> ## Documentation Index > Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-8a08bda2.mintlify.site/llms.txt > Use this file to discover all available pages before exploring further. > Settings for MergeTree which are in `system.merge_tree_settings` # MergeTree tables settings export const SettingsInfoBlock = ({type, default_value, changeable_without_restart}) => { const cells = [["Type", {type}], ["Default value", {default_value}]]; if (changeable_without_restart) { const isYes = String(changeable_without_restart).trim().toLowerCase() === "yes"; const badge = isYes ? Yes : No; cells.push(["Changeable without restart", badge]); } return {cells.map(([h]) => )} {cells.map(([h, v]) => )}

{h}
{v}

; }; export const galaxyOnClick = eventName => () => { try { if (typeof window !== "undefined" && window.galaxy && eventName) { window.galaxy.track(eventName, { interaction: "click" }); } } catch (e) {} }; export const BetaBadge = ({link, galaxyTrack, galaxyEvent}) => { if (link) { return Beta ; } return

Beta feature. Learn more.

; }; export const ExperimentalBadge = () => { return

Experimental feature. Learn more.

; }; System table `system.merge_tree_settings` shows the globally set MergeTree settings. MergeTree settings can be set in the `merge_tree` section of the server config file, or specified for each `MergeTree` table individually in the `SETTINGS` clause of the `CREATE TABLE` statement. Example for customizing setting `max_suspicious_broken_parts`: Configure the default for all `MergeTree` tables in the server configuration file: ```text theme={null} 5 ``` Set for a particular table: ```sql theme={null} CREATE TABLE tab ( `A` Int64 ) ENGINE = MergeTree ORDER BY tuple() SETTINGS max_suspicious_broken_parts = 500; ``` Change the settings for a particular table using `ALTER TABLE ... MODIFY SETTING`: ```sql theme={null} ALTER TABLE tab MODIFY SETTING max_suspicious_broken_parts = 100; -- reset to global default (value from system.merge_tree_settings) ALTER TABLE tab RESET SETTING max_suspicious_broken_parts; ```

MergeTree settings

adaptive\_write\_buffer\_initial\_size

Initial size of an adaptive write buffer

add\_implicit\_sign\_column\_constraint\_for\_collapsing\_engine

If true, adds an implicit constraint for the `sign` column of a CollapsingMergeTree or VersionedCollapsingMergeTree table to allow only valid values (`1` and `-1`).

add\_minmax\_index\_for\_block\_number\_column

When enabled, an implicit min-max (skipping) index is added for the persistent virtual column `_block_number`. Requires `enable_block_number_column = 1` to take effect. The index is built only during merges, not during inserts: at insert time the block number is provisional and would index a constant.

add\_minmax\_index\_for\_block\_offset\_column

When enabled, an implicit min-max (skipping) index is added for the persistent virtual column `_block_offset`. Requires `enable_block_offset_column = 1` to take effect. The index is built only during merges, not during inserts.

add\_minmax\_index\_for\_numeric\_columns

When enabled, min-max (skipping) indices are added for all numeric columns of the table.

add\_minmax\_index\_for\_string\_columns

When enabled, min-max (skipping) indices are added for all string columns of the table.

add\_minmax\_index\_for\_temporal\_columns

When enabled, min-max (skipping) indices are added for all Date, Date32, Time, Time64, DateTime and DateTime64 columns of the table

allow\_coalescing\_columns\_in\_partition\_or\_order\_key

When enabled, allows coalescing columns in a CoalescingMergeTree table to be used in the partition or sorting key.

allow\_commit\_order\_projection

Enables commit-order projections that store `_block_number` and `_block_offset` virtual columns, preserving original insertion order through merges. Requires `enable_block_number_column` and `enable_block_offset_column` to be enabled.

allow\_experimental\_replacing\_merge\_with\_cleanup

Allow experimental CLEANUP merges for ReplacingMergeTree with `is_deleted` column. When enabled, allows using `OPTIMIZE ... FINAL CLEANUP` to manually merge all parts in a partition down to a single part and removing any deleted rows. Also allows enabling such merges to happen automatically in the background with settings `min_age_to_force_merge_seconds`, `min_age_to_force_merge_on_partition_only` and `enable_replacing_merge_with_cleanup_for_min_age_to_force_merge`.

allow\_experimental\_reverse\_key

Enables support for descending sort order in MergeTree sorting keys. This setting is particularly useful for time series analysis and Top-N queries, allowing data to be stored in reverse chronological order to optimize query performance. With `allow_experimental_reverse_key` enabled, you can define descending sort orders within the `ORDER BY` clause of a MergeTree table. This enables the use of more efficient `ReadInOrder` optimizations instead of `ReadInReverseOrder` for descending queries. **Example** ```sql theme={null} CREATE TABLE example ( time DateTime, key Int32, value String ) ENGINE = MergeTree ORDER BY (time DESC, key) -- Descending order on 'time' field SETTINGS allow_experimental_reverse_key = 1; SELECT * FROM example WHERE key = 'xxx' ORDER BY time DESC LIMIT 10; ``` By using `ORDER BY time DESC` in the query, `ReadInOrder` is applied. **Default Value:** false

allow\_floating\_point\_partition\_key

Enables to allow floating-point number as a partition key. Possible values: * `0` — Floating-point partition key not allowed. * `1` — Floating-point partition key allowed.

allow\_nullable\_key

Allow Nullable types as primary keys.

allow\_part\_offset\_column\_in\_projections

Allow usage of '\_part\_offset' column in projections select query.

allow\_reduce\_blocking\_parts\_task

Background task which reduces blocking parts for shared merge tree tables. Only in ClickHouse Cloud

allow\_remote\_fs\_zero\_copy\_replication

Don't use this setting in production, because it is not ready.

allow\_summing\_columns\_in\_partition\_or\_order\_key

When enabled, allows summing columns in a SummingMergeTree table to be used in the partition or sorting key.

allow\_suspicious\_indices

Reject primary/secondary indexes and sorting keys with identical expressions

allow\_vertical\_merges\_from\_compact\_to\_wide\_parts

Allows vertical merges from compact to wide parts. This settings must have the same value on all replicas.

alter\_column\_secondary\_index\_mode

Configures whether to allow `ALTER` commands that modify columns covered by secondary indices, and what action to take if they are allowed. By default, such `ALTER` commands are allowed and the indices are rebuilt. Possible values: * `rebuild` (default): Rebuilds any secondary indices affected by the column in the `ALTER` command. * `throw`: Prevents any `ALTER` of columns covered by **explicit** secondary indices by throwing an exception. Implicit indices are excluded from this restriction and will be rebuilt. * `drop`: Drop the dependent secondary indices. The new parts won't have the indices, requiring `MATERIALIZE INDEX` to recreate them. * `compatibility`: Matches the original behaviour: `throw` on `ALTER ... MODIFY COLUMN` and `rebuild` on `ALTER ... UPDATE/DELETE`. * `ignore`: Intended for expert usage. It will leave the indices in an inconsistent state, allowing incorrect query results.

always\_fetch\_merged\_part

If true, this replica never merges parts and always downloads merged parts from other replicas. Possible values: * true, false

always\_use\_copy\_instead\_of\_hardlinks

Always copy data instead of hardlinking during mutations/replaces/detaches and so on.

apply\_patches\_on\_merge

If true patch parts are applied on merges

assign\_part\_uuids

When enabled, a unique part identifier will be assigned for every new part. Before enabling, check that all replicas support UUID version 4.

async\_block\_ids\_cache\_update\_wait\_ms

How long each insert iteration will wait for async\_block\_ids\_cache update

async\_insert

If true, data from INSERT query is stored in queue and later flushed to table in background.

auto\_statistics\_types

Comma-separated list of statistics types to calculate automatically on all suitable columns. Supported statistics types: tdigest, countmin, minmax, nullcount, uniq.

background\_task\_preferred\_step\_execution\_time\_ms

Target time to execution of one step of merge or mutation. Can be exceeded if one step takes longer time

cache\_populated\_by\_fetch

This setting applies only to ClickHouse Cloud. When `cache_populated_by_fetch` is disabled (the default setting), new data parts are loaded into the filesystem cache only when a query is run that requires those parts. If enabled, `cache_populated_by_fetch` will instead cause all nodes to load new data parts from storage into their filesystem cache without requiring a query to trigger such an action. **See Also** * [ignore\_cold\_parts\_seconds](/reference/settings/session-settings#ignore_cold_parts_seconds) * [prefer\_warmed\_unmerged\_parts\_seconds](/reference/settings/session-settings#prefer_warmed_unmerged_parts_seconds) * [cache\_warmer\_threads](/reference/settings/session-settings#cache_warmer_threads)

cache\_populated\_by\_fetch\_filename\_regexp

This setting applies only to ClickHouse Cloud. If not empty, only files that match this regex will be prewarmed into the cache after fetch (if `cache_populated_by_fetch` is enabled).

check\_delay\_period

Obsolete setting, does nothing.

check\_sample\_column\_is\_correct

Enables the check at table creation, that the data type of a column for s ampling or sampling expression is correct. The data type must be one of unsigned [integer types](/reference/data-types/int-uint): `UInt8`, `UInt16`, `UInt32`, `UInt64`. Possible values: * `true` — The check is enabled. * `false` — The check is disabled at table creation. Default value: `true`. By default, the ClickHouse server checks at table creation the data type of a column for sampling or sampling expression. If you already have tables with incorrect sampling expression and do not want the server to raise an exception during startup, set `check_sample_column_is_correct` to `false`.

clean\_deleted\_rows

Obsolete setting, does nothing.

cleanup\_delay\_period

Minimum period to clean old queue logs, blocks hashes and parts.

cleanup\_delay\_period\_random\_add

Add uniformly distributed value from 0 to x seconds to cleanup\_delay\_period to avoid thundering herd effect and subsequent DoS of ZooKeeper in case of very large number of tables.

cleanup\_thread\_preferred\_points\_per\_iteration

Preferred batch size for background cleanup (points are abstract but 1 point is approximately equivalent to 1 inserted block).

cleanup\_threads

Obsolete setting, does nothing.

clone\_replica\_zookeeper\_create\_get\_part\_batch\_size

Batch size for ZooKeeper multi-create get-part requests when cloning replica.

columns\_and\_secondary\_indices\_sizes\_lazy\_calculation

Calculate columns and secondary indices sizes lazily on first request instead of on table initialization.

columns\_to\_prewarm\_mark\_cache

List of columns to prewarm mark cache for (if enabled). Empty means all columns

compact\_parts\_max\_bytes\_to\_buffer

Only available in ClickHouse Cloud. Maximal number of bytes to write in a single stripe in compact parts

compact\_parts\_max\_granules\_to\_buffer

Only available in ClickHouse Cloud. Maximal number of granules to write in a single stripe in compact parts

compact\_parts\_merge\_max\_bytes\_to\_prefetch\_part

Only available in ClickHouse Cloud. Maximal size of compact part to read it in a whole to memory during merge.

compatibility\_allow\_sampling\_expression\_not\_in\_primary\_key

Allow to create a table with sampling expression not in primary key. This is needed only to temporarily allow to run the server with wrong tables for backward compatibility.

compress\_marks

Marks support compression, reduce mark file size and speed up network transmission.

compress\_per\_column\_in\_compact\_parts

Controls the physical layout of Compact parts. If true (default), each column in a granule starts a new compressed block, allowing ClickHouse to skip reading unnecessary columns from disk. If false, all columns within a granule are packed into the same compressed block, improving compression ratio but requiring more data to be decompressed during reads. This is beneficial for workloads that always read all columns (e.g. projections).

compress\_primary\_key

Primary key support compression, reduce primary key file size and speed up network transmission.

concurrent\_part\_removal\_threshold

Activate concurrent part removal (see 'max\_part\_removal\_threads') only if the number of inactive data parts is at least this.

concurrent\_part\_removal\_threshold\_for\_remote\_disk

Same as `concurrent_part_removal_threshold`, but used when at least one part being removed is stored on a remote disk. The default is lower because each part removal on remote storage typically requires a network round-trip (e.g. one HTTP `DELETE` per part on object storage), so a serial removal of even 100 parts can stall a `DROP TABLE` for tens of seconds.

deduplicate\_merge\_projection\_mode

Whether to allow create projection for the table with non-classic MergeTree, that is not (Replicated, Shared) MergeTree. Ignore option is purely for compatibility which might result in incorrect answer. Otherwise, if allowed, what is the action when merge projections, either drop or rebuild. So classic MergeTree would ignore this setting. It also controls `OPTIMIZE DEDUPLICATE` as well, but has effect on all MergeTree family members. Similar to the option `lightweight_mutation_projection_mode`, it is also part level. Possible values: * `ignore` * `throw` * `drop` * `rebuild`

default\_compression\_codec

Specifies the default compression codec to be used if none is defined for a particular column in the table declaration. Compression codec selecting order for a column: 1. Compression codec defined for the column in the table declaration 2. Compression codec defined in `default_compression_codec` (this setting) 3. Default compression codec defined in `compression` settings Default value: an empty string (not defined).

detach\_not\_byte\_identical\_parts

Enables or disables detaching a data part on a replica after a merge or a mutation, if it is not byte-identical to data parts on other replicas. If disabled, the data part is removed. Activate this setting if you want to analyze such parts later. The setting is applicable to `MergeTree` tables with enabled [data replication](/reference/engines/table-engines/mergetree-family/replacingmergetree). Possible values: * `0` — Parts are removed. * `1` — Parts are detached.

detach\_old\_local\_parts\_when\_cloning\_replica

Do not remove old local parts when repairing lost replica. Possible values: * `true` * `false`

disable\_detach\_partition\_for\_zero\_copy\_replication

Disable DETACH PARTITION query for zero copy replication.

disable\_fetch\_partition\_for\_zero\_copy\_replication

Disable FETCH PARTITION query for zero copy replication.

disable\_freeze\_partition\_for\_zero\_copy\_replication

Disable FREEZE PARTITION query for zero copy replication.

disk

Name of storage disk. Can be specified instead of storage policy.

distributed\_index\_analysis\_min\_indexes\_bytes\_to\_activate

Minimal index sizes (data skipping and primary key) on disk (but uncompressed) to activated distributed index analysis

distributed\_index\_analysis\_min\_parts\_to\_activate

Minimal number of parts to activated distributed index analysis

dynamic\_serialization\_version

Serialization version for Dynamic data type. Required for compatibility. Possible values: * `v1` * `v2` * `v3`

enable\_block\_number\_column

Enable persisting column \_block\_number for each row.

enable\_block\_offset\_column

Persists virtual column `_block_offset` on merges.

enable\_index\_granularity\_compression

Compress in memory values of index granularity if it is possible

enable\_max\_bytes\_limit\_for\_min\_age\_to\_force\_merge

If settings `min_age_to_force_merge_seconds` and `min_age_to_force_merge_on_partition_only` should respect setting `max_bytes_to_merge_at_max_space_in_pool`. Possible values: * `true` * `false`

enable\_mixed\_granularity\_parts

Enables or disables transitioning to control the granule size with the `index_granularity_bytes` setting. Before version 19.11, there was only the `index_granularity` setting for restricting granule size. The `index_granularity_bytes` setting improves ClickHouse performance when selecting data from tables with big rows (tens and hundreds of megabytes). If you have tables with big rows, you can enable this setting for the tables to improve the efficiency of `SELECT` queries.

enable\_replacing\_merge\_with\_cleanup\_for\_min\_age\_to\_force\_merge

Whether to use CLEANUP merges for ReplacingMergeTree when merging partitions down to a single part. Requires `allow_experimental_replacing_merge_with_cleanup`, `min_age_to_force_merge_seconds` and `min_age_to_force_merge_on_partition_only` to be enabled. Possible values: * `true` * `false`

enable\_the\_endpoint\_id\_with\_zookeeper\_name\_prefix

Enable the endpoint id with zookeeper name prefix for the replicated merge tree table.

enable\_vertical\_merge\_algorithm

Enable usage of Vertical merge algorithm.

enforce\_index\_structure\_match\_on\_partition\_manipulation

If this setting is enabled for destination table of a partition manipulation query (`ATTACH/MOVE/REPLACE PARTITION`), the indices and projections must be identical between the source and destination tables. Otherwise, the destination table can have a superset of the source table's indices and projections.

escape\_index\_filenames

Prior to 26.1 we didn't escape special symbols in filenames created for secondary indices, which could lead to issues with some characters in index names producing broken parts. This is added purely for compatibility reasons. It should not be changed unless you are reading old parts with indices using non-ascii characters in their names.

escape\_variant\_subcolumn\_filenames

Escape special symbols in filenames created for subcolumns of Variant data type in Wide parts of MergeTree table. Needed for compatibility.

exclude\_deleted\_rows\_for\_part\_size\_in\_merge

If enabled, estimated actual size of data parts (i.e., excluding those rows that have been deleted through `DELETE FROM`) will be used when selecting parts to merge. Note that this behavior is only triggered for data parts affected by `DELETE FROM` executed after this setting is enabled. Possible values: * `true` * `false` **See Also** * [load\_existing\_rows\_count\_for\_old\_parts](#load_existing_rows_count_for_old_parts) setting

exclude\_materialize\_skip\_indexes\_on\_merge

Excludes provided comma delimited list of skip indexes from being built and stored during merges. Has no effect if [materialize\_skip\_indexes\_on\_merge](#materialize_skip_indexes_on_merge) is false. The excluded skip indexes will still be built and stored by an explicit [MATERIALIZE INDEX](/reference/statements/alter/skipping-index#materialize-index) query or during INSERTs depending on the [materialize\_skip\_indexes\_on\_insert](/reference/settings/session-settings#materialize_skip_indexes_on_insert) session setting. Example: ```sql theme={null} CREATE TABLE tab ( a UInt64, b UInt64, INDEX idx_a a TYPE minmax, INDEX idx_b b TYPE set(3) ) ENGINE = MergeTree ORDER BY tuple() SETTINGS exclude_materialize_skip_indexes_on_merge = 'idx_a'; INSERT INTO tab SELECT number, number / 50 FROM numbers(100); -- setting has no effect on INSERTs -- idx_a will be excluded from update during background or explicit merge via OPTIMIZE TABLE FINAL -- can exclude multiple indexes by providing a list ALTER TABLE tab MODIFY SETTING exclude_materialize_skip_indexes_on_merge = 'idx_a, idx_b'; -- default setting, no indexes excluded from being updated during merge ALTER TABLE tab MODIFY SETTING exclude_materialize_skip_indexes_on_merge = ''; ```

execute\_merges\_on\_single\_replica\_time\_threshold

When this setting has a value greater than zero, only a single replica starts the merge immediately, and other replicas wait up to that amount of time to download the result instead of doing merges locally. If the chosen replica doesn't finish the merge during that amount of time, fallback to standard behavior happens. Possible values: * Any positive integer.

fault\_probability\_after\_part\_commit

For testing. Do not change it.

fault\_probability\_before\_part\_commit

For testing. Do not change it.

finished\_mutations\_to\_keep

How many records about mutations that are done to keep. If zero, then keep all of them.

force\_read\_through\_cache\_for\_merges

Force read-through filesystem cache for merges

fsync\_after\_insert

Do fsync for every inserted part. Significantly decreases performance of inserts, not recommended to use with wide parts.

fsync\_part\_directory

Do fsync for part directory after all part operations (writes, renames, etc.).

in\_memory\_parts\_enable\_wal

Obsolete setting, does nothing.

in\_memory\_parts\_insert\_sync

Obsolete setting, does nothing.

inactive\_parts\_to\_delay\_insert

If the number of inactive parts in a single partition in the table exceeds the `inactive_parts_to_delay_insert` value, an `INSERT` is artificially slowed down. It is useful when a server fails to clean up parts quickly enough. Possible values: * Any positive integer.

inactive\_parts\_to\_throw\_insert

If the number of inactive parts in a single partition more than the `inactive_parts_to_throw_insert` value, `INSERT` is interrupted with the following error: > "Too many inactive parts (N). Parts cleaning are processing significantly > slower than inserts" exception." Possible values: * Any positive integer.

index\_granularity

Maximum number of data rows between the marks of an index. I.e how many rows correspond to one primary key value.

index\_granularity\_bytes

Maximum size of data granules in bytes. To restrict the granule size only by number of rows, set to `0` (not recommended).

initialization\_retry\_period

Retry period for table initialization, in seconds.

kill\_delay\_period

Obsolete setting, does nothing.

kill\_delay\_period\_random\_add

Obsolete setting, does nothing.

kill\_threads

Obsolete setting, does nothing.

lightweight\_mutation\_projection\_mode

By default, lightweight delete `DELETE` does not work for tables with projections. This is because rows in a projection may be affected by a `DELETE` operation. So the default value would be `throw`. However, this option can change the behavior. With the value either `drop` or `rebuild`, deletes will work with projections. `drop` would delete the projection so it might be fast in the current query as projection gets deleted but slow in future queries as no projection attached. `rebuild` would rebuild the projection which might affect the performance of the current query, but might speedup for future queries. A good thing is that these options would only work in the part level, which means projections in the part that don't get touched would stay intact instead of triggering any action like drop or rebuild. Possible values: * `throw` * `drop` * `rebuild`

load\_existing\_rows\_count\_for\_old\_parts

If enabled along with [exclude\_deleted\_rows\_for\_part\_size\_in\_merge](#exclude_deleted_rows_for_part_size_in_merge), deleted rows count for existing data parts will be calculated during table starting up. Note that it may slow down start up table loading. Possible values: * `true` * `false` **See Also** * [exclude\_deleted\_rows\_for\_part\_size\_in\_merge](#exclude_deleted_rows_for_part_size_in_merge) setting

lock\_acquire\_timeout\_for\_background\_operations

For background operations like merges, mutations etc. How many seconds before failing to acquire table locks.

map\_buckets\_coefficient

The coefficient used in `sqrt` and `linear` [map\_buckets\_strategy](#map_buckets_strategy) to calculate the number of buckets from the average map size. For `sqrt` strategy: `round(map_buckets_coefficient * sqrt(avg_map_size))`. For `linear` strategy: `round(map_buckets_coefficient * avg_map_size)`. Ignored when `map_buckets_strategy` is `constant`.

map\_buckets\_min\_avg\_size

The minimum average map size (number of keys per row) required to apply `with_buckets` serialization. If the average map size is less than this value, a single bucket is used regardless of other bucket settings. A value of `0` disables the threshold and always applies the bucketing strategy. This setting is useful to avoid the overhead of bucketed serialization for small maps where the benefit is negligible.

map\_buckets\_strategy

Controls the strategy for choosing the number of buckets in `with_buckets` `Map` serialization based on the average map size. Possible values: * constant — Always use [max\_buckets\_in\_map](#max_buckets_in_map) as the number of buckets, regardless of the average map size. * sqrt — Use `round(map_buckets_coefficient * sqrt(avg_map_size))` as the number of buckets, clamped to `[1, max_buckets_in_map]`. * linear — Use `round(map_buckets_coefficient * avg_map_size)` as the number of buckets, clamped to `[1, max_buckets_in_map]`.

map\_serialization\_version

Controls the serialization method used for `Map` columns. Possible values: * basic — Use the standard serialization for `Map`. * with\_buckets — Split keys into buckets during serialization. Using buckets improves reading individual keys from the Map. The number of buckets in `with_buckets` serialization is determined by [max\_buckets\_in\_map](#max_buckets_in_map) and [map\_buckets\_strategy](#map_buckets_strategy).

map\_serialization\_version\_for\_zero\_level\_parts

This setting allows to specify a different serialization version of `Map` columns for zero level parts that are created during inserts. It can be useful to keep `basic` serialization for zero level parts to avoid performance degradation during inserts, while using `with_buckets` for merged parts.

marks\_compress\_block\_size

Mark compress block size, the actual size of the block to compress.

marks\_compression\_codec

Compression encoding used by marks, marks are small enough and cached, so the default compression is ZSTD(3).

materialize\_skip\_indexes\_on\_merge

When enabled, merges build and store skip indices for new parts. Otherwise they can be created/stored by explicit [MATERIALIZE INDEX](/reference/statements/alter/skipping-index#materialize-index) or [during INSERTs](/reference/settings/session-settings#materialize_skip_indexes_on_insert). See also [exclude\_materialize\_skip\_indexes\_on\_merge](#exclude_materialize_skip_indexes_on_merge) for more fine-grained control.

materialize\_statistics\_on\_merge

When enabled, merges will build and store statistics for new parts. Otherwise they can be created/stored by explicit [MATERIALIZE STATISTICS](/reference/statements/alter/statistics) or [during INSERTs](/reference/settings/session-settings#materialize_statistics_on_insert)

materialize\_ttl\_recalculate\_only

Only recalculate ttl info when MATERIALIZE TTL

max\_avg\_part\_size\_for\_too\_many\_parts

The 'too many parts' check according to 'parts\_to\_delay\_insert' and 'parts\_to\_throw\_insert' will be active only if the average part size (in the relevant partition) is not larger than the specified threshold. If it is larger than the specified threshold, the INSERTs will be neither delayed or rejected. This allows to have hundreds of terabytes in a single table on a single server if the parts are successfully merged to larger parts. This does not affect the thresholds on inactive parts or total parts.

max\_buckets\_in\_map

The maximum number of buckets for `Map` serialization. Works with `with_buckets` `Map` serialization. The actual number of buckets is determined by [map\_buckets\_strategy](#map_buckets_strategy). The maximum allowed value is 256.

max\_bytes\_to\_merge\_at\_max\_space\_in\_pool

The maximum total parts size (in bytes) to be merged into one part, if there are enough resources available. Corresponds roughly to the maximum possible part size created by an automatic background merge. (0 means merges will be disabled) Possible values: * Any non-negative integer. The merge scheduler periodically analyzes the sizes and number of parts in partitions, and if there are enough free resources in the pool, it starts background merges. Merges occur until the total size of the source parts is larger than `max_bytes_to_merge_at_max_space_in_pool`. Merges initiated by [OPTIMIZE FINAL](/reference/statements/optimize) ignore `max_bytes_to_merge_at_max_space_in_pool` (only the free disk space is taken into account).

max\_bytes\_to\_merge\_at\_min\_space\_in\_pool

The maximum total part size (in bytes) to be merged into one part, with the minimum available resources in the background pool. Possible values: * Any positive integer. `max_bytes_to_merge_at_min_space_in_pool` defines the maximum total size of parts which can be merged despite the lack of available disk space (in pool). This is necessary to reduce the number of small parts and the chance of `Too many parts` errors. Merges book disk space by doubling the total merged parts sizes. Thus, with a small amount of free disk space, a situation may occur in which there is free space, but this space is already booked by ongoing large merges, so other merges are unable to start, and the number of small parts grows with every insert.

max\_cleanup\_delay\_period

Maximum period to clean old queue logs, blocks hashes and parts.

max\_compress\_block\_size

The maximum size of blocks of uncompressed data before compressing for writing to a table. You can also specify this setting in the global settings (see [max\_compress\_block\_size](/reference/settings/merge-tree-settings#max_compress_block_size) setting). The value specified when the table is created overrides the global value for this setting.

max\_concurrent\_queries

Max number of concurrently executed queries related to the MergeTree table. Queries will still be limited by other `max_concurrent_queries` settings. Possible values: * Positive integer. * `0` — No limit. Default value: `0` (no limit). **Example** ```xml theme={null} 50 ```

max\_delay\_to\_insert

The value in seconds, which is used to calculate the `INSERT` delay, if the number of active parts in a single partition exceeds the [parts\_to\_delay\_insert](#parts_to_delay_insert) value. Possible values: * Any positive integer. The delay (in milliseconds) for `INSERT` is calculated by the formula: ```code theme={null} max_k = parts_to_throw_insert - parts_to_delay_insert k = 1 + parts_count_in_partition - parts_to_delay_insert delay_milliseconds = pow(max_delay_to_insert * 1000, k / max_k) ``` For example, if a partition has 299 active parts and parts\_to\_throw\_insert \= 300, parts\_to\_delay\_insert = 150, max\_delay\_to\_insert = 1, `INSERT` is delayed for `pow( 1 * 1000, (1 + 299 - 150) / (300 - 150) ) = 1000` milliseconds. Starting from version 23.1 formula has been changed to: ```code theme={null} allowed_parts_over_threshold = parts_to_throw_insert - parts_to_delay_insert parts_over_threshold = parts_count_in_partition - parts_to_delay_insert + 1 delay_milliseconds = max(min_delay_to_insert_ms, (max_delay_to_insert * 1000) * parts_over_threshold / allowed_parts_over_threshold) ``` For example, if a partition has 224 active parts and parts\_to\_throw\_insert \= 300, parts\_to\_delay\_insert = 150, max\_delay\_to\_insert = 1, min\_delay\_to\_insert\_ms = 10, `INSERT` is delayed for `max( 10, 1 * 1000 * (224 - 150 + 1) / (300 - 150) ) = 500` milliseconds.

max\_delay\_to\_mutate\_ms

Max delay of mutating MergeTree table in milliseconds, if there are a lot of unfinished mutations

max\_digestion\_size\_per\_segment

Obsolete setting, does nothing.

max\_file\_name\_length

The maximal length of the file name to keep it as is without hashing. Takes effect only if setting `replace_long_file_name_to_hash` is enabled. The value of this setting does not include the length of file extension. So, it is recommended to set it below the maximum filename length (usually 255 bytes) with some gap to avoid filesystem errors.

max\_files\_to\_modify\_in\_alter\_columns

Do not apply ALTER if number of files for modification(deletion, addition) is greater than this setting. Possible values: * Any positive integer. Default value: 75

max\_files\_to\_remove\_in\_alter\_columns

Do not apply ALTER, if the number of files for deletion is greater than this setting. Possible values: * Any positive integer.

max\_merge\_delayed\_streams\_for\_parallel\_write

The maximum number of streams (columns) that can be flushed in parallel (analog of max\_insert\_delayed\_streams\_for\_parallel\_write for merges). Works only for Vertical merges.

max\_merge\_selecting\_sleep\_ms

Maximum time to wait before trying to select parts to merge again after no parts were selected. A lower setting will trigger selecting tasks in background\_schedule\_pool frequently which result in large amount of requests to zookeeper in large-scale clusters

max\_number\_of\_merges\_with\_ttl\_in\_pool

When there is more than specified number of merges with TTL entries in pool, do not assign new merge with TTL. This is to leave free threads for regular merges and avoid "Too many parts"

max\_number\_of\_mutations\_for\_replica

Limit the number of part mutations per replica to the specified amount. Zero means no limit on the number of mutations per replica (the execution can still be constrained by other settings).

max\_part\_loading\_threads

Obsolete setting, does nothing.

max\_part\_removal\_threads

Obsolete setting, does nothing.

max\_partitions\_to\_read

Limits the maximum number of partitions that can be accessed in one query. The setting value specified when the table is created can be overridden via query-level setting. Possible values: * Any positive integer. You can also specify a query complexity setting [max\_partitions\_to\_read](/reference/settings/session-settings#max_partitions_to_read) at a query / session / profile level.

max\_parts\_in\_total

If the total number of active parts in all partitions of a table exceeds the `max_parts_in_total` value `INSERT` is interrupted with the `Too many parts (N)` exception. Possible values: * Any positive integer. A large number of parts in a table reduces performance of ClickHouse queries and increases ClickHouse boot time. Most often this is a consequence of an incorrect design (mistakes when choosing a partitioning strategy - too small partitions).

max\_parts\_to\_merge\_at\_once

Max amount of parts which can be merged at once (0 - disabled). Doesn't affect OPTIMIZE FINAL query.

max\_postpone\_time\_for\_failed\_mutations\_ms

The maximum postpone time for failed mutations.

max\_postpone\_time\_for\_failed\_replicated\_fetches\_ms

The maximum postpone time for failed replicated fetches.

max\_postpone\_time\_for\_failed\_replicated\_merges\_ms

The maximum postpone time for failed replicated merges.

max\_postpone\_time\_for\_failed\_replicated\_tasks\_ms

The maximum postpone time for failed replicated task. The value is used if the task is not a fetch, merge or mutation.

max\_projections

The maximum number of merge tree projections.

max\_replicated\_fetches\_network\_bandwidth

Limits the maximum speed of data exchange over the network in bytes per second for [replicated](/reference/engines/table-engines/mergetree-family/replication) fetches. This setting is applied to a particular table, unlike the [`max_replicated_fetches_network_bandwidth_for_server`](/reference/settings/merge-tree-settings#max_replicated_fetches_network_bandwidth) setting, which is applied to the server. You can limit both server network and network for a particular table, but for this the value of the table-level setting should be less than server-level one. Otherwise the server considers only the `max_replicated_fetches_network_bandwidth_for_server` setting. The setting isn't followed perfectly accurately. Possible values: * Positive integer. * `0` — Unlimited. Default value: `0`. **Usage** Could be used for throttling speed when replicating data to add or replace new nodes.

max\_replicated\_logs\_to\_keep

How many records may be in the ClickHouse Keeper log if there is inactive replica. An inactive replica becomes lost when when this number exceed. Possible values: * Any positive integer.

max\_replicated\_merges\_in\_queue

How many tasks of merging and mutating parts are allowed simultaneously in ReplicatedMergeTree queue.

max\_replicated\_merges\_with\_ttl\_in\_queue

How many tasks of merging parts with TTL are allowed simultaneously in ReplicatedMergeTree queue.

max\_replicated\_mutations\_in\_queue

How many tasks of mutating parts are allowed simultaneously in ReplicatedMergeTree queue.

max\_replicated\_sends\_network\_bandwidth

Limits the maximum speed of data exchange over the network in bytes per second for [replicated](/reference/engines/table-engines/mergetree-family/replacingmergetree) sends. This setting is applied to a particular table, unlike the [`max_replicated_sends_network_bandwidth_for_server`](/reference/settings/merge-tree-settings#max_replicated_sends_network_bandwidth) setting, which is applied to the server. You can limit both server network and network for a particular table, but for this the value of the table-level setting should be less than server-level one. Otherwise the server considers only the `max_replicated_sends_network_bandwidth_for_server` setting. The setting isn't followed perfectly accurately. Possible values: * Positive integer. * `0` — Unlimited. **Usage** Could be used for throttling speed when replicating data to add or replace new nodes.

max\_suspicious\_broken\_parts

If the number of broken parts in a single partition exceeds the `max_suspicious_broken_parts` value, automatic deletion is denied. Possible values: * Any positive integer.

max\_suspicious\_broken\_parts\_bytes

Max size of all broken parts, if more - deny automatic deletion. Possible values: * Any positive integer.

max\_uncompressed\_bytes\_in\_patches

The maximum uncompressed size of data in all patch parts in bytes. If amount of data in all patch parts exceeds this value, lightweight updates will be rejected. 0 - unlimited.

merge\_max\_block\_size

The number of rows that are read from the merged parts into memory. Possible values: * Any positive integer. Merge reads rows from parts in blocks of `merge_max_block_size` rows, then merges and writes the result into a new part. The read block is placed in RAM, so `merge_max_block_size` affects the size of the RAM required for the merge. Thus, merges can consume a large amount of RAM for tables with very wide rows (if the average row size is 100kb, then when merging 10 parts, (100kb \* 10 \* 8192) = \~ 8GB of RAM). By decreasing `merge_max_block_size`, you can reduce the amount of RAM required for a merge but slow down a merge.

merge\_max\_block\_size\_bytes

How many bytes in blocks should be formed for merge operations. By default has the same value as `index_granularity_bytes`.

merge\_max\_bytes\_to\_prewarm\_cache

Only available in ClickHouse Cloud. Maximal size of part (compact or packed) to prewarm cache during merge.

merge\_max\_dynamic\_subcolumns\_in\_compact\_part

The maximum number of dynamic subcolumns that can be created in every column in the Compact data part after merge. It allows to control the number of dynamic subcolumns in Compact part regardless of dynamic parameters specified in the data type. For example, if the table has a column with the JSON(max\_dynamic\_paths=1024) type and the setting merge\_max\_dynamic\_subcolumns\_in\_compact\_part is set to 128, after merge into the Compact data part number of dynamic paths will be decreased to 128 in this part and only 128 paths will be written as dynamic subcolumns.

merge\_max\_dynamic\_subcolumns\_in\_wide\_part

The maximum number of dynamic subcolumns that can be created in every column in the Wide data part after merge. It allows to reduce number of files created in Wide data part regardless of dynamic parameters specified in the data type. For example, if the table has a column with the JSON(max\_dynamic\_paths=1024) type and the setting merge\_max\_dynamic\_subcolumns\_in\_wide\_part is set to 128, after merge into the Wide data part number of dynamic paths will be decreased to 128 in this part and only 128 paths will be written as dynamic subcolumns.

merge\_selecting\_sleep\_ms

Minimum time to wait before trying to select parts to merge again after no parts were selected. A lower setting will trigger selecting tasks in background\_schedule\_pool frequently which result in large amount of requests to zookeeper in large-scale clusters

merge\_selecting\_sleep\_slowdown\_factor

The sleep time for merge selecting task is multiplied by this factor when there's nothing to merge and divided when a merge was assigned

merge\_selector\_algorithm

The algorithm to select parts for merges assignment

merge\_selector\_base

Affects write amplification of assigned merges (expert level setting, don't change if you don't understand what it is doing). Works for Simple and StochasticSimple merge selectors

merge\_selector\_blurry\_base\_scale\_factor

Controls when the logic kicks in relatively to the number of parts in partition. The bigger the factor the more belated reaction will be.

merge\_selector\_enable\_heuristic\_to\_lower\_max\_parts\_to\_merge\_at\_once

Enable heuristic for simple merge selector which will lower maximum limit for merge choice. By doing so number of concurrent merges will increase which can help with TOO\_MANY\_PARTS errors but at the same time this will increase the write amplification.

merge\_selector\_enable\_heuristic\_to\_remove\_small\_parts\_at\_right

Enable heuristic for selecting parts for merge which removes parts from right side of range, if their size is less than specified ratio (0.01) of sum\_size. Works for Simple and StochasticSimple merge selectors

merge\_selector\_heuristic\_to\_lower\_max\_parts\_to\_merge\_at\_once\_exponent

Controls the exponent value used in formulae building lowering curve. Lowering exponent will lower merge widths which will trigger increase in write amplification. The reverse is also true.

merge\_selector\_window\_size

How many parts to look at once.

merge\_total\_max\_bytes\_to\_prewarm\_cache

Only available in ClickHouse Cloud. Maximal size of parts in total to prewarm cache during merge.

merge\_tree\_clear\_old\_broken\_detached\_parts\_ttl\_timeout\_seconds

Obsolete setting, does nothing.

merge\_tree\_clear\_old\_parts\_interval\_seconds

Sets the interval in seconds for ClickHouse to execute the cleanup of old parts, WALs, and mutations. Possible values: * Any positive integer.

merge\_tree\_clear\_old\_temporary\_directories\_interval\_seconds

Sets the interval in seconds for ClickHouse to execute the cleanup of old temporary directories. Possible values: * Any positive integer.

merge\_tree\_enable\_clear\_old\_broken\_detached

Obsolete setting, does nothing.

merge\_with\_recompression\_ttl\_timeout

Minimum delay in seconds before repeating a merge with recompression TTL.

merge\_with\_ttl\_timeout

Minimum delay in seconds before repeating a merge with delete TTL.

merge\_workload

Used to regulate how resources are utilized and shared between merges and other workloads. Specified value is used as `workload` setting value for background merges of this table. If not specified (empty string), then server setting `merge_workload` is used instead. **See Also** * [Workload Scheduling](/concepts/features/configuration/server-config/workload-scheduling)

min\_absolute\_delay\_to\_close

Minimal absolute delay to close, stop serving requests and not return Ok during status check.

min\_age\_to\_force\_merge\_on\_partition\_only

Whether `min_age_to_force_merge_seconds` should be applied only on the entire partition and not on subset. By default, ignores setting `max_bytes_to_merge_at_max_space_in_pool` (see `enable_max_bytes_limit_for_min_age_to_force_merge`). Possible values: * true, false

min\_age\_to\_force\_merge\_seconds

Merge parts if every part in the range is older than the value of `min_age_to_force_merge_seconds`. By default, ignores setting `max_bytes_to_merge_at_max_space_in_pool` (see `enable_max_bytes_limit_for_min_age_to_force_merge`). Possible values: * Positive integer.

min\_bytes\_for\_compact\_part

Obsolete setting, does nothing.

min\_bytes\_for\_full\_part\_storage

Only available in ClickHouse Cloud. Minimal uncompressed size in bytes to use full type of storage for data part instead of packed

min\_bytes\_for\_wide\_part

Minimum number of bytes/rows in a data part that can be stored in `Wide` format. You can set one, both or none of these settings.

min\_bytes\_to\_prewarm\_caches

Minimal size (uncompressed bytes) to prewarm mark cache and primary index cache for new parts

min\_bytes\_to\_rebalance\_partition\_over\_jbod

Sets minimal amount of bytes to enable balancing when distributing new big parts over volume disks [JBOD](https://en.wikipedia.org/wiki/Non-RAID_drive_architectures). Possible values: * Positive integer. * `0` — Balancing is disabled. **Usage** The value of the `min_bytes_to_rebalance_partition_over_jbod` setting should not be less than the value of the [max\_bytes\_to\_merge\_at\_max\_space\_in\_pool](/reference/settings/merge-tree-settings#max_bytes_to_merge_at_max_space_in_pool) / 1024. Otherwise, ClickHouse throws an exception.

min\_columns\_to\_activate\_adaptive\_write\_buffer

Allow to reduce memory usage for tables with lots of columns by using adaptive writer buffers. Possible values: * 0 - unlimited * 1 - always enabled

min\_compress\_block\_size

Minimum size of blocks of uncompressed data required for compression when writing the next mark. You can also specify this setting in the global settings (see [min\_compress\_block\_size](/reference/settings/merge-tree-settings#min_compress_block_size) setting). The value specified when the table is created overrides the global value for this setting.

min\_compressed\_bytes\_to\_fsync\_after\_fetch

Minimal number of compressed bytes to do fsync for part after fetch (0 - disabled)

min\_compressed\_bytes\_to\_fsync\_after\_merge

Minimal number of compressed bytes to do fsync for part after merge (0 - disabled)

min\_delay\_to\_insert\_ms

Min delay of inserting data into MergeTree table in milliseconds, if there are a lot of unmerged parts in single partition.

min\_delay\_to\_mutate\_ms

Min delay of mutating MergeTree table in milliseconds, if there are a lot of unfinished mutations

min\_free\_disk\_bytes\_to\_perform\_insert

The minimum number of bytes that should be free in disk space in order to insert data. If the number of available free bytes is less than `min_free_disk_bytes_to_perform_insert` then an exception is thrown and the insert is not executed. Note that this setting: * takes into account the `keep_free_space_bytes` setting. * does not take into account the amount of data that will be written by the `INSERT` operation. * is only checked if a positive (non-zero) number of bytes is specified Possible values: * Any positive integer. If both `min_free_disk_bytes_to_perform_insert` and `min_free_disk_ratio_to_perform_insert` are specified, ClickHouse will count on the value that will allow to perform inserts on a bigger amount of free memory.

min\_free\_disk\_ratio\_to\_perform\_insert

The minimum free to total disk space ratio to perform an `INSERT`. Must be a floating point value between 0 and 1. Note that this setting: * takes into account the `keep_free_space_bytes` setting. * does not take into account the amount of data that will be written by the `INSERT` operation. * is only checked if a positive (non-zero) ratio is specified Possible values: * Float, 0.0 - 1.0 Note that if both `min_free_disk_ratio_to_perform_insert` and `min_free_disk_bytes_to_perform_insert` are specified, ClickHouse will count on the value that will allow to perform inserts on a bigger amount of free memory.

min\_index\_granularity\_bytes

Min allowed size of data granules in bytes. To provide a safeguard against accidentally creating tables with very low `index_granularity_bytes`.

min\_level\_for\_full\_part\_storage

Only available in ClickHouse Cloud. Minimal part level to use full type of storage for data part instead of packed

min\_level\_for\_wide\_part

Minimal part level to create a data part in `Wide` format instead of `Compact`.

min\_marks\_to\_honor\_max\_concurrent\_queries

The minimal number of marks read by the query for applying the [max\_concurrent\_queries](#max_concurrent_queries) setting. Queries will still be limited by other `max_concurrent_queries` settings. Possible values: * Positive integer. * `0` — Disabled (`max_concurrent_queries` limit applied to no queries). **Example** ```xml theme={null} 10 ```

min\_merge\_bytes\_to\_use\_direct\_io

The minimum data volume for merge operation that is required for using direct I/O access to the storage disk. When merging data parts, ClickHouse calculates the total storage volume of all the data to be merged. If the volume exceeds `min_merge_bytes_to_use_direct_io` bytes, ClickHouse reads and writes the data to the storage disk using the direct I/O interface (`O_DIRECT` option). If `min_merge_bytes_to_use_direct_io = 0`, then direct I/O is disabled.

min\_parts\_to\_merge\_at\_once

Minimal amount of data parts which merge selector can pick to merge at once (expert level setting, don't change if you don't understand what it is doing). 0 - disabled. Works for Simple and StochasticSimple merge selectors.

min\_relative\_delay\_to\_close

Minimal delay from other replicas to close, stop serving requests and not return Ok during status check.

min\_relative\_delay\_to\_measure

Calculate relative replica delay only if absolute delay is not less that this value.

min\_relative\_delay\_to\_yield\_leadership

Obsolete setting, does nothing.

min\_replicated\_logs\_to\_keep

Keep about this number of last records in ZooKeeper log, even if they are obsolete. It doesn't affect work of tables: used only to diagnose ZooKeeper log before cleaning. Possible values: * Any positive integer.

min\_rows\_for\_compact\_part

Obsolete setting, does nothing.

min\_rows\_for\_full\_part\_storage

Only available in ClickHouse Cloud. Minimal number of rows to use full type of storage for data part instead of packed

min\_rows\_for\_wide\_part

Minimal number of rows to create a data part in `Wide` format instead of `Compact`.

min\_rows\_to\_fsync\_after\_merge

Minimal number of rows to do fsync for part after merge (0 - disabled)

mutation\_workload

Used to regulate how resources are utilized and shared between mutations and other workloads. Specified value is used as `workload` setting value for background mutations of this table. If not specified (empty string), then server setting `mutation_workload` is used instead. **See Also** * [Workload Scheduling](/concepts/features/configuration/server-config/workload-scheduling)

non\_replicated\_deduplication\_window

The number of the most recently inserted blocks in the non-replicated [MergeTree](/reference/engines/table-engines/mergetree-family/mergetree) table for which hash sums are stored to check for duplicates. Possible values: * Any positive integer. * `0` (disable deduplication). A deduplication mechanism is used, similar to replicated tables (see [replicated\_deduplication\_window](#replicated_deduplication_window) setting). The hash sums of the created parts are written to a local file on a disk.

notify\_newest\_block\_number

Notify newest block number to SharedJoin or SharedSet. Only in ClickHouse Cloud.

nullable\_serialization\_version

Controls the serialization method used for `Nullable(T)` columns. Possible values: * basic — Use the standard serialization for `Nullable(T)`. * allow\_sparse — Permit `Nullable(T)` to use sparse encoding.

number\_of\_free\_entries\_in\_pool\_to\_execute\_mutation

When there is less than specified number of free entries in pool, do not execute part mutations. This is to leave free threads for regular merges and to avoid "Too many parts" errors. Possible values: * Any positive integer. **Usage** The value of the `number_of_free_entries_in_pool_to_execute_mutation` setting should be less than the value of the [background\_pool\_size](/reference/settings/server-settings/settings#background_pool_size) * [background\_merges\_mutations\_concurrency\_ratio](/reference/settings/server-settings/settings#background_merges_mutations_concurrency_ratio). Otherwise, ClickHouse will throw an exception.

number\_of\_free\_entries\_in\_pool\_to\_execute\_optimize\_entire\_partition

When there is less than specified number of free entries in pool, do not execute optimizing entire partition in the background (this task generated when set `min_age_to_force_merge_seconds` and enable `min_age_to_force_merge_on_partition_only`). This is to leave free threads for regular merges and avoid "Too many parts". Possible values: * Positive integer. The value of the `number_of_free_entries_in_pool_to_execute_optimize_entire_partition` setting should be less than the value of the [background\_pool\_size](/reference/settings/server-settings/settings#background_pool_size) * [background\_merges\_mutations\_concurrency\_ratio](/reference/settings/server-settings/settings#background_merges_mutations_concurrency_ratio). Otherwise, ClickHouse throws an exception.

number\_of\_free\_entries\_in\_pool\_to\_lower\_max\_size\_of\_merge

When there is less than the specified number of free entries in pool (or replicated queue), start to lower maximum size of merge to process (or to put in queue). This is to allow small merges to process - not filling the pool with long running merges. Possible values: * Any positive integer.

number\_of\_mutations\_to\_delay

If table has at least that many unfinished mutations, artificially slow down mutations of table. Disabled if set to 0

number\_of\_mutations\_to\_throw

If table has at least that many unfinished mutations, throw 'Too many mutations' exception. Disabled if set to 0

number\_of\_partitions\_to\_consider\_for\_merge

Only available in ClickHouse Cloud. Up to top N partitions which we will consider for merge. Partitions picked in a random weighted way where weight is amount of data parts which can be merged in this partition.

object\_serialization\_version

Serialization version for JSON data type. Required for compatibility. Possible values: * `v1` * `v2` * `v3` Only version `v3` supports changing the shared data serialization version.

object\_shared\_data\_buckets\_for\_compact\_part

The number of buckets for JSON shared data serialization in Compact parts. Works with `map_with_buckets` and `advanced` shared data serializations. The maximum allowed value is 256.

object\_shared\_data\_buckets\_for\_wide\_part

The number of buckets for JSON shared data serialization in Wide parts. Works with `map_with_buckets` and `advanced` shared data serializations. The maximum allowed value is 256.

object\_shared\_data\_serialization\_version

Serialization version for shared data inside JSON data type. Possible values: * `map` - store shared data as `Map(String, String)` * `map_with_buckets` - store shared data as several separate `Map(String, String)` columns. Using buckets improves reading individual paths from shared data. * `advanced` - special serialization of shared data designed to significantly improve reading of individual paths from shared data. Note that this serialization increases the shared data storage size on disk because we store a lot of additional information. The number of buckets for `map_with_buckets` and `advanced` serializations is determined by settings [object\_shared\_data\_buckets\_for\_compact\_part](#object_shared_data_buckets_for_compact_part)/[object\_shared\_data\_buckets\_for\_wide\_part](#object_shared_data_buckets_for_wide_part).

object\_shared\_data\_serialization\_version\_for\_zero\_level\_parts

This setting allows to specify different serialization version of the shared data inside JSON type for zero level parts that are created during inserts. It's recommended not to use `advanced` shared data serialization for zero level parts because it can increase the insertion time significantly.

old\_parts\_lifetime

The time (in seconds) of storing inactive parts to protect against data loss during spontaneous server reboots. Possible values: * Any positive integer. After merging several parts into a new part, ClickHouse marks the original parts as inactive and deletes them only after `old_parts_lifetime` seconds. Inactive parts are removed if they are not used by current queries, i.e. if the `refcount` of the part is 1. `fsync` is not called for new parts, so for some time new parts exist only in the server's RAM (OS cache). If the server is rebooted spontaneously, new parts can be lost or damaged. To protect data inactive parts are not deleted immediately. During startup ClickHouse checks the integrity of the parts. If the merged part is damaged ClickHouse returns the inactive parts to the active list, and later merges them again. Then the damaged part is renamed (the `broken_` prefix is added) and moved to the `detached` folder. If the merged part is not damaged, then the original inactive parts are renamed (the `ignored_` prefix is added) and moved to the `detached` folder. The default `dirty_expire_centisecs` value (a Linux kernel setting) is 30 seconds (the maximum time that written data is stored only in RAM), but under heavy loads on the disk system data can be written much later. Experimentally, a value of 480 seconds was chosen for `old_parts_lifetime`, during which a new part is guaranteed to be written to disk.

optimize\_row\_order

Controls if the row order should be optimized during inserts to improve the compressability of the newly inserted table part. Only has an effect for ordinary MergeTree-engine tables. Does nothing for specialized MergeTree engine tables (e.g. CollapsingMergeTree). MergeTree tables are (optionally) compressed using [compression codecs](/reference/statements/create/table#column_compression_codec). Generic compression codecs such as LZ4 and ZSTD achieve maximum compression rates if the data exposes patterns. Long runs of the same value typically compress very well. If this setting is enabled, ClickHouse attempts to store the data in newly inserted parts in a row order that minimizes the number of equal-value runs across the columns of the new table part. In other words, a small number of equal-value runs mean that individual runs are long and compress well. Finding the optimal row order is computationally infeasible (NP hard). Therefore, ClickHouse uses a heuristics to quickly find a row order which still improves compression rates over the original row order.

Heuristics for finding a row order

It is generally possible to shuffle the rows of a table (or table part) freely as SQL considers the same table (table part) in different row order equivalent. This freedom of shuffling rows is restricted when a primary key is defined for the table. In ClickHouse, a primary key `C1, C2, ..., CN` enforces that the table rows are sorted by columns `C1`, `C2`, ... `Cn` ([clustered index](https://en.wikipedia.org/wiki/Database_index#Clustered)). As a result, rows can only be shuffled within "equivalence classes" of row, i.e. rows which have the same values in their primary key columns. The intuition is that primary keys with high-cardinality, e.g. primary keys involving a `DateTime64` timestamp column, lead to many small equivalence classes. Likewise, tables with a low-cardinality primary key, create few and large equivalence classes. A table with no primary key represents the extreme case of a single equivalence class which spans all rows. The fewer and the larger the equivalence classes are, the higher the degree of freedom when re-shuffling rows. The heuristics applied to find the best row order within each equivalence class is suggested by D. Lemire, O. Kaser in [Reordering columns for smaller indexes](https://doi.org/10.1016/j.ins.2011.02.002) and based on sorting the rows within each equivalence class by ascending cardinality of the non-primary key columns. It performs three steps: 1. Find all equivalence classes based on the row values in primary key columns. 2. For each equivalence class, calculate (usually estimate) the cardinalities of the non-primary-key columns. 3. For each equivalence class, sort the rows in order of ascending non-primary-key column cardinality.

If enabled, insert operations incur additional CPU costs to analyze and optimize the row order of the new data. INSERTs are expected to take 30-50% longer depending on the data characteristics. Compression rates of LZ4 or ZSTD improve on average by 20-40%. This setting works best for tables with no primary key or a low-cardinality primary key, i.e. a table with only few distinct primary key values. High-cardinality primary keys, e.g. involving timestamp columns of type `DateTime64`, are not expected to benefit from this setting.

part\_minmax\_index\_columns

Selects which columns the per-part min-max index covers. Each value enables an additional group of columns on top of the previous one. Possible values: * `partition_key_only` — only the partition-key columns are tracked. * `with_block_number_offset` — partition-key columns plus the persisted `_block_number` and `_block_offset` virtual columns. Enables part-level pruning by these columns.

part\_moves\_between\_shards\_delay\_seconds

Time to wait before/after moving parts between shards.

part\_moves\_between\_shards\_enable

Experimental/Incomplete feature to move parts between shards. Does not take into account sharding expressions.

parts\_to\_delay\_insert

If the number of active parts in a single partition exceeds the `parts_to_delay_insert` value, an `INSERT` is artificially slowed down. Possible values: * Any positive integer. ClickHouse artificially executes `INSERT` longer (adds 'sleep') so that the background merge process can merge parts faster than they are added.

parts\_to\_throw\_insert

If the number of active parts in a single partition exceeds the `parts_to_throw_insert` value, `INSERT` is interrupted with the `Too many parts (N). Merges are processing significantly slower than inserts` exception. Possible values: * Any positive integer. To achieve maximum performance of `SELECT` queries, it is necessary to minimize the number of parts processed, see [Merge Tree](/resources/develop-contribute/introduction/architecture#merge-tree). Prior to version 23.6 this setting was set to 300. You can set a higher different value, it will reduce the probability of the `Too many parts` error, but at the same time `SELECT` performance might degrade. Also in case of a merge issue (for example, due to insufficient disk space) you will notice it later than you would with the original 300.

prefer\_fetch\_merged\_part\_size\_threshold

If the sum of the size of parts exceeds this threshold and the time since a replication log entry creation is greater than `prefer_fetch_merged_part_time_threshold`, then prefer fetching merged part from a replica instead of doing merge locally. This is to speed up very long merges. Possible values: * Any positive integer.

prefer\_fetch\_merged\_part\_time\_threshold

If the time passed since a replication log (ClickHouse Keeper or ZooKeeper) entry creation exceeds this threshold, and the sum of the size of parts is greater than `prefer_fetch_merged_part_size_threshold`, then prefer fetching merged part from a replica instead of doing merge locally. This is to speed up very long merges. Possible values: * Any positive integer.

prewarm\_mark\_cache

If true mark cache will be prewarmed by saving marks to mark cache on inserts, merges, fetches and on startup of server

prewarm\_primary\_key\_cache

If true primary index cache will be prewarmed by saving marks to mark cache on inserts, merges, fetches and on startup of server

primary\_key\_compress\_block\_size

Primary compress block size, the actual size of the block to compress.

primary\_key\_compression\_codec

Compression encoding used by primary, primary key is small enough and cached, so the default compression is ZSTD(3).

primary\_key\_lazy\_load

Load primary key in memory on first use instead of on table initialization. This can save memory in the presence of a large number of tables.

primary\_key\_ratio\_of\_unique\_prefix\_values\_to\_skip\_suffix\_columns

If the value of a column of the primary key in data part changes at least in this ratio of times, skip loading next columns in memory. This allows to save memory usage by not loading useless columns of the primary key.

propagate\_types\_serialization\_versions\_to\_nested\_types

If true, serialization versions like string\_serialization\_version will be propagated inside nested types like Array/Map/Nullable/JSON/etc. If disabled, the serialization version will take affect only to top-level columns of this type and Tuple el

ratio\_of\_defaults\_for\_sparse\_serialization

Minimal ratio of the number of *default* values to the number of *all* values in a column. Setting this value causes the column to be stored using sparse serializations. If a column is sparse (contains mostly zeros), ClickHouse can encode it in a sparse format and automatically optimize calculations - the data does not require full decompression during queries. To enable this sparse serialization, define the `ratio_of_defaults_for_sparse_serialization` setting to be less than 1.0. If the value is greater than or equal to 1.0, then the columns will be always written using the normal full serialization. Possible values: * Float between `0` and `1` to enable sparse serialization * `1.0` (or greater) if you do not want to use sparse serialization **Example** Notice the `s` column in the following table is an empty string for 95% of the rows. In `my_regular_table` we do not use sparse serialization, and in `my_sparse_table` we set `ratio_of_defaults_for_sparse_serialization` to 0.95: ```sql theme={null} CREATE TABLE my_regular_table ( `id` UInt64, `s` String ) ENGINE = MergeTree ORDER BY id; INSERT INTO my_regular_table SELECT number AS id, number % 20 = 0 ? toString(number): '' AS s FROM numbers(10000000); CREATE TABLE my_sparse_table ( `id` UInt64, `s` String ) ENGINE = MergeTree ORDER BY id SETTINGS ratio_of_defaults_for_sparse_serialization = 0.95; INSERT INTO my_sparse_table SELECT number, number % 20 = 0 ? toString(number): '' FROM numbers(10000000); ``` Notice the `s` column in `my_sparse_table` uses less storage space on disk: ```sql theme={null} SELECT table, name, data_compressed_bytes, data_uncompressed_bytes FROM system.columns WHERE table LIKE 'my_%_table'; ``` ```response theme={null} ┌─table────────────┬─name─┬─data_compressed_bytes─┬─data_uncompressed_bytes─┐ │ my_regular_table │ id │ 37790741 │ 75488328 │ │ my_regular_table │ s │ 2451377 │ 12683106 │ │ my_sparse_table │ id │ 37790741 │ 75488328 │ │ my_sparse_table │ s │ 2283454 │ 9855751 │ └──────────────────┴──────┴───────────────────────┴─────────────────────────┘ ``` You can verify if a column is using the sparse encoding by viewing the `serialization_kind` column of the `system.parts_columns` table: ```sql theme={null} SELECT column, serialization_kind FROM system.parts_columns WHERE table LIKE 'my_sparse_table'; ``` You can see which parts of `s` were stored using the sparse serialization: ```response theme={null} ┌─column─┬─serialization_kind─┐ │ id │ Default │ │ s │ Default │ │ id │ Default │ │ s │ Default │ │ id │ Default │ │ s │ Sparse │ │ id │ Default │ │ s │ Sparse │ │ id │ Default │ │ s │ Sparse │ │ id │ Default │ │ s │ Sparse │ │ id │ Default │ │ s │ Sparse │ │ id │ Default │ │ s │ Sparse │ │ id │ Default │ │ s │ Sparse │ │ id │ Default │ │ s │ Sparse │ │ id │ Default │ │ s │ Sparse │ └────────┴────────────────────┘ ```

reduce\_blocking\_parts\_sleep\_ms

Only available in ClickHouse Cloud. Minimum time to wait before trying to reduce blocking parts again after no ranges were dropped/replaced. A lower setting will trigger tasks in background\_schedule\_pool frequently which results in large amount of requests to zookeeper in large-scale clusters

refresh\_parts\_interval

If it is greater than zero - refresh the list of data parts from the underlying filesystem to check if the data was updated under the hood. It can be set only if the table is located on readonly disks (which means that this is a readonly replica, while data is being written by another replica).

refresh\_statistics\_interval

The interval of refreshing statistics cache in seconds. If it is set to zero, the refreshing will be disabled.

remote\_fs\_execute\_merges\_on\_single\_replica\_time\_threshold

When this setting has a value greater than zero only a single replica starts the merge immediately if merged part on shared storage. Zero-copy replication is not ready for production Zero-copy replication is disabled by default in ClickHouse version 22.8 and higher. This feature is not recommended for production use. Possible values: * Any positive integer.

remote\_fs\_zero\_copy\_path\_compatible\_mode

Run zero-copy in compatible mode during conversion process.

remote\_fs\_zero\_copy\_zookeeper\_path

ZooKeeper path for zero-copy table-independent info.

remove\_empty\_parts

Remove empty parts after they were pruned by TTL, mutation, or collapsing merge algorithm.

remove\_rolled\_back\_parts\_immediately

Setting for an incomplete experimental feature.

remove\_unused\_patch\_parts

Remove in background patch parts which are applied for all active parts.

replace\_long\_file\_name\_to\_hash

If the file name for column is too long (more than 'max\_file\_name\_length' bytes) replace it to SipHash128

replicated\_can\_become\_leader

If true, replicated tables replicas on this node will try to acquire leadership. Possible values: * `true` * `false`

replicated\_deduplication\_window

The number of most recently inserted blocks for which ClickHouse Keeper stores hash sums to check for duplicates. Possible values: * Any positive integer. * 0 (disable deduplication) The `Insert` command creates one or more blocks (parts). For [insert deduplication](/reference/engines/table-engines/mergetree-family/replication), when writing into replicated tables, ClickHouse writes the hash sums of the created parts into ClickHouse Keeper. Hash sums are stored only for the most recent `replicated_deduplication_window` blocks. The oldest hash sums are removed from ClickHouse Keeper. A large number for `replicated_deduplication_window` slows down `Inserts` because more entries need to be compared. The hash sum is calculated from the composition of the field names and types and the data of the inserted part (stream of bytes).

replicated\_deduplication\_window\_for\_async\_inserts

The number of most recently async inserted blocks for which ClickHouse Keeper stores hash sums to check for duplicates. Possible values: * Any positive integer. * 0 (disable deduplication for async\_inserts) The [Async Insert](/reference/settings/session-settings#async_insert) command will be cached in one or more blocks (parts). For [insert deduplication](/reference/engines/table-engines/mergetree-family/replication), when writing into replicated tables, ClickHouse writes the hash sums of each insert into ClickHouse Keeper. Hash sums are stored only for the most recent `replicated_deduplication_window_for_async_inserts` blocks. The oldest hash sums are removed from ClickHouse Keeper. A large number of `replicated_deduplication_window_for_async_inserts` slows down `Async Inserts` because it needs to compare more entries. The hash sum is calculated from the composition of the field names and types and the data of the insert (stream of bytes).

replicated\_deduplication\_window\_seconds

The number of seconds after which the hash sums of the inserted blocks are removed from ClickHouse Keeper. Possible values: * Any positive integer. Similar to [replicated\_deduplication\_window](#replicated_deduplication_window), `replicated_deduplication_window_seconds` specifies how long to store hash sums of blocks for insert deduplication. Hash sums older than `replicated_deduplication_window_seconds` are removed from ClickHouse Keeper, even if they are less than ` replicated_deduplication_window`. The time is relative to the time of the most recent record, not to the wall time. If it's the only record it will be stored forever.

replicated\_deduplication\_window\_seconds\_for\_async\_inserts

The number of seconds after which the hash sums of the async inserts are removed from ClickHouse Keeper. Possible values: * Any positive integer. Similar to [replicated\_deduplication\_window\_for\_async\_inserts](#replicated_deduplication_window_for_async_inserts), `replicated_deduplication_window_seconds_for_async_inserts` specifies how long to store hash sums of blocks for async insert deduplication. Hash sums older than `replicated_deduplication_window_seconds_for_async_inserts` are removed from ClickHouse Keeper, even if they are less than `replicated_deduplication_window_for_async_inserts`. The time is relative to the time of the most recent record, not to the wall time. If it's the only record it will be stored forever.

replicated\_fetches\_http\_connection\_timeout

Obsolete setting, does nothing.

replicated\_fetches\_http\_receive\_timeout

Obsolete setting, does nothing.

replicated\_fetches\_http\_send\_timeout

Obsolete setting, does nothing.

replicated\_fetches\_min\_part\_level

Minimum part level to fetch from other replicas. Parts with level below this threshold are postponed (kept in the replication queue and re-evaluated each scheduling cycle, not permanently skipped). Use 1 to postpone fetching level-0 (unmerged) parts, reducing replication overhead during heavy ingestion. Default: 0 (fetch all parts regardless of level).

replicated\_fetches\_min\_part\_level\_timeout\_seconds

Timeout in seconds after which a part below replicated\_fetches\_min\_part\_level will be fetched anyway. Use 0 to disable the timeout (parts below the minimum level are postponed indefinitely until merged). Default: 300 (force fetch after 5 minutes).

replicated\_max\_mutations\_in\_one\_entry

Max number of mutation commands that can be merged together and executed in one MUTATE\_PART entry (0 means unlimited)

replicated\_max\_parallel\_fetches

Obsolete setting, does nothing.

replicated\_max\_parallel\_fetches\_for\_host

Obsolete setting, does nothing.

replicated\_max\_parallel\_fetches\_for\_table

Obsolete setting, does nothing.

replicated\_max\_parallel\_sends

Obsolete setting, does nothing.

replicated\_max\_parallel\_sends\_for\_table

Obsolete setting, does nothing.

replicated\_max\_ratio\_of\_wrong\_parts

If the ratio of wrong parts to total number of parts is less than this - allow to start. Possible values: * Float, 0.0 - 1.0

search\_orphaned\_parts\_disks

ClickHouse scans all disks for orphaned parts upon any ATTACH or CREATE table in order to not allow to miss data parts at undefined (not included in policy) disks. Orphaned parts originates from potentially unsafe storage reconfiguration, e.g. if a disk was excluded from storage policy. This setting limits scope of disks to search by traits of the disks. Possible values: * any - scope is not limited. * local - scope is limited by local disks . * none - empty scope, do not search

serialization\_info\_version

Serialization info version used when writing `serialization.json`. This setting is required for compatibility during cluster upgrades. Possible values: * `basic` - Basic format. * `with_types` - Format with additional `types_serialization_versions` field, allowing per-type serialization versions. This makes settings like `string_serialization_version` effective. During rolling upgrades, set this to `basic` so that new servers produce data parts compatible with old servers. After the upgrade completes, switch to `WITH_TYPES` to enable per-type serialization versions.

share\_nested\_offsets

When enabled (default), Array columns with dotted names that share a common prefix (e.g. n.a and n.b) are treated as part of a Nested structure: they share a single offsets file on disk (e.g. n.size0), and their array sizes are validated to be equal during INSERT. When disabled, each Array column gets its own independent offset file, dotted names carry no special semantics, and a scalar column may coexist with dotted Array columns sharing the same prefix (e.g. n UInt32 alongside n.a Array(String)). This setting is immutable after table creation.

shared\_merge\_tree\_activate\_coordinated\_merges\_tasks

Activates rescheduling of coordinated merges tasks. It can be useful even when shared\_merge\_tree\_enable\_coordinated\_merges=0 because this will populate merge coordinator statistics and help with cold start.

shared\_merge\_tree\_create\_per\_replica\_metadata\_nodes

Enables creation of per-replica /metadata and /columns nodes in ZooKeeper. Only available in ClickHouse Cloud

shared\_merge\_tree\_disable\_merges\_and\_mutations\_assignment

Stop merges assignment for shared merge tree. Only available in ClickHouse Cloud

shared\_merge\_tree\_empty\_partition\_lifetime

How many seconds partition will be stored in keeper if it has no parts.

shared\_merge\_tree\_enable\_automatic\_empty\_partitions\_cleanup

Enabled cleanup of Keeper entries of empty partition.

shared\_merge\_tree\_enable\_coordinated\_merges

Enables coordinated merges strategy

shared\_merge\_tree\_enable\_keeper\_parts\_extra\_data

Enables writing attributes into virtual parts and committing blocks in keeper

shared\_merge\_tree\_enable\_outdated\_parts\_check

Enable outdated parts check. Only available in ClickHouse Cloud

shared\_merge\_tree\_idle\_parts\_update\_seconds

Interval in seconds for parts update without being triggered by ZooKeeper watch in the shared merge tree. Only available in ClickHouse Cloud

shared\_merge\_tree\_initial\_parts\_update\_backoff\_ms

Initial backoff for parts update. Only available in ClickHouse Cloud

shared\_merge\_tree\_interserver\_http\_connection\_timeout\_ms

Timeouts for interserver HTTP connection. Only available in ClickHouse Cloud

shared\_merge\_tree\_interserver\_http\_timeout\_ms

Timeouts for interserver HTTP communication. Only available in ClickHouse Cloud

shared\_merge\_tree\_leader\_update\_period\_random\_add\_seconds

Add uniformly distributed value from 0 to x seconds to shared\_merge\_tree\_leader\_update\_period to avoid thundering herd effect. Only available in ClickHouse Cloud

shared\_merge\_tree\_leader\_update\_period\_seconds

Maximum period to recheck leadership for parts update. Only available in ClickHouse Cloud

shared\_merge\_tree\_max\_outdated\_parts\_to\_process\_at\_once

Maximum amount of outdated parts leader will try to confirm for removal at one HTTP request. Only available in ClickHouse Cloud.

shared\_merge\_tree\_max\_parts\_update\_backoff\_ms

Max backoff for parts update. Only available in ClickHouse Cloud

shared\_merge\_tree\_max\_parts\_update\_leaders\_in\_total

Maximum number of parts update leaders. Only available in ClickHouse Cloud

shared\_merge\_tree\_max\_parts\_update\_leaders\_per\_az

Maximum number of parts update leaders. Only available in ClickHouse Cloud

shared\_merge\_tree\_max\_replicas\_for\_parts\_deletion

Max replicas which will participate in parts deletion (killer thread). Only available in ClickHouse Cloud

shared\_merge\_tree\_max\_replicas\_to\_merge\_parts\_for\_each\_parts\_range

Max replicas which will try to assign potentially conflicting merges (allow to avoid redundant conflicts in merges assignment). 0 means disabled. Only available in ClickHouse Cloud

shared\_merge\_tree\_max\_suspicious\_broken\_parts

Max broken parts for SMT, if more - deny automatic detach.

shared\_merge\_tree\_max\_suspicious\_broken\_parts\_bytes

Max size of all broken parts for SMT, if more - deny automatic detach.

shared\_merge\_tree\_memo\_ids\_remove\_timeout\_seconds

How long we store insert memoization ids to avoid wrong actions during insert retries. Only available in ClickHouse Cloud

shared\_merge\_tree\_merge\_coordinator\_election\_check\_period\_ms

Time between runs of merge coordinator election thread

shared\_merge\_tree\_merge\_coordinator\_factor

Time changing factor for delay of coordinator thread

shared\_merge\_tree\_merge\_coordinator\_fetch\_fresh\_metadata\_period\_ms

How often merge coordinator should sync with zookeeper to take fresh metadata

shared\_merge\_tree\_merge\_coordinator\_max\_merge\_request\_size

Number of merges that coordinator can request from MergerMutator at once

shared\_merge\_tree\_merge\_coordinator\_max\_period\_ms

Maximum time between runs of merge coordinator thread

shared\_merge\_tree\_merge\_coordinator\_merges\_prepare\_count

Number of merge entries that coordinator should prepare and distribute across workers. When set to 'auto', equals the max number of merge tasks allowed on a single replica multiplied by the number of active replicas.

shared\_merge\_tree\_merge\_coordinator\_min\_period\_ms

Minimum time between runs of merge coordinator thread

shared\_merge\_tree\_merge\_worker\_fast\_timeout\_ms

Timeout that merge worker thread will use if it is needed to update it's state after immediate action

shared\_merge\_tree\_merge\_worker\_regular\_timeout\_ms

Time between runs of merge worker thread

shared\_merge\_tree\_outdated\_parts\_group\_size

How many replicas will be in the same rendezvous hash group for outdated parts cleanup. Only available in ClickHouse Cloud.

shared\_merge\_tree\_partitions\_hint\_ratio\_to\_reload\_merge\_pred\_for\_mutations

Will reload merge predicate in merge/mutate selecting task when `/` ratio is higher than the setting. Only available in ClickHouse Cloud

shared\_merge\_tree\_parts\_load\_batch\_size

Amount of fetch parts metadata jobs to schedule at once. Only available in ClickHouse Cloud

shared\_merge\_tree\_postpone\_next\_merge\_for\_locally\_merged\_parts\_ms

Time to keep a locally merged part without starting a new merge containing this part. Gives other replicas a chance fetch the part and start this merge. Only available in ClickHouse Cloud.

shared\_merge\_tree\_postpone\_next\_merge\_for\_locally\_merged\_parts\_rows\_threshold

Minimum size of part (in rows) to postpone assigning a next merge just after merging it locally. Only available in ClickHouse Cloud.

shared\_merge\_tree\_range\_for\_merge\_window\_size

Time to keep a locally merged part without starting a new merge containing this part. Gives other replicas a chance fetch the part and start this merge. Only available in ClickHouse Cloud

shared\_merge\_tree\_read\_virtual\_parts\_from\_leader

Read virtual parts from leader when possible. Only available in ClickHouse Cloud