diff --git a/solr/solr-ref-guide/src/cross-data-center-replication-cdcr.adoc b/solr/solr-ref-guide/src/cross-data-center-replication-cdcr.adoc index 3aa4a3b0cb2..7d36e3f7457 100644 --- a/solr/solr-ref-guide/src/cross-data-center-replication-cdcr.adoc +++ b/solr/solr-ref-guide/src/cross-data-center-replication-cdcr.adoc @@ -167,10 +167,21 @@ The Source and Target configurations differ in the case of the data centers bein === Source Configuration -Here is a sample of a Source configuration file, a section in `solrconfig.xml`. The presence of the section causes CDCR to use this cluster as the Source and should not be present in the Target collections. Details about each setting are after the two examples: +Here is a sample of a Source configuration file, a section in `solrconfig.xml`. The presence of the section causes CDCR to use this cluster as the Source and should not be present in the Target collections. Details about each setting are after the two examples. The source example has buffering disabled, the default is enabled: [source,xml] ---- + + + + + + + + cdcr-processor-chain + + + 10.240.18.211:2181,10.240.18.212:2181 @@ -191,6 +202,12 @@ Here is a sample of a Source configuration file, a section in `solrconfig.xml`. 1000 + + + + DISABLED + + + + ---- @@ -212,6 +231,7 @@ Target instance must configure an update processor chain that is specific to CDC [source,xml] ---- + disabled @@ -235,6 +255,9 @@ Target instance must configure an update processor chain that is specific to CDC ${solr.ulog.dir:} + + + ---- @@ -274,12 +297,14 @@ The number of updates to send in one batch. The optimal size depends on the size Expert: Non-leader nodes need to synchronize their update logs with their leader node from time to time in order to clean deprecated transaction log files. By default, such a synchronization process is performed every minute. The schedule of the synchronization can be modified with a “updateLogSynchronizer” list as follows: +TIP: If the updateLogSynchronizer element is omitted from the Source cluster, transaction logs may accumulate on non-leaders. + `schedule`:: The delay in milliseconds for synchronizing the update logs. The default is `60000`. ==== The Buffer Element -When buffering updates, the update logs will store all the updates indefinitely. It is recommended to disable buffering on both the Source and Target clusters during normal operation as when buffering is enabled the Update Logs will grow without limit. Leaving buffering enabled is intended for special maintenance periods. The buffer can be disabled at startup with a “buffer” list and the parameter “defaultState” as follows: +When buffering updates, the update logs will store all the updates indefinitely. It is best to disable buffering on both the Source and Target clusters during normal operation as when buffering is enabled the Update Logs will grow without limit. Enbling buffering is intended for special maintenance periods. Buffering can be disabled at startup with a “buffer” list and the parameter “defaultState” as follows: `defaultState`:: The state of the buffer at startup. The default is `enabled`. @@ -293,7 +318,7 @@ Buffering is designed to augment maintenance windows. The following points shoul * During normal operation, the Update Logs will automatically accrue on the Source data center if the Target data center is unavailable; It is not necessary to enable buffering for CDCR to handle routine network disruptions. ** For this reason, monitoring disk usage on the Source data center is recommended as an additional check that the Target data center is receiving updates. * Buffering should _not_ be enabled on the Target data center as Update Logs would accrue without limit. - * If buffering is enabled then disabled, the Update Logs will be removed when their contents have been sent to the Target data center. This process may take some time. + * If buffering is enabled then disabled, the Update Logs will be removed when their contents have been sent to the Target data center. This process may take some time and is triggered by additional updates the Source cluster. ** Update Log cleanup is not triggered until a new update is sent to the Source data center. ==== @@ -630,33 +655,7 @@ As usual, it is good to start small. Sync a single cloud and monitor for a perio * Before starting, stop or pause the indexers. This is best done during a small maintenance window. * Stop the SolrCloud instances at the Source -* Include the CDCR request handler configuration in `solrconfig.xml` as in the below example. -+ -[source,xml] ----- - - - ${TargetZk} - ${SourceCollection} - ${TargetCollection} - - - 8 - 10 - 2000 - - - 1000 - - - - - - - ----- -+ -* Upload the modified `solrconfig.xml` to ZooKeeper on both Source and Target +* Upload the modified `solrconfig.xml` to ZooKeeper on both Source and Target as appropriate, see the examples above. * Sync the index directories from the Source collection to Target collection across to the corresponding shard nodes. `rsync` works well for this. + For example, if there are 2 shards on collection1 with 2 replicas for each shard, copy the corresponding index directories from