HBASE-22700:incorrect timeout in recommended ZooKeeper configuration
Signed-off-by: Guanghao Zhang <zghao@apache.org>
This commit is contained in:
parent
9d5e5adaf0
commit
54b514d62a
|
@ -380,7 +380,7 @@ possible configurations would overwhelm and obscure the important.
|
||||||
But, a region server that connects to an ensemble managed with a different configuration
|
But, a region server that connects to an ensemble managed with a different configuration
|
||||||
will be subjected that ensemble's maxSessionTimeout. So, even though HBase might propose
|
will be subjected that ensemble's maxSessionTimeout. So, even though HBase might propose
|
||||||
using 90 seconds, the ensemble can have a max timeout lower than this and it will take
|
using 90 seconds, the ensemble can have a max timeout lower than this and it will take
|
||||||
precedence. The current default that ZK ships with is 40 seconds, which is lower than
|
precedence. The current default maxSessionTimeout that ZK ships with is 40 seconds, which is lower than
|
||||||
HBase's.
|
HBase's.
|
||||||
</description>
|
</description>
|
||||||
</property>
|
</property>
|
||||||
|
|
|
@ -749,7 +749,7 @@ See link:https://issues.apache.org/jira/browse/HBASE-6389[HBASE-6389 Modify the
|
||||||
[[sect.zookeeper.session.timeout]]
|
[[sect.zookeeper.session.timeout]]
|
||||||
===== `zookeeper.session.timeout`
|
===== `zookeeper.session.timeout`
|
||||||
|
|
||||||
The default timeout is three minutes (specified in milliseconds). This means that if a server crashes, it will be three minutes before the Master notices the crash and starts recovery.
|
The default timeout is 90 seconds (specified in milliseconds). This means that if a server crashes, it will be 90 seconds before the Master notices the crash and starts recovery.
|
||||||
You might need to tune the timeout down to a minute or even less so the Master notices failures sooner.
|
You might need to tune the timeout down to a minute or even less so the Master notices failures sooner.
|
||||||
Before changing this value, be sure you have your JVM garbage collection configuration under control, otherwise, a long garbage collection that lasts beyond the ZooKeeper session timeout will take out your RegionServer. (You might be fine with this -- you probably want recovery to start on the server if a RegionServer has been in GC for a long period of time).
|
Before changing this value, be sure you have your JVM garbage collection configuration under control, otherwise, a long garbage collection that lasts beyond the ZooKeeper session timeout will take out your RegionServer. (You might be fine with this -- you probably want recovery to start on the server if a RegionServer has been in GC for a long period of time).
|
||||||
|
|
||||||
|
|
|
@ -465,7 +465,7 @@ ZooKeeper session timeout in milliseconds. It is used in two different ways.
|
||||||
session timeout will be the one specified by this configuration. But, a region server that connects
|
session timeout will be the one specified by this configuration. But, a region server that connects
|
||||||
to an ensemble managed with a different configuration will be subjected that ensemble's maxSessionTimeout. So,
|
to an ensemble managed with a different configuration will be subjected that ensemble's maxSessionTimeout. So,
|
||||||
even though HBase might propose using 90 seconds, the ensemble can have a max timeout lower than this and
|
even though HBase might propose using 90 seconds, the ensemble can have a max timeout lower than this and
|
||||||
it will take precedence. The current default that ZK ships with is 40 seconds, which is lower than HBase's.
|
it will take precedence. The current default maxSessionTimeout that ZK ships with is 40 seconds, which is lower than HBase's.
|
||||||
|
|
||||||
+
|
+
|
||||||
.Default
|
.Default
|
||||||
|
|
|
@ -1142,6 +1142,7 @@ Disable Nagle’s algorithm. Delayed ACKs can add up to ~200ms to RPC round trip
|
||||||
Detect regionserver failure as fast as reasonable. Set the following parameters:
|
Detect regionserver failure as fast as reasonable. Set the following parameters:
|
||||||
|
|
||||||
* In `hbase-site.xml`, set `zookeeper.session.timeout` to 30 seconds or less to bound failure detection (20-30 seconds is a good start).
|
* In `hbase-site.xml`, set `zookeeper.session.timeout` to 30 seconds or less to bound failure detection (20-30 seconds is a good start).
|
||||||
|
- Notice: the `sessionTimeout` of zookeeper is limited between 2 times and 20 times the `tickTime`(the basic time unit in milliseconds used by ZooKeeper.the default value is 2000ms.It is used to do heartbeats and the minimum session timeout will be twice the tickTime).
|
||||||
* Detect and avoid unhealthy or failed HDFS DataNodes: in `hdfs-site.xml` and `hbase-site.xml`, set the following parameters:
|
* Detect and avoid unhealthy or failed HDFS DataNodes: in `hdfs-site.xml` and `hbase-site.xml`, set the following parameters:
|
||||||
- `dfs.namenode.avoid.read.stale.datanode = true`
|
- `dfs.namenode.avoid.read.stale.datanode = true`
|
||||||
- `dfs.namenode.avoid.write.stale.datanode = true`
|
- `dfs.namenode.avoid.write.stale.datanode = true`
|
||||||
|
|
Loading…
Reference in New Issue