HBASE-22700:incorrect timeout in recommended ZooKeeper configuration
Signed-off-by: Guanghao Zhang <zghao@apache.org>
This commit is contained in:
parent
9d5e5adaf0
commit
54b514d62a
|
@ -380,7 +380,7 @@ possible configurations would overwhelm and obscure the important.
|
|||
But, a region server that connects to an ensemble managed with a different configuration
|
||||
will be subjected that ensemble's maxSessionTimeout. So, even though HBase might propose
|
||||
using 90 seconds, the ensemble can have a max timeout lower than this and it will take
|
||||
precedence. The current default that ZK ships with is 40 seconds, which is lower than
|
||||
precedence. The current default maxSessionTimeout that ZK ships with is 40 seconds, which is lower than
|
||||
HBase's.
|
||||
</description>
|
||||
</property>
|
||||
|
|
|
@ -749,7 +749,7 @@ See link:https://issues.apache.org/jira/browse/HBASE-6389[HBASE-6389 Modify the
|
|||
[[sect.zookeeper.session.timeout]]
|
||||
===== `zookeeper.session.timeout`
|
||||
|
||||
The default timeout is three minutes (specified in milliseconds). This means that if a server crashes, it will be three minutes before the Master notices the crash and starts recovery.
|
||||
The default timeout is 90 seconds (specified in milliseconds). This means that if a server crashes, it will be 90 seconds before the Master notices the crash and starts recovery.
|
||||
You might need to tune the timeout down to a minute or even less so the Master notices failures sooner.
|
||||
Before changing this value, be sure you have your JVM garbage collection configuration under control, otherwise, a long garbage collection that lasts beyond the ZooKeeper session timeout will take out your RegionServer. (You might be fine with this -- you probably want recovery to start on the server if a RegionServer has been in GC for a long period of time).
|
||||
|
||||
|
|
|
@ -465,7 +465,7 @@ ZooKeeper session timeout in milliseconds. It is used in two different ways.
|
|||
session timeout will be the one specified by this configuration. But, a region server that connects
|
||||
to an ensemble managed with a different configuration will be subjected that ensemble's maxSessionTimeout. So,
|
||||
even though HBase might propose using 90 seconds, the ensemble can have a max timeout lower than this and
|
||||
it will take precedence. The current default that ZK ships with is 40 seconds, which is lower than HBase's.
|
||||
it will take precedence. The current default maxSessionTimeout that ZK ships with is 40 seconds, which is lower than HBase's.
|
||||
|
||||
+
|
||||
.Default
|
||||
|
|
|
@ -1142,6 +1142,7 @@ Disable Nagle’s algorithm. Delayed ACKs can add up to ~200ms to RPC round trip
|
|||
Detect regionserver failure as fast as reasonable. Set the following parameters:
|
||||
|
||||
* In `hbase-site.xml`, set `zookeeper.session.timeout` to 30 seconds or less to bound failure detection (20-30 seconds is a good start).
|
||||
- Notice: the `sessionTimeout` of zookeeper is limited between 2 times and 20 times the `tickTime`(the basic time unit in milliseconds used by ZooKeeper.the default value is 2000ms.It is used to do heartbeats and the minimum session timeout will be twice the tickTime).
|
||||
* Detect and avoid unhealthy or failed HDFS DataNodes: in `hdfs-site.xml` and `hbase-site.xml`, set the following parameters:
|
||||
- `dfs.namenode.avoid.read.stale.datanode = true`
|
||||
- `dfs.namenode.avoid.write.stale.datanode = true`
|
||||
|
|
Loading…
Reference in New Issue