HBASE-22700:incorrect timeout in recommended ZooKeeper configuration

Signed-off-by: Guanghao Zhang <zghao@apache.org>
This commit is contained in:
毛蛤丝 2019-07-17 22:08:52 +08:00 committed by Guanghao
parent 9d5e5adaf0
commit 54b514d62a
4 changed files with 4 additions and 3 deletions

View File

@ -380,7 +380,7 @@ possible configurations would overwhelm and obscure the important.
But, a region server that connects to an ensemble managed with a different configuration But, a region server that connects to an ensemble managed with a different configuration
will be subjected that ensemble's maxSessionTimeout. So, even though HBase might propose will be subjected that ensemble's maxSessionTimeout. So, even though HBase might propose
using 90 seconds, the ensemble can have a max timeout lower than this and it will take using 90 seconds, the ensemble can have a max timeout lower than this and it will take
precedence. The current default that ZK ships with is 40 seconds, which is lower than precedence. The current default maxSessionTimeout that ZK ships with is 40 seconds, which is lower than
HBase's. HBase's.
</description> </description>
</property> </property>

View File

@ -749,7 +749,7 @@ See link:https://issues.apache.org/jira/browse/HBASE-6389[HBASE-6389 Modify the
[[sect.zookeeper.session.timeout]] [[sect.zookeeper.session.timeout]]
===== `zookeeper.session.timeout` ===== `zookeeper.session.timeout`
The default timeout is three minutes (specified in milliseconds). This means that if a server crashes, it will be three minutes before the Master notices the crash and starts recovery. The default timeout is 90 seconds (specified in milliseconds). This means that if a server crashes, it will be 90 seconds before the Master notices the crash and starts recovery.
You might need to tune the timeout down to a minute or even less so the Master notices failures sooner. You might need to tune the timeout down to a minute or even less so the Master notices failures sooner.
Before changing this value, be sure you have your JVM garbage collection configuration under control, otherwise, a long garbage collection that lasts beyond the ZooKeeper session timeout will take out your RegionServer. (You might be fine with this -- you probably want recovery to start on the server if a RegionServer has been in GC for a long period of time). Before changing this value, be sure you have your JVM garbage collection configuration under control, otherwise, a long garbage collection that lasts beyond the ZooKeeper session timeout will take out your RegionServer. (You might be fine with this -- you probably want recovery to start on the server if a RegionServer has been in GC for a long period of time).

View File

@ -465,7 +465,7 @@ ZooKeeper session timeout in milliseconds. It is used in two different ways.
session timeout will be the one specified by this configuration. But, a region server that connects session timeout will be the one specified by this configuration. But, a region server that connects
to an ensemble managed with a different configuration will be subjected that ensemble's maxSessionTimeout. So, to an ensemble managed with a different configuration will be subjected that ensemble's maxSessionTimeout. So,
even though HBase might propose using 90 seconds, the ensemble can have a max timeout lower than this and even though HBase might propose using 90 seconds, the ensemble can have a max timeout lower than this and
it will take precedence. The current default that ZK ships with is 40 seconds, which is lower than HBase's. it will take precedence. The current default maxSessionTimeout that ZK ships with is 40 seconds, which is lower than HBase's.
+ +
.Default .Default

View File

@ -1142,6 +1142,7 @@ Disable Nagles algorithm. Delayed ACKs can add up to ~200ms to RPC round trip
Detect regionserver failure as fast as reasonable. Set the following parameters: Detect regionserver failure as fast as reasonable. Set the following parameters:
* In `hbase-site.xml`, set `zookeeper.session.timeout` to 30 seconds or less to bound failure detection (20-30 seconds is a good start). * In `hbase-site.xml`, set `zookeeper.session.timeout` to 30 seconds or less to bound failure detection (20-30 seconds is a good start).
- Notice: the `sessionTimeout` of zookeeper is limited between 2 times and 20 times the `tickTime`(the basic time unit in milliseconds used by ZooKeeper.the default value is 2000ms.It is used to do heartbeats and the minimum session timeout will be twice the tickTime).
* Detect and avoid unhealthy or failed HDFS DataNodes: in `hdfs-site.xml` and `hbase-site.xml`, set the following parameters: * Detect and avoid unhealthy or failed HDFS DataNodes: in `hdfs-site.xml` and `hbase-site.xml`, set the following parameters:
- `dfs.namenode.avoid.read.stale.datanode = true` - `dfs.namenode.avoid.read.stale.datanode = true`
- `dfs.namenode.avoid.write.stale.datanode = true` - `dfs.namenode.avoid.write.stale.datanode = true`