HBASE-24535: Tweak the master registry docs for branch-2 (#1890)
Updated to include changes in HBASE-24265 and some rewording to make it version agnostic. Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
This commit is contained in:
parent
21fe873eba
commit
fd5002d0da
|
@ -261,8 +261,8 @@ For region name, we only accept `byte[]` as the parameter type and it may be a f
|
|||
Information on non-Java clients and custom protocols is covered in <<external_apis>>
|
||||
|
||||
[[client.masterregistry]]
|
||||
=== Master registry (new as of release 3.0.0)
|
||||
|
||||
=== Master Registry (new as of 2.3.0)
|
||||
Client internally works with a _connection registry_ to fetch the metadata needed by connections.
|
||||
This connection registry implementation is responsible for fetching the following metadata.
|
||||
|
||||
|
@ -271,18 +271,18 @@ This connection registry implementation is responsible for fetching the followin
|
|||
* Cluster ID (unique to this cluster)
|
||||
|
||||
This information is needed as a part of various client operations like connection set up, scans,
|
||||
gets etc. Up until releases 2.x.y, the default connection registry is based on ZooKeeper as the
|
||||
source of truth and the the clients fetched the metadata from zookeeper znodes. As of release 3.0.0,
|
||||
the default implementation for connection registry has been switched to a master based
|
||||
implementation. With this change, the clients now fetch the required metadata from master RPC end
|
||||
points directly. This change was done for the following reasons.
|
||||
gets, etc. Traditionally, the connection registry implementation has been based on ZooKeeper as the
|
||||
source of truth and clients fetched the metadata directly from the ZooKeeper quorum. HBase 2.3.0
|
||||
introduces a new connection registry implementation based on direct communication with the Masters.
|
||||
With this implementation, clients now fetch required metadata via master RPC end points instead of
|
||||
maintaining connections to ZooKeeper. This change was done for the following reasons.
|
||||
|
||||
* Reduce load on ZooKeeper since that is critical for cluster operation.
|
||||
* Holistic client timeout and retry configurations since the new registry brings all the client
|
||||
operations under HBase rpc framework.
|
||||
* Remove the ZooKeeper client dependency on HBase client library.
|
||||
|
||||
This means that
|
||||
This means:
|
||||
|
||||
* At least a single active or stand by master is needed for cluster connection setup. Refer to
|
||||
<<master.runtime>> for more details.
|
||||
|
@ -293,22 +293,42 @@ HMasters instead of ZooKeeper ensemble`
|
|||
|
||||
To reduce hot-spotting on a single master, all the masters (active & stand-by) expose the needed
|
||||
service to fetch the connection metadata. This lets the client connect to any master (not just active).
|
||||
Both ZooKeeper- and Master-based connection registry implementations are available in 2.3+. For
|
||||
2.3 and earlier, the ZooKeeper-based implementation remains the default configuration.
|
||||
The Master-based implementation becomes the default in 3.0.0.
|
||||
|
||||
==== RPC hedging
|
||||
Change the connection registry implementation by updating the value configured for
|
||||
`hbase.client.registry.impl`. To explicitly enable the ZooKeeper-based registry, use
|
||||
|
||||
This feature also implements an new RPC channel that can hedge requests to multiple masters. This
|
||||
lets the client make the same request to multiple servers and which ever responds first is returned
|
||||
back to the client and the other other in-flight requests are canceled. This improves the
|
||||
performance, especially when a subset of servers are under load. The hedging fan out size is
|
||||
configurable, meaning the number of requests that are hedged in a single attempt, using the
|
||||
configuration key _hbase.rpc.hedged.fanout_ in the client configuration. It defaults to 2. With this
|
||||
default, the RPCs are tried in batches of 2. The hedging policy is still primitive and does not
|
||||
[source, xml]
|
||||
<property>
|
||||
<name>hbase.client.registry.impl</name>
|
||||
<value>org.apache.hadoop.hbase.client.ZKConnectionRegistry</value>
|
||||
</property>
|
||||
|
||||
To explicitly enable the Master-based registry, use
|
||||
|
||||
[source, xml]
|
||||
<property>
|
||||
<name>hbase.client.registry.impl</name>
|
||||
<value>org.apache.hadoop.hbase.client.MasterRegistry</value>
|
||||
</property>
|
||||
|
||||
==== MasterRegistry RPC hedging
|
||||
|
||||
MasterRegistry implements hedging of connection registry RPCs across active and stand-by masters.
|
||||
This lets the client make the same request to multiple servers and which ever responds first is
|
||||
returned back to the client immediately. This improves performance, especially when a subset of
|
||||
servers are under load. The hedging fan out size is configurable, meaning the number of requests
|
||||
that are hedged in a single attempt, using the configuration key
|
||||
_hbase.client.master_registry.hedged.fanout_ in the client configuration. It defaults to 2. With
|
||||
this default, the RPCs are tried in batches of 2. The hedging policy is still primitive and does not
|
||||
adapt to any sort of live rpc performance metrics.
|
||||
|
||||
==== Additional Notes
|
||||
|
||||
* Clients hedge the requests in a randomized order to avoid hot-spotting a single server.
|
||||
* Cluster internal connections (master<->regionservers) still use ZooKeeper based connection
|
||||
* Clients hedge the requests in a randomized order to avoid hot-spotting a single master.
|
||||
* Cluster internal connections (masters <-> regionservers) still use ZooKeeper based connection
|
||||
registry.
|
||||
* Cluster internal state is still tracked in Zookeeper, hence ZK availability requirements are same
|
||||
as before.
|
||||
|
|
Loading…
Reference in New Issue