diff --git a/src/main/asciidoc/_chapters/appendix_hfile_format.adoc b/src/main/asciidoc/_chapters/appendix_hfile_format.adoc index 0f37beb3c88..98659c26ccf 100644 --- a/src/main/asciidoc/_chapters/appendix_hfile_format.adoc +++ b/src/main/asciidoc/_chapters/appendix_hfile_format.adoc @@ -106,11 +106,11 @@ In the version 2 every block in the data section contains the following fields: .. BLOOM_CHUNK – Bloom filter chunks .. META – meta blocks (not used for Bloom filters in version 2 anymore) .. INTERMEDIATE_INDEX – intermediate-level index blocks in a multi-level blockindex -.. ROOT_INDEX – root>level index blocks in a multi>level block index -.. FILE_INFO – the ``file info'' block, a small key>value map of metadata -.. BLOOM_META – a Bloom filter metadata block in the load>on>open section -.. TRAILER – a fixed>size file trailer. - As opposed to the above, this is not an HFile v2 block but a fixed>size (for each HFile version) data structure +.. ROOT_INDEX – root-level index blocks in a multi-level block index +.. FILE_INFO – the ''file info'' block, a small key-value map of metadata +.. BLOOM_META – a Bloom filter metadata block in the load-on-open section +.. TRAILER – a fixed-size file trailer. + As opposed to the above, this is not an HFile v2 block but a fixed-size (for each HFile version) data structure .. INDEX_V1 – this block type is only used for legacy HFile v1 block . Compressed size of the block's data, not including the header (int). + @@ -127,7 +127,7 @@ The above format of blocks is used in the following HFile sections: Scanned block section:: The section is named so because it contains all data blocks that need to be read when an HFile is scanned sequentially. - Also contains leaf block index and Bloom chunk blocks. + Also contains Leaf index blocks and Bloom chunk blocks. Non-scanned block section:: This section still contains unified-format v2 blocks but it does not have to be read when doing a sequential scan. This section contains "meta" blocks and intermediate-level index blocks. @@ -140,10 +140,10 @@ There are three types of block indexes in HFile version 2, stored in two differe . Data index -- version 2 multi-level block index, consisting of: .. Version 2 root index, stored in the data block index section of the file -.. Optionally, version 2 intermediate levels, stored in the non%root format in the data index section of the file. Intermediate levels can only be present if leaf level blocks are present -.. Optionally, version 2 leaf levels, stored in the non%root format inline with data blocks +.. Optionally, version 2 intermediate levels, stored in the non-root format in the data index section of the file. Intermediate levels can only be present if leaf level blocks are present +.. Optionally, version 2 leaf levels, stored in the non-root format inline with data blocks . Meta index -- version 2 root index format only, stored in the meta index section of the file -. Bloom index -- version 2 root index format only, stored in the ``load-on-open'' section as part of Bloom filter metadata. +. Bloom index -- version 2 root index format only, stored in the ''load-on-open'' section as part of Bloom filter metadata. ==== Root block index format in version 2 @@ -156,7 +156,7 @@ A version 2 root index block is a sequence of entries of the following format, s . Offset (long) + -This offset may point to a data block or to a deeper>level index block. +This offset may point to a data block or to a deeper-level index block. . On-disk size (int) . Key (a serialized byte array stored using Bytes.writeByteArray) @@ -172,7 +172,7 @@ For the data index and the meta index the number of entries is stored in the tra For a multi-level block index we also store the following fields in the root index block in the load-on-open section of the HFile, in addition to the data structure described above: . Middle leaf index block offset -. Middle leaf block on-disk size (meaning the leaf index block containing the reference to the ``middle'' data block of the file) +. Middle leaf block on-disk size (meaning the leaf index block containing the reference to the ''middle'' data block of the file) . The index of the mid-key (defined below) in the middle leaf-level block. @@ -200,9 +200,9 @@ Every non-root index block is structured as follows. . Entries. Each entry contains: + -. Offset of the block referenced by this entry in the file (long) -. On>disk size of the referenced block (int) -. Key. +.. Offset of the block referenced by this entry in the file (long) +.. On-disk size of the referenced block (int) +.. Key. The length can be calculated from entryOffsets. @@ -214,7 +214,7 @@ In contrast with version 1, in a version 2 HFile Bloom filter metadata is stored + . Bloom filter version = 3 (int). There used to be a DynamicByteBloomFilter class that had the Bloom filter version number 2 . The total byte size of all compound Bloom filter chunks (long) -. Number of hash functions (int +. Number of hash functions (int) . Type of hash functions (int) . The total key count inserted into the Bloom filter (long) . The maximum total number of keys in the Bloom filter (long) @@ -246,7 +246,7 @@ This is because we need to know the comparator at the time of parsing the load-o ==== Fixed file trailer format differences between versions 1 and 2 The following table shows common and different fields between fixed file trailers in versions 1 and 2. -Note that the size of the trailer is different depending on the version, so it is ``fixed'' only within one version. +Note that the size of the trailer is different depending on the version, so it is ''fixed'' only within one version. However, the version is always stored as the last four-byte integer in the file. .Differences between HFile Versions 1 and 2 diff --git a/src/main/asciidoc/_chapters/architecture.adoc b/src/main/asciidoc/_chapters/architecture.adoc index c942e5418e8..218e6744a56 100644 --- a/src/main/asciidoc/_chapters/architecture.adoc +++ b/src/main/asciidoc/_chapters/architecture.adoc @@ -260,6 +260,93 @@ For region name, we only accept `byte[]` as the parameter type and it may be a f Information on non-Java clients and custom protocols is covered in <> +[[client.masterregistry]] + +=== Master Registry (new as of 2.3.0) +Client internally works with a _connection registry_ to fetch the metadata needed by connections. +This connection registry implementation is responsible for fetching the following metadata. + +* Active master address +* Current meta region(s) locations +* Cluster ID (unique to this cluster) + +This information is needed as a part of various client operations like connection set up, scans, +gets, etc. Traditionally, the connection registry implementation has been based on ZooKeeper as the +source of truth and clients fetched the metadata directly from the ZooKeeper quorum. HBase 2.3.0 +introduces a new connection registry implementation based on direct communication with the Masters. +With this implementation, clients now fetch required metadata via master RPC end points instead of +maintaining connections to ZooKeeper. This change was done for the following reasons. + +* Reduce load on ZooKeeper since that is critical for cluster operation. +* Holistic client timeout and retry configurations since the new registry brings all the client +operations under HBase rpc framework. +* Remove the ZooKeeper client dependency on HBase client library. + +This means: + +* At least a single active or stand by master is needed for cluster connection setup. Refer to +<> for more details. +* Master can be in a critical path of read/write operations, especially if the client metadata cache +is empty or stale. +* There is higher connection load on the masters that before since the clients talk directly to +HMasters instead of ZooKeeper ensemble` + +To reduce hot-spotting on a single master, all the masters (active & stand-by) expose the needed +service to fetch the connection metadata. This lets the client connect to any master (not just active). +Both ZooKeeper- and Master-based connection registry implementations are available in 2.3+. For +2.3 and earlier, the ZooKeeper-based implementation remains the default configuration. +The Master-based implementation becomes the default in 3.0.0. + +Change the connection registry implementation by updating the value configured for +`hbase.client.registry.impl`. To explicitly enable the ZooKeeper-based registry, use + +[source, xml] + + hbase.client.registry.impl + org.apache.hadoop.hbase.client.ZKConnectionRegistry + + +To explicitly enable the Master-based registry, use + +[source, xml] + + hbase.client.registry.impl + org.apache.hadoop.hbase.client.MasterRegistry + + +==== MasterRegistry RPC hedging + +MasterRegistry implements hedging of connection registry RPCs across active and stand-by masters. +This lets the client make the same request to multiple servers and which ever responds first is +returned back to the client immediately. This improves performance, especially when a subset of +servers are under load. The hedging fan out size is configurable, meaning the number of requests +that are hedged in a single attempt, using the configuration key +_hbase.client.master_registry.hedged.fanout_ in the client configuration. It defaults to 2. With +this default, the RPCs are tried in batches of 2. The hedging policy is still primitive and does not +adapt to any sort of live rpc performance metrics. + +==== Additional Notes + +* Clients hedge the requests in a randomized order to avoid hot-spotting a single master. +* Cluster internal connections (masters <-> regionservers) still use ZooKeeper based connection +registry. +* Cluster internal state is still tracked in Zookeeper, hence ZK availability requirements are same +as before. +* Inter cluster replication still uses ZooKeeper based connection registry to simplify configuration +management. + +For more implementation details, please refer to the https://github.com/apache/hbase/tree/master/dev-support/design-docs[design doc] and +https://issues.apache.org/jira/browse/HBASE-18095[HBASE-18095]. + +''' +NOTE: (Advanced) In case of any issues with the master based registry, use the following +configuration to fallback to the ZooKeeper based connection registry implementation. +[source, xml] + + hbase.client.registry.impl + org.apache.hadoop.hbase.client.ZKConnectionRegistry + + [[client.filter]] == Client Request Filters @@ -466,8 +553,8 @@ scan.setFilter(f); scan.setBatch(10); // set this if there could be many columns returned ResultScanner rs = t.getScanner(scan); for (Result r = rs.next(); r != null; r = rs.next()) { - for (KeyValue kv : r.raw()) { - // each kv represents a column + for (Cell cell : result.listCells()) { + // each cell represents a column } } rs.close(); @@ -496,8 +583,8 @@ scan.setFilter(f); scan.setBatch(10); // set this if there could be many columns returned ResultScanner rs = t.getScanner(scan); for (Result r = rs.next(); r != null; r = rs.next()) { - for (KeyValue kv : r.raw()) { - // each kv represents a column + for (Cell cell : result.listCells()) { + // each cell represents a column } } rs.close(); @@ -532,8 +619,8 @@ scan.setFilter(f); scan.setBatch(10); // set this if there could be many columns returned ResultScanner rs = t.getScanner(scan); for (Result r = rs.next(); r != null; r = rs.next()) { - for (KeyValue kv : r.raw()) { - // each kv represents a column + for (Cell cell : result.listCells()) { + // each cell represents a column } } rs.close(); @@ -577,11 +664,24 @@ If the active Master loses its lease in ZooKeeper (or the Master shuts down), th [[master.runtime]] === Runtime Impact -A common dist-list question involves what happens to an HBase cluster when the Master goes down. +A common dist-list question involves what happens to an HBase cluster when the Master goes down. This information has changed staring 3.0.0. + +==== Up until releases 2.x.y Because the HBase client talks directly to the RegionServers, the cluster can still function in a "steady state". Additionally, per <>, `hbase:meta` exists as an HBase table and is not resident in the Master. However, the Master controls critical functions such as RegionServer failover and completing region splits. So while the cluster can still run for a short time without the Master, the Master should be restarted as soon as possible. +==== Staring release 3.0.0 +As mentioned in section <>, the default connection registry for clients is now based on master rpc end points. Hence the requirements for +masters' uptime are even tighter starting this release. + +- At least one active or stand by master is needed for a connection set up, unlike before when all the clients needed was a ZooKeeper ensemble. +- Master is now in critical path for read/write operations. For example, if the meta region bounces off to a different region server, clients +need master to fetch the new locations. Earlier this was done by fetching this information directly from ZooKeeper. +- Masters will now have higher connection load than before. So, the server side configuration might need adjustment depending on the load. + +Overall, the master uptime requirements, when this feature is enabled, are even higher for the client operations to go through. + [[master.api]] === Interface @@ -610,6 +710,83 @@ See <> for more information on region assignment. Periodically checks and cleans up the `hbase:meta` table. See <> for more information on the meta table. +[[master.wal]] +=== MasterProcWAL + +_MasterProcWAL is replaced in hbase-2.3.0 by an alternate Procedure Store implementation; see +<>. This section pertains to hbase-2.0.0 through hbase-2.2.x_ + +HMaster records administrative operations and their running states, such as the handling of a crashed server, +table creation, and other DDLs, into a Procedure Store. The Procedure Store WALs are stored under the +MasterProcWALs directory. The Master WALs are not like RegionServer WALs. Keeping up the Master WAL allows +us run a state machine that is resilient across Master failures. For example, if a HMaster was in the +middle of creating a table encounters an issue and fails, the next active HMaster can take up where +the previous left off and carry the operation to completion. Since hbase-2.0.0, a +new AssignmentManager (A.K.A AMv2) was introduced and the HMaster handles region assignment +operations, server crash processing, balancing, etc., all via AMv2 persisting all state and +transitions into MasterProcWALs rather than up into ZooKeeper, as we do in hbase-1.x. + +See <> (and <> for its basis) if you would like to learn more about the new +AssignmentManager. + +[[master.wal.conf]] +==== Configurations for MasterProcWAL +Here are the list of configurations that effect MasterProcWAL operation. +You should not have to change your defaults. + +[[hbase.procedure.store.wal.periodic.roll.msec]] +*`hbase.procedure.store.wal.periodic.roll.msec`*:: ++ +.Description +Frequency of generating a new WAL ++ +.Default +`1h (3600000 in msec)` + +[[hbase.procedure.store.wal.roll.threshold]] +*`hbase.procedure.store.wal.roll.threshold`*:: ++ +.Description +Threshold in size before the WAL rolls. Every time the WAL reaches this size or the above period, 1 hour, passes since last log roll, the HMaster will generate a new WAL. ++ +.Default +`32MB (33554432 in byte)` + +[[hbase.procedure.store.wal.warn.threshold]] +*`hbase.procedure.store.wal.warn.threshold`*:: ++ +.Description +If the number of WALs goes beyond this threshold, the following message should appear in the HMaster log with WARN level when rolling. + + procedure WALs count=xx above the warning threshold 64. check running procedures to see if something is stuck. + ++ +.Default +`64` + +[[hbase.procedure.store.wal.max.retries.before.roll]] +*`hbase.procedure.store.wal.max.retries.before.roll`*:: ++ +.Description +Max number of retry when syncing slots (records) to its underlying storage, such as HDFS. Every attempt, the following message should appear in the HMaster log. + + unable to sync slots, retry=xx + ++ +.Default +`3` + +[[hbase.procedure.store.wal.sync.failure.roll.max]] +*`hbase.procedure.store.wal.sync.failure.roll.max`*:: ++ +.Description +After the above 3 retrials, the log is rolled and the retry count is reset to 0, thereon a new set of retrial starts. This configuration controls the max number of attempts of log rolling upon sync failure. That is, HMaster is allowed to fail to sync 9 times in total. Once it exceeds, the following log should appear in the HMaster log. + + Sync slots after log roll failed, abort. ++ +.Default +`3` + [[regionserver.arch]] == RegionServer @@ -779,7 +956,7 @@ Here are two use cases: Setting block caching on such a table is a waste of memory and CPU cycles, more so that it will generate more garbage to pick up by the JVM. For more information on monitoring GC, see <>. * Mapping a table: In a typical MapReduce job that takes a table in input, every row will be read only once so there's no need to put them into the block cache. - The Scan object has the option of turning this off via the setCaching method (set it to false). You can still keep block caching turned on on this table if you need fast random read access. + The Scan object has the option of turning this off via the setCacheBlocks method (set it to false). You can still keep block caching turned on on this table if you need fast random read access. An example would be counting the number of rows in a table that serves live traffic, caching every block of that table would create massive churn and would surely evict data that's currently in use. [[data.blocks.in.fscache]] @@ -831,8 +1008,9 @@ benefit of NOT provoking GC. From HBase 2.0.0 onwards, the notions of L1 and L2 have been deprecated. When BucketCache is turned on, the DATA blocks will always go to BucketCache and INDEX/BLOOM blocks go to on heap LRUBlockCache. `cacheDataInL1` support hase been removed. ==== -The BucketCache Block Cache can be deployed _off-heap_, _file_ or _mmaped_ file mode. - +[[bc.deloy.modes]] +====== BucketCache Deploy Modes +The BucketCache Block Cache can be deployed _offheap_, _file_ or _mmaped_ file mode. You set which via the `hbase.bucketcache.ioengine` setting. Setting it to `offheap` will have BucketCache make its allocations off-heap, and an ioengine setting of `file:PATH_TO_FILE` will direct BucketCache to use file caching (Useful in particular if you have some fast I/O attached to the box such as SSDs). From 2.0.0, it is possible to have more than one file backing the BucketCache. This is very useful specially when the Cache size requirement is high. For multiple backing files, configure ioengine as `files:PATH_TO_FILE1,PATH_TO_FILE2,PATH_TO_FILE3`. BucketCache can be configured to use an mmapped file also. Configure ioengine as `mmap:PATH_TO_FILE` for this. @@ -851,6 +1029,7 @@ See the link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfil To check it enabled, look for the log line describing cache setup; it will detail how BucketCache has been deployed. Also see the UI. It will detail the cache tiering and their configuration. +[[bc.example]] ====== BucketCache Example Configuration This sample provides a configuration for a 4 GB off-heap BucketCache with a 1 GB on-heap cache. @@ -894,14 +1073,14 @@ In the above, we set the BucketCache to be 4G. We configured the on-heap LruBlockCache have 20% (0.2) of the RegionServer's heap size (0.2 * 5G = 1G). In other words, you configure the L1 LruBlockCache as you would normally (as if there were no L2 cache present). link:https://issues.apache.org/jira/browse/HBASE-10641[HBASE-10641] introduced the ability to configure multiple sizes for the buckets of the BucketCache, in HBase 0.98 and newer. -To configurable multiple bucket sizes, configure the new property `hfile.block.cache.sizes` (instead of `hfile.block.cache.size`) to a comma-separated list of block sizes, ordered from smallest to largest, with no spaces. +To configurable multiple bucket sizes, configure the new property `hbase.bucketcache.bucket.sizes` to a comma-separated list of block sizes, ordered from smallest to largest, with no spaces. The goal is to optimize the bucket sizes based on your data access patterns. The following example configures buckets of size 4096 and 8192. [source,xml] ---- - hfile.block.cache.sizes + hbase.bucketcache.bucket.sizes 4096,8192 ---- @@ -1145,127 +1324,40 @@ WAL log splitting and recovery can be resource intensive and take a long time, d Distributed log processing is enabled by default since HBase 0.92. The setting is controlled by the `hbase.master.distributed.log.splitting` property, which can be set to `true` or `false`, but defaults to `true`. -[[log.splitting.step.by.step]] -.Distributed Log Splitting, Step by Step +==== WAL splitting based on procedureV2 +After HBASE-20610, we introduce a new way to do WAL splitting coordination by procedureV2 framework. This can simplify the process of WAL splitting and no need to connect zookeeper any more. -After configuring distributed log splitting, the HMaster controls the process. -The HMaster enrolls each RegionServer in the log splitting process, and the actual work of splitting the logs is done by the RegionServers. -The general process for log splitting, as described in <> still applies here. +[[background]] +.Background +Currently, splitting WAL processes are coordinated by zookeeper. Each region server are trying to grab tasks from zookeeper. And the burden becomes heavier when the number of region server increase. -. If distributed log processing is enabled, the HMaster creates a _split log manager_ instance when the cluster is started. - .. The split log manager manages all log files which need to be scanned and split. - .. The split log manager places all the logs into the ZooKeeper splitWAL node (_/hbase/splitWAL_) as tasks. - .. You can view the contents of the splitWAL by issuing the following `zkCli` command. Example output is shown. -+ -[source,bash] ----- -ls /hbase/splitWAL -[hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2FWALs%2Fhost8.sample.com%2C57020%2C1340474893275-splitting%2Fhost8.sample.com%253A57020.1340474893900, -hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2FWALs%2Fhost3.sample.com%2C57020%2C1340474893299-splitting%2Fhost3.sample.com%253A57020.1340474893931, -hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2FWALs%2Fhost4.sample.com%2C57020%2C1340474893287-splitting%2Fhost4.sample.com%253A57020.1340474893946] ----- -+ -The output contains some non-ASCII characters. -When decoded, it looks much more simple: -+ ----- -[hdfs://host2.sample.com:56020/hbase/WALs -/host8.sample.com,57020,1340474893275-splitting -/host8.sample.com%3A57020.1340474893900, -hdfs://host2.sample.com:56020/hbase/WALs -/host3.sample.com,57020,1340474893299-splitting -/host3.sample.com%3A57020.1340474893931, -hdfs://host2.sample.com:56020/hbase/WALs -/host4.sample.com,57020,1340474893287-splitting -/host4.sample.com%3A57020.1340474893946] ----- -+ -The listing represents WAL file names to be scanned and split, which is a list of log splitting tasks. +[[implementation.on.master.side]] +.Implementation on Master side +During ServerCrashProcedure, SplitWALManager will create one SplitWALProcedure for each WAL file which should be split. Then each SplitWALProcedure will spawn a SplitWalRemoteProcedure to send the request to region server. +SplitWALProcedure is a StateMachineProcedure and here is the state transfer diagram. -. The split log manager monitors the log-splitting tasks and workers. -+ -The split log manager is responsible for the following ongoing tasks: -+ -* Once the split log manager publishes all the tasks to the splitWAL znode, it monitors these task nodes and waits for them to be processed. -* Checks to see if there are any dead split log workers queued up. - If it finds tasks claimed by unresponsive workers, it will resubmit those tasks. - If the resubmit fails due to some ZooKeeper exception, the dead worker is queued up again for retry. -* Checks to see if there are any unassigned tasks. - If it finds any, it create an ephemeral rescan node so that each split log worker is notified to re-scan unassigned tasks via the `nodeChildrenChanged` ZooKeeper event. -* Checks for tasks which are assigned but expired. - If any are found, they are moved back to `TASK_UNASSIGNED` state again so that they can be retried. - It is possible that these tasks are assigned to slow workers, or they may already be finished. - This is not a problem, because log splitting tasks have the property of idempotence. - In other words, the same log splitting task can be processed many times without causing any problem. -* The split log manager watches the HBase split log znodes constantly. - If any split log task node data is changed, the split log manager retrieves the node data. - The node data contains the current state of the task. - You can use the `zkCli` `get` command to retrieve the current state of a task. - In the example output below, the first line of the output shows that the task is currently unassigned. -+ ----- -get /hbase/splitWAL/hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2FWALs%2Fhost6.sample.com%2C57020%2C1340474893287-splitting%2Fhost6.sample.com%253A57020.1340474893945 +.WAL_splitting_coordination +image::WAL_splitting.png[] -unassigned host2.sample.com:57000 -cZxid = 0×7115 -ctime = Sat Jun 23 11:13:40 PDT 2012 -... ----- -+ -Based on the state of the task whose data is changed, the split log manager does one of the following: -+ -* Resubmit the task if it is unassigned -* Heartbeat the task if it is assigned -* Resubmit or fail the task if it is resigned (see <>) -* Resubmit or fail the task if it is completed with errors (see <>) -* Resubmit or fail the task if it could not complete due to errors (see <>) -* Delete the task if it is successfully completed or failed -+ -[[distributed.log.replay.failure.reasons]] -[NOTE] -.Reasons a Task Will Fail -==== -* The task has been deleted. -* The node no longer exists. -* The log status manager failed to move the state of the task to `TASK_UNASSIGNED`. -* The number of resubmits is over the resubmit threshold. -==== +[[implementation.on.region.server.side]] +.Implementation on Region Server side +Region Server will receive a SplitWALCallable and execute it, which is much more straightforward than before. It will return null if success and return exception if there is any error. -. Each RegionServer's split log worker performs the log-splitting tasks. -+ -Each RegionServer runs a daemon thread called the _split log worker_, which does the work to split the logs. -The daemon thread starts when the RegionServer starts, and registers itself to watch HBase znodes. -If any splitWAL znode children change, it notifies a sleeping worker thread to wake up and grab more tasks. -If a worker's current task's node data is changed, -the worker checks to see if the task has been taken by another worker. -If so, the worker thread stops work on the current task. -+ -The worker monitors the splitWAL znode constantly. -When a new task appears, the split log worker retrieves the task paths and checks each one until it finds an unclaimed task, which it attempts to claim. -If the claim was successful, it attempts to perform the task and updates the task's `state` property based on the splitting outcome. -At this point, the split log worker scans for another unclaimed task. -+ -.How the Split Log Worker Approaches a Task -* It queries the task state and only takes action if the task is in `TASK_UNASSIGNED `state. -* If the task is in `TASK_UNASSIGNED` state, the worker attempts to set the state to `TASK_OWNED` by itself. - If it fails to set the state, another worker will try to grab it. - The split log manager will also ask all workers to rescan later if the task remains unassigned. -* If the worker succeeds in taking ownership of the task, it tries to get the task state again to make sure it really gets it asynchronously. - In the meantime, it starts a split task executor to do the actual work: -** Get the HBase root folder, create a temp folder under the root, and split the log file to the temp folder. -** If the split was successful, the task executor sets the task to state `TASK_DONE`. -** If the worker catches an unexpected IOException, the task is set to state `TASK_ERR`. -** If the worker is shutting down, set the task to state `TASK_RESIGNED`. -** If the task is taken by another worker, just log it. +[[preformance]] +.Performance +According to tests on a cluster which has 5 regionserver and 1 master. +procedureV2 coordinated WAL splitting has a better performance than ZK coordinated WAL splitting no master when restarting the whole cluster or one region server crashing. +[[enable.this.feature]] +.Enable this feature +To enable this feature, first we should ensure our package of HBase already contains these code. If not, please upgrade the package of HBase cluster without any configuration change first. +Then change configuration 'hbase.split.wal.zk.coordinated' to false. Rolling upgrade the master with new configuration. Now WAL splitting are handled by our new implementation. +But region server are still trying to grab tasks from zookeeper, we can rolling upgrade the region servers with the new configuration to stop that. -. The split log manager monitors for uncompleted tasks. -+ -The split log manager returns when all tasks are completed successfully. -If all tasks are completed with some failures, the split log manager throws an exception so that the log splitting can be retried. -Due to an asynchronous implementation, in very rare cases, the split log manager loses track of some completed tasks. -For that reason, it periodically checks for remaining uncompleted task in its task map or ZooKeeper. -If none are found, it throws an exception so that the log splitting can be retried right away instead of hanging there waiting for something that won't happen. +* steps as follows: +** Upgrade whole cluster to get the new Implementation. +** Upgrade Master with new configuration 'hbase.split.wal.zk.coordinated'=false. +** Upgrade region server to stop grab tasks from zookeeper. [[wal.compression]] ==== WAL Compression ==== @@ -1295,6 +1387,26 @@ It is possible to set _durability_ on each Mutation or on a Table basis. Options Do not confuse the _ASYNC_WAL_ option on a Mutation or Table with the _AsyncFSWAL_ writer; they are distinct options unfortunately closely named +[[arch.custom.wal.dir]] +==== Custom WAL Directory +HBASE-17437 added support for specifying a WAL directory outside the HBase root directory or even in a different FileSystem since 1.3.3/2.0+. Some FileSystems (such as Amazon S3) don’t support append or consistent writes, in such scenario WAL directory needs to be configured in a different FileSystem to avoid loss of writes. + +Following configurations are added to accomplish this: + +. `hbase.wal.dir` ++ +This defines where the root WAL directory is located, could be on a different FileSystem than the root directory. WAL directory can not be set to a subdirectory of the root directory. The default value of this is the root directory if unset. + +. `hbase.rootdir.perms` ++ +Configures FileSystem permissions to set on the root directory. This is '700' by default. + +. `hbase.wal.dir.perms` ++ +Configures FileSystem permissions to set on the WAL directory FileSystem. This is '700' by default. + +NOTE: While migrating to custom WAL dir (outside the HBase root directory or a different FileSystem) existing WAL files must be copied manually to new WAL dir, otherwise it may lead to data loss/inconsistency as HMaster has no information about previous WAL directory. + [[wal.disable]] ==== Disabling the WAL @@ -1681,6 +1793,9 @@ For example, to view the content of the file _hdfs://10.81.47.41:8020/hbase/defa If you leave off the option -v to see just a summary on the HFile. See usage for other things to do with the `hfile` tool. +NOTE: In the output of this tool, you might see 'seqid=0' for certain keys in places such as 'Mid-key'/'firstKey'/'lastKey'. These are + 'KeyOnlyKeyValue' type instances - meaning their seqid is irrelevant & we just need the keys of these Key-Value instances. + [[store.file.dir]] ===== StoreFile Directory Structure on HDFS @@ -1794,8 +1909,8 @@ Instead, the expired data is filtered out and is not written back to the compact [[compaction.and.versions]] .Compaction and Versions -When you create a Column Family, you can specify the maximum number of versions to keep, by specifying `HColumnDescriptor.setMaxVersions(int versions)`. -The default value is `3`. +When you create a Column Family, you can specify the maximum number of versions to keep, by specifying `ColumnFamilyDescriptorBuilder.setMaxVersions(int versions)`. +The default value is `1`. If more versions than the specified maximum exist, the excess versions are filtered out and not written back to the compacted StoreFile. .Major Compactions Can Impact Query Results @@ -2436,14 +2551,8 @@ To control these for stripe compactions, use `hbase.store.stripe.compaction.minF HBase includes several methods of loading data into tables. The most straightforward method is to either use the `TableOutputFormat` class from a MapReduce job, or use the normal client APIs; however, these are not always the most efficient methods. -The bulk load feature uses a MapReduce job to output table data in HBase's internal data format, and then directly loads the generated StoreFiles into a running cluster. -Using bulk load will use less CPU and network resources than simply using the HBase API. - -[[arch.bulk.load.limitations]] -=== Bulk Load Limitations - -As bulk loading bypasses the write path, the WAL doesn't get written to as part of the process. -Replication works by reading the WAL files so it won't see the bulk loaded data – and the same goes for the edits that use `Put.setDurability(SKIP_WAL)`. One way to handle that is to ship the raw files or the HFiles to the other cluster and do the other processing there. +The bulk load feature uses a MapReduce job to output table data in HBase's internal data format, and then directly load the generated StoreFiles into a running cluster. +Using bulk load will use less CPU and network resources than loading via the HBase API. [[arch.bulk.load.arch]] === Bulk Load Architecture @@ -2454,7 +2563,7 @@ The HBase bulk load process consists of two main steps. ==== Preparing data via a MapReduce job The first step of a bulk load is to generate HBase data files (StoreFiles) from a MapReduce job using `HFileOutputFormat2`. -This output format writes out data in HBase's internal storage format so that they can be later loaded very efficiently into the cluster. +This output format writes out data in HBase's internal storage format so that they can be later loaded efficiently into the cluster. In order to function efficiently, `HFileOutputFormat2` must be configured such that each output HFile fits within a single region. In order to do this, jobs whose output will be bulk loaded into HBase use Hadoop's `TotalOrderPartitioner` class to partition the map output into disjoint ranges of the key space, corresponding to the key ranges of the regions in the table. @@ -2471,22 +2580,20 @@ It then contacts the appropriate RegionServer which adopts the HFile, moving it If the region boundaries have changed during the course of bulk load preparation, or between the preparation and completion steps, the `completebulkload` utility will automatically split the data files into pieces corresponding to the new boundaries. This process is not optimally efficient, so users should take care to minimize the delay between preparing a bulk load and importing it into the cluster, especially if other clients are simultaneously loading data through other means. +[[arch.bulk.load.complete.help]] [source,bash] ---- -$ hadoop jar hbase-server-VERSION.jar completebulkload [-c /path/to/hbase/config/hbase-site.xml] /user/todd/myoutput mytable +$ hadoop jar hbase-mapreduce-VERSION.jar completebulkload [-c /path/to/hbase/config/hbase-site.xml] /user/todd/myoutput mytable ---- The `-c config-file` option can be used to specify a file containing the appropriate hbase parameters (e.g., hbase-site.xml) if not supplied already on the CLASSPATH (In addition, the CLASSPATH must contain the directory that has the zookeeper configuration file if zookeeper is NOT managed by HBase). -NOTE: If the target table does not already exist in HBase, this tool will create the table automatically. - - [[arch.bulk.load.also]] === See Also For more information about the referenced utilities, see <> and <>. -See link:http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/[How-to: Use HBase Bulk Loading, and Why] for a recent blog on current state of bulk loading. +See link:http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/[How-to: Use HBase Bulk Loading, and Why] for an old blog post on loading. [[arch.bulk.load.adv]] === Advanced Usage @@ -2497,6 +2604,79 @@ To get started doing so, dig into `ImportTsv.java` and check the JavaDoc for HFi The import step of the bulk load can also be done programmatically. See the `LoadIncrementalHFiles` class for more information. +[[arch.bulk.load.complete.strays]] +==== 'Adopting' Stray Data +Should an HBase cluster lose account of regions or files during an outage or error, you can use +the `completebulkload` tool to add back the dropped data. HBase operator tooling such as +link:https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2[HBCK2] or +the reporting added to the Master's UI under the `HBCK Report` (Since HBase 2.0.6/2.1.6/2.2.1) +can identify such 'orphan' directories. + +Before you begin the 'adoption', ensure the `hbase:meta` table is in a healthy state. +Run the `CatalogJanitor` by executing the `catalogjanitor_run` command on the HBase shell. +When finished, check the `HBCK Report` page on the Master UI. Work on fixing any +inconsistencies, holes, or overlaps found before proceeding. The `hbase:meta` table +is the authority on where all data is to be found and must be consistent for +the `completebulkload` tool to work properly. + +The `completebulkload` tool takes a directory and a `tablename`. +The directory has subdirectories named for column families of the targeted `tablename`. +In these subdirectories are `hfiles` to load. Given this structure, you can pass +errant region directories (and the table name to which the region directory belongs) +and the tool will bring the data files back into the fold by moving them under the +approprate serving directory. If stray files, then you will need to mock up this +structure before invoking the `completebulkload` tool; you may have to look at the +file content using the <> to see what the column family to use is. +When the tool completes its run, you will notice that the +source errant directory has had its storefiles moved/removed. It is now desiccated +since its data has been drained, and the pointed-to directory can be safely +removed. It may still have `.regioninfo` files and other +subdirectories but they are of no relevance now (There may be content still +under the _recovered_edits_ directory; a TODO is tooling to replay the +content of _recovered_edits_ if needed; see +link:https://issues.apache.org/jira/browse/HBASE-22976[Add RecoveredEditsPlayer]). +If you pass `completebulkload` a directory without store files, it will run and +note the directory is storefile-free. Just remove such 'empty' directories. + +For example, presuming a directory at the top level in HDFS named +`eb3352fb5c9c9a05feeb2caba101e1cc` has data we need to re-add to the +HBase `TestTable`: + +[source,bash] +---- +$ ${HBASE_HOME}/bin/hbase --config ~/hbase-conf completebulkload hdfs://server.example.org:9000/eb3352fb5c9c9a05feeb2caba101e1cc TestTable +---- + +After it successfully completes, any files that were in `eb3352fb5c9c9a05feeb2caba101e1cc` have been moved +under hbase and the `eb3352fb5c9c9a05feeb2caba101e1cc` directory can be deleted (Check content +before and after by running `ls -r` on the HDFS directory). + +[[arch.bulk.load.replication]] +=== Bulk Loading Replication +HBASE-13153 adds replication support for bulk loaded HFiles, available since HBase 1.3/2.0. This feature is enabled by setting `hbase.replication.bulkload.enabled` to `true` (default is `false`). +You also need to copy the source cluster configuration files to the destination cluster. + +Additional configurations are required too: + +. `hbase.replication.source.fs.conf.provider` ++ +This defines the class which loads the source cluster file system client configuration in the destination cluster. This should be configured for all the RS in the destination cluster. Default is `org.apache.hadoop.hbase.replication.regionserver.DefaultSourceFSConfigurationProvider`. ++ +. `hbase.replication.conf.dir` ++ +This represents the base directory where the file system client configurations of the source cluster are copied to the destination cluster. This should be configured for all the RS in the destination cluster. Default is `$HBASE_CONF_DIR`. ++ +. `hbase.replication.cluster.id` ++ +This configuration is required in the cluster where replication for bulk loaded data is enabled. A source cluster is uniquely identified by the destination cluster using this id. This should be configured for all the RS in the source cluster configuration file for all the RS. ++ + + + +For example: If source cluster FS client configurations are copied to the destination cluster under directory `/home/user/dc1/`, then `hbase.replication.cluster.id` should be configured as `dc1` and `hbase.replication.conf.dir` as `/home/user`. + +NOTE: `DefaultSourceFSConfigurationProvider` supports only `xml` type files. It loads source cluster FS client configuration only once, so if source cluster FS client configuration files are updated, every peer(s) cluster RS must be restarted to reload the configuration. + [[arch.hdfs]] == HDFS diff --git a/src/main/asciidoc/_chapters/community.adoc b/src/main/asciidoc/_chapters/community.adoc index 3a896cf2bd9..ffe209ede77 100644 --- a/src/main/asciidoc/_chapters/community.adoc +++ b/src/main/asciidoc/_chapters/community.adoc @@ -82,20 +82,17 @@ NOTE: End-of-life releases are not included in this list. | Release | Release Manager -| 1.2 -| Sean Busbey - | 1.3 | Mikhail Antonov | 1.4 | Andrew Purtell -| 2.0 -| Michael Stack +| 2.2 +| Guanghao Zhang -| 2.1 -| Duo Zhang +| 2.3 +| Nick Dimiduk |=== diff --git a/src/main/asciidoc/_chapters/compression.adoc b/src/main/asciidoc/_chapters/compression.adoc index b2ff5ce6999..9a1bf0c981f 100644 --- a/src/main/asciidoc/_chapters/compression.adoc +++ b/src/main/asciidoc/_chapters/compression.adoc @@ -336,9 +336,7 @@ If you are changing codecs, be sure the old codec is still available until all t .Enabling Compression on a ColumnFamily of an Existing Table using HBaseShell ---- -hbase> disable 'test' hbase> alter 'test', {NAME => 'cf', COMPRESSION => 'GZ'} -hbase> enable 'test' ---- .Creating a New Table with Compression On a ColumnFamily @@ -436,15 +434,12 @@ Following is an example using HBase Shell: .Enable Data Block Encoding On a Table ---- -hbase> disable 'test' hbase> alter 'test', { NAME => 'cf', DATA_BLOCK_ENCODING => 'FAST_DIFF' } Updating all regions with the new schema... 0/1 regions updated. 1/1 regions updated. Done. 0 row(s) in 2.2820 seconds -hbase> enable 'test' -0 row(s) in 0.1580 seconds ---- .Verifying a ColumnFamily's Data Block Encoding diff --git a/src/main/asciidoc/_chapters/configuration.adoc b/src/main/asciidoc/_chapters/configuration.adoc index f7d6b799270..41d78107f2b 100644 --- a/src/main/asciidoc/_chapters/configuration.adoc +++ b/src/main/asciidoc/_chapters/configuration.adoc @@ -27,62 +27,66 @@ :icons: font :experimental: -This chapter expands upon the <> chapter to further explain configuration of Apache HBase. -Please read this chapter carefully, especially the <> -to ensure that your HBase testing and deployment goes smoothly. -Familiarize yourself with <> as well. +This chapter expands upon the <> chapter to further explain configuration of +Apache HBase. Please read this chapter carefully, especially the +<> to ensure that your HBase testing and deployment goes +smoothly. Familiarize yourself with <> as well. == Configuration Files -Apache HBase uses the same configuration system as Apache Hadoop. -All configuration files are located in the _conf/_ directory, which needs to be kept in sync for each node on your cluster. +Apache HBase uses the same configuration system as Apache Hadoop. All configuration files are +located in the _conf/_ directory, which needs to be kept in sync for each node on your cluster. .HBase Configuration File Descriptions _backup-masters_:: - Not present by default. - A plain-text file which lists hosts on which the Master should start a backup Master process, one host per line. + Not present by default. A plain-text file which lists hosts on which the Master should start a + backup Master process, one host per line. _hadoop-metrics2-hbase.properties_:: Used to connect HBase Hadoop's Metrics2 framework. - See the link:https://wiki.apache.org/hadoop/HADOOP-6728-MetricsV2[Hadoop Wiki entry] for more information on Metrics2. - Contains only commented-out examples by default. + See the link:https://cwiki.apache.org/confluence/display/HADOOP2/HADOOP-6728-MetricsV2[Hadoop Wiki entry] + for more information on Metrics2. Contains only commented-out examples by default. _hbase-env.cmd_ and _hbase-env.sh_:: - Script for Windows and Linux / Unix environments to set up the working environment for HBase, including the location of Java, Java options, and other environment variables. - The file contains many commented-out examples to provide guidance. + Script for Windows and Linux / Unix environments to set up the working environment for HBase, + including the location of Java, Java options, and other environment variables. The file contains + many commented-out examples to provide guidance. _hbase-policy.xml_:: - The default policy configuration file used by RPC servers to make authorization decisions on client requests. - Only used if HBase <> is enabled. + The default policy configuration file used by RPC servers to make authorization decisions on + client requests. Only used if HBase <> is enabled. _hbase-site.xml_:: The main HBase configuration file. This file specifies configuration options which override HBase's default configuration. You can view (but do not edit) the default configuration file at _docs/hbase-default.xml_. - You can also view the entire effective configuration for your cluster (defaults and overrides) in the [label]#HBase Configuration# tab of the HBase Web UI. + You can also view the entire effective configuration for your cluster (defaults and overrides) in + the [label]#HBase Configuration# tab of the HBase Web UI. _log4j.properties_:: Configuration file for HBase logging via `log4j`. _regionservers_:: A plain-text file containing a list of hosts which should run a RegionServer in your HBase cluster. - By default this file contains the single entry `localhost`. - It should contain a list of hostnames or IP addresses, one per line, and should only contain `localhost` if each node in your cluster will run a RegionServer on its `localhost` interface. + By default, this file contains the single entry `localhost`. + It should contain a list of hostnames or IP addresses, one per line, and should only contain + `localhost` if each node in your cluster will run a RegionServer on its `localhost` interface. .Checking XML Validity [TIP] ==== -When you edit XML, it is a good idea to use an XML-aware editor to be sure that your syntax is correct and your XML is well-formed. -You can also use the `xmllint` utility to check that your XML is well-formed. -By default, `xmllint` re-flows and prints the XML to standard output. -To check for well-formedness and only print output if errors exist, use the command `xmllint -noout filename.xml`. +When you edit XML, it is a good idea to use an XML-aware editor to be sure that your syntax is +correct and your XML is well-formed. You can also use the `xmllint` utility to check that your XML +is well-formed. By default, `xmllint` re-flows and prints the XML to standard output. To check for +well-formedness and only print output if errors exist, use the command `xmllint -noout filename.xml`. ==== .Keep Configuration In Sync Across the Cluster [WARNING] ==== -When running in distributed mode, after you make an edit to an HBase configuration, make sure you copy the contents of the _conf/_ directory to all nodes of the cluster. -HBase will not do this for you. -Use `rsync`, `scp`, or another secure mechanism for copying the configuration files to your nodes. -For most configurations, a restart is needed for servers to pick up changes. Dynamic configuration is an exception to this, to be described later below. +When running in distributed mode, after you make an edit to an HBase configuration, make sure you +copy the contents of the _conf/_ directory to all nodes of the cluster. HBase will not do this for +you. Use a configuration management tool for managing and copying the configuration files to your +nodes. For most configurations, a restart is needed for servers to pick up changes. Dynamic +configuration is an exception to this, to be described later below. ==== [[basic.prerequisites]] @@ -93,123 +97,189 @@ This section lists required services and some required system configuration. [[java]] .Java -The following table summarizes the recommendation of the HBase community wrt deploying on various Java versions. -A icon:check-circle[role="green"] symbol is meant to indicate a base level of testing and willingness to help diagnose and address issues you might run into. -Similarly, an entry of icon:exclamation-circle[role="yellow"] or icon:times-circle[role="red"] generally means that should you run into an issue the community is likely to ask you to change the Java environment before proceeding to help. -In some cases, specific guidance on limitations (e.g. whether compiling / unit tests work, specific operational issues, etc) will also be noted. +HBase runs on the Java Virtual Machine, thus all HBase deployments require a JVM runtime. -.Long Term Support JDKs are recommended +The following table summarizes the recommendations of the HBase community with respect to running +on various Java versions. The icon:check-circle[role="green"] symbol indicates a base level of +testing and willingness to help diagnose and address issues you might run into; these are the +expected deployment combinations. An entry of icon:exclamation-circle[role="yellow"] +means that there may be challenges with this combination, and you should look for more information +before deciding to pursue this as your deployment strategy. The icon:times-circle[role="red"] means +this combination does not work; either an older Java version is considered deprecated by the HBase +community, or this combination is known to not work. For combinations of newer JDK with older HBase +releases, it's likely there are known compatibility issues that cannot be addressed under our +compatibility guarantees, making the combination impossible. In some cases, specific guidance on +limitations (e.g. whether compiling / unit tests work, specific operational issues, etc) are also +noted. Assume any combination not listed here is considered icon:times-circle[role="red"]. + +.Long-Term Support JDKs are Recommended +[WARNING] +==== +HBase recommends downstream users rely only on JDK releases that are marked as Long-Term Supported +(LTS), either from the OpenJDK project or vendors. At the time of this writing, the following JDK +releases are NOT LTS releases and are NOT tested or advocated for use by the Apache HBase +community: JDK9, JDK10, JDK12, JDK13, and JDK14. Community discussion around this decision is +recorded on link:https://issues.apache.org/jira/browse/HBASE-20264[HBASE-20264]. +==== + +.HotSpot vs. OpenJ9 [TIP] ==== -HBase recommends downstream users rely on JDK releases that are marked as Long Term Supported (LTS) either from the OpenJDK project or vendors. As of March 2018 that means Java 8 is the only applicable version and that the next likely version to see testing will be Java 11 near Q3 2018. +At this time, all testing performed by the Apache HBase project runs on the HotSpot variant of the +JVM. When selecting your JDK distribution, please take this into consideration. ==== .Java support by release line -[cols="6*^.^", options="header"] +[cols="4*^.^", options="header"] |=== -|HBase Version -|JDK 7 -|JDK 8 -|JDK 9 (Non-LTS) -|JDK 10 (Non-LTS) -|JDK 11 +|Java Version +|HBase 1.3+ +|HBase 2.1+ +|HBase 2.3+ -|2.0+ +|JDK7 +|icon:check-circle[role="green"] +|icon:times-circle[role="red"] |icon:times-circle[role="red"] -|icon:check-circle[role="green"] -v|icon:exclamation-circle[role="yellow"] -link:https://issues.apache.org/jira/browse/HBASE-20264[HBASE-20264] -v|icon:exclamation-circle[role="yellow"] -link:https://issues.apache.org/jira/browse/HBASE-20264[HBASE-20264] -v|icon:exclamation-circle[role="yellow"] -link:https://issues.apache.org/jira/browse/HBASE-21110[HBASE-21110] -|1.2+ +|JDK8 |icon:check-circle[role="green"] |icon:check-circle[role="green"] -v|icon:exclamation-circle[role="yellow"] -link:https://issues.apache.org/jira/browse/HBASE-20264[HBASE-20264] -v|icon:exclamation-circle[role="yellow"] -link:https://issues.apache.org/jira/browse/HBASE-20264[HBASE-20264] -v|icon:exclamation-circle[role="yellow"] -link:https://issues.apache.org/jira/browse/HBASE-21110[HBASE-21110] +|icon:check-circle[role="green"] + +|JDK11 +|icon:times-circle[role="red"] +|icon:times-circle[role="red"] +|icon:exclamation-circle[role="yellow"]* |=== -NOTE: HBase will neither build nor run with Java 6. +.A Note on JDK11 icon:exclamation-circle[role="yellow"]* +[WARNING] +==== +Preliminary support for JDK11 is introduced with HBase 2.3.0. This support is limited to +compilation and running the full test suite. There are open questions regarding the runtime +compatibility of JDK11 with Apache ZooKeeper and Apache Hadoop +(link:https://issues.apache.org/jira/browse/HADOOP-15338[HADOOP-15338]). Significantly, neither +project has yet released a version with explicit runtime support for JDK11. The remaining known +issues in HBase are catalogued in +link:https://issues.apache.org/jira/browse/HBASE-22972[HBASE-22972]. +==== -NOTE: You must set `JAVA_HOME` on each node of your cluster. _hbase-env.sh_ provides a handy mechanism to do this. +NOTE: You must set `JAVA_HOME` on each node of your cluster. _hbase-env.sh_ provides a handy +mechanism to do this. [[os]] .Operating System Utilities ssh:: - HBase uses the Secure Shell (ssh) command and utilities extensively to communicate between cluster nodes. Each server in the cluster must be running `ssh` so that the Hadoop and HBase daemons can be managed. You must be able to connect to all nodes via SSH, including the local node, from the Master as well as any backup Master, using a shared key rather than a password. You can see the basic methodology for such a set-up in Linux or Unix systems at "<>". If your cluster nodes use OS X, see the section, link:https://wiki.apache.org/hadoop/Running_Hadoop_On_OS_X_10.5_64-bit_%28Single-Node_Cluster%29[SSH: Setting up Remote Desktop and Enabling Self-Login] on the Hadoop wiki. + HBase uses the Secure Shell (ssh) command and utilities extensively to communicate between +cluster nodes. Each server in the cluster must be running `ssh` so that the Hadoop and HBase +daemons can be managed. You must be able to connect to all nodes via SSH, including the local +node, from the Master as well as any backup Master, using a shared key rather than a password. +You can see the basic methodology for such a set-up in Linux or Unix systems at +"<>". If your cluster nodes use OS X, see the section, +link:https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120730246#RunningHadoopOnOSX10.564-bit(Single-NodeCluster)-SSH:SettingupRemoteDesktopandEnablingSelf-Login[SSH: Setting up Remote Desktop and Enabling Self-Login] +on the Hadoop wiki. DNS:: HBase uses the local hostname to self-report its IP address. NTP:: - The clocks on cluster nodes should be synchronized. A small amount of variation is acceptable, but larger amounts of skew can cause erratic and unexpected behavior. Time synchronization is one of the first things to check if you see unexplained problems in your cluster. It is recommended that you run a Network Time Protocol (NTP) service, or another time-synchronization mechanism on your cluster and that all nodes look to the same service for time synchronization. See the link:http://www.tldp.org/LDP/sag/html/basic-ntp-config.html[Basic NTP Configuration] at [citetitle]_The Linux Documentation Project (TLDP)_ to set up NTP. + The clocks on cluster nodes should be synchronized. A small amount of variation is acceptable, +but larger amounts of skew can cause erratic and unexpected behavior. Time synchronization is one +of the first things to check if you see unexplained problems in your cluster. It is recommended +that you run a Network Time Protocol (NTP) service, or another time-synchronization mechanism on +your cluster and that all nodes look to the same service for time synchronization. See the +link:http://www.tldp.org/LDP/sag/html/basic-ntp-config.html[Basic NTP Configuration] at +[citetitle]_The Linux Documentation Project (TLDP)_ to set up NTP. [[ulimit]] Limits on Number of Files and Processes (ulimit):: - Apache HBase is a database. It requires the ability to open a large number of files at once. Many Linux distributions limit the number of files a single user is allowed to open to `1024` (or `256` on older versions of OS X). You can check this limit on your servers by running the command `ulimit -n` when logged in as the user which runs HBase. See <> for some of the problems you may experience if the limit is too low. You may also notice errors such as the following: + Apache HBase is a database. It requires the ability to open a large number of files at once. Many +Linux distributions limit the number of files a single user is allowed to open to `1024` (or `256` +on older versions of OS X). You can check this limit on your servers by running the command +`ulimit -n` when logged in as the user which runs HBase. See +<> for some of the problems you may +experience if the limit is too low. You may also notice errors such as the following: + ---- 2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception increateBlockOutputStream java.io.EOFException 2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901 ---- + -It is recommended to raise the ulimit to at least 10,000, but more likely 10,240, because the value is usually expressed in multiples of 1024. Each ColumnFamily has at least one StoreFile, and possibly more than six StoreFiles if the region is under load. The number of open files required depends upon the number of ColumnFamilies and the number of regions. The following is a rough formula for calculating the potential number of open files on a RegionServer. +It is recommended to raise the ulimit to at least 10,000, but more likely 10,240, because the value +is usually expressed in multiples of 1024. Each ColumnFamily has at least one StoreFile, and +possibly more than six StoreFiles if the region is under load. The number of open files required +depends upon the number of ColumnFamilies and the number of regions. The following is a rough +formula for calculating the potential number of open files on a RegionServer. + .Calculate the Potential Number of Open Files ---- (StoreFiles per ColumnFamily) x (regions per RegionServer) ---- + -For example, assuming that a schema had 3 ColumnFamilies per region with an average of 3 StoreFiles per ColumnFamily, and there are 100 regions per RegionServer, the JVM will open `3 * 3 * 100 = 900` file descriptors, not counting open JAR files, configuration files, and others. Opening a file does not take many resources, and the risk of allowing a user to open too many files is minimal. +For example, assuming that a schema had 3 ColumnFamilies per region with an average of 3 StoreFiles +per ColumnFamily, and there are 100 regions per RegionServer, the JVM will open `3 * 3 * 100 = 900` +file descriptors, not counting open JAR files, configuration files, and others. Opening a file does +not take many resources, and the risk of allowing a user to open too many files is minimal. + -Another related setting is the number of processes a user is allowed to run at once. In Linux and Unix, the number of processes is set using the `ulimit -u` command. This should not be confused with the `nproc` command, which controls the number of CPUs available to a given user. Under load, a `ulimit -u` that is too low can cause OutOfMemoryError exceptions. +Another related setting is the number of processes a user is allowed to run at once. In Linux and +Unix, the number of processes is set using the `ulimit -u` command. This should not be confused +with the `nproc` command, which controls the number of CPUs available to a given user. Under load, +a `ulimit -u` that is too low can cause OutOfMemoryError exceptions. + -Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user's ulimit configuration, look at the first line of the HBase log for that instance. +Configuring the maximum number of file descriptors and processes for the user who is running the +HBase process is an operating system configuration, rather than an HBase configuration. It is also +important to be sure that the settings are changed for the user that actually runs HBase. To see +which user started HBase, and that user's ulimit configuration, look at the first line of the +HBase log for that instance. + .`ulimit` Settings on Ubuntu ==== -To configure ulimit settings on Ubuntu, edit _/etc/security/limits.conf_, which is a space-delimited file with four columns. Refer to the man page for _limits.conf_ for details about the format of this file. In the following example, the first line sets both soft and hard limits for the number of open files (nofile) to 32768 for the operating system user with the username hadoop. The second line sets the number of processes to 32000 for the same user. +To configure ulimit settings on Ubuntu, edit _/etc/security/limits.conf_, which is a +space-delimited file with four columns. Refer to the man page for _limits.conf_ for details about +the format of this file. In the following example, the first line sets both soft and hard limits +for the number of open files (nofile) to 32768 for the operating system user with the username +hadoop. The second line sets the number of processes to 32000 for the same user. ---- hadoop - nofile 32768 hadoop - nproc 32000 ---- -The settings are only applied if the Pluggable Authentication Module (PAM) environment is directed to use them. To configure PAM to use these limits, be sure that the _/etc/pam.d/common-session_ file contains the following line: +The settings are only applied if the Pluggable Authentication Module (PAM) environment is directed +to use them. To configure PAM to use these limits, be sure that the _/etc/pam.d/common-session_ +file contains the following line: ---- session required pam_limits.so ---- ==== Linux Shell:: - All of the shell scripts that come with HBase rely on the link:http://www.gnu.org/software/bash[GNU Bash] shell. + All of the shell scripts that come with HBase rely on the +link:http://www.gnu.org/software/bash[GNU Bash] shell. Windows:: Running production systems on Windows machines is not recommended. - [[hadoop]] === link:https://hadoop.apache.org[Hadoop](((Hadoop))) -The following table summarizes the versions of Hadoop supported with each version of HBase. Older versions not appearing in this table are considered unsupported and likely missing necessary features, while newer versions are untested but may be suitable. +The following table summarizes the versions of Hadoop supported with each version of HBase. Older +versions not appearing in this table are considered unsupported and likely missing necessary +features, while newer versions are untested but may be suitable. -Based on the version of HBase, you should select the most appropriate version of Hadoop. -You can use Apache Hadoop, or a vendor's distribution of Hadoop. -No distinction is made here. -See link:https://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Support[the Hadoop wiki] for information about vendors of Hadoop. +Based on the version of HBase, you should select the most appropriate version of Hadoop. You can +use Apache Hadoop, or a vendor's distribution of Hadoop. No distinction is made here. See +link:https://cwiki.apache.org/confluence/display/HADOOP2/Distributions+and+Commercial+Support[the Hadoop wiki] +for information about vendors of Hadoop. .Hadoop 2.x is recommended. [TIP] ==== -Hadoop 2.x is faster and includes features, such as short-circuit reads (see <>), -which will help improve your HBase random read profile. -Hadoop 2.x also includes important bug fixes that will improve your overall HBase experience. HBase does not support running with -earlier versions of Hadoop. See the table below for requirements specific to different HBase versions. +Hadoop 2.x is faster and includes features, such as short-circuit reads (see +<>), which will help improve your HBase random read profile. Hadoop +2.x also includes important bug fixes that will improve your overall HBase experience. HBase does +not support running with earlier versions of Hadoop. See the table below for requirements specific +to different HBase versions. Hadoop 3.x is still in early access releases and has not yet been sufficiently tested by the HBase community for production use cases. ==== @@ -219,25 +289,27 @@ Use the following legend to interpret this table: .Hadoop version support matrix * icon:check-circle[role="green"] = Tested to be fully-functional -* icon:times-circle[role="red"] = Known to not be fully-functional +* icon:times-circle[role="red"] = Known to not be fully-functional, or there are +link:https://hadoop.apache.org/cve_list.html[CVEs] so we drop the support in newer minor releases * icon:exclamation-circle[role="yellow"] = Not tested, may/may-not function [cols="1,6*^.^", options="header"] |=== -| | HBase-1.2.x, HBase-1.3.x | HBase-1.4.x | HBase-1.5.x | HBase-2.0.x | HBase-2.1.x | HBase-2.2.x +| | HBase-1.3.x | HBase-1.4.x | HBase-1.5.x | HBase-2.1.x | HBase-2.2.x | HBase-2.3.x |Hadoop-2.4.x | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] |Hadoop-2.5.x | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] |Hadoop-2.6.0 | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] -|Hadoop-2.6.1+ | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] +|Hadoop-2.6.1+ | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] |Hadoop-2.7.0 | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] -|Hadoop-2.7.1+ | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] -|Hadoop-2.8.[0-1] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] -|Hadoop-2.8.2 | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] -|Hadoop-2.8.3+ | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] -|Hadoop-2.9.0 | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] -|Hadoop-2.9.1+ | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] +|Hadoop-2.7.1+ | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] +|Hadoop-2.8.[0-2] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] +|Hadoop-2.8.[3-4] | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] +|Hadoop-2.8.5+ | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] +|Hadoop-2.9.[0-1] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] +|Hadoop-2.9.2+ | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] +|Hadoop-2.10.0 | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] |Hadoop-3.0.[0-2] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] -|Hadoop-3.0.3+ | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] +|Hadoop-3.0.3+ | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] |Hadoop-3.1.0 | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] |Hadoop-3.1.1+ | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] |=== @@ -246,9 +318,9 @@ Use the following legend to interpret this table: [TIP] ==== When using pre-2.6.1 Hadoop versions and JDK 1.8 in a Kerberos environment, HBase server can fail -and abort due to Kerberos keytab relogin error. Late version of JDK 1.7 (1.7.0_80) has the problem too. -Refer to link:https://issues.apache.org/jira/browse/HADOOP-10786[HADOOP-10786] for additional details. -Consider upgrading to Hadoop 2.6.1+ in this case. +and abort due to Kerberos keytab relogin error. Late version of JDK 1.7 (1.7.0_80) has the problem +too. Refer to link:https://issues.apache.org/jira/browse/HADOOP-10786[HADOOP-10786] for additional +details. Consider upgrading to Hadoop 2.6.1+ in this case. ==== .Hadoop 2.6.x @@ -263,31 +335,43 @@ data loss. This patch is present in Apache Hadoop releases 2.6.1+. .Hadoop 2.y.0 Releases [TIP] ==== -Starting around the time of Hadoop version 2.7.0, the Hadoop PMC got into the habit of calling out new minor releases on their major version 2 release line as not stable / production ready. As such, HBase expressly advises downstream users to avoid running on top of these releases. Note that additionally the 2.8.1 release was given the same caveat by the Hadoop PMC. For reference, see the release announcements for link:https://s.apache.org/hadoop-2.7.0-announcement[Apache Hadoop 2.7.0], link:https://s.apache.org/hadoop-2.8.0-announcement[Apache Hadoop 2.8.0], link:https://s.apache.org/hadoop-2.8.1-announcement[Apache Hadoop 2.8.1], and link:https://s.apache.org/hadoop-2.9.0-announcement[Apache Hadoop 2.9.0]. +Starting around the time of Hadoop version 2.7.0, the Hadoop PMC got into the habit of calling out +new minor releases on their major version 2 release line as not stable / production ready. As such, +HBase expressly advises downstream users to avoid running on top of these releases. Note that +additionally the 2.8.1 release was given the same caveat by the Hadoop PMC. For reference, see the +release announcements for link:https://s.apache.org/hadoop-2.7.0-announcement[Apache Hadoop 2.7.0], +link:https://s.apache.org/hadoop-2.8.0-announcement[Apache Hadoop 2.8.0], +link:https://s.apache.org/hadoop-2.8.1-announcement[Apache Hadoop 2.8.1], and +link:https://s.apache.org/hadoop-2.9.0-announcement[Apache Hadoop 2.9.0]. ==== .Hadoop 3.0.x Releases [TIP] ==== -Hadoop distributions that include the Application Timeline Service feature may cause unexpected versions of HBase classes to be present in the application classpath. Users planning on running MapReduce applications with HBase should make sure that link:https://issues.apache.org/jira/browse/YARN-7190[YARN-7190] is present in their YARN service (currently fixed in 2.9.1+ and 3.1.0+). +Hadoop distributions that include the Application Timeline Service feature may cause unexpected +versions of HBase classes to be present in the application classpath. Users planning on running +MapReduce applications with HBase should make sure that +link:https://issues.apache.org/jira/browse/YARN-7190[YARN-7190] is present in their YARN service +(currently fixed in 2.9.1+ and 3.1.0+). ==== .Hadoop 3.1.0 Release [TIP] ==== -The Hadoop PMC called out the 3.1.0 release as not stable / production ready. As such, HBase expressly advises downstream users to avoid running on top of this release. For reference, see the link:https://s.apache.org/hadoop-3.1.0-announcement[release announcement for Hadoop 3.1.0]. +The Hadoop PMC called out the 3.1.0 release as not stable / production ready. As such, HBase +expressly advises downstream users to avoid running on top of this release. For reference, see +the link:https://s.apache.org/hadoop-3.1.0-announcement[release announcement for Hadoop 3.1.0]. ==== .Replace the Hadoop Bundled With HBase! [NOTE] ==== -Because HBase depends on Hadoop, it bundles Hadoop jars under its _lib_ directory. -The bundled jars are ONLY for use in standalone mode. -In distributed mode, it is _critical_ that the version of Hadoop that is out on your cluster match what is under HBase. -Replace the hadoop jars found in the HBase lib directory with the equivalent hadoop jars from the version you are running -on your cluster to avoid version mismatch issues. -Make sure you replace the jars under HBase across your whole cluster. -Hadoop version mismatch issues have various manifestations. Check for mismatch if +Because HBase depends on Hadoop, it bundles Hadoop jars under its _lib_ directory. The bundled jars +are ONLY for use in stand-alone mode. In distributed mode, it is _critical_ that the version of +Hadoop that is out on your cluster match what is under HBase. Replace the hadoop jars found in the +HBase lib directory with the equivalent hadoop jars from the version you are running on your +cluster to avoid version mismatch issues. Make sure you replace the jars under HBase across your +whole cluster. Hadoop version mismatch issues have various manifestations. Check for mismatch if HBase appears hung. ==== @@ -295,7 +379,8 @@ HBase appears hung. ==== `dfs.datanode.max.transfer.threads` (((dfs.datanode.max.transfer.threads))) An HDFS DataNode has an upper bound on the number of files that it will serve at any one time. -Before doing any loading, make sure you have configured Hadoop's _conf/hdfs-site.xml_, setting the `dfs.datanode.max.transfer.threads` value to at least the following: +Before doing any loading, make sure you have configured Hadoop's _conf/hdfs-site.xml_, setting the +`dfs.datanode.max.transfer.threads` value to at least the following: [source,xml] ---- @@ -318,12 +403,16 @@ For example: contain current block. Will get new block locations from namenode and retry... ---- -See also <> and note that this property was previously known as `dfs.datanode.max.xcievers` (e.g. link:http://ccgtech.blogspot.com/2010/02/hadoop-hdfs-deceived-by-xciever.html[Hadoop HDFS: Deceived by Xciever]). +See also <> and note that this +property was previously known as `dfs.datanode.max.xcievers` (e.g. +link:http://ccgtech.blogspot.com/2010/02/hadoop-hdfs-deceived-by-xciever.html[Hadoop HDFS: Deceived by Xciever]). [[zookeeper.requirements]] === ZooKeeper Requirements -ZooKeeper 3.4.x is required. +An Apache ZooKeeper quorum is required. The exact version depends on your version of HBase, though +the minimum ZooKeeper version is 3.4.x due to the `useMulti` feature made default in 1.0.0 +(see https://issues.apache.org/jira/browse/HBASE-16598[HBASE-16598]). [[standalone_dist]] == HBase run modes: Standalone and Distributed @@ -332,16 +421,18 @@ HBase has two run modes: <> and <> section. -In standalone mode, HBase does not use HDFS -- it uses the local filesystem instead -- and it runs all HBase daemons and a local ZooKeeper all up in the same JVM. -ZooKeeper binds to a well known port so clients may talk to HBase. +In standalone mode, HBase does not use HDFS -- it uses the local filesystem instead -- and it runs +all HBase daemons and a local ZooKeeper all up in the same JVM. ZooKeeper binds to a well-known +port so clients may talk to HBase. [[standalone.over.hdfs]] ==== Standalone HBase over HDFS @@ -376,12 +467,15 @@ to _false_. For example: [[distributed]] === Distributed -Distributed mode can be subdivided into distributed but all daemons run on a single node -- a.k.a. _pseudo-distributed_ -- and _fully-distributed_ where the daemons are spread across all nodes in the cluster. -The _pseudo-distributed_ vs. _fully-distributed_ nomenclature comes from Hadoop. +Distributed mode can be subdivided into distributed but all daemons run on a single node -- a.k.a. +_pseudo-distributed_ -- and _fully-distributed_ where the daemons are spread across all nodes in +the cluster. The _pseudo-distributed_ vs. _fully-distributed_ nomenclature comes from Hadoop. -Pseudo-distributed mode can run against the local filesystem or it can run against an instance of the _Hadoop Distributed File System_ (HDFS). Fully-distributed mode can ONLY run on HDFS. +Pseudo-distributed mode can run against the local filesystem or it can run against an instance of +the _Hadoop Distributed File System_ (HDFS). Fully-distributed mode can ONLY run on HDFS. See the Hadoop link:https://hadoop.apache.org/docs/current/[documentation] for how to set up HDFS. -A good walk-through for setting up HDFS on Hadoop 2 can be found at http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide. +A good walk-through for setting up HDFS on Hadoop 2 can be found at +http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide. [[pseudo]] ==== Pseudo-distributed @@ -401,22 +495,26 @@ Do not use this configuration for production or for performance evaluation. [[fully_dist]] === Fully-distributed -By default, HBase runs in standalone mode. -Both standalone mode and pseudo-distributed mode are provided for the purposes of small-scale testing. -For a production environment, distributed mode is advised. -In distributed mode, multiple instances of HBase daemons run on multiple servers in the cluster. +By default, HBase runs in stand-alone mode. Both stand-alone mode and pseudo-distributed mode are +provided for the purposes of small-scale testing. For a production environment, distributed mode +is advised. In distributed mode, multiple instances of HBase daemons run on multiple servers in the +cluster. -Just as in pseudo-distributed mode, a fully distributed configuration requires that you set the `hbase.cluster.distributed` property to `true`. -Typically, the `hbase.rootdir` is configured to point to a highly-available HDFS filesystem. +Just as in pseudo-distributed mode, a fully distributed configuration requires that you set the +`hbase.cluster.distributed` property to `true`. Typically, the `hbase.rootdir` is configured to +point to a highly-available HDFS filesystem. -In addition, the cluster is configured so that multiple cluster nodes enlist as RegionServers, ZooKeeper QuorumPeers, and backup HMaster servers. -These configuration basics are all demonstrated in <>. +In addition, the cluster is configured so that multiple cluster nodes enlist as RegionServers, +ZooKeeper QuorumPeers, and backup HMaster servers. These configuration basics are all demonstrated +in <>. .Distributed RegionServers -Typically, your cluster will contain multiple RegionServers all running on different servers, as well as primary and backup Master and ZooKeeper daemons. -The _conf/regionservers_ file on the master server contains a list of hosts whose RegionServers are associated with this cluster. -Each host is on a separate line. -All hosts listed in this file will have their RegionServer processes started and stopped when the master server starts or stops. +Typically, your cluster will contain multiple RegionServers all running on different servers, as +well as primary and backup Master and ZooKeeper daemons. The _conf/regionservers_ file on the +master server contains a list of hosts whose RegionServers are associated with this cluster. +Each host is on a separate line. All hosts listed in this file will have their RegionServer +processes started and stopped when the +master server starts or stops. .ZooKeeper and HBase See the <> section for ZooKeeper setup instructions for HBase. @@ -425,8 +523,8 @@ See the <> section for ZooKeeper setup instructions for HBa ==== This is a bare-bones _conf/hbase-site.xml_ for a distributed HBase cluster. A cluster that is used for real-world work would contain more custom configuration parameters. -Most HBase configuration directives have default values, which are used unless the value is overridden in the _hbase-site.xml_. -See "<>" for more information. +Most HBase configuration directives have default values, which are used unless the value is +overridden in the _hbase-site.xml_. See "<>" for more information. [source,xml] ---- @@ -447,8 +545,9 @@ See "<>" for more information. ---- -This is an example _conf/regionservers_ file, which contains a list of nodes that should run a RegionServer in the cluster. -These nodes need HBase installed and they need to use the same contents of the _conf/_ directory as the Master server +This is an example _conf/regionservers_ file, which contains a list of nodes that should run a +RegionServer in the cluster. These nodes need HBase installed and they need to use the same +contents of the _conf/_ directory as the Master server. [source] ---- @@ -458,8 +557,9 @@ node-b.example.com node-c.example.com ---- -This is an example _conf/backup-masters_ file, which contains a list of each node that should run a backup Master instance. -The backup Master instances will sit idle unless the main Master becomes unavailable. +This is an example _conf/backup-masters_ file, which contains a list of each node that should run +a backup Master instance. The backup Master instances will sit idle unless the main Master becomes +unavailable. [source] ---- @@ -470,28 +570,37 @@ node-c.example.com ==== .Distributed HBase Quickstart -See <> for a walk-through of a simple three-node cluster configuration with multiple ZooKeeper, backup HMaster, and RegionServer instances. +See <> for a walk-through of a simple +three-node cluster configuration with multiple ZooKeeper, backup HMaster, and RegionServer +instances. .Procedure: HDFS Client Configuration -. Of note, if you have made HDFS client configuration changes on your Hadoop cluster, such as configuration directives for HDFS clients, as opposed to server-side configurations, you must use one of the following methods to enable HBase to see and use these configuration changes: +. Of note, if you have made HDFS client configuration changes on your Hadoop cluster, such as +configuration directives for HDFS clients, as opposed to server-side configurations, you must use +one of the following methods to enable HBase to see and use these configuration changes: + -a. Add a pointer to your `HADOOP_CONF_DIR` to the `HBASE_CLASSPATH` environment variable in _hbase-env.sh_. -b. Add a copy of _hdfs-site.xml_ (or _hadoop-site.xml_) or, better, symlinks, under _${HBASE_HOME}/conf_, or +a. Add a pointer to your `HADOOP_CONF_DIR` to the `HBASE_CLASSPATH` environment variable in +_hbase-env.sh_. +b. Add a copy of _hdfs-site.xml_ (or _hadoop-site.xml_) or, better, symlinks, under +_${HBASE_HOME}/conf_, or c. if only a small set of HDFS client configurations, add them to _hbase-site.xml_. An example of such an HDFS client configuration is `dfs.replication`. -If for example, you want to run with a replication factor of 5, HBase will create files with the default of 3 unless you do the above to make the configuration available to HBase. +If for example, you want to run with a replication factor of 5, HBase will create files with the +default of 3 unless you do the above to make the configuration available to HBase. [[confirm]] == Running and Confirming Your Installation Make sure HDFS is running first. -Start and stop the Hadoop HDFS daemons by running _bin/start-hdfs.sh_ over in the `HADOOP_HOME` directory. -You can ensure it started properly by testing the `put` and `get` of files into the Hadoop filesystem. -HBase does not normally use the MapReduce or YARN daemons. These do not need to be started. +Start and stop the Hadoop HDFS daemons by running _bin/start-hdfs.sh_ over in the `HADOOP_HOME` +directory. You can ensure it started properly by testing the `put` and `get` of files into the +Hadoop filesystem. HBase does not normally use the MapReduce or YARN daemons. These do not need to +be started. -_If_ you are managing your own ZooKeeper, start it and confirm it's running, else HBase will start up ZooKeeper for you as part of its start process. +_If_ you are managing your own ZooKeeper, start it and confirm it's running, else HBase will start +up ZooKeeper for you as part of its start process. Start HBase with the following command: @@ -506,9 +615,13 @@ HBase logs can be found in the _logs_ subdirectory. Check them out especially if HBase had trouble starting. HBase also puts up a UI listing vital attributes. -By default it's deployed on the Master host at port 16010 (HBase RegionServers listen on port 16020 by default and put up an informational HTTP server at port 16030). If the Master is running on a host named `master.example.org` on the default port, point your browser at pass:[http://master.example.org:16010] to see the web interface. +By default it's deployed on the Master host at port 16010 (HBase RegionServers listen on port 16020 +by default and put up an informational HTTP server at port 16030). If the Master is running on a +host named `master.example.org` on the default port, point your browser at +pass:[http://master.example.org:16010] to see the web interface. -Once HBase has started, see the <> section for how to create tables, add data, scan your insertions, and finally disable and drop your tables. +Once HBase has started, see the <> section for how to create +tables, add data, scan your insertions, and finally disable and drop your tables. To stop HBase after exiting the HBase shell enter @@ -519,7 +632,8 @@ stopping hbase............... Shutdown can take a moment to complete. It can take longer if your cluster is comprised of many machines. -If you are running a distributed operation, be sure to wait until HBase has shut down completely before stopping the Hadoop daemons. +If you are running a distributed operation, be sure to wait until HBase has shut down completely +before stopping the Hadoop daemons. [[config.files]] == Default Configuration @@ -527,11 +641,14 @@ If you are running a distributed operation, be sure to wait until HBase has shut [[hbase.site]] === _hbase-site.xml_ and _hbase-default.xml_ -Just as in Hadoop where you add site-specific HDFS configuration to the _hdfs-site.xml_ file, for HBase, site specific customizations go into the file _conf/hbase-site.xml_. -For the list of configurable properties, see <> below or view the raw _hbase-default.xml_ source file in the HBase source code at _src/main/resources_. +Just as in Hadoop where you add site-specific HDFS configuration to the _hdfs-site.xml_ file, for +HBase, site specific customizations go into the file _conf/hbase-site.xml_. For the list of +configurable properties, see <> below +or view the raw _hbase-default.xml_ source file in the HBase source code at _src/main/resources_. Not all configuration options make it out to _hbase-default.xml_. -Some configurations would only appear in source code; the only way to identify these changes are through code review. +Some configurations would only appear in source code; the only way to identify these changes are +through code review. Currently, changes here will require a cluster restart for HBase to notice the change. // hbase/src/main/asciidoc @@ -542,35 +659,89 @@ include::{docdir}/../../../target/asciidoc/hbase-default.adoc[] [[hbase.env.sh]] === _hbase-env.sh_ -Set HBase environment variables in this file. -Examples include options to pass the JVM on start of an HBase daemon such as heap size and garbage collector configs. -You can also set configurations for HBase configuration, log directories, niceness, ssh options, where to locate process pid files, etc. -Open the file at _conf/hbase-env.sh_ and peruse its content. -Each option is fairly well documented. -Add your own environment variables here if you want them read by HBase daemons on startup. +Set HBase environment variables in this file. Examples include options to pass the JVM on start of +an HBase daemon such as heap size and garbage collector configs. +You can also set configurations for HBase configuration, log directories, niceness, ssh options, +where to locate process pid files, etc. Open the file at _conf/hbase-env.sh_ and peruse its content. +Each option is fairly well documented. Add your own environment variables here if you want them +read by HBase daemons on startup. Changes here will require a cluster restart for HBase to notice the change. [[log4j]] === _log4j.properties_ -Edit this file to change rate at which HBase files are rolled and to change the level at which HBase logs messages. +Edit this file to change rate at which HBase files are rolled and to change the level at which +HBase logs messages. -Changes here will require a cluster restart for HBase to notice the change though log levels can be changed for particular daemons via the HBase UI. +Changes here will require a cluster restart for HBase to notice the change though log levels can +be changed for particular daemons via the HBase UI. [[client_dependencies]] === Client configuration and dependencies connecting to an HBase cluster -If you are running HBase in standalone mode, you don't need to configure anything for your client to work provided that they are all on the same machine. +If you are running HBase in standalone mode, you don't need to configure anything for your client +to work provided that they are all on the same machine. -Since the HBase Master may move around, clients bootstrap by looking to ZooKeeper for current critical locations. -ZooKeeper is where all these values are kept. -Thus clients require the location of the ZooKeeper ensemble before they can do anything else. -Usually this ensemble location is kept out in the _hbase-site.xml_ and is picked up by the client from the `CLASSPATH`. +Starting release 3.0.0, the default connection registry has been switched to a master based +implementation. Refer to <> for more details about what a connection +registry is and implications of this change. Depending on your HBase version, following is the +expected minimal client configuration. -If you are configuring an IDE to run an HBase client, you should include the _conf/_ directory on your classpath so _hbase-site.xml_ settings can be found (or add _src/test/resources_ to pick up the hbase-site.xml used by tests). +==== Up until 2.x.y releases +In 2.x.y releases, the default connection registry was based on ZooKeeper as the source of truth. +This means that the clients always looked up ZooKeeper znodes to fetch the required metadata. For +example, if an active master crashed and the a new master is elected, clients looked up the master +znode to fetch the active master address (similarly for meta locations). This meant that the +clients needed to have access to ZooKeeper and need to know the ZooKeeper ensemble information +before they can do anything. This can be configured in the client configuration xml as follows: -For Java applications using Maven, including the hbase-shaded-client module is the recommended dependency when connecting to a cluster: +[source,xml] +---- + + + + + hbase.zookeeper.quorum + example1,example2,example3 + Zookeeper ensemble information + + +---- + +==== Starting 3.0.0 release + +The default implementation was switched to a master based connection registry. With this +implementation, clients always contact the active or stand-by master RPC end points to fetch the +connection registry information. This means that the clients should have access to the list of +active and master end points before they can do anything. This can be configured in the client +configuration xml as follows: + +[source,xml] +---- + + + + + hbase.masters + example1,example2,example3 + List of master rpc end points for the hbase cluster. + + +---- + +The configuration value for _hbase.masters_ is a comma separated list of _host:port_ values. If no +port value is specified, the default of _16000_ is assumed. + +Usually this configuration is kept out in the _hbase-site.xml_ and is picked up by the client from +the `CLASSPATH`. + +If you are configuring an IDE to run an HBase client, you should include the _conf/_ directory on +your classpath so _hbase-site.xml_ settings can be found (or add _src/test/resources_ to pick up +the hbase-site.xml used by tests). + +For Java applications using Maven, including the hbase-shaded-client module is the recommended +dependency when connecting to a cluster: [source,xml] ---- @@ -580,41 +751,34 @@ For Java applications using Maven, including the hbase-shaded-client module is t ---- -A basic example _hbase-site.xml_ for client only may look as follows: -[source,xml] ----- - - - - - hbase.zookeeper.quorum - example1,example2,example3 - The directory shared by region servers. - - - ----- - [[java.client.config]] ==== Java client configuration -The configuration used by a Java client is kept in an link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HBaseConfiguration[HBaseConfiguration] instance. +The configuration used by a Java client is kept in an +link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HBaseConfiguration[HBaseConfiguration] +instance. + +The factory method on HBaseConfiguration, `HBaseConfiguration.create();`, on invocation, will read +in the content of the first _hbase-site.xml_ found on the client's `CLASSPATH`, if one is present +(Invocation will also factor in any _hbase-default.xml_ found; an _hbase-default.xml_ ships inside +the _hbase.X.X.X.jar_). It is also possible to specify configuration directly without having to +read from a _hbase-site.xml_. -The factory method on HBaseConfiguration, `HBaseConfiguration.create();`, on invocation, will read in the content of the first _hbase-site.xml_ found on the client's `CLASSPATH`, if one is present (Invocation will also factor in any _hbase-default.xml_ found; an _hbase-default.xml_ ships inside the _hbase.X.X.X.jar_). It is also possible to specify configuration directly without having to read from a _hbase-site.xml_. For example, to set the ZooKeeper ensemble for the cluster programmatically do as follows: [source,java] ---- Configuration config = HBaseConfiguration.create(); -config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zookeeper locally +config.set("hbase.zookeeper.quorum", "localhost"); // Until 2.x.y versions +// ---- or ---- +config.set("hbase.masters", "localhost:1234"); // Starting 3.0.0 version ---- -If multiple ZooKeeper instances make up your ZooKeeper ensemble, they may be specified in a comma-separated list (just as in the _hbase-site.xml_ file). This populated `Configuration` instance can then be passed to an link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html[Table], and so on. - [[config_timeouts]] === Timeout settings -HBase provides a wide variety of timeout settings to limit the execution time of various remote operations. +HBase provides a wide variety of timeout settings to limit the execution time of various remote +operations. * hbase.rpc.timeout * hbase.rpc.read.timeout @@ -624,15 +788,18 @@ HBase provides a wide variety of timeout settings to limit the execution time of * hbase.client.scanner.timeout.period The `hbase.rpc.timeout` property limits how long a single RPC call can run before timing out. -To fine tune read or write related RPC timeouts set `hbase.rpc.read.timeout` and `hbase.rpc.write.timeout` configuration properties. -In the absence of these properties `hbase.rpc.timeout` will be used. +To fine tune read or write related RPC timeouts set `hbase.rpc.read.timeout` and +`hbase.rpc.write.timeout` configuration properties. In the absence of these properties +`hbase.rpc.timeout` will be used. A higher-level timeout is `hbase.client.operation.timeout` which is valid for each client call. -When an RPC call fails for instance for a timeout due to `hbase.rpc.timeout` it will be retried until `hbase.client.operation.timeout` is reached. -Client operation timeout for system tables can be fine tuned by setting `hbase.client.meta.operation.timeout` configuration value. +When an RPC call fails for instance for a timeout due to `hbase.rpc.timeout` it will be retried +until `hbase.client.operation.timeout` is reached. Client operation timeout for system tables can +be fine tuned by setting `hbase.client.meta.operation.timeout` configuration value. When this is not set its value will use `hbase.client.operation.timeout`. -Timeout for scan operations is controlled differently. Use `hbase.client.scanner.timeout.period` property to set this timeout. +Timeout for scan operations is controlled differently. Use `hbase.client.scanner.timeout.period` +property to set this timeout. [[example_config]] == Example Configurations @@ -646,7 +813,8 @@ Here is a basic configuration example for a distributed ten node cluster: * A 3-node ZooKeeper ensemble runs on `example1`, `example2`, and `example3` on the default ports. * ZooKeeper data is persisted to the directory _/export/zookeeper_. -Below we show what the main configuration files -- _hbase-site.xml_, _regionservers_, and _hbase-env.sh_ -- found in the HBase _conf_ directory might look like. +Below we show what the main configuration files -- _hbase-site.xml_, _regionservers_, and +_hbase-env.sh_ -- found in the HBase _conf_ directory might look like. [[hbase_site]] ==== _hbase-site.xml_ @@ -708,7 +876,9 @@ example9 [[hbase_env]] ==== _hbase-env.sh_ -The following lines in the _hbase-env.sh_ file show how to set the `JAVA_HOME` environment variable (required for HBase) and set the heap to 4 GB (rather than the default value of 1 GB). If you copy and paste this example, be sure to adjust the `JAVA_HOME` to suit your environment. +The following lines in the _hbase-env.sh_ file show how to set the `JAVA_HOME` environment variable +(required for HBase) and set the heap to 4 GB (rather than the default value of 1 GB). If you copy +and paste this example, be sure to adjust the `JAVA_HOME` to suit your environment. ---- # The java implementation to use. @@ -734,9 +904,11 @@ Review the <> and <> sections. [[big.cluster.config]] ==== Big Cluster Configurations -If you have a cluster with a lot of regions, it is possible that a Regionserver checks in briefly after the Master starts while all the remaining RegionServers lag behind. This first server to check in will be assigned all regions which is not optimal. -To prevent the above scenario from happening, up the `hbase.master.wait.on.regionservers.mintostart` property from its default value of 1. -See link:https://issues.apache.org/jira/browse/HBASE-6389[HBASE-6389 Modify the +If you have a cluster with a lot of regions, it is possible that a Regionserver checks in briefly +after the Master starts while all the remaining RegionServers lag behind. This first server to +check in will be assigned all regions which is not optimal. To prevent the above scenario from +happening, up the `hbase.master.wait.on.regionservers.mintostart` property from its default value +of 1. See link:https://issues.apache.org/jira/browse/HBASE-6389[HBASE-6389 Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments] for more detail. @@ -749,16 +921,22 @@ See link:https://issues.apache.org/jira/browse/HBASE-6389[HBASE-6389 Modify the [[sect.zookeeper.session.timeout]] ===== `zookeeper.session.timeout` -The default timeout is 90 seconds (specified in milliseconds). This means that if a server crashes, it will be 90 seconds before the Master notices the crash and starts recovery. -You might need to tune the timeout down to a minute or even less so the Master notices failures sooner. -Before changing this value, be sure you have your JVM garbage collection configuration under control, otherwise, a long garbage collection that lasts beyond the ZooKeeper session timeout will take out your RegionServer. (You might be fine with this -- you probably want recovery to start on the server if a RegionServer has been in GC for a long period of time). +The default timeout is 90 seconds (specified in milliseconds). This means that if a server crashes, +it will be 90 seconds before the Master notices the crash and starts recovery. You might need to +tune the timeout down to a minute or even less so the Master notices failures sooner. Before +changing this value, be sure you have your JVM garbage collection configuration under control, +otherwise, a long garbage collection that lasts beyond the ZooKeeper session timeout will take out +your RegionServer. (You might be fine with this -- you probably want recovery to start on the +server if a RegionServer has been in GC for a long period of time). -To change this configuration, edit _hbase-site.xml_, copy the changed file across the cluster and restart. +To change this configuration, edit _hbase-site.xml_, copy the changed file across the cluster and +restart. -We set this value high to save our having to field questions up on the mailing lists asking why a RegionServer went down during a massive import. -The usual cause is that their JVM is untuned and they are running into long GC pauses. -Our thinking is that while users are getting familiar with HBase, we'd save them having to know all of its intricacies. -Later when they've built some confidence, then they can play with configuration such as this. +We set this value high to save our having to field questions up on the mailing lists asking why a +RegionServer went down during a massive import. The usual cause is that their JVM is untuned and +they are running into long GC pauses. Our thinking is that while users are getting familiar with +HBase, we'd save them having to know all of its intricacies. Later when they've built some +confidence, then they can play with configuration such as this. [[zookeeper.instances]] ===== Number of ZooKeeper Instances @@ -771,109 +949,141 @@ See <>. [[dfs.datanode.failed.volumes.tolerated]] ===== `dfs.datanode.failed.volumes.tolerated` -This is the "...number of volumes that are allowed to fail before a DataNode stops offering service. -By default any volume failure will cause a datanode to shutdown" from the _hdfs-default.xml_ description. -You might want to set this to about half the amount of your available disks. +This is the "...number of volumes that are allowed to fail before a DataNode stops offering +service. By default, any volume failure will cause a datanode to shutdown" from the +_hdfs-default.xml_ description. You might want to set this to about half the amount of your +available disks. [[hbase.regionserver.handler.count]] ===== `hbase.regionserver.handler.count` -This setting defines the number of threads that are kept open to answer incoming requests to user tables. -The rule of thumb is to keep this number low when the payload per request approaches the MB (big puts, scans using a large cache) and high when the payload is small (gets, small puts, ICVs, deletes). The total size of the queries in progress is limited by the setting `hbase.ipc.server.max.callqueue.size`. +This setting defines the number of threads that are kept open to answer incoming requests to user +tables. The rule of thumb is to keep this number low when the payload per request approaches the MB +(big puts, scans using a large cache) and high when the payload is small (gets, small puts, ICVs, +deletes). The total size of the queries in progress is limited by the setting +`hbase.ipc.server.max.callqueue.size`. -It is safe to set that number to the maximum number of incoming clients if their payload is small, the typical example being a cluster that serves a website since puts aren't typically buffered and most of the operations are gets. +It is safe to set that number to the maximum number of incoming clients if their payload is small, +the typical example being a cluster that serves a website since puts aren't typically buffered and +most of the operations are gets. -The reason why it is dangerous to keep this setting high is that the aggregate size of all the puts that are currently happening in a region server may impose too much pressure on its memory, or even trigger an OutOfMemoryError. -A RegionServer running on low memory will trigger its JVM's garbage collector to run more frequently up to a point where GC pauses become noticeable (the reason being that all the memory used to keep all the requests' payloads cannot be trashed, no matter how hard the garbage collector tries). After some time, the overall cluster throughput is affected since every request that hits that RegionServer will take longer, which exacerbates the problem even more. +The reason why it is dangerous to keep this setting high is that the aggregate size of all the puts +that are currently happening in a region server may impose too much pressure on its memory, or even +trigger an OutOfMemoryError. A RegionServer running on low memory will trigger its JVM's garbage +collector to run more frequently up to a point where GC pauses become noticeable (the reason being +that all the memory used to keep all the requests' payloads cannot be trashed, no matter how hard +the garbage collector tries). After some time, the overall cluster throughput is affected since +every request that hits that RegionServer will take longer, which exacerbates the problem even more. -You can get a sense of whether you have too little or too many handlers by <> on an individual RegionServer then tailing its logs (Queued requests consume memory). +You can get a sense of whether you have too little or too many handlers by +<> on an individual RegionServer then tailing its logs (Queued requests +consume memory). [[big_memory]] ==== Configuration for large memory machines -HBase ships with a reasonable, conservative configuration that will work on nearly all machine types that people might want to test with. -If you have larger machines -- HBase has 8G and larger heap -- you might find the following configuration options helpful. +HBase ships with a reasonable, conservative configuration that will work on nearly all machine +types that people might want to test with. If you have larger machines -- HBase has 8G and larger +heap -- you might find the following configuration options helpful. TODO. [[config.compression]] ==== Compression You should consider enabling ColumnFamily compression. -There are several options that are near-frictionless and in most all cases boost performance by reducing the size of StoreFiles and thus reducing I/O. +There are several options that are near-frictionless and in most all cases boost performance by +reducing the size of StoreFiles and thus reducing I/O. See <> for more information. [[config.wals]] ==== Configuring the size and number of WAL files -HBase uses <> to recover the memstore data that has not been flushed to disk in case of an RS failure. -These WAL files should be configured to be slightly smaller than HDFS block (by default a HDFS block is 64Mb and a WAL file is ~60Mb). +HBase uses <> to recover the memstore data that has not been flushed to disk in case of +an RS failure. These WAL files should be configured to be slightly smaller than HDFS block (by +default a HDFS block is 64Mb and a WAL file is ~60Mb). -HBase also has a limit on the number of WAL files, designed to ensure there's never too much data that needs to be replayed during recovery. -This limit needs to be set according to memstore configuration, so that all the necessary data would fit. -It is recommended to allocate enough WAL files to store at least that much data (when all memstores are close to full). For example, with 16Gb RS heap, default memstore settings (0.4), and default WAL file size (~60Mb), 16Gb*0.4/60, the starting point for WAL file count is ~109. -However, as all memstores are not expected to be full all the time, less WAL files can be allocated. +HBase also has a limit on the number of WAL files, designed to ensure there's never too much data +that needs to be replayed during recovery. This limit needs to be set according to memstore +configuration, so that all the necessary data would fit. It is recommended to allocate enough WAL +files to store at least that much data (when all memstores are close to full). For example, with +16Gb RS heap, default memstore settings (0.4), and default WAL file size (~60Mb), 16Gb*0.4/60, the +starting point for WAL file count is ~109. However, as all memstores are not expected to be full +all the time, less WAL files can be allocated. [[disable.splitting]] ==== Managed Splitting -HBase generally handles splitting of your regions based upon the settings in your _hbase-default.xml_ and _hbase-site.xml_ configuration files. -Important settings include `hbase.regionserver.region.split.policy`, `hbase.hregion.max.filesize`, `hbase.regionserver.regionSplitLimit`. -A simplistic view of splitting is that when a region grows to `hbase.hregion.max.filesize`, it is split. -For most usage patterns, you should use automatic splitting. -See <> for more information about manual region splitting. +HBase generally handles splitting of your regions based upon the settings in your +_hbase-default.xml_ and _hbase-site.xml_ configuration files. Important settings include +`hbase.regionserver.region.split.policy`, `hbase.hregion.max.filesize`, +`hbase.regionserver.regionSplitLimit`. A simplistic view of splitting is that when a region grows +to `hbase.hregion.max.filesize`, it is split. For most usage patterns, you should use automatic +splitting. See <> for more +information about manual region splitting. -Instead of allowing HBase to split your regions automatically, you can choose to manage the splitting yourself. -Manually managing splits works if you know your keyspace well, otherwise let HBase figure where to split for you. -Manual splitting can mitigate region creation and movement under load. -It also makes it so region boundaries are known and invariant (if you disable region splitting). If you use manual splits, it is easier doing staggered, time-based major compactions to spread out your network IO load. +Instead of allowing HBase to split your regions automatically, you can choose to manage the +splitting yourself. Manually managing splits works if you know your keyspace well, otherwise let +HBase figure where to split for you. Manual splitting can mitigate region creation and movement +under load. It also makes it so region boundaries are known and invariant (if you disable region +splitting). If you use manual splits, it is easier doing staggered, time-based major compactions +to spread out your network IO load. .Disable Automatic Splitting -To disable automatic splitting, you can set region split policy in either cluster configuration or table configuration to be `org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy` +To disable automatic splitting, you can set region split policy in either cluster configuration +or table configuration to be `org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy` .Automatic Splitting Is Recommended [NOTE] ==== -If you disable automatic splits to diagnose a problem or during a period of fast data growth, it is recommended to re-enable them when your situation becomes more stable. -The potential benefits of managing region splits yourself are not undisputed. +If you disable automatic splits to diagnose a problem or during a period of fast data growth, it +is recommended to re-enable them when your situation becomes more stable. The potential benefits +of managing region splits yourself are not undisputed. ==== .Determine the Optimal Number of Pre-Split Regions -The optimal number of pre-split regions depends on your application and environment. -A good rule of thumb is to start with 10 pre-split regions per server and watch as data grows over time. -It is better to err on the side of too few regions and perform rolling splits later. -The optimal number of regions depends upon the largest StoreFile in your region. -The size of the largest StoreFile will increase with time if the amount of data grows. -The goal is for the largest region to be just large enough that the compaction selection algorithm only compacts it during a timed major compaction. -Otherwise, the cluster can be prone to compaction storms with a large number of regions under compaction at the same time. -It is important to understand that the data growth causes compaction storms and not the manual split decision. +The optimal number of pre-split regions depends on your application and environment. A good rule of +thumb is to start with 10 pre-split regions per server and watch as data grows over time. It is +better to err on the side of too few regions and perform rolling splits later. The optimal number +of regions depends upon the largest StoreFile in your region. The size of the largest StoreFile +will increase with time if the amount of data grows. The goal is for the largest region to be just +large enough that the compaction selection algorithm only compacts it during a timed major +compaction. Otherwise, the cluster can be prone to compaction storms with a large number of regions +under compaction at the same time. It is important to understand that the data growth causes +compaction storms and not the manual split decision. -If the regions are split into too many large regions, you can increase the major compaction interval by configuring `HConstants.MAJOR_COMPACTION_PERIOD`. -The `org.apache.hadoop.hbase.util.RegionSplitter` utility also provides a network-IO-safe rolling split of all regions. +If the regions are split into too many large regions, you can increase the major compaction +interval by configuring `HConstants.MAJOR_COMPACTION_PERIOD`. The +`org.apache.hadoop.hbase.util.RegionSplitter` utility also provides a network-IO-safe rolling +split of all regions. [[managed.compactions]] ==== Managed Compactions By default, major compactions are scheduled to run once in a 7-day period. -If you need to control exactly when and how often major compaction runs, you can disable managed major compactions. -See the entry for `hbase.hregion.majorcompaction` in the <> table for details. +If you need to control exactly when and how often major compaction runs, you can disable managed +major compactions. See the entry for `hbase.hregion.majorcompaction` in the +<> table for details. .Do Not Disable Major Compactions [WARNING] ==== -Major compactions are absolutely necessary for StoreFile clean-up. -Do not disable them altogether. -You can run major compactions manually via the HBase shell or via the link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact-org.apache.hadoop.hbase.TableName-[Admin API]. +Major compactions are absolutely necessary for StoreFile clean-up. Do not disable them altogether. +You can run major compactions manually via the HBase shell or via the +link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact-org.apache.hadoop.hbase.TableName-[Admin API]. ==== -For more information about compactions and the compaction file selection process, see <> +For more information about compactions and the compaction file selection process, see +<> [[spec.ex]] ==== Speculative Execution -Speculative Execution of MapReduce tasks is on by default, and for HBase clusters it is generally advised to turn off Speculative Execution at a system-level unless you need it for a specific case, where it can be configured per-job. -Set the properties `mapreduce.map.speculative` and `mapreduce.reduce.speculative` to false. +Speculative Execution of MapReduce tasks is on by default, and for HBase clusters it is generally +advised to turn off Speculative Execution at a system-level unless you need it for a specific case, +where it can be configured per-job. Set the properties `mapreduce.map.speculative` and +`mapreduce.reduce.speculative` to false. [[other_configuration]] === Other Configurations @@ -881,34 +1091,49 @@ Set the properties `mapreduce.map.speculative` and `mapreduce.reduce.speculative [[balancer_config]] ==== Balancer -The balancer is a periodic operation which is run on the master to redistribute regions on the cluster. -It is configured via `hbase.balancer.period` and defaults to 300000 (5 minutes). +The balancer is a periodic operation which is run on the master to redistribute regions on the +cluster. It is configured via `hbase.balancer.period` and defaults to 300000 (5 minutes). -See <> for more information on the LoadBalancer. +See <> for more information on the +LoadBalancer. [[disabling.blockcache]] ==== Disabling Blockcache -Do not turn off block cache (You'd do it by setting `hfile.block.cache.size` to zero). Currently we do not do well if you do this because the RegionServer will spend all its time loading HFile indices over and over again. -If your working set is such that block cache does you no good, at least size the block cache such that HFile indices will stay up in the cache (you can get a rough idea on the size you need by surveying RegionServer UIs; you'll see index block size accounted near the top of the webpage). +Do not turn off block cache (You'd do it by setting `hfile.block.cache.size` to zero). Currently, +we do not do well if you do this because the RegionServer will spend all its time loading HFile +indices over and over again. If your working set is such that block cache does you no good, at +least size the block cache such that HFile indices will stay up in the cache (you can get a rough +idea on the size you need by surveying RegionServer UIs; you'll see index block size accounted near +the top of the webpage). [[nagles]] ==== link:http://en.wikipedia.org/wiki/Nagle's_algorithm[Nagle's] or the small package problem If a big 40ms or so occasional delay is seen in operations against HBase, try the Nagles' setting. -For example, see the user mailing list thread, link:http://search-hadoop.com/m/pduLg2fydtE/Inconsistent+scan+performance+with+caching+set+&subj=Re+Inconsistent+scan+performance+with+caching+set+to+1[Inconsistent scan performance with caching set to 1] and the issue cited therein where setting `notcpdelay` improved scan speeds. -You might also see the graphs on the tail of link:https://issues.apache.org/jira/browse/HBASE-7008[HBASE-7008 Set scanner caching to a better default] where our Lars Hofhansl tries various data sizes w/ Nagle's on and off measuring the effect. +For example, see the user mailing list thread, +link:http://search-hadoop.com/m/pduLg2fydtE/Inconsistent+scan+performance+with+caching+set+&subj=Re+Inconsistent+scan+performance+with+caching+set+to+1[Inconsistent scan performance with caching set to 1] +and the issue cited therein where setting `notcpdelay` improved scan speeds. You might also see the +graphs on the tail of +link:https://issues.apache.org/jira/browse/HBASE-7008[HBASE-7008 Set scanner caching to a better default] +where our Lars Hofhansl tries various data sizes w/ Nagle's on and off measuring the effect. [[mttr]] ==== Better Mean Time to Recover (MTTR) -This section is about configurations that will make servers come back faster after a fail. -See the Deveraj Das and Nicolas Liochon blog post link:http://hortonworks.com/blog/introduction-to-hbase-mean-time-to-recover-mttr/[Introduction to HBase Mean Time to Recover (MTTR)] for a brief introduction. +This section is about configurations that will make servers come back faster after a fail. See the +Deveraj Das and Nicolas Liochon blog post +link:http://hortonworks.com/blog/introduction-to-hbase-mean-time-to-recover-mttr/[Introduction to HBase Mean Time to Recover (MTTR)] +for a brief introduction. -The issue link:https://issues.apache.org/jira/browse/HBASE-8389[HBASE-8354 forces Namenode into loop with lease recovery requests] is messy but has a bunch of good discussion toward the end on low timeouts and how to cause faster recovery including citation of fixes added to HDFS. Read the Varun Sharma comments. -The below suggested configurations are Varun's suggestions distilled and tested. -Make sure you are running on a late-version HDFS so you have the fixes he refers to and himself adds to HDFS that help HBase MTTR (e.g. -HDFS-3703, HDFS-3712, and HDFS-4791 -- Hadoop 2 for sure has them and late Hadoop 1 has some). Set the following in the RegionServer. +The issue +link:https://issues.apache.org/jira/browse/HBASE-8389[HBASE-8354 forces Namenode into loop with lease recovery requests] +is messy but has a bunch of good discussion toward the end on low timeouts and how to cause faster +recovery including citation of fixes added to HDFS. Read the Varun Sharma comments. The below +suggested configurations are Varun's suggestions distilled and tested. Make sure you are running +on a late-version HDFS so you have the fixes he refers to and himself adds to HDFS that help HBase +MTTR (e.g. HDFS-3703, HDFS-3712, and HDFS-4791 -- Hadoop 2 for sure has them and late Hadoop 1 has +some). Set the following in the RegionServer. [source,xml] ---- @@ -925,7 +1150,8 @@ HDFS-3703, HDFS-3712, and HDFS-4791 -- Hadoop 2 for sure has them and late Hadoo ---- -And on the NameNode/DataNode side, set the following to enable 'staleness' introduced in HDFS-3703, HDFS-3912. +And on the NameNode/DataNode side, set the following to enable 'staleness' introduced in HDFS-3703, +HDFS-3912. [source,xml] ---- @@ -969,13 +1195,17 @@ And on the NameNode/DataNode side, set the following to enable 'staleness' intro [[jmx_config]] ==== JMX -JMX (Java Management Extensions) provides built-in instrumentation that enables you to monitor and manage the Java VM. -To enable monitoring and management from remote systems, you need to set system property `com.sun.management.jmxremote.port` (the port number through which you want to enable JMX RMI connections) when you start the Java VM. -See the link:http://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html[official documentation] for more information. -Historically, besides above port mentioned, JMX opens two additional random TCP listening ports, which could lead to port conflict problem. (See link:https://issues.apache.org/jira/browse/HBASE-10289[HBASE-10289] for details) +JMX (Java Management Extensions) provides built-in instrumentation that enables you to monitor and +manage the Java VM. To enable monitoring and management from remote systems, you need to set system +property `com.sun.management.jmxremote.port` (the port number through which you want to enable JMX +RMI connections) when you start the Java VM. See the +link:http://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html[official documentation] +for more information. Historically, besides above port mentioned, JMX opens two additional random +TCP listening ports, which could lead to port conflict problem. (See +link:https://issues.apache.org/jira/browse/HBASE-10289[HBASE-10289] for details) -As an alternative, you can use the coprocessor-based JMX implementation provided by HBase. -To enable it, add below property in _hbase-site.xml_: +As an alternative, you can use the coprocessor-based JMX implementation provided by HBase. To +enable it, add below property in _hbase-site.xml_: [source,xml] ---- @@ -988,7 +1218,8 @@ To enable it, add below property in _hbase-site.xml_: NOTE: DO NOT set `com.sun.management.jmxremote.port` for Java VM at the same time. Currently it supports Master and RegionServer Java VM. -By default, the JMX listens on TCP port 10102, you can further configure the port using below properties: +By default, the JMX listens on TCP port 10102, you can further configure the port using below +properties: [source,xml] ---- @@ -1002,11 +1233,12 @@ By default, the JMX listens on TCP port 10102, you can further configure the por ---- -The registry port can be shared with connector port in most cases, so you only need to configure regionserver.rmi.registry.port. -However if you want to use SSL communication, the 2 ports must be configured to different values. +The registry port can be shared with connector port in most cases, so you only need to configure +`regionserver.rmi.registry.port`. However, if you want to use SSL communication, the 2 ports must +be configured to different values. By default the password authentication and SSL communication is disabled. -To enable password authentication, you need to update _hbase-env.sh_ like below: +To enable password authentication, you need to update _hbase-env.sh_ like below: [source,bash] ---- export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.authenticate=true \ @@ -1039,7 +1271,7 @@ And then update _hbase-env.sh_ like below: ---- export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=true \ -Djavax.net.ssl.keyStore=/home/tianq/myKeyStore \ - -Djavax.net.ssl.keyStorePassword=your_password_in_step_1 \ + -Djavax.net.ssl.keyStorePassword=your_password_in_step_1 \ -Dcom.sun.management.jmxremote.authenticate=true \ -Dcom.sun.management.jmxremote.password.file=your_password file \ -Dcom.sun.management.jmxremote.access.file=your_access_file" @@ -1055,7 +1287,8 @@ Finally start `jconsole` on the client using the key store: jconsole -J-Djavax.net.ssl.trustStore=/home/tianq/jconsoleKeyStore ---- -NOTE: To enable the HBase JMX implementation on Master, you also need to add below property in _hbase-site.xml_: +NOTE: To enable the HBase JMX implementation on Master, you also need to add below property in +_hbase-site.xml_: [source,xml] ---- @@ -1065,13 +1298,15 @@ NOTE: To enable the HBase JMX implementation on Master, you also need to add bel ---- -The corresponding properties for port configuration are `master.rmi.registry.port` (by default 10101) and `master.rmi.connector.port` (by default the same as registry.port) +The corresponding properties for port configuration are `master.rmi.registry.port` (by default +10101) and `master.rmi.connector.port` (by default the same as registry.port) [[dyn_config]] == Dynamic Configuration -It is possible to change a subset of the configuration without requiring a server restart. -In the HBase shell, the operations `update_config` and `update_all_config` will prompt a server or all servers to reload configuration. +It is possible to change a subset of the configuration without requiring a server restart. In the +HBase shell, the operations `update_config` and `update_all_config` will prompt a server or all +servers to reload configuration. Only a subset of all configurations can currently be changed in the running server. Here are those configurations: diff --git a/src/main/asciidoc/_chapters/cp.adoc b/src/main/asciidoc/_chapters/cp.adoc index e769efd2840..43aa55137b9 100644 --- a/src/main/asciidoc/_chapters/cp.adoc +++ b/src/main/asciidoc/_chapters/cp.adoc @@ -139,7 +139,7 @@ Referential Integrity:: Secondary Indexes:: You can use a coprocessor to maintain secondary indexes. For more information, see - link:https://wiki.apache.org/hadoop/Hbase/SecondaryIndexing[SecondaryIndexing]. + link:https://cwiki.apache.org/confluence/display/HADOOP2/Hbase+SecondaryIndexing[SecondaryIndexing]. ==== Types of Observer Coprocessor @@ -178,11 +178,28 @@ average or summation for an entire table which spans hundreds of regions. In contrast to observer coprocessors, where your code is run transparently, endpoint coprocessors must be explicitly invoked using the -link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Table.html#coprocessorService-java.lang.Class-byte:A-byte:A-org.apache.hadoop.hbase.client.coprocessor.Batch.Call-[CoprocessorService()] +link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/AsyncTable.html#coprocessorService-java.util.function.Function-org.apache.hadoop.hbase.client.ServiceCaller-byte:A-[CoprocessorService()] method available in -link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Table.html[Table] -or -link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/HTable.html[HTable]. +link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/AsyncTable.html[AsyncTable]. + +[WARNING] +.On using coprocessorService method with sync client +==== +The coprocessorService method in link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Table.html[Table] +has been deprecated. + +In link:https://issues.apache.org/jira/browse/HBASE-21512[HBASE-21512] +we reimplement the sync client based on the async client. The coprocessorService +method defined in `Table` interface directly references a method from protobuf's +`BlockingInterface`, which means we need to use a separate thread pool to execute +the method so we avoid blocking the async client(We want to avoid blocking calls in +our async implementation). + +Since coprocessor is an advanced feature, we believe it is OK for coprocessor users to +instead switch over to use `AsyncTable`. There is a lightweight +link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Connection.html#toAsyncConnection--[toAsyncConnection] +method to get an `AsyncConnection` from `Connection` if needed. +==== Starting with HBase 0.96, endpoint coprocessors are implemented using Google Protocol Buffers (protobuf). For more details on protobuf, see Google's @@ -193,6 +210,12 @@ link:https://issues.apache.org/jira/browse/HBASE-5448[HBASE-5448]). To upgrade y HBase cluster from 0.94 or earlier to 0.96 or later, you need to reimplement your coprocessor. +In HBase 2.x, we made use of a shaded version of protobuf 3.x, but kept the +protobuf for coprocessors on 2.5.0. In HBase 3.0.0, we removed all dependencies on +non-shaded protobuf so you need to reimplement your coprocessor to make use of the +shaded protobuf version provided in hbase-thirdparty. Please see +the <> section for more details. + Coprocessor Endpoints should make no use of HBase internals and only avail of public APIs; ideally a CPEP should depend on Interfaces and data structures only. This is not always possible but beware @@ -310,13 +333,6 @@ dependencies. [[load_coprocessor_in_shell]] ==== Using HBase Shell -. Disable the table using HBase Shell: -+ -[source] ----- -hbase> disable 'users' ----- - . Load the Coprocessor, using a command like the following: + [source] @@ -346,12 +362,6 @@ observers registered at the same hook using priorities. This field can be left b case the framework will assign a default priority value. * Arguments (Optional): This field is passed to the Coprocessor implementation. This is optional. -. Enable the table. -+ ----- -hbase(main):003:0> enable 'users' ----- - . Verify that the coprocessor loaded: + ---- @@ -372,7 +382,6 @@ String path = "hdfs://:/user//coprocessor.jar"; Configuration conf = HBaseConfiguration.create(); Connection connection = ConnectionFactory.createConnection(conf); Admin admin = connection.getAdmin(); -admin.disableTable(tableName); HTableDescriptor hTableDescriptor = new HTableDescriptor(tableName); HColumnDescriptor columnFamily1 = new HColumnDescriptor("personalDet"); columnFamily1.setMaxVersions(3); @@ -384,7 +393,6 @@ hTableDescriptor.setValue("COPROCESSOR$1", path + "|" + RegionObserverExample.class.getCanonicalName() + "|" + Coprocessor.PRIORITY_USER); admin.modifyTable(tableName, hTableDescriptor); -admin.enableTable(tableName); ---- ==== Using the Java API (HBase 0.96+ only) @@ -399,7 +407,6 @@ Path path = new Path("hdfs://:/user//coprocessor.ja Configuration conf = HBaseConfiguration.create(); Connection connection = ConnectionFactory.createConnection(conf); Admin admin = connection.getAdmin(); -admin.disableTable(tableName); HTableDescriptor hTableDescriptor = new HTableDescriptor(tableName); HColumnDescriptor columnFamily1 = new HColumnDescriptor("personalDet"); columnFamily1.setMaxVersions(3); @@ -410,7 +417,6 @@ hTableDescriptor.addFamily(columnFamily2); hTableDescriptor.addCoprocessor(RegionObserverExample.class.getCanonicalName(), path, Coprocessor.PRIORITY_USER, null); admin.modifyTable(tableName, hTableDescriptor); -admin.enableTable(tableName); ---- WARNING: There is no guarantee that the framework will load a given Coprocessor successfully. @@ -422,13 +428,6 @@ verifies whether the given class is actually contained in the jar file. ==== Using HBase Shell -. Disable the table. -+ -[source] ----- -hbase> disable 'users' ----- - . Alter the table to remove the coprocessor. + [source] @@ -436,13 +435,6 @@ hbase> disable 'users' hbase> alter 'users', METHOD => 'table_att_unset', NAME => 'coprocessor$1' ---- -. Enable the table. -+ -[source] ----- -hbase> enable 'users' ----- - ==== Using the Java API Reload the table definition without setting the value of the coprocessor either by @@ -456,7 +448,6 @@ String path = "hdfs://:/user//coprocessor.jar"; Configuration conf = HBaseConfiguration.create(); Connection connection = ConnectionFactory.createConnection(conf); Admin admin = connection.getAdmin(); -admin.disableTable(tableName); HTableDescriptor hTableDescriptor = new HTableDescriptor(tableName); HColumnDescriptor columnFamily1 = new HColumnDescriptor("personalDet"); columnFamily1.setMaxVersions(3); @@ -465,7 +456,6 @@ HColumnDescriptor columnFamily2 = new HColumnDescriptor("salaryDet"); columnFamily2.setMaxVersions(3); hTableDescriptor.addFamily(columnFamily2); admin.modifyTable(tableName, hTableDescriptor); -admin.enableTable(tableName); ---- In HBase 0.96 and newer, you can instead use the `removeCoprocessor()` method of the @@ -499,6 +489,7 @@ The following Observer coprocessor prevents the details of the user `admin` from returned in a `Get` or `Scan` of the `users` table. . Write a class that implements the +link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionCoprocessor.html[RegionCoprocessor], link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionObserver.html[RegionObserver] class. @@ -516,16 +507,20 @@ empty result. Otherwise, process the request as normal. Following are the implementation of the above steps: - [source,java] ---- -public class RegionObserverExample implements RegionObserver { +public class RegionObserverExample implements RegionCoprocessor, RegionObserver { private static final byte[] ADMIN = Bytes.toBytes("admin"); private static final byte[] COLUMN_FAMILY = Bytes.toBytes("details"); private static final byte[] COLUMN = Bytes.toBytes("Admin_det"); private static final byte[] VALUE = Bytes.toBytes("You can't see Admin details"); + @Override + public Optional getRegionObserver() { + return Optional.of(this); + } + @Override public void preGetOp(final ObserverContext e, final Get get, final List results) throws IOException { diff --git a/src/main/asciidoc/_chapters/datamodel.adoc b/src/main/asciidoc/_chapters/datamodel.adoc index 4dad015d69b..dd54b1cc04c 100644 --- a/src/main/asciidoc/_chapters/datamodel.adoc +++ b/src/main/asciidoc/_chapters/datamodel.adoc @@ -425,7 +425,7 @@ Get get = new Get(Bytes.toBytes("row1")); get.setMaxVersions(3); // will return last 3 versions of row Result r = table.get(get); byte[] b = r.getValue(CF, ATTR); // returns current version of value -List kv = r.getColumn(CF, ATTR); // returns all versions of this column +List cells = r.getColumnCells(CF, ATTR); // returns all versions of this column ---- ==== Put @@ -471,6 +471,26 @@ Caution: the version timestamp is used internally by HBase for things like time- It's usually best to avoid setting this timestamp yourself. Prefer using a separate timestamp attribute of the row, or have the timestamp as a part of the row key, or both. +===== Cell Version Example + +The following Put uses a method getCellBuilder() to get a CellBuilder instance +that already has relevant Type and Row set. + +[source,java] +---- + +public static final byte[] CF = "cf".getBytes(); +public static final byte[] ATTR = "attr".getBytes(); +... + +Put put = new Put(Bytes.toBytes(row)); +put.add(put.getCellBuilder().setQualifier(ATTR) + .setFamily(CF) + .setValue(Bytes.toBytes(data)) + .build()); +table.put(put); +---- + [[version.delete]] ==== Delete diff --git a/src/main/asciidoc/_chapters/developer.adoc b/src/main/asciidoc/_chapters/developer.adoc index d17c79f775d..6987ffd6b63 100644 --- a/src/main/asciidoc/_chapters/developer.adoc +++ b/src/main/asciidoc/_chapters/developer.adoc @@ -284,22 +284,37 @@ For additional information on setting up Eclipse for HBase development on Window === IntelliJ IDEA -You can set up IntelliJ IDEA for similar functionality as Eclipse. -Follow these steps. +A functional development environment can be setup around an IntelliJ IDEA installation that has the +plugins necessary for building Java projects with Maven. -. Select -. You do not need to select a profile. - Be sure [label]#Maven project - required# is selected, and click btn:[Next]. -. Select the location for the JDK. +. Use either File > New > "Project from Existing Sources..." or "Project From Version Control.." +. Depending on your version of IntelliJ, you may need to choose Maven as the "project" or "model" +type. -.Using the HBase Formatter in IntelliJ IDEA -Using the Eclipse Code Formatter plugin for IntelliJ IDEA, you can import the HBase code formatter described in <>. +The following plugins are recommended: + +. Maven, bundled. This allows IntelliJ to resolve dependencies and recognize the project structure. +. EditorConfig, bundled. This will apply project whitespace settings found in the the +`.editorconfig` file available on branches with +link:https://issues.apache.org/jira/browse/HBASE-23234[HBASE-23234] or later. +. link:https://plugins.jetbrains.com/plugin/1065-checkstyle-idea/[Checkstyle-IDEA]. Configure this +against the configuration file found under `hbase-checkstyle/src/main/resources/hbase/checkstyle.xml` +(If the Intellij checkstyle plugin complains parsing the volunteered hbase `checkstyle.xml`, make +sure the plugin's `version` popup menu matches the hbase checkstyle version -- see +link:https://issues.apache.org/jira/browse/HBASE-23242[HBASE-23242] for more). +This plugin will highlight style errors in the IDE, so you can fix them before they get flagged during the +pre-commit process. +. link:https://plugins.jetbrains.com/plugin/8277-protobuf-support/[Protobuf Support]. HBase uses +link:https://developers.google.com/protocol-buffers/[Protocol Buffers] in a number of places where +serialization is required. This plugin is helpful when editing these object definitions. +. link:https://plugins.jetbrains.com/plugin/7391-asciidoc/[AsciiDoc]. HBase uses +link:http://asciidoc.org[AsciiDoc] for building it's project documentation. This plugin is helpful +when editing this book. === Other IDEs -It would be useful to mirror the <> set-up instructions for other IDEs. -If you would like to assist, please have a look at link:https://issues.apache.org/jira/browse/HBASE-11704[HBASE-11704]. +If you'd have another environment with which you'd like to develop on HBase, please consider +documenting your setup process here. [[build]] == Building Apache HBase @@ -307,16 +322,15 @@ If you would like to assist, please have a look at link:https://issues.apache.or [[build.basic]] === Basic Compile -HBase is compiled using Maven. -You must use at least Maven 3.0.4. -To check your Maven version, run the command +mvn -version+. +HBase is compiled using Maven. You must use at least Maven 3.0.4. To check your Maven version, run +the command +mvn -version+. -.JDK Version Requirements -[NOTE] -==== -Starting with HBase 1.0 you must use Java 7 or later to build from source code. -See <> for more complete information about supported JDK versions. -==== +[[build.basic.jdk_requirements]] +==== JDK Version Requirements + +HBase has Java version compiler requirements that vary by release branch. At compilation time, +HBase has the same version requirements as it does for runtime. See <> for a complete +support matrix of Java version by HBase version. [[maven.build.commands]] ==== Maven Build Commands @@ -382,28 +396,111 @@ mvn clean install -DskipTests See the <> section in <> [[maven.build.hadoop]] -==== Building against various hadoop versions. +==== Building against various Hadoop versions -HBase supports building against Apache Hadoop versions: 2.y and 3.y (early release artifacts). By default we build against Hadoop 2.x. +HBase supports building against Apache Hadoop versions: 2.y and 3.y (early release artifacts). +Exactly which version of Hadoop is used by default varies by release branch. See the section +<> for the complete breakdown of supported Hadoop version by HBase release. -To build against a specific release from the Hadoop 2.y line, set e.g. `-Dhadoop-two.version=2.6.3`. +The mechanism for selecting a Hadoop version at build time is identical across all releases. Which +version of Hadoop is default varies. We manage Hadoop major version selection by way of Maven +profiles. Due to the peculiarities of Maven profile mutual exclusion, the profile that builds +against a particular Hadoop version is activated by setting a property, *not* the usual profile +activation. Hadoop version profile activation is summarized by the following table. + +.Hadoop Profile Activation by HBase Release +[cols="3*^.^", options="header"] +|=== +| | Hadoop2 Activation | Hadoop3 Activation +| HBase 1.3+ | _active by default_ | `-Dhadoop.profile=3.0` +| HBase 3.0+ | _not supported_ | _active by default_ +|=== + +[WARNING] +==== +Please note that where a profile is active by default, `hadoop.profile` must NOT be provided. +==== + +Once the Hadoop major version profile is activated, the exact Hadoop version can be +specified by overriding the appropriate property value. For Hadoop2 versions, the property name +is `hadoop-two.version`. With Hadoop3 versions, the property name is `hadoop-three.version`. + +.Example 1, Building HBase 1.7 against Hadoop 2.10.0 + +For example, to build HBase 1.7 against Hadoop 2.10.0, the profile is set for Hadoop2 by default, +so only `hadoop-two.version` must be specified: [source,bourne] ---- -mvn -Dhadoop-two.version=2.6.3 ... +git checkout branch-1 +mvn -Dhadoop-two.version=2.10.0 ... ---- -To change the major release line of Hadoop we build against, add a hadoop.profile property when you invoke +mvn+: +.Example 2, Building HBase 2.3 against Hadoop 3.3.0-SNAPSHOT + +This is how a developer might check the compatibility of HBase 2.3 against an unreleased Hadoop +version (currently 3.3). Both the Hadoop3 profile and version must be specified: [source,bourne] ---- -mvn -Dhadoop.profile=3.0 ... +git checkout branch-2.3 +mvn -Dhadoop.profile=3.0 -Dhadoop-three.version=3.3.0-SNAPSHOT ... ---- -The above will build against whatever explicit hadoop 3.y version we have in our _pom.xml_ as our '3.0' version. -Tests may not all pass so you may need to pass `-DskipTests` unless you are inclined to fix the failing tests. +.Example 3, Building HBase 3.0 against Hadoop 3.3.0-SNAPSHOT -To pick a particular Hadoop 3.y release, you'd set hadoop-three.version property e.g. `-Dhadoop-three.version=3.0.0`. +The same developer might want also to check the development version of HBase (currently 3.0) +against the development version of Hadoop (currently 3.3). In this case, the Hadoop3 profile is +active by default, so only `hadoop-three.version` must be specified: + +[source,bourne] +---- +git checkout master +mvn -Dhadoop-three.version=3.3.0-SNAPSHOT ... +---- + +[[maven.build.jdk11_hadoop3]] +==== Building with JDK11 and Hadoop3 + +HBase manages JDK-specific build settings using Maven profiles. The profile appropriate to the JDK +in use is automatically activated. Building and running on JDK8 supports both Hadoop2 and Hadoop3. +For JDK11, only Hadoop3 is supported. Thus, the Hadoop3 profile must be active when building on +JDK11, and the artifacts used when running HBase on JDK11 must be compiled against Hadoop3. +Furthermore, the JDK11 profile requires a minimum Hadoop version of 3.2.0. This value is specified +by the JDK11 profile, but it can be overridden using the `hadoop-three.version` property as normal. +For details on Hadoop profile activation by HBase branch, see +<>. See <> for a complete +support matrix of Java version by HBase version. + +.Example 1, Building HBase 2.3 with JDK11 + +To build HBase 2.3 with JDK11, the Hadoop3 profile must be activated explicitly. + +[source,bourne] +---- +git checkout branch-2.3 +JAVA_HOME=/usr/lib/jvm/java-11 mvn -Dhadoop.profile=3.0 ... +---- + +.Example 2, Building HBase 3.0 with JDK11 + +For HBase 3.0, the Hadoop3 profile is active by default, no additional properties need be +specified. + +[source,bourne] +---- +git checkout master +JAVA_HOME=/usr/lib/jvm/java-11 mvn ... +---- + +[[maven.build.jdk11_hadoop3_ide]] +==== Building and testing in an IDE with JDK11 and Hadoop3 + +Continuing the discussion from the <>, building and +testing with JDK11 and Hadoop3 within an IDE may require additional configuration. Specifically, +make sure the JVM version used by the IDE is a JDK11, the active JDK Maven profile is for JDK11, +and the Maven profile for JDK8 is NOT active. Likewise, ensure the Hadoop3 Maven profile is active +and the Hadoop2 Maven profile is NOT active. [[build.protobuf]] ==== Build Protobuf @@ -488,6 +585,40 @@ If you see `Unable to find resource 'VM_global_library.vm'`, ignore it. It's not an error. It is link:https://issues.apache.org/jira/browse/MSITE-286[officially ugly] though. +[[build.on.linux.aarch64]] +=== Build On Linux Aarch64 +HBase runs on both Windows and UNIX-like systems, and it should run on any platform +that runs a supported version of Java. This should include JVMs on x86_64 and aarch64. +The documentation below describes how to build hbase on aarch64 platform. + +==== Set Environment Variables +Manually install Java and Maven on aarch64 servers if they are not installed, +and set environment variables. For example: + +[source,bourne] +---- +export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-arm64 +export MAVEN_HOME=/opt/maven +export PATH=${MAVEN_HOME}/bin:${JAVA_HOME}/bin:${PATH} +---- + +==== Use Protobuf Supported On Aarch64 +Now HBase uses protobuf of two versions. Version '3.11.4' of protobuf that hbase uses +internally and version '2.5.0' as external usage. +Package protoc-2.5.0 does not work on aarch64 platform, we should add maven +profile '-Paarch64' when building. It downloads protoc-2.5.0 package from maven +repository which we made on aarch64 platform locally. + +[source,bourne] +---- +mvn clean install -Paarch64 -DskipTests +---- + +[NOTE] +Protobuf is released with aarch64 protoc since version '3.5.0', and we are planning to +upgrade protobuf later, then we don't have to add the profile '-Paarch64' anymore. + + [[releasing]] == Releasing Apache HBase @@ -546,24 +677,173 @@ For the build to sign them for you, you a properly configured _settings.xml_ in [[maven.release]] === Making a Release Candidate -Only committers may make releases of hbase artifacts. +Only committers can make releases of hbase artifacts. .Before You Begin -Make sure your environment is properly set up. Maven and Git are the main tooling -used in the below. You'll need a properly configured _settings.xml_ file in your -local _~/.m2_ maven repository with logins for apache repos (See <>). -You will also need to have a published signing key. Browse the Hadoop -link:http://wiki.apache.org/hadoop/HowToRelease[How To Release] wiki page on -how to release. It is a model for most of the instructions below. It often has more -detail on particular steps, for example, on adding your code signing key to the -project KEYS file up in Apache or on how to update JIRA in preparation for release. - -Before you make a release candidate, do a practice run by deploying a SNAPSHOT. Check to be sure recent builds have been passing for the branch from where you are going to take your release. You should also have tried recent branch tips out on a cluster under load, perhaps by running the `hbase-it` integration test suite for a few hours to 'burn in' the near-candidate bits. +You will need a published signing key added to the hbase +link:https://dist.apache.org/repos/dist/release/hbase/KEYS[KEYS] file. +(For how to add a KEY, see _Step 1._ in link:https://cwiki.apache.org/confluence/display/HADOOP2/HowToRelease[How To Release], +the Hadoop version of this document). + +Next make sure JIRA is properly primed, that all issues targeted against +the prospective release have been resolved and are present in git on the +particular branch. If any outstanding issues, move them out of the release by +adjusting the fix version to remove this pending release as a target. +Any JIRA with a fix version that matches the release candidate +target release will be included in the generated _CHANGES.md/RELEASENOTES.md_ +files that ship with the release so make sure JIRA is correct before you begin. + +After doing the above, you can move to the manufacture of an RC. +Building an RC is involved. We've tried to script it. In the next section +we describe the script. It is followed by a description of the steps +involved which the script automates. + +[[do-release-docker.sh]] +==== Release Candidate Generating Script + +The _dev-support/create-release/do-release-docker.sh_ Release Candidate (RC) +Generating script is maintained in the master branch but can generate RCs +for any 2.x+ branch (The script does not work against branch-1). Check out +and update the master branch when making RCs. + +The script builds in a Docker container to ensure we have a consistent +environment building. It will ask you for passwords for apache and for your +gpg signing key so it can sign and commit on your behalf. The passwords +are passed to gpg-agent in the container and purged along with the container +when the build is done. + +[NOTE] +==== +_dev-support/create-release/do-release-docker.sh_ supercedes the previous +_dev-support/make_rc.sh_ script. It is more comprehensive automating all +steps, rather than a portion, building a RC. +==== + +The script will: + + * Set version to the release version + * Updates RELEASENOTES.md and CHANGES.md + * Tag the RC + * Set version to next SNAPSHOT version. + * Builds, signs, and hashes all artifacts. + * Generates the api compatibility report + * Pushes release tgzs to the dev dir in a apache dist. + * Pushes to repository.apache.org staging. + * Creates vote email template. + +The RC building script is _dev-support/create-release/do-release-docker.sh_. +Pass _-h_ to _dev-support/create-release/do-release-docker.sh_ to +see available options: + +``` +Usage: do-release-docker.sh [options] + +This script runs the release scripts inside a docker image. + +Options: + + -d [path] required. working directory. output will be written to "output" in here. + -n dry run mode. Checks and local builds, but does not upload anything. + -t [tag] tag for the hbase-rm docker image to use for building (default: "latest"). + -j [path] path to local JDK installation to use building. By default the script will + use openjdk8 installed in the docker image. + -s [step] runs a single step of the process; valid steps are: tag, build, publish. if + none specified, runs tag, then build, and then publish. +``` + +Running the below command will do all steps above using the +'rm' working directory under Downloads as workspace: +``` + $ ./dev-support/create-release/do-release-docker.sh -d ~/Downloads/rm +``` + +The script will ask you a set of questions about the release version +and branch, the version to generate the compatibility report against, +and so on, before it starts executing (If you set the appropriate +environment variables, the script will skip asking you questions -- +which can come in handy if you end up having to re-run the script +multiple times). + +On branch 2.1, a Release Candidate (RC) creation can take many hours +(~8 hours) so run your build on a machine you know will be +around for this swath of time. Start the build inside a _screen_ +or _tmux_ session in case you become disconnected from your +build box. + +The build is made of three stages: tag, build, and +publish. If the script fails, you may resort to 'fixing' the +failure manually and then asking the script to run the +subsequent stage rather than start over. + +When the scripts run, they use the passed working directory. +Under the working directory is an _output_ dir. In here is +where the checkouts go, where we build up the _svn_ directory +to _svn_ commit to _apache/dist/dev_, etc. Each step also +dumps a log file in here: e.g. _tag.log_ for the tagging +step and _build.log_ for building. + +The _tagging_ step will checkout hbase, set the version number +in all the poms – e.g. if branch-2.0 is at 2.0.6-SNAPSHOT +and you are making a 2.0.5 RC, it will set the versions in +all poms to 2.0.5 – appropriately. It then generate CHANGES.md +and RELEASENOTES.md by checking out yetus and then +calling its generator scripts. It then commits the poms with +their new versions along with the changed CHANGES.md and +RELEASENOTES.md, tags, and pushes up all changes to the +apache repo. + +The _build_ step will checkout hbase, build all including +javadoc and doc (javadoc takes the bulk of the time – 4 hours plus), +run assemblies to produce src and bin tarballs, sign and hash it +all, and then make a dir under apache dist dev named for the RC. +It will copy all artifacts in here including top-level CHANGES.md +and RELEASENOTES.md. It will generate api diff docs and put them +into this RC dir too. When done, it commits the svn RC. + +The publish step will checkout hbase, build, and then copy up all +artifacts to repository.apache.org (signed and hashed). When done, +it will dump out an email template with all the correct links in place. + +Check the artifacts pushed to the dev distribution directory and up +in repository.apache.org. If all looks good, check the generated +email and send to the dev list. + +Under the create-release dir, scripts should make some sense: +``` +do-release-docker.sh # Main entrance. +do-release.sh . # More checks. Not usable really other than by setting env variables before running it. +release-tag.sh # Does tagging steps. +release-build.sh . # Does the build and publish step. +release-util.sh # Utility used by all of the above. +vote.tmpl # Template for email to send out. +hbase-rm # Has docker image we use. +``` + +If the RC fails, the script will do the right thing when it comes +to edit of the _CHANGES.md_ and _RELEASENOTES.md_ removing the old +and updating the files with the updated content (No harm verifying +though). + +One trick for checking stuff especially in utility is to do as follows: + +``` +$ source release-util.sh ; generate_api_report ../../ rel/2.1.3 2.14RC1 +``` + +i.e. source the release-util.sh script and then run one of its functions +passing args. Helped debugging stuff. + +[[rc_procedure]] +==== Release Candidate Procedure +Here we describe the steps involved generating a Release Candidate, the steps +automated by the script described in the previous section. + +The process below makes use of various tools, mainly _git_ and _maven_. .Specifying the Heap Space for Maven [NOTE] @@ -579,75 +859,60 @@ MAVEN_OPTS="-Xmx4g -XX:MaxPermSize=256m" mvn package You could also set this in an environment variable or alias in your shell. ==== +===== Update the _CHANGES.md_ and _RELEASENOTES.md_ files and the POM files. -[NOTE] -==== -The script _dev-support/make_rc.sh_ automates many of the below steps. -It will checkout a tag, clean the checkout, build src and bin tarballs, -and deploy the built jars to repository.apache.org. -It does NOT do the modification of the _CHANGES.txt_ for the release, -the checking of the produced artifacts to ensure they are 'good' -- -e.g. extracting the produced tarballs, verifying that they -look right, then starting HBase and checking that everything is running -correctly -- or the signing and pushing of the tarballs to -link:https://people.apache.org[people.apache.org]. -Take a look. Modify/improve as you see fit. -==== +Update _CHANGES.md_ with the changes since the last release. Be careful with where you put +headings and license. Respect the instructions and warning you find in current +_CHANGES.md_ and _RELEASENOTES.md_ since these two files are processed by tooling that is +looking for particular string sequences. See link:https://issues.apache.org/jira/browse/HBASE-21399[HBASE-21399] +for description on how to make use of yetus generating additions to +_CHANGES.md_ and _RELEASENOTES.md_ (RECOMMENDED!). Adding JIRA fixes, make sure the +URL to the JIRA points to the proper location which lists fixes for this release. -.Procedure: Release Procedure -. Update the _CHANGES.txt_ file and the POM files. -+ -Update _CHANGES.txt_ with the changes since the last release. -Make sure the URL to the JIRA points to the proper location which lists fixes for this release. -Adjust the version in all the POM files appropriately. +Next, adjust the version in all the POM files appropriately. If you are making a release candidate, you must remove the `-SNAPSHOT` label from all versions in all pom.xml files. If you are running this receipe to publish a snapshot, you must keep the `-SNAPSHOT` suffix on the hbase version. The link:http://www.mojohaus.org/versions-maven-plugin/[Versions Maven Plugin] can be of use here. To set a version in all the many poms of the hbase multi-module project, use a command like the following: -+ + [source,bourne] ---- $ mvn clean org.codehaus.mojo:versions-maven-plugin:2.5:set -DnewVersion=2.1.0-SNAPSHOT ---- -+ -Make sure all versions in poms are changed! Checkin the _CHANGES.txt_ and any maven version changes. -. Update the documentation. -+ +Make sure all versions in poms are changed! Checkin the _CHANGES.md_, _RELEASENOTES.md_, and +any maven version changes. + +===== Update the documentation. + Update the documentation under _src/main/asciidoc_. This usually involves copying the latest from master branch and making version-particular -adjustments to suit this release candidate version. +adjustments to suit this release candidate version. Commit your changes. -. Clean the checkout dir -+ +===== Clean the checkout dir [source,bourne] ---- - $ mvn clean $ git clean -f -x -d ---- - -. Run Apache-Rat +===== Run Apache-Rat Check licenses are good -+ + [source,bourne] ---- - -$ mvn apache-rat +$ mvn apache-rat:check ---- -+ + If the above fails, check the rat log. -+ [source,bourne] ---- $ grep 'Rat check' patchprocess/mvn_apache_rat.log ---- -+ -. Create a release tag. +===== Create a release tag. Presuming you have run basic tests, the rat check, passes and all is looking good, now is the time to tag the release candidate (You always remove the tag if you need to redo). To tag, do @@ -656,10 +921,8 @@ All tags should be signed tags; i.e. pass the _-s_ option (See link:http://https://git-scm.com/book/id/v2/Git-Tools-Signing-Your-Work[Signing Your Work] for how to set up your git environment for signing). -+ [source,bourne] ---- - $ git tag -s 2.0.0-alpha4-RC0 -m "Tagging the 2.0.0-alpha4 first Releae Candidate (Candidates start at zero)" ---- @@ -672,25 +935,22 @@ they are preserved in the Apache repo as in: ---- Push the (specific) tag (only) so others have access. -+ + [source,bourne] ---- - $ git push origin 2.0.0-alpha4-RC0 ---- -+ + For how to delete tags, see link:http://www.manikrathee.com/how-to-delete-a-tag-in-git.html[How to Delete a Tag]. Covers deleting tags that have not yet been pushed to the remote Apache repo as well as delete of tags pushed to Apache. - -. Build the source tarball. -+ +===== Build the source tarball. Now, build the source tarball. Lets presume we are building the source tarball for the tag _2.0.0-alpha4-RC0_ into _/tmp/hbase-2.0.0-alpha4-RC0/_ (This step requires that the mvn and git clean steps described above have just been done). -+ + [source,bourne] ---- $ git archive --format=tar.gz --output="/tmp/hbase-2.0.0-alpha4-RC0/hbase-2.0.0-alpha4-src.tar.gz" --prefix="hbase-2.0.0-alpha4/" $git_tag @@ -701,7 +961,7 @@ _/tmp/hbase-2.0.0-alpha4-RC0_ build output directory (We don't want the _RC0_ in These bits are currently a release candidate but if the VOTE passes, they will become the release so we do not taint the artifact names with _RCX_). -. Build the binary tarball. +===== Build the binary tarball. Next, build the binary tarball. Add the `-Prelease` profile when building. It runs the license apache-rat check among other rules that help ensure all is wholesome. Do it in two steps. @@ -710,7 +970,6 @@ First install into the local repository [source,bourne] ---- - $ mvn clean install -DskipTests -Prelease ---- @@ -720,24 +979,21 @@ documentation. [source,bourne] ---- - $ mvn install -DskipTests site assembly:single -Prelease ---- -+ Otherwise, the build complains that hbase modules are not in the maven repository when you try to do it all in one step, especially on a fresh repository. It seems that you need the install goal in both steps. -+ + Extract the generated tarball -- you'll find it under _hbase-assembly/target_ and check it out. Look at the documentation, see if it runs, etc. If good, copy the tarball beside the source tarball in the build output directory. +===== Deploy to the Maven Repository. -. Deploy to the Maven Repository. -+ Next, deploy HBase to the Apache Maven repository. Add the apache-release` profile when running the `mvn deploy` command. This profile comes from the Apache parent pom referenced by our pom files. @@ -746,43 +1002,42 @@ _settings.xml_ is configured correctly, as described in <>. This step depends on the local repository having been populate by the just-previous bin tarball build. -+ + [source,bourne] ---- - $ mvn deploy -DskipTests -Papache-release -Prelease ---- -+ + This command copies all artifacts up to a temporary staging Apache mvn repository in an 'open' state. More work needs to be done on these maven artifacts to make them generally available. -+ + We do not release HBase tarball to the Apache Maven repository. To avoid deploying the tarball, do not include the `assembly:single` goal in your `mvn deploy` command. Check the deployed artifacts as described in the next section. .make_rc.sh [NOTE] ==== -If you run the _dev-support/make_rc.sh_ script, this is as far as it takes you. +If you ran the old _dev-support/make_rc.sh_ script, this is as far as it takes you. To finish the release, take up the script from here on out. ==== -. Make the Release Candidate available. -+ +===== Make the Release Candidate available. + The artifacts are in the maven repository in the staging area in the 'open' state. While in this 'open' state you can check out what you've published to make sure all is good. To do this, log in to Apache's Nexus at link:https://repository.apache.org[repository.apache.org] using your Apache ID. Find your artifacts in the staging repository. Click on 'Staging Repositories' and look for a new one ending in "hbase" with a status of 'Open', select it. Use the tree view to expand the list of repository contents and inspect if the artifacts you expect are present. Check the POMs. As long as the staging repo is open you can re-upload if something is missing or built incorrectly. -+ + If something is seriously wrong and you would like to back out the upload, you can use the 'Drop' button to drop and delete the staging repository. Sometimes the upload fails in the middle. This is another reason you might have to 'Drop' the upload from the staging repository. -+ + If it checks out, close the repo using the 'Close' button. The repository must be closed before a public URL to it becomes available. It may take a few minutes for the repository to close. Once complete you'll see a public URL to the repository in the Nexus UI. You may also receive an email with the URL. Provide the URL to the temporary staging repository in the email that announces the release candidate. (Folks will need to add this repo URL to their local poms or to their local _settings.xml_ file to pull the published release candidate artifacts.) -+ + When the release vote concludes successfully, return here and click the 'Release' button to release the artifacts to central. The release process will automatically drop and delete the staging repository. -+ + .hbase-downstreamer [NOTE] ==== @@ -793,11 +1048,11 @@ Make sure you are pulling from the repository when tests run and that you are no ==== See link:https://www.apache.org/dev/publishing-maven-artifacts.html[Publishing Maven Artifacts] for some pointers on this maven staging process. -+ + If the HBase version ends in `-SNAPSHOT`, the artifacts go elsewhere. They are put into the Apache snapshots repository directly and are immediately available. Making a SNAPSHOT release, this is what you want to happen. -+ + At this stage, you have two tarballs in your 'build output directory' and a set of artifacts in a staging area of the maven repository, in the 'closed' state. Next sign, fingerprint and then 'stage' your release candiate build output directory via svnpubsub by committing @@ -808,7 +1063,6 @@ link:https://dist.apache.org/repos/dist/release/hbase[release/hbase]). In the _v [source,bourne] ---- - $ for i in *.tar.gz; do echo $i; gpg --print-md MD5 $i > $i.md5 ; done $ for i in *.tar.gz; do echo $i; gpg --print-md SHA512 $i > $i.sha ; done $ for i in *.tar.gz; do echo $i; gpg --armor --output $i.asc --detach-sig $i ; done @@ -832,12 +1086,11 @@ $ mv 0.96.0RC0 /Users/stack/checkouts/hbase.dist.dev.svn $ svn add 0.96.0RC0 $ svn commit ... ---- -+ + Ensure it actually gets published by checking link:https://dist.apache.org/repos/dist/dev/hbase/[https://dist.apache.org/repos/dist/dev/hbase/]. Announce the release candidate on the mailing list and call a vote. - [[maven.snapshot]] === Publishing a SNAPSHOT to maven @@ -876,6 +1129,50 @@ Regards the latter, run `mvn apache-rat:check` to verify all files are suitably See link:http://search-hadoop.com/m/DHED4dhFaU[HBase, mail # dev - On recent discussion clarifying ASF release policy] for how we arrived at this process. +To help with the release verification, please follow the guideline below and vote based on the your verification. + +=== Baseline Verifications for Voting Release Candidates + +Although contributors have their own checklist for verifications, the following items are usually used for voting on release candidates. + +* CHANGES.md if any +* RELEASENOTES.md (release notes) if any +* Generated API compatibility report +** For what should be compatible please refer the link:https://hbase.apache.org/book.html#hbase.versioning[versioning guideline], especially for items with marked as high severity +* Use `hbase-vote.sh` to perform sanity checks for checksum, signatures, files are licensed, built from source, and unit tests. +** `hbase-vote.sh` shell script is available under `dev-support` directory of HBase source. Following are the usage details. + +[source,bourne] +---- +./dev-support/hbase-vote.sh -h +hbase-vote. A script for standard vote which verifies the following items +1. Checksum of sources and binaries +2. Signature of sources and binaries +3. Rat check +4. Built from source +5. Unit tests + +Usage: hbase-vote.sh -s | --source [-k | --key ] [-f | --keys-file-url ] [-o | --output-dir ] + hbase-vote.sh -h | --help + + -h | --help Show this screen. + -s | --source '' A URL pointing to the release candidate sources and binaries + e.g. https://dist.apache.org/repos/dist/dev/hbase/hbase-RC0/ + -k | --key '' A signature of the public key, e.g. 9AD2AE49 + -f | --keys-file-url '' the URL of the key file, default is + http://www.apache.org/dist/hbase/KEYS + -o | --output-dir '' directory which has the stdout and stderr of each verification target +---- +* If you see any unit test failures, please call out the solo test result and whether it's part of flaky (nightly) tests dashboard, e.g. link:https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/master/lastSuccessfulBuild/artifact/dashboard.html[dashboard of master branch] (please change the test branch accordingly). + +=== Additional Verifications for Voting Release Candidates + +Other than the common verifications, contributors may call out additional concerns, e.g. for a specific feature by running end to end tests on a distributed environment. This is optional and always encouraged. + +* Start a distributed HBase cluster and call out the test result of specific workload on cluster. e.g. +** Run basic table operations, e.g. `create/put/get/scan/flush/list/disable/drop` +** Run built-in tests, e.g. `LoadTestTool` (LTT) and `IntegrationTestBigLinkedList` (ITBLL) + [[hbase.release.announcement]] == Announcing Releases @@ -1106,7 +1403,7 @@ Medium Tests (((MediumTests))):: Large Tests (((LargeTests))):: _Large_ test cases are everything else. They are typically large-scale tests, regression tests for specific bugs, timeout tests, or performance tests. No large test suite can take longer than - ten minutes. It will be killed as timed out. Cast your test as an Integration Test if it needs + thirteen minutes. It will be killed as timed out. Cast your test as an Integration Test if it needs to run longer. Integration Tests (((IntegrationTests))):: @@ -1117,22 +1414,46 @@ Integration Tests (((IntegrationTests))):: [[hbase.unittests.cmds]] === Running tests +The state of tests on the hbase branches varies. Some branches keep good test hygiene and all tests pass +reliably with perhaps an unlucky sporadic flakey test failure. On other branches, the case may be less so with +frequent flakies and even broken tests in need of attention that fail 100% of the time. Try and figure +the state of tests on the branch you are currently interested in; the current state of nightly +link:https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/[apache jenkins builds] is a good +place to start. Tests on master branch are generally not in the best of condition as releases +are less frequent off master. This can make it hard landing patches especially given our dictum that +patches land on master branch first. + +The full test suite can take from 5-6 hours on an anemic VM with 4 CPUs and minimal +parallelism to 50 minutes or less on a linux machine with dozens of CPUs and plenty of +RAM. + +When you go to run the full test suite, make sure you up the test runner user nproc +(`ulimit -u` -- make sure it > 6000 or more if more parallelism) and the number of +open files (`ulimit -n` -- make sure it > 10240 or more) limits on your system. +Errors because the test run hits +limits are often only opaquely related to the constraint. You can see the current +user settings by running `ulimit -a`. + [[hbase.unittests.cmds.test]] ==== Default: small and medium category tests -Running `mvn test` will execute all small tests in a single JVM (no fork) and then medium tests in a separate JVM for each test instance. -Medium tests are NOT executed if there is an error in a small test. Large tests are NOT executed. +Running `mvn test` will execute all small tests in a single JVM (no fork) and then medium tests in a +forked, separate JVM for each test instance (For definition of 'small' test and so on, see +<>). Medium tests are NOT executed if there is an error in a +small test. Large tests are NOT executed. [[hbase.unittests.cmds.test.runalltests]] ==== Running all tests -Running `mvn test -P runAllTests` will execute small tests in a single JVM then medium and large tests in a separate JVM for each test. -Medium and large tests are NOT executed if there is an error in a small test. +Running `mvn test -P runAllTests` will execute small tests in a single JVM, then medium and large tests +in a forked, separate JVM for each test. Medium and large tests are NOT executed if there is an error in +a small test. [[hbase.unittests.cmds.test.localtests.mytest]] ==== Running a single test or all tests in a package -To run an individual test, e.g. `MyTest`, rum `mvn test -Dtest=MyTest` You can also pass multiple, individual tests as a comma-delimited list: +To run an individual test, e.g. `MyTest`, rum `mvn test -Dtest=MyTest` You can also pass multiple, +individual tests as a comma-delimited list: [source,bash] ---- mvn test -Dtest=MyTest1,MyTest2,MyTest3 @@ -1144,10 +1465,12 @@ mvn test '-Dtest=org.apache.hadoop.hbase.client.*' ---- When `-Dtest` is specified, the `localTests` profile will be used. -Each junit test is executed in a separate JVM (A fork per test class). There is no parallelization when tests are running in this mode. +Each junit test is executed in a separate JVM (A fork per test class). +There is no parallelization when tests are running in this mode. You will see a new message at the end of the -report: `"[INFO] Tests are skipped"`. -It's harmless. -However, you need to make sure the sum of `Tests run:` in the `Results:` section of test reports matching the number of tests you specified because no error will be reported when a non-existent test case is specified. +It's harmless. However, you need to make sure the sum of +`Tests run:` in the `Results:` section of test reports matching the number of tests +you specified because no error will be reported when a non-existent test case is specified. [[hbase.unittests.cmds.test.profiles]] ==== Other test invocation permutations @@ -1163,17 +1486,36 @@ For convenience, you can run `mvn test -P runDevTests` to execute both small and [[hbase.unittests.test.faster]] ==== Running tests faster -By default, `$ mvn test -P runAllTests` runs all small tests in 1 forked instance and the medium and large tests in 5 parallel forked instances. Up these counts to get the build to run faster (you may run into -rare issues of test mutual interference). For example, -allowing that you want to have 2 tests in parallel per core, and you need about 2GB of memory per test (at the extreme), if you have an 8 core, 24GB box, you can have 16 tests in parallel. -but the memory available limits it to 12 (24/2), To run all tests with 12 tests in parallel, do this: +mvn test -P runAllTests -Dsurefire.secondPartForkCount=12+. -If using a version earlier than 2.0, do: +mvn test -P runAllTests -Dsurefire.secondPartThreadCount=12 +. -You can also increase the fork count for the first party by setting -Dsurefire.firstPartForkCount to a value > 1. -The values passed as fork counts can be specified as a fraction of CPU as follows: for two forks per available CPU, set the value to 2.0C; for a fork for every two CPUs, set it to 0.5C. -To increase the speed, you can as well use a ramdisk. -You will need 2GB of memory to run all tests. -You will also need to delete the files between two test run. -The typical way to configure a ramdisk on Linux is: +By default, `$ mvn test -P runAllTests` runs all tests using a quarter of the CPUs available on machine +hosting the test run (see `surefire.firstPartForkCount` and `surefire.secondPartForkCount` in the top-level +hbase `pom.xml` which default to 0.25C, or 1/4 of CPU count). Up these counts to get the build to run faster. +You can also have hbase modules +run their tests in parrallel when the dependency graph allows by passing `--threads=N` when you invoke +maven, where `N` is the amount of parallelism wanted. +maven, where `N` is the amount of _module_ parallelism wanted. + +For example, allowing that you want to use all cores on a machine to run tests, +you could start up the maven test run with: + +---- + $ x="1.0C"; mvn -Dsurefire.firstPartForkCount=$x -Dsurefire.secondPartForkCount=$x test -PrunAllTests +---- + +If a 32 core machine, you should see periods during which 32 forked jvms appear in your process listing each running unit tests. +Your milage may vary. Dependent on hardware, overcommittment of CPU and/or memory can bring the test suite crashing down, +usually complaining with a spew of test system exits and incomplete test report xml files. Start gently, with the default fork +and move up gradually. + +Adding the `--threads=N`, maven will run N maven modules in parallel (when module inter-dependencies allow). Be aware, if you have +set the forkcount to `1.0C`, and the `--threads` count to '2', the number of concurrent test runners can approach +2 * CPU, a count likely to overcommit the host machine (with attendant test exits failures). + +You will need ~2.2GB of memory per forked JVM plus the memory used by maven itself (3-4G). + +===== RAM Disk + +To increase the speed, you can as well use a ramdisk. 2-3G should be sufficient. Be sure to +delete the files between each test run. The typical way to configure a ramdisk on Linux is: ---- $ sudo mkdir /ram2G @@ -1183,17 +1525,7 @@ sudo mount -t tmpfs -o size=2048M tmpfs /ram2G You can then use it to run all HBase tests on 2.0 with the command: ---- -mvn test - -P runAllTests -Dsurefire.secondPartForkCount=12 - -Dtest.build.data.basedirectory=/ram2G ----- - -On earlier versions, use: - ----- -mvn test - -P runAllTests -Dsurefire.secondPartThreadCount=12 - -Dtest.build.data.basedirectory=/ram2G +mvn test -PrunAllTests -Dtest.build.data.basedirectory=/ram2G ---- [[hbase.unittests.cmds.test.hbasetests]] @@ -1396,21 +1728,40 @@ mvn verify ---- If you just want to run the integration tests in top-level, you need to run two commands. -First: +mvn failsafe:integration-test+ This actually runs ALL the integration tests. +First: +---- +mvn failsafe:integration-test +---- + +This actually runs ALL the integration tests. NOTE: This command will always output `BUILD SUCCESS` even if there are test failures. At this point, you could grep the output by hand looking for failed tests. -However, maven will do this for us; just use: +mvn - failsafe:verify+ The above command basically looks at all the test results (so don't remove the 'target' directory) for test failures and reports the results. +However, maven will do this for us; just use: +---- +mvn failsafe:verify +---- + +The above command basically looks at all the test results (so don't remove the 'target' directory) for test failures and reports the results. [[maven.build.commands.integration.tests2]] ===== Running a subset of Integration tests This is very similar to how you specify running a subset of unit tests (see above), but use the property `it.test` instead of `test`. -To just run `IntegrationTestClassXYZ.java`, use: +mvn - failsafe:integration-test -Dit.test=IntegrationTestClassXYZ+ The next thing you might want to do is run groups of integration tests, say all integration tests that are named IntegrationTestClassX*.java: +mvn failsafe:integration-test -Dit.test=*ClassX*+ This runs everything that is an integration test that matches *ClassX*. This means anything matching: "**/IntegrationTest*ClassX*". You can also run multiple groups of integration tests using comma-delimited lists (similar to unit tests). Using a list of matches still supports full regex matching for each of the groups. This would look something like: +mvn - failsafe:integration-test -Dit.test=*ClassX*, *ClassY+ +To just run `IntegrationTestClassXYZ.java`, use: +---- +mvn failsafe:integration-test -Dit.test=IntegrationTestClassXYZ -DfailIfNoTests=false +---- +The next thing you might want to do is run groups of integration tests, say all integration tests that are named IntegrationTestClassX*.java: +---- +mvn failsafe:integration-test -Dit.test=*ClassX* -DfailIfNoTests=false +---- + +This runs everything that is an integration test that matches *ClassX*. This means anything matching: "**/IntegrationTest*ClassX*". You can also run multiple groups of integration tests using comma-delimited lists (similar to unit tests). Using a list of matches still supports full regex matching for each of the groups. This would look something like: +---- +mvn failsafe:integration-test -Dit.test=*ClassX*,*ClassY -DfailIfNoTests=false +---- [[maven.build.commands.integration.tests.distributed]] ==== Running integration tests against distributed cluster @@ -2047,32 +2398,123 @@ Patches larger than one screen, or patches that will be tricky to review, should For more information on how to use ReviewBoard, see link:http://www.reviewboard.org/docs/manual/1.5/[the ReviewBoard documentation]. +[[github]] +==== GitHub +Submitting link:https://github.com/apache/hbase[GitHub] pull requests is another accepted form of +contributing patches. Refer to GitHub link:https://help.github.com/[documentation] for details on +how to create pull requests. + +NOTE: This section is incomplete and needs to be updated. Refer to +link:https://issues.apache.org/jira/browse/HBASE-23557[HBASE-23557] + +===== GitHub Tooling + +====== Browser bookmarks + +Following is a useful javascript based browser bookmark that redirects from GitHub pull +requests to the corresponding jira work item. This redirects based on the HBase jira ID mentioned +in the issue title for the PR. Add the following javascript snippet as a browser bookmark to the +tool bar. Clicking on it while you are on an HBase GitHub PR page redirects you to the corresponding +jira item. + +[source, javascript] +----- +javascript:location.href='https://issues.apache.org/jira/browse/'+document.getElementsByClassName("js-issue-title")[0].innerHTML.match(/HBASE-\d+/)[0]; +----- + ==== Guide for HBase Committers +===== Becoming a committer + +Committers are responsible for reviewing and integrating code changes, testing +and voting on release candidates, weighing in on design discussions, as well as +other types of project contributions. The PMC votes to make a contributor a +committer based on an assessment of their contributions to the project. It is +expected that committers demonstrate a sustained history of high-quality +contributions to the project and community involvement. + +Contributions can be made in many ways. There is no single path to becoming a +committer, nor any expected timeline. Submitting features, improvements, and bug +fixes is the most common avenue, but other methods are both recognized and +encouraged (and may be even more important to the health of HBase as a project and a +community). A non-exhaustive list of potential contributions (in no particular +order): + +* <> for new + changes, best practices, recipes, and other improvements. +* Keep the website up to date. +* Perform testing and report the results. For instance, scale testing and + testing non-standard configurations is always appreciated. +* Maintain the shared Jenkins testing environment and other testing + infrastructure. +* <> after performing validation, even if non-binding. + A non-binding vote is a vote by a non-committer. +* Provide input for discussion threads on the link:/mail-lists.html[mailing lists] (which usually have + `[DISCUSS]` in the subject line). +* Answer questions questions on the user or developer mailing lists and on + Slack. +* Make sure the HBase community is a welcoming one and that we adhere to our + link:/coc.html[Code of conduct]. Alert the PMC if you + have concerns. +* Review other people's work (both code and non-code) and provide public + feedback. +* Report bugs that are found, or file new feature requests. +* Triage issues and keep JIRA organized. This includes closing stale issues, + labeling new issues, updating metadata, and other tasks as needed. +* Mentor new contributors of all sorts. +* Give talks and write blogs about HBase. Add these to the link:/[News] section + of the website. +* Provide UX feedback about HBase, the web UI, the CLI, APIs, and the website. +* Write demo applications and scripts. +* Help attract and retain a diverse community. +* Interact with other projects in ways that benefit HBase and those other + projects. + +Not every individual is able to do all (or even any) of the items on this list. +If you think of other ways to contribute, go for it (and add them to the list). +A pleasant demeanor and willingness to contribute are all you need to make a +positive impact on the HBase project. Invitations to become a committer are the +result of steady interaction with the community over the long term, which builds +trust and recognition. + ===== New committers -New committers are encouraged to first read Apache's generic committer documentation: +New committers are encouraged to first read Apache's generic committer +documentation: * link:https://www.apache.org/dev/new-committers-guide.html[Apache New Committer Guide] * link:https://www.apache.org/dev/committers.html[Apache Committer FAQ] ===== Review -HBase committers should, as often as possible, attempt to review patches submitted by others. -Ideally every submitted patch will get reviewed by a committer _within a few days_. -If a committer reviews a patch they have not authored, and believe it to be of sufficient quality, then they can commit the patch, otherwise the patch should be cancelled with a clear explanation for why it was rejected. +HBase committers should, as often as possible, attempt to review patches +submitted by others. Ideally every submitted patch will get reviewed by a +committer _within a few days_. If a committer reviews a patch they have not +authored, and believe it to be of sufficient quality, then they can commit the +patch. Otherwise the patch should be cancelled with a clear explanation for why +it was rejected. -The list of submitted patches is in the link:https://issues.apache.org/jira/secure/IssueNavigator.jspa?mode=hide&requestId=12312392[HBase Review Queue], which is ordered by time of last modification. -Committers should scan the list from top to bottom, looking for patches that they feel qualified to review and possibly commit. +The list of submitted patches is in the +link:https://issues.apache.org/jira/secure/IssueNavigator.jspa?mode=hide&requestId=12312392[HBase Review Queue], +which is ordered by time of last modification. Committers should scan the list +from top to bottom, looking for patches that they feel qualified to review and +possibly commit. If you see a patch you think someone else is better qualified +to review, you can mention them by username in the JIRA. -For non-trivial changes, it is required to get another committer to review your own patches before commit. -Use the btn:[Submit Patch] button in JIRA, just like other contributors, and then wait for a `+1` response from another committer before committing. +For non-trivial changes, it is required that another committer review your +patches before commit. **Self-commits of non-trivial patches are not allowed.** +Use the btn:[Submit Patch] button in JIRA, just like other contributors, and +then wait for a `+1` response from another committer before committing. ===== Reject -Patches which do not adhere to the guidelines in link:https://hbase.apache.org/book.html#developer[HowToContribute] and to the link:https://wiki.apache.org/hadoop/CodeReviewChecklist[code review checklist] should be rejected. -Committers should always be polite to contributors and try to instruct and encourage them to contribute better patches. -If a committer wishes to improve an unacceptable patch, then it should first be rejected, and a new patch should be attached by the committer for review. +Patches which do not adhere to the guidelines in +link:https://hbase.apache.org/book.html#developer[HowToContribute] and to the +link:https://cwiki.apache.org/confluence/display/HADOOP2/CodeReviewChecklist[code review checklist] +should be rejected. Committers should always be polite to contributors and try +to instruct and encourage them to contribute better patches. If a committer +wishes to improve an unacceptable patch, then it should first be rejected, and a +new patch should be attached by the committer for further review. [[committing.patches]] ===== Commit @@ -2083,29 +2525,34 @@ Committers commit patches to the Apache HBase GIT repository. [NOTE] ==== Make sure your local configuration is correct, especially your identity and email. -Examine the output of the +$ git config - --list+ command and be sure it is correct. -See this GitHub article, link:https://help.github.com/articles/set-up-git[Set Up Git] if you need pointers. +Examine the output of the +$ git config --list+ command and be sure it is correct. +See link:https://help.github.com/articles/set-up-git[Set Up Git] if you need +pointers. ==== -When you commit a patch, please: +When you commit a patch: -. Include the Jira issue id in the commit message along with a short description of the change. Try - to add something more than just the Jira title so that someone looking at git log doesn't - have to go to Jira to discern what the change is about. - Be sure to get the issue ID right, as this causes Jira to link to the change in Git (use the - issue's "All" tab to see these). -. Commit the patch to a new branch based off master or other intended branch. - It's a good idea to call this branch by the JIRA ID. - Then check out the relevant target branch where you want to commit, make sure your local branch has all remote changes, by doing a +git pull --rebase+ or another similar command, cherry-pick the change into each relevant branch (such as master), and do +git push - +. +. Include the Jira issue ID in the commit message along with a short description + of the change. Try to add something more than just the Jira title so that + someone looking at `git log` output doesn't have to go to Jira to discern what + the change is about. Be sure to get the issue ID right, because this causes + Jira to link to the change in Git (use the issue's "All" tab to see these + automatic links). +. Commit the patch to a new branch based off `master` or the other intended + branch. It's a good idea to include the JIRA ID in the name of this branch. + Check out the relevant target branch where you want to commit, and make sure + your local branch has all remote changes, by doing a +git pull --rebase+ or + another similar command. Next, cherry-pick the change into each relevant + branch (such as master), and push the changes to the remote branch using + a command such as +git push +. + WARNING: If you do not have all remote changes, the push will fail. If the push fails for any reason, fix the problem or ask for help. Do not do a +git push --force+. + Before you can commit a patch, you need to determine how the patch was created. -The instructions and preferences around the way to create patches have changed, and there will be a transition period. +The instructions and preferences around the way to create patches have changed, +and there will be a transition period. + .Determine How a Patch Was Created * If the first few lines of the patch look like the headers of an email, with a From, Date, and @@ -2132,16 +2579,18 @@ diff --git src/main/asciidoc/_chapters/developer.adoc src/main/asciidoc/_chapter + .Example of committing a Patch ==== -One thing you will notice with these examples is that there are a lot of +git pull+ commands. -The only command that actually writes anything to the remote repository is +git push+, and you need to make absolutely sure you have the correct versions of everything and don't have any conflicts before pushing. -The extra +git - pull+ commands are usually redundant, but better safe than sorry. +One thing you will notice with these examples is that there are a lot of ++git pull+ commands. The only command that actually writes anything to the +remote repository is +git push+, and you need to make absolutely sure you have +the correct versions of everything and don't have any conflicts before pushing. +The extra +git pull+ commands are usually redundant, but better safe than sorry. -The first example shows how to apply a patch that was generated with +git format-patch+ and apply it to the `master` and `branch-1` branches. +The first example shows how to apply a patch that was generated with +git +format-patch+ and apply it to the `master` and `branch-1` branches. -The directive to use +git format-patch+ rather than +git diff+, and not to use `--no-prefix`, is a new one. -See the second example for how to apply a patch created with +git - diff+, and educate the person who created the patch. +The directive to use +git format-patch+ rather than +git diff+, and not to use +`--no-prefix`, is a new one. See the second example for how to apply a patch +created with +git diff+, and educate the person who created the patch. ---- $ git checkout -b HBASE-XXXX @@ -2163,13 +2612,13 @@ $ git push origin branch-1 $ git branch -D HBASE-XXXX ---- -This example shows how to commit a patch that was created using +git diff+ without `--no-prefix`. -If the patch was created with `--no-prefix`, add `-p0` to the +git apply+ command. +This example shows how to commit a patch that was created using +git diff+ +without `--no-prefix`. If the patch was created with `--no-prefix`, add `-p0` to +the +git apply+ command. ---- $ git apply ~/Downloads/HBASE-XXXX-v2.patch -$ git commit -m "HBASE-XXXX Really Good Code Fix (Joe Schmo)" --author= -a # This -and next command is needed for patches created with 'git diff' +$ git commit -m "HBASE-XXXX Really Good Code Fix (Joe Schmo)" --author= -a # This and next command is needed for patches created with 'git diff' $ git commit --amend --signoff $ git checkout master $ git pull --rebase @@ -2190,7 +2639,9 @@ $ git branch -D HBASE-XXXX ==== . Resolve the issue as fixed, thanking the contributor. - Always set the "Fix Version" at this point, but please only set a single fix version for each branch where the change was committed, the earliest release in that branch in which the change will appear. + Always set the "Fix Version" at this point, but only set a single fix version + for each branch where the change was committed, the earliest release in that + branch in which the change will appear. ====== Commit Message Format @@ -2205,22 +2656,42 @@ The preferred commit message format is: HBASE-12345 Fix All The Things (jane@example.com) ---- -If the contributor used +git format-patch+ to generate the patch, their commit message is in their patch and you can use that, but be sure the JIRA ID is at the front of the commit message, even if the contributor left it out. +If the contributor used +git format-patch+ to generate the patch, their commit +message is in their patch and you can use that, but be sure the JIRA ID is at +the front of the commit message, even if the contributor left it out. [[committer.amending.author]] -====== Add Amending-Author when a conflict cherrypick backporting +====== Use GitHub's "Co-authored-by" when there are multiple authors We've established the practice of committing to master and then cherry picking back to branches whenever possible, unless * it's breaking compat: In which case, if it can go in minor releases, backport to branch-1 and branch-2. * it's a new feature: No for maintenance releases, For minor releases, discuss and arrive at consensus. -When there is a minor conflict we can fix it up and just proceed with the commit. -The resulting commit retains the original author. -When the amending author is different from the original committer, add notice of this at the end of the commit message as: `Amending-Author: Author - ` See discussion at link:http://search-hadoop.com/m/DHED4wHGYS[HBase, mail # dev - - [DISCUSSION] Best practice when amending commits cherry picked - from master to branch]. +There are occasions when there are multiple author for a patch. +For example when there is a minor conflict we can fix it up and just proceed with the commit. +The amending author will be different from the original committer, so you should also attribute to the original author by +adding one or more `Co-authored-by` trailers to the commit's message. +See link:https://help.github.com/en/articles/creating-a-commit-with-multiple-authors/[the GitHub documentation for "Creating a commit with multiple authors"]. + +In short, these are the steps to add Co-authors that will be tracked by GitHub: + +. Collect the name and email address for each co-author. +. Commit the change, but after your commit description, instead of a closing quotation, add two empty lines. (Do not close the commit message with a quotation mark) +. On the next line of the commit message, type `Co-authored-by: name `. After the co-author information, add a closing quotation mark. + +Here is the example from the GitHub page, using 2 Co-authors: +[source,xml] +---- +$ git commit -m "Refactor usability tests. +> +> +Co-authored-by: name +Co-authored-by: another-name " +---- + +Note: `Amending-Author: Author ` was used prior to this +link:https://lists.apache.org/thread.html/f00b5f9b65570e777dbb31c37d7b0ffc55c5fc567aefdb456608a042@%3Cdev.hbase.apache.org%3E[DISCUSSION]. ====== Close related GitHub PRs @@ -2253,6 +2724,28 @@ Avoid merge commits, as they create problems in the git history. See <>. +====== How to re-trigger github Pull Request checks/re-build + +A Pull Request (PR) submission triggers the hbase yetus checks. The checks make +sure the patch doesn't break the build or introduce test failures. The checks take +around four hours to run (They are the same set run when you submit a patch via +HBASE JIRA). When finished, they add a report to the PR as a comment. If a problem +w/ the patch -- failed compile, checkstyle violation, or an added findbugs -- +the original author makes fixes and pushes a new patch. This re-runs the checks +to produce a new report. + +Sometimes though, the patch is good but a flakey, unrelated test has the report vote -1 +on the patch. In this case, **committers** can retrigger the check run by doing a force push of the +exact same patch. Or, click on the `Console output` link which shows toward the end +of the report (For example `https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-289/1/console`). +This will take you to `builds.apache.org`, to the build run that failed. See the +"breadcrumbs" along the top (where breadcrumbs is the listing of the directories that +gets us to this particular build page). It'll look something like +`Jenkins > HBase-PreCommit-GitHub-PR > PR-289 > #1`. Click on the +PR number -- i.e. PR-289 in our example -- and then, when you've arrived at the PR page, +find the 'Build with Parameters' menu-item (along top left-hand menu). Click here and +then `Build` leaving the JIRA_ISSUE_KEY empty. This will re-run your checks. + ==== Dialog Committers should hang out in the #hbase room on irc.freenode.net for real-time discussions. diff --git a/src/main/asciidoc/_chapters/faq.adoc b/src/main/asciidoc/_chapters/faq.adoc index 0e498ac9d38..baf33ac9648 100644 --- a/src/main/asciidoc/_chapters/faq.adoc +++ b/src/main/asciidoc/_chapters/faq.adoc @@ -32,9 +32,6 @@ When should I use HBase?:: See <> in the Architecture chapter. -Are there other HBase FAQs?:: - See the FAQ that is up on the wiki, link:https://wiki.apache.org/hadoop/Hbase/FAQ[HBase Wiki FAQ]. - Does HBase support SQL?:: Not really. SQL-ish support for HBase via link:https://hive.apache.org/[Hive] is in development, however Hive is based on MapReduce which is not generally suitable for low-latency requests. See the <> section for examples on the HBase client. diff --git a/src/main/asciidoc/_chapters/getting_started.adoc b/src/main/asciidoc/_chapters/getting_started.adoc index e50ea6bd4a7..9e4aa8c069b 100644 --- a/src/main/asciidoc/_chapters/getting_started.adoc +++ b/src/main/asciidoc/_chapters/getting_started.adoc @@ -67,18 +67,15 @@ $ tar xzvf hbase-{Version}-bin.tar.gz $ cd hbase-{Version}/ ---- -. You are required to set the `JAVA_HOME` environment variable before starting HBase. - You can set the variable via your operating system's usual mechanism, but HBase - provides a central mechanism, _conf/hbase-env.sh_. - Edit this file, uncomment the line starting with `JAVA_HOME`, and set it to the - appropriate location for your operating system. - The `JAVA_HOME` variable should be set to a directory which contains the executable file _bin/java_. - Most modern Linux operating systems provide a mechanism, such as /usr/bin/alternatives on RHEL or CentOS, for transparently switching between versions of executables such as Java. - In this case, you can set `JAVA_HOME` to the directory containing the symbolic link to _bin/java_, which is usually _/usr_. +. You must set the `JAVA_HOME` environment variable before starting HBase. + To make this easier, HBase lets you set it within the _conf/hbase-env.sh_ file. You must locate where Java is + installed on your machine, and one way to find this is by using the _whereis java_ command. Once you have the location, + edit the _conf/hbase-env.sh_ file and uncomment the line starting with _#export JAVA_HOME=_, and then set it to your Java installation path. + ----- -JAVA_HOME=/usr ----- +.Example extract from _hbase-env.sh_ where _JAVA_HOME_ is set + # Set environment variables here. + # The java implementation to use. + export JAVA_HOME=/usr/jdk64/jdk1.8.0_112 + . The _bin/start-hbase.sh_ script is provided as a convenient way to start HBase. Issue the command, and if all goes well, a message is logged to standard output showing that HBase started successfully. @@ -577,11 +574,11 @@ For more about ZooKeeper configuration, including using an external ZooKeeper in . Browse to the Web UI. + .Web UI Port Changes -NOTE: Web UI Port Changes -+ +[NOTE] +==== In HBase newer than 0.98.x, the HTTP ports used by the HBase Web UI changed from 60010 for the Master and 60030 for each RegionServer to 16010 for the Master and 16030 for the RegionServer. - +==== + If everything is set up correctly, you should be able to connect to the UI for the Master `http://node-a.example.com:16010/` or the secondary master at `http://node-b.example.com:16010/` diff --git a/src/main/asciidoc/_chapters/hbase-default.adoc b/src/main/asciidoc/_chapters/hbase-default.adoc index c20604fd68c..cdefb5cfcd7 100644 --- a/src/main/asciidoc/_chapters/hbase-default.adoc +++ b/src/main/asciidoc/_chapters/hbase-default.adoc @@ -523,7 +523,7 @@ The host name or IP address of the name server (DNS) + .Description Port used by ZooKeeper peers to talk to each other. - See https://zookeeper.apache.org/doc/r3.3.3/zookeeperStarted.html#sc_RunningReplicatedZooKeeper + See https://zookeeper.apache.org/doc/r3.4.10/zookeeperStarted.html#sc_RunningReplicatedZooKeeper for more information. + .Default @@ -535,7 +535,7 @@ Port used by ZooKeeper peers to talk to each other. + .Description Port used by ZooKeeper for leader election. - See https://zookeeper.apache.org/doc/r3.3.3/zookeeperStarted.html#sc_RunningReplicatedZooKeeper + See https://zookeeper.apache.org/doc/r3.4.10/zookeeperStarted.html#sc_RunningReplicatedZooKeeper for more information. + .Default @@ -2027,108 +2027,6 @@ A comma-separated list of `0` -[[hbase.master.regions.recovery.check.interval]] -*`hbase.master.regions.recovery.check.interval`*:: -+ -.Description - - Regions Recovery Chore interval in milliseconds. - This chore keeps running at this interval to - find all regions with configurable max store file ref count - and reopens them. - -+ -.Default -`1200000` - - -[[hbase.regions.recovery.store.file.ref.count]] -*`hbase.regions.recovery.store.file.ref.count`*:: -+ -.Description - - Very large number of ref count on a compacted - store file indicates that it is a ref leak - on that object(compacted store file). - Such files can not be removed after - it is invalidated via compaction. - Only way to recover in such scenario is to - reopen the region which can release - all resources, like the refcount, - leases, etc. This config represents Store files Ref - Count threshold value considered for reopening - regions. Any region with compacted store files - ref count > this value would be eligible for - reopening by master. Here, we get the max - refCount among all refCounts on all - compacted away store files that belong to a - particular region. Default value -1 indicates - this feature is turned off. Only positive - integer value should be provided to - enable this feature. - -+ -.Default -`-1` - - -[[hbase.regionserver.slowlog.ringbuffer.size]] -*`hbase.regionserver.slowlog.ringbuffer.size`*:: -+ -.Description - - Default size of ringbuffer to be maintained by each RegionServer in order - to store online slowlog responses. This is an in-memory ring buffer of - requests that were judged to be too slow in addition to the responseTooSlow - logging. The in-memory representation would be complete. - For more details, please look into Doc Section: - <> - - -+ -.Default -`256` - - - -[[hbase.regionserver.slowlog.buffer.enabled]] -*`hbase.regionserver.slowlog.buffer.enabled`*:: -+ -.Description - - Indicates whether RegionServers have ring buffer running for storing - Online Slow logs in FIFO manner with limited entries. The size of - the ring buffer is indicated by config: hbase.regionserver.slowlog.ringbuffer.size - The default value is false, turn this on and get latest slowlog - responses with complete data. - For more details, please look into Doc Section: - <> - - -+ -.Default -`false` - - -[[hbase.regionserver.slowlog.systable.enabled]] -*`hbase.regionserver.slowlog.systable.enabled`*:: -+ -.Description - - Should be enabled only if hbase.regionserver.slowlog.buffer.enabled is enabled. - If enabled (true), all slow/large RPC logs would be persisted to system table - hbase:slowlog (in addition to in-memory ring buffer at each RegionServer). - The records are stored in increasing order of time. - Operators can scan the table with various combination of ColumnValueFilter and - time range. - More details are provided in the doc section: - "Get Slow/Large Response Logs from System table hbase:slowlog" - -+ -.Default -`false` - - [[hbase.region.replica.replication.enabled]] *`hbase.region.replica.replication.enabled`*:: + @@ -2265,6 +2163,121 @@ The percent of region server RPC threads failed to abort RS. `0` +[[hbase.master.regions.recovery.check.interval]] +*`hbase.master.regions.recovery.check.interval`*:: ++ +.Description + + Regions Recovery Chore interval in milliseconds. + This chore keeps running at this interval to + find all regions with configurable max store file ref count + and reopens them. + ++ +.Default +`1200000` + + +[[hbase.regions.recovery.store.file.ref.count]] +*`hbase.regions.recovery.store.file.ref.count`*:: ++ +.Description + + Very large number of ref count on a compacted + store file indicates that it is a ref leak + on that object(compacted store file). + Such files can not be removed after + it is invalidated via compaction. + Only way to recover in such scenario is to + reopen the region which can release + all resources, like the refcount, + leases, etc. This config represents Store files Ref + Count threshold value considered for reopening + regions. Any region with compacted store files + ref count > this value would be eligible for + reopening by master. Here, we get the max + refCount among all refCounts on all + compacted away store files that belong to a + particular region. Default value -1 indicates + this feature is turned off. Only positive + integer value should be provided to + enable this feature. + ++ +.Default +`-1` + + +[[hbase.regionserver.slowlog.ringbuffer.size]] +*`hbase.regionserver.slowlog.ringbuffer.size`*:: ++ +.Description + + Default size of ringbuffer to be maintained by each RegionServer in order + to store online slowlog responses. This is an in-memory ring buffer of + requests that were judged to be too slow in addition to the responseTooSlow + logging. The in-memory representation would be complete. + For more details, please look into Doc Section: + <> + + ++ +.Default +`256` + + + +[[hbase.regionserver.slowlog.buffer.enabled]] +*`hbase.regionserver.slowlog.buffer.enabled`*:: ++ +.Description + + Indicates whether RegionServers have ring buffer running for storing + Online Slow logs in FIFO manner with limited entries. The size of + the ring buffer is indicated by config: hbase.regionserver.slowlog.ringbuffer.size + The default value is false, turn this on and get latest slowlog + responses with complete data. + For more details, please look into Doc Section: + <> + + ++ +.Default +`false` + + +[[hbase.regionserver.slowlog.systable.enabled]] +*`hbase.regionserver.slowlog.systable.enabled`*:: ++ +.Description + + Should be enabled only if hbase.regionserver.slowlog.buffer.enabled is enabled. + If enabled (true), all slow/large RPC logs would be persisted to system table + hbase:slowlog (in addition to in-memory ring buffer at each RegionServer). + The records are stored in increasing order of time. + Operators can scan the table with various combination of ColumnValueFilter and + time range. + More details are provided in the doc section: + "Get Slow/Large Response Logs from System table hbase:slowlog" + ++ +.Default +`false` + + +[[hbase.master.metafixer.max.merge.count]] +*`hbase.master.metafixer.max.merge.count`*:: ++ +.Description + + Maximum regions to merge at a time when we fix overlaps noted in + CJ consistency report, but avoid merging 100 regions in one go! + ++ +.Default +`64` + + [[hbase.rpc.rows.size.threshold.reject]] *`hbase.rpc.rows.size.threshold.reject`*:: + @@ -2283,3 +2296,4 @@ The percent of region server RPC threads failed to abort RS. + .Default `false` + diff --git a/src/main/asciidoc/_chapters/hbase_mob.adoc b/src/main/asciidoc/_chapters/hbase_mob.adoc index f0b60938333..0ce4ea37efd 100644 --- a/src/main/asciidoc/_chapters/hbase_mob.adoc +++ b/src/main/asciidoc/_chapters/hbase_mob.adoc @@ -36,22 +36,15 @@ read and write paths are optimized for values smaller than 100KB in size. When HBase deals with large numbers of objects over this threshold, referred to here as medium objects, or MOBs, performance is degraded due to write amplification caused by splits and compactions. When using MOBs, ideally your objects will be between -100KB and 10MB (see the <>). HBase ***FIX_VERSION_NUMBER*** adds support -for better managing large numbers of MOBs while maintaining performance, -consistency, and low operational overhead. MOB support is provided by the work -done in link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339]. To -take advantage of MOB, you need to use <>. Optionally, +100KB and 10MB (see the <>). HBase 2 added special internal handling of MOBs +to maintain performance, consistency, and low operational overhead. MOB support is +provided by the work done in link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339]. +To take advantage of MOB, you need to use <>. Optionally, configure the MOB file reader's cache settings for each RegionServer (see <>), then configure specific columns to hold MOB data. Client code does not need to change to take advantage of HBase MOB support. The feature is transparent to the client. -MOB compaction - -MOB data is flushed into MOB files after MemStore flush. There will be lots of MOB files -after some time. To reduce MOB file count, there is a periodic task which compacts -small MOB files into a large one (MOB compaction). - === Configuring Columns for MOB You can configure columns to support MOB during table creation or alteration, @@ -79,41 +72,6 @@ hcd.setMobThreshold(102400L); ---- ==== -=== Configure MOB Compaction Policy - -By default, MOB files for one specific day are compacted into one large MOB file. -To reduce MOB file count more, there are other MOB Compaction policies supported. - -daily policy - compact MOB Files for one day into one large MOB file (default policy) -weekly policy - compact MOB Files for one week into one large MOB file -montly policy - compact MOB Files for one month into one large MOB File - -.Configure MOB compaction policy Using HBase Shell ----- -hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'daily'} -hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'weekly'} -hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'monthly'} - -hbase> alter 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'daily'} -hbase> alter 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'weekly'} -hbase> alter 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'monthly'} ----- - -=== Configure MOB Compaction mergeable threshold - -If the size of a mob file is less than this value, it's regarded as a small file and needs to -be merged in mob compaction. The default value is 1280MB. - -==== -[source,xml] ----- - - hbase.mob.compaction.mergeable.threshold - 10000000000 - ----- -==== - === Testing MOB The utility `org.apache.hadoop.hbase.IntegrationTestIngestWithMOB` is provided to assist with testing @@ -133,9 +91,219 @@ $ sudo -u hbase hbase org.apache.hadoop.hbase.IntegrationTestIngestWithMOB \ * `*maxMobDataSize*` is the maximum value for the size of MOB data. The default is 5 kB, expressed in bytes. +=== MOB architecture + +This section is derived from information found in +link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339], which covered the initial GA +implementation of MOB in HBase and +link:https://issues.apache.org/jira/browse/HBASE-22749[HBASE-22749], which improved things by +parallelizing MOB maintenance across the RegionServers. For more information see +the last version of the design doc created during the initial work, +"link:https://github.com/apache/hbase/blob/master/dev-support/design-docs/HBASE-11339%20MOB%20GA%20design.pdf[HBASE-11339 MOB GA design.pdf]", +and the design doc for the distributed mob compaction feature, +"link:https://github.com/apache/hbase/blob/master/dev-support/design-docs/HBASE-22749%20MOB%20distributed%20compaction.pdf[HBASE-22749 MOB distributed compaction.pdf]". + + +==== Overview + +The MOB feature reduces the overall IO load for configured column families by storing values that +are larger than the configured threshold outside of the normal regions to avoid splits, merges, and +most importantly normal compactions. + +When a cell is first written to a region it is stored in the WAL and memstore regardless of value +size. When memstores from a column family configured to use MOB are eventually flushed two hfiles +are written simultaneously. Cells with a value smaller than the threshold size are written to a +normal region hfile. Cells with a value larger than the threshold are written into a special MOB +hfile and also have a MOB reference cell written into the normal region HFile. As the Region Server +flushes a MOB enabled memstore and closes a given normal region HFile it appends metadata that lists +each of the special MOB hfiles referenced by the cells within. + +MOB reference cells have the same key as the cell they are based on. The value of the reference cell +is made up of two pieces of metadata: the size of the actual value and the MOB hfile that contains +the original cell. In addition to any tags originally written to HBase, the reference cell prepends +two additional tags. The first is a marker tag that says the cell is a MOB reference. This can be +used later to scan specifically just for reference cells. The second stores the namespace and table +at the time the MOB hfile is written out. This tag is used to optimize how the MOB system finds +the underlying value in MOB hfiles after a series of HBase snapshot operations (ref HBASE-12332). +Note that tags are only available within HBase servers and by default are not sent over RPCs. + +All MOB hfiles for a given table are managed within a logical region that does not directly serve +requests. When these MOB hfiles are created from a flush or MOB compaction they are placed in a +dedicated mob data area under the hbase root directory specific to the namespace, table, mob +logical region, and column family. In general that means a path structured like: + +---- +%HBase Root Dir%/mobdir/data/%namespace%/%table%/%logical region%/%column family%/ +---- + +With default configs, an example table named 'some_table' in the +default namespace with a MOB enabled column family named 'foo' this HDFS directory would be + +---- +/hbase/mobdir/data/default/some_table/372c1b27e3dc0b56c3a031926e5efbe9/foo/ +---- + +These MOB hfiles are maintained by special chores in the HBase Master and across the individual +Region Servers. Specifically those chores take care of enforcing TTLs and compacting them. Note that +this compaction is primarily a matter of controlling the total number of files in HDFS because our +operational assumptions for MOB data is that it will seldom update or delete. + +When a given MOB hfile is no longer needed as a result of our compaction process then a chore in +the Master will take care of moving it to the archive just +like any normal hfile. Because the table's mob region is independent of all the normal regions it +can coexist with them in the regular archive storage area: + +---- +/hbase/archive/data/default/some_table/372c1b27e3dc0b56c3a031926e5efbe9/foo/ +---- + +The same hfile cleaning chores that take care of eventually deleting unneeded archived files from +normal regions thus also will take care of these MOB hfiles. As such, if there is a snapshot of a +MOB enabled table then the cleaning system will make sure those MOB files stick around in the +archive area as long as they are needed by a snapshot or a clone of a snapshot. + +==== MOB compaction + +Each time the memstore for a MOB enabled column family performs a flush HBase will write values over +the MOB threshold into MOB specific hfiles. When normal region compaction occurs the Region Server +rewrites the normal data files while maintaining references to these MOB files without rewriting +them. Normal client lookups for MOB values transparently will receive the original values because +the Region Server internals take care of using the reference data to then pull the value out of a +specific MOB file. This indirection means that building up a large number of MOB hfiles doesn't +impact the overall time to retrieve any specific MOB cell. Thus, we need not perform compactions of +the MOB hfiles nearly as often as normal hfiles. As a result, HBase saves IO by not rewriting MOB +hfiles as a part of the periodic compactions a Region Server does on its own. + +However, if deletes and updates of MOB cells are frequent then this indirection will begin to waste +space. The only way to stop using the space of a particular MOB hfile is to ensure no cells still +hold references to it. To do that we need to ensure we have written the current values into a new +MOB hfile. If our backing filesystem has a limitation on the number of files that can be present, as +HDFS does, then even if we do not have deletes or updates of MOB cells eventually there will be a +sufficient number of MOB hfiles that we will need to coallesce them. + +Periodically a chore in the master coordinates having the region servers +perform a special major compaction that also handles rewritting new MOB files. Like all compactions +the Region Server will create updated hfiles that hold both the cells that are smaller than the MOB +threshold and cells that hold references to the newly rewritten MOB file. Because this rewriting has +the advantage of looking across all active cells for the region our several small MOB files should +end up as a single MOB file per region. The chore defaults to running weekly and can be +configured by setting `hbase.mob.compaction.chore.period` to the desired period in seconds. + +==== +[source,xml] +---- + + hbase.mob.compaction.chore.period + 2592000 + Example of changing the chore period from a week to a month. + +---- +==== + +By default, the periodic MOB compaction coordination chore will attempt to keep every region +busy doing compactions in parallel in order to maximize the amount of work done on the cluster. +If you need to tune the amount of IO this compaction generates on the underlying filesystem, you +can control how many concurrent region-level compaction requests are allowed by setting +`hbase.mob.major.compaction.region.batch.size` to an integer number greater than zero. If you set +the configuration to 0 then you will get the default behavior of attempting to do all regions in +parallel. + +==== +[source,xml] +---- + + hbase.mob.major.compaction.region.batch.size + 1 + Example of switching from "as parallel as possible" to "serially" + +---- +==== + +==== MOB file archiving + +Eventually we will have MOB hfiles that are no longer needed. Either clients will overwrite the +value or a MOB-rewriting compaction will store a reference to a newer larger MOB hfile. Because any +given MOB cell could have originally been written either in the current region or in a parent region +that existed at some prior point in time, individual Region Servers do not decide when it is time +to archive MOB hfiles. Instead a periodic chore in the Master evaluates MOB hfiles for archiving. + +A MOB HFile will be subject to archiving under any of the following conditions: + +* Any MOB HFile older than the column family's TTL +* Any MOB HFile older than a "too recent" threshold with no references to it from the regular hfiles + for all regions in a column family + +To determine if a MOB HFile meets the second criteria the chore extracts metadata from the regular +HFiles for each MOB enabled column family for a given table. That metadata enumerates the complete +set of MOB HFiles needed to satisfy the references stored in the normal HFile area. + +The period of the cleaner chore can be configued by setting `hbase.master.mob.cleaner.period` to a +positive integer number of seconds. It defaults to running daily. You should not need to tune it +unless you have a very aggressive TTL or a very high rate of MOB updates with a correspondingly +high rate of non-MOB compactions. + +=== MOB Optimization Tasks + +==== Further limiting write amplification + +If your MOB workload has few to no updates or deletes then you can opt-in to MOB compactions that +optimize for limiting the amount of write amplification. It acheives this by setting a +size threshold to ignore MOB files during the compaction process. When a given region goes +through MOB compaction it will evaluate the size of the MOB file that currently holds the actual +value and skip rewriting the value if that file is over threshold. + +The bound of write amplification in this mode can be approximated as +stem:["Write Amplification" = log_K(M/S)] where *K* is the number of files in compaction +selection, *M* is the configurable threshold for MOB files size, and *S* is the minmum size of +memstore flushes that create MOB files in the first place. For example given 5 files picked up per +compaction, a threshold of 1 GB, and a flush size of 10MB the write amplification will be +stem:[log_5((1GB)/(10MB)) = log_5(100) = 2.86]. + +If we are using an underlying filesystem with a limitation on the number of files, such as HDFS, +and we know our expected data set size we can choose our maximum file size in order to approach +this limit but stay within it in order to minimize write amplification. For example, if we expect to +store a petabyte and we have a conservative limitation of a million files in our HDFS instance, then +stem:[(1PB)/(1M) = 1GB] gives us a target limitation of a gigabyte per MOB file. + +To opt-in to this compaction mode you must set `hbase.mob.compaction.type` to `optimized`. The +default MOB size threshold in this mode is set to 1GB. It can be changed by setting +`hbase.mob.compactions.max.file.size` to a positive integer number of bytes. + + +==== +[source,xml] +---- + + hbase.mob.compaction.type + optimized + opt-in to write amplification optimized mob compaction. + + + hbase.mob.compactions.max.file.size + 10737418240 + Example of tuning the max mob file size to 10GB + +---- +==== + +Additionally, when operating in this mode the compaction process will seek to avoid writing MOB +files that are over the max file threshold. As it is writing out a additional MOB values into a MOB +hfile it will check to see if the additional data causes the hfile to be over the max file size. +When the hfile of MOB values reaches limit, the MOB hfile is committed to the MOB storage area and +a new one is created. The hfile with reference cells will track the complete set of MOB hfiles it +needs in its metadata. + +.Be mindful of total time to complete compaction of a region +[WARNING] +==== +When using the write amplification optimized compaction mode you need to watch for the maximum time +to compact a single region. If it nears an hour you should read through the troubleshooting section +below <>. Failure to make the adjustments discussed there could +lead to dataloss. +==== [[mob.cache.configure]] -=== Configuring the MOB Cache +==== Configuring the MOB Cache Because there can be a large number of MOB files at any time, as compared to the number of HFiles, @@ -181,85 +349,61 @@ suit your environment, and restart or rolling restart the RegionServer. ---- ==== -=== MOB Optimization Tasks - ==== Manually Compacting MOB Files To manually compact MOB files, rather than waiting for the -<> to trigger compaction, use the -`compact` or `major_compact` HBase shell commands. These commands +periodic chore to trigger compaction, use the +`major_compact` HBase shell commands. These commands require the first argument to be the table name, and take a column -family as the second argument. and take a compaction type as the third argument. +family as the second argument. If used with a column family that includes MOB data, then +these operator requests will result in the MOB data being compacted. ---- -hbase> compact 't1', 'c1’, ‘MOB’ -hbase> major_compact 't1', 'c1’, ‘MOB’ +hbase> major_compact 't1' +hbase> major_compact 't2', 'c1’ ---- -These commands are also available via `Admin.compact` and -`Admin.majorCompact` methods. - -=== MOB architecture - -This section is derived from information found in -link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339]. For more information see -the attachment on that issue -"link:https://issues.apache.org/jira/secure/attachment/12724468/HBase%20MOB%20Design-v5.pdf[Base MOB Design-v5.pdf]". - -==== Overview -The MOB feature reduces the overall IO load for configured column families by storing values that -are larger than the configured threshold outside of the normal regions to avoid splits, merges, and -most importantly normal compactions. - -When a cell is first written to a region it is stored in the WAL and memstore regardless of value -size. When memstores from a column family configured to use MOB are eventually flushed two hfiles -are written simultaneously. Cells with a value smaller than the threshold size are written to a -normal region hfile. Cells with a value larger than the threshold are written into a special MOB -hfile and also have a MOB reference cell written into the normal region HFile. - -MOB reference cells have the same key as the cell they are based on. The value of the reference cell -is made up of two pieces of metadata: the size of the actual value and the MOB hfile that contains -the original cell. In addition to any tags originally written to HBase, the reference cell prepends -two additional tags. The first is a marker tag that says the cell is a MOB reference. This can be -used later to scan specifically just for reference cells. The second stores the namespace and table -at the time the MOB hfile is written out. This tag is used to optimize how the MOB system finds -the underlying value in MOB hfiles after a series of HBase snapshot operations (ref HBASE-12332). -Note that tags are only available within HBase servers and by default are not sent over RPCs. - -All MOB hfiles for a given table are managed within a logical region that does not directly serve -requests. When these MOB hfiles are created from a flush or MOB compaction they are placed in a -dedicated mob data area under the hbase root directory specific to the namespace, table, mob -logical region, and column family. In general that means a path structured like: - ----- -%HBase Root Dir%/mobdir/data/%namespace%/%table%/%logical region%/%column family%/ ----- - -With default configs, an example table named 'some_table' in the -default namespace with a MOB enabled column family named 'foo' this HDFS directory would be - ----- -/hbase/mobdir/data/default/some_table/372c1b27e3dc0b56c3a031926e5efbe9/foo/ ----- - -These MOB hfiles are maintained by special chores in the HBase Master rather than by any individual -Region Server. Specifically those chores take care of enforcing TTLs and compacting them. Note that -this compaction is primarily a matter of controlling the total number of files in HDFS because our -operational assumptions for MOB data is that it will seldom update or delete. - -When a given MOB hfile is no longer needed as a result of our compaction process it is archived just -like any normal hfile. Because the table's mob region is independent of all the normal regions it -can coexist with them in the regular archive storage area: - ----- -/hbase/archive/data/default/some_table/372c1b27e3dc0b56c3a031926e5efbe9/foo/ ----- - -The same hfile cleaning chores that take care of eventually deleting unneeded archived files from -normal regions thus also will take care of these MOB hfiles. +This same request can be made via the `Admin.majorCompact` Java API. === MOB Troubleshooting +[[mob.troubleshoot.cleaner.toonew]] +==== Adjusting the MOB cleaner's tolerance for new hfiles + +The MOB cleaner chore ignores all MOB hfiles that were created more recently than an hour prior to +the start of the chore to ensure we don't miss the reference metadata from the corresponding regular +hfile. Without this safety check it would be possible for the cleaner chore to see a MOB hfile for +an in progress flush or compaction and prematurely archive the MOB data. This default buffer should +be sufficient for normal use. + +You will need to adjust the tolerance if you use write amplification optimized MOB compaction and +the combination of your underlying filesystem performance and data shape is such that it could take +more than an hour to complete major compaction of a single region. For example, if your MOB data is +distributed such that your largest region adds 80GB of MOB data between compactions that include +rewriting MOB data and your HDFS cluster is only capable of writing 20MB/s for a single file then +when performing the optimized compaction the Region Server will take about a minute to write the +first 1GB MOB hfile and then another hour and seven minutes to write the remaining seventy-nine 1GB +MOB hfiles before finally committing the new reference hfile at the end of the compaction. Given +this example, you would need a larger tolerance window. + +You will also need to adjust the tolerance if Region Server flush operations take longer than an +hour for the two HDFS move operations needed to commit both the MOB hfile and the normal hfile that +references it. Such a delay should not happen with a normally configured and healthy HDFS and HBase. + +The cleaner's window for "too recent" is controlled by setting `hbase.mob.min.age.archive` to a +positive integer number of milliseconds. + +==== +[source,xml] +---- + + hbase.mob.min.age.archive + 86400000 + Example of tuning the cleaner to only archive files older than a day. + +---- +==== + ==== Retrieving MOB metadata through the HBase Shell While working on troubleshooting failures in the MOB system you can retrieve some of the internal @@ -468,3 +612,64 @@ $ hdfs dfs -count /hbase/mobdir/data/default/some_table + This data is spurious and may be reclaimed. You should sideline it, verify your application’s view of the table, and then delete it. + +=== MOB Upgrade Considerations + +Generally, data stored using the MOB feature should transparently continue to work correctly across +HBase upgrades. + +==== Upgrading to a version with the "distributed MOB compaction" feature + +Prior to the work in HBASE-22749, "Distributed MOB compactions", HBase had the Master coordinate all +compaction maintenance of the MOB hfiles. Centralizing management of the MOB data allowed for space +optimizations but safely coordinating that managemet with Region Servers resulted in edge cases that +caused data loss (ref link:https://issues.apache.org/jira/browse/HBASE-22075[HBASE-22075]). + +Users of the MOB feature upgrading to a version of HBase that includes HBASE-22749 should be aware +of the following changes: + +* The MOB system no longer allows setting "MOB Compaction Policies" +* The MOB system no longer attempts to group MOB values by the date of the original cell's timestamp + according to said compaction policies, daily or otherwise +* The MOB system no longer needs to track individual cell deletes through the use of special + files in the MOB storage area with the suffix `_del`. After upgrading you should sideline these + files. +* Under default configuration the MOB system should take much less time to perform a compaction of + MOB stored values. This is a direct consequence of the fact that HBase will place a much larger + load on the underlying filesystem when doing compactions of MOB stored values; the additional load + should be a multiple on the order of magnitude of number of region servers. I.e. for a cluster + with three region servers and two masters the default configuration should have HBase put three + times the load on HDFS during major compactions that rewrite MOB data when compared to Master + handled MOB compaction; it should also be approximately three times as fast. +* When the MOB system detects that a table has hfiles with references to MOB data but the reference + hfiles do not yet have the needed file level metadata (i.e. from use of the MOB feature prior to + HBASE-22749) then it will refuse to archive _any_ MOB hfiles from that table. The normal course of + periodic compactions done by Region Servers will update existing hfiles with MOB references, but + until a given table has been through the needed compactions operators should expect to see an + increased amount of storage used by the MOB feature. +* Performing a compaction with type "MOB" no longer has special handling to compact specifically the + MOB hfiles. Instead it will issue a warning and do a compaction of the table. For example using + the HBase shell as follows will result in a warning in the Master logs followed by a major + compaction of the 'example' table in its entirety or for the 'big' column respectively. ++ +---- +hbase> major_compact 'example', nil, 'MOB' +hbase> major_compact 'example', 'big', 'MOB' +---- ++ +The same is true for directly using the Java API for +`admin.majorCompact(TableName.valueOf("example"), CompactType.MOB)`. +* Similarly, manually performing a major compaction on a table or region will also handle compacting + the MOB stored values for that table or region respectively. + +The following configuration setting has been deprecated and replaced: + +* `hbase.master.mob.ttl.cleaner.period` has been replaced with `hbase.master.mob.cleaner.period` + +The following configuration settings are no longer used: + +* `hbase.mob.compaction.mergeable.threshold` +* `hbase.mob.delfile.max.count` +* `hbase.mob.compaction.batch.size` +* `hbase.mob.compactor.class` +* `hbase.mob.compaction.threads.max` diff --git a/src/main/asciidoc/_chapters/hbtop.adoc b/src/main/asciidoc/_chapters/hbtop.adoc new file mode 100644 index 00000000000..9aad5fd22f2 --- /dev/null +++ b/src/main/asciidoc/_chapters/hbtop.adoc @@ -0,0 +1,269 @@ +//// +/** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +//// + +[[hbtop]] += hbtop +:doctype: book +:numbered: +:toc: left +:icons: font +:experimental: + +== Overview + +`hbtop` is a real-time monitoring tool for HBase like Unix's top command. +It can display summary information as well as metrics per Region/Namespace/Table/RegionServer. +In this tool, you can see the metrics sorted by a selected field and filter the metrics to see only metrics you really want to see. +Also, with the drill-down feature, you can find hot regions easily in a top-down manner. + +== Usage + +You can run hbtop with the following command: + +---- +$ hbase hbtop +---- + +In this case, the values of `hbase.client.zookeeper.quorum` and `zookeeper.znode.parent` in `hbase-site.xml` in the classpath or the default values of them are used to connect. + +Or, you can specify your own zookeeper quorum and znode parent as follows: + +---- +$ hbase hbtop -Dhbase.client.zookeeper.quorum= -Dzookeeper.znode.parent= +---- + +image::https://hbase.apache.org/hbtop-images/top_screen.gif[Top screen] + +The top screen consists of a summary part and of a metrics part. +In the summary part, you can see `HBase Version`, `Cluster ID`, `The number of region servers`, `Region count`, `Average Cluster Load` and `Aggregated Request/s`. +In the metrics part, you can see metrics per Region/Namespace/Table/RegionServer depending on the selected mode. +The top screen is refreshed in a certain period – 3 seconds by default. + +=== Scrolling metric records + +You can scroll the metric records in the metrics part. + +image::https://hbase.apache.org/hbtop-images/scrolling_metric_records.gif[Scrolling metric records] + +=== Command line arguments + +[options="header"] +|================================= +| Argument | Description +| -d,--delay <arg> | The refresh delay (in seconds); default is 3 seconds +| -h,--help | Print usage; for help while the tool is running press `h` key +| -m,--mode <arg> | The mode; `n` (Namespace)|`t` (Table)|r (Region)|`s` (RegionServer), default is `r` (Region) +|================================= + +=== Modes + +There are the following 4 modes in hbtop: + +[options="header"] +|================================= +| Mode | Description +| Region | Showing metric records per region +| Namespace | Showing metric records per namespace +| Table | Showing metric records per table +| RegionServer | Showing metric records per region server +|================================= + +==== Region mode + +In Region mode, the default sort field is `#REQ/S`. + +The fields in this mode are as follows: + +[options="header"] +|================================= +| Field | Description | Displayed by default +| RNAME | Region Name | false +| NAMESPACE | Namespace Name | true +| TABLE | Table Name | true +| SCODE | Start Code | false +| REPID | Replica ID | false +| REGION | Encoded Region Name | true +| RS | Short Region Server Name | true +| LRS | Long Region Server Name | false +| #REQ/S | Request Count per second | true +| #READ/S | Read Request Count per second | true +| #FREAD/S | Filtered Read Request Count per second | true +| #WRITE/S | Write Request Count per second | true +| SF | StoreFile Size | true +| USF | Uncompressed StoreFile Size | false +| #SF | Number of StoreFiles | true +| MEMSTORE | MemStore Size | true +| LOCALITY | Block Locality | true +| SKEY | Start Key | false +| #COMPingCELL | Compacting Cell Count | false +| #COMPedCELL | Compacted Cell Count | false +| %COMP | Compaction Progress | false +| LASTMCOMP | Last Major Compaction Time | false +|================================= + +==== Namespace mode + +In Namespace mode, the default sort field is `#REQ/S`. + +The fields in this mode are as follows: + +[options="header"] +|================================= +| Field | Description | Displayed by default +| NAMESPACE | Namespace Name | true +| #REGION | Region Count | true +| #REQ/S | Request Count per second | true +| #READ/S | Read Request Count per second | true +| #FREAD/S | Filtered Read Request Count per second | true +| #WRITE/S | Write Request Count per second | true +| SF | StoreFile Size | true +| USF | Uncompressed StoreFile Size | false +| #SF | Number of StoreFiles | true +| MEMSTORE | MemStore Size | true +|================================= + +==== Table mode + +In Table mode, the default sort field is `#REQ/S`. + +The fields in this mode are as follows: + +[options="header"] +|================================= +| Field | Description | Displayed by default +| NAMESPACE | Namespace Name | true +| TABLE | Table Name | true +| #REGION | Region Count | true +| #REQ/S | Request Count per second | true +| #READ/S | Read Request Count per second | true +| #FREAD/S | Filtered Read Request Count per second | true +| #WRITE/S | Write Request Count per second | true +| SF | StoreFile Size | true +| USF | Uncompressed StoreFile Size | false +| #SF | Number of StoreFiles | true +| MEMSTORE | MemStore Size | true +|================================= + +==== RegionServer mode + +In RegionServer mode, the default sort field is `#REQ/S`. + +The fields in this mode are as follows: + +[options="header"] +|================================= +| Field | Description | Displayed by default +| RS | Short Region Server Name | true +| LRS | Long Region Server Name | false +| #REGION | Region Count | true +| #REQ/S | Request Count per second | true +| #READ/S | Read Request Count per second | true +| #FREAD/S | Filtered Read Request Count per second | true +| #WRITE/S | Write Request Count per second | true +| SF | StoreFile Size | true +| USF | Uncompressed StoreFile Size | false +| #SF | Number of StoreFiles | true +| MEMSTORE | MemStore Size | true +| UHEAP | Used Heap Size | true +| MHEAP | Max Heap Size | true +|================================= + +=== Changing mode + +You can change mode by pressing `m` key in the top screen. + +image::https://hbase.apache.org/hbtop-images/changing_mode.gif[Changing mode] + +=== Changing the refresh delay + +You can change the refresh by pressing `d` key in the top screen. + +image::https://hbase.apache.org/hbtop-images/changing_refresh_delay.gif[Changing the refresh delay] + +=== Changing the displayed fields + +You can move to the field screen by pressing `f` key in the top screen. In the fields screen, you can change the displayed fields by choosing a field and pressing `d` key or `space` key. + +image::https://hbase.apache.org/hbtop-images/changing_displayed_fields.gif[Changing the displayed fields] + +=== Changing the sort field + +You can move to the fields screen by pressing `f` key in the top screen. In the field screen, you can change the sort field by choosing a field and pressing `s`. Also, you can change the sort order (ascending or descending) by pressing `R` key. + +image::https://hbase.apache.org/hbtop-images/changing_sort_field.gif[Changing the sort field] + +=== Changing the order of the fields + +You can move to the fields screen by pressing `f` key in the top screen. In the field screen, you can change the order of the fields. + +image::https://hbase.apache.org/hbtop-images/changing_order_of_fields.gif[Changing the sort field] + +=== Filters + +You can filter the metric records with the filter feature. We can add filters by pressing `o` key for ignoring case or `O` key for case sensitive. + +image::https://hbase.apache.org/hbtop-images/adding_filters.gif[Adding filters] + +The syntax is as follows: +---- + +---- + +For example, we can add filters like the following: +---- +NAMESPACE==default +REQ/S>1000 +---- + +The operators we can specify are as follows: + +[options="header"] +|================================= +| Operator | Description +| = | Partial match +| == | Exact match +| > | Greater than +| >= | Greater than or equal to +| < | Less than +| <= | Less than and equal to +|================================= + +You can see the current filters by pressing `^o` key and clear them by pressing `=` key. + +image::https://hbase.apache.org/hbtop-images/showing_and_clearing_filters.gif[Showing and clearing filters] + +=== Drilling down + +You can drill down the metric record by choosing a metric record that you want to drill down and pressing `i` key in the top screen. With this feature, you can find hot regions easily in a top-down manner. + +image::https://hbase.apache.org/hbtop-images/driling_down.gif[Drilling down] + +=== Help screen + +You can see the help screen by pressing `h` key in the top screen. + +image::https://hbase.apache.org/hbtop-images/help_screen.gif[Help screen] + +== Others + +=== How hbtop gets the metrics data + +hbtop gets the metrics from ClusterMetrics which is returned as the result of a call to Admin#getClusterMetrics() on the current HMaster. To add metrics to hbtop, they will need to be exposed via ClusterMetrics. diff --git a/src/main/asciidoc/_chapters/inmemory_compaction.adoc b/src/main/asciidoc/_chapters/inmemory_compaction.adoc index f64bc7aca3f..371257b1785 100644 --- a/src/main/asciidoc/_chapters/inmemory_compaction.adoc +++ b/src/main/asciidoc/_chapters/inmemory_compaction.adoc @@ -59,11 +59,9 @@ attribute can have one of four values. * _EAGER_: This is _BASIC_ policy plus in-memory compaction of flushes (much like the on-disk compactions done to hfiles); on compaction we apply on-disk rules eliminating versions, duplicates, ttl'd cells, etc. * _ADAPTIVE_: Adaptive compaction adapts to the workload. It applies either index compaction or data compaction based on the ratio of duplicate cells in the data. Experimental. -To enable _BASIC_ on the _info_ column family in the table _radish_, disable the table and add the attribute to the _info_ column family, and then reenable: +To enable _BASIC_ on the _info_ column family in the table _radish_, add the attribute to the _info_ column family: [source,ruby] ---- -hbase(main):002:0> disable 'radish' -Took 0.5570 seconds hbase(main):003:0> alter 'radish', {NAME => 'info', IN_MEMORY_COMPACTION => 'BASIC'} Updating all regions with the new schema... All regions updated. @@ -77,8 +75,6 @@ COLUMN FAMILIES DESCRIPTION 'IN_MEMORY_COMPACTION' => 'BASIC'}} 1 row(s) Took 0.0239 seconds -hbase(main):005:0> enable 'radish' -Took 0.7537 seconds ---- Note how the IN_MEMORY_COMPACTION attribute shows as part of the _METADATA_ map. diff --git a/src/main/asciidoc/_chapters/mapreduce.adoc b/src/main/asciidoc/_chapters/mapreduce.adoc index 2f72a2db247..bba8cc92b94 100644 --- a/src/main/asciidoc/_chapters/mapreduce.adoc +++ b/src/main/asciidoc/_chapters/mapreduce.adoc @@ -120,7 +120,7 @@ You might find the more selective `hbase mapredcp` tool output of interest; it l to run a basic mapreduce job against an hbase install. It does not include configuration. You'll probably need to add these if you want your MapReduce job to find the target cluster. You'll probably have to also add pointers to extra jars once you start to do anything of substance. Just specify the extras by passing the system propery `-Dtmpjars` when -you run `hbase mapredcp`. +you run `hbase mapredcp`. For jobs that do not package their dependencies or call `TableMapReduceUtil#addDependencyJars`, the following command structure is necessary: @@ -417,8 +417,8 @@ public static class MyMapper extends TableMapper { private static Put resultToPut(ImmutableBytesWritable key, Result result) throws IOException { Put put = new Put(key.get()); - for (KeyValue kv : result.raw()) { - put.add(kv); + for (Cell cell : result.listCells()) { + put.add(cell); } return put; } diff --git a/src/main/asciidoc/_chapters/offheap_read_write.adoc b/src/main/asciidoc/_chapters/offheap_read_write.adoc index 44ee97c2458..7ba4778285b 100644 --- a/src/main/asciidoc/_chapters/offheap_read_write.adoc +++ b/src/main/asciidoc/_chapters/offheap_read_write.adoc @@ -46,29 +46,41 @@ image::offheap-overview.png[] == Offheap read-path In HBase-2.0.0, link:https://issues.apache.org/jira/browse/HBASE-11425[HBASE-11425] changed the HBase read path so it could hold the read-data off-heap (from BucketCache) avoiding copying of cached data on to the java heap. -This reduces GC pauses given there is less garbage made and so less to clear. The off-heap read path has a performance -that is similar/better to that of the on-heap LRU cache. This feature is available since HBase 2.0.0. -If the BucketCache is in `file` mode, fetching will always be slower compared to the native on-heap LruBlockCache. +This reduces GC pauses given there is less garbage made and so less to clear. The off-heap read path can have a performance +that is similar or better to that of the on-heap LRU cache. This feature is available since HBase 2.0.0. Refer to below blogs for more details and test results on off heaped read path link:https://blogs.apache.org/hbase/entry/offheaping_the_read_path_in[Offheaping the Read Path in Apache HBase: Part 1 of 2] and link:https://blogs.apache.org/hbase/entry/offheap-read-path-in-production[Offheap Read-Path in Production - The Alibaba story] -For an end-to-end off-heaped read-path, first of all there should be an off-heap backed <>. Configure 'hbase.bucketcache.ioengine' to off-heap in -_hbase-site.xml_. Also specify the total capacity of the BucketCache using `hbase.bucketcache.size` config. Please remember to adjust value of 'HBASE_OFFHEAPSIZE' in -_hbase-env.sh_. This is how we specify the max possible off-heap memory allocation for the RegionServer java process. -This should be bigger than the off-heap BC size. Please keep in mind that there is no default for `hbase.bucketcache.ioengine` -which means the BC is turned OFF by default (See <>). +For an end-to-end off-heaped read-path, all you have to do is enable an off-heap backed <>(BC). +Configure _hbase.bucketcache.ioengine_ to be _offheap_ in _hbase-site.xml_ (See <> to learn more about _hbase.bucketcache.ioengine_ options). +Also specify the total capacity of the BC using `hbase.bucketcache.size` config. Please remember to adjust value of 'HBASE_OFFHEAPSIZE' in +_hbase-env.sh_ (See <> for help sizing and an example enabling). This configuration is for specifying the maximum +possible off-heap memory allocation for the RegionServer java process. This should be bigger than the off-heap BC size +to accommodate usage by other features making use of off-heap memory such as Server RPC buffer pool and short-circuit +reads (See discussion in <>). -Next thing to tune is the ByteBuffer pool on the RPC server side: +Please keep in mind that there is no default for `hbase.bucketcache.ioengine` +which means the BC is OFF by default (See <>). + +This is all you need to do to enable off-heap read path. Most buffers in HBase are already off-heap. With BC off-heap, +the read pipeline will copy data between HDFS and the server socket send of the results back to the client. + +[[regionserver.offheap.rpc.bb.tuning]] +===== Tuning the RPC buffer pool +It is possible to tune the ByteBuffer pool on the RPC server side +used to accumulate the cell bytes and create result cell blocks to send back to the client side. +`hbase.ipc.server.reservoir.enabled` can be used to turn this pool ON or OFF. By default this pool is ON and available. HBase will create off-heap ByteBuffers +and pool them them by default. Please make sure not to turn this OFF if you want end-to-end off-heaping in read path. NOTE: the config keys which start with prefix `hbase.ipc.server.reservoir` are deprecated in HBase3.x. If you are still in HBase2.x, then just use the old config keys. otherwise if in HBase3.x, please use the new config keys. (See <>) -The buffers from this pool will be used to accumulate the cell bytes and create a result cell block to send back to the client side. -`hbase.ipc.server.reservoir.enabled` can be used to turn this pool ON or OFF. By default this pool is ON and available. HBase will create off heap ByteBuffers -and pool them. Please make sure not to turn this OFF if you want end-to-end off-heaping in read path. -If this pool is turned off, the server will create temp buffers on heap to accumulate the cell bytes and make a result cell block. This can impact the GC on a highly read loaded server. +If this pool is turned off, the server will create temp buffers on heap to accumulate the cell bytes and +make a result cell block. This can impact the GC on a highly read loaded server. +Next thing to tune is the ByteBuffer pool on the RPC server side: + The user can tune this pool with respect to how many buffers are in the pool and what should be the size of each ByteBuffer. Use the config `hbase.ipc.server.reservoir.initial.buffer.size` to tune each of the buffer sizes. Default is 64 KB for HBase2.x, while it will be changed to 65KB by default for HBase3.x (see link:https://issues.apache.org/jira/browse/HBASE-22532[HBASE-22532]) diff --git a/src/main/asciidoc/_chapters/ops_mgt.adoc b/src/main/asciidoc/_chapters/ops_mgt.adoc index 5a6d2c54f9d..b2a20647daf 100644 --- a/src/main/asciidoc/_chapters/ops_mgt.adoc +++ b/src/main/asciidoc/_chapters/ops_mgt.adoc @@ -51,7 +51,8 @@ Options: Commands: Some commands take arguments. Pass no args or -h for usage. shell Run the HBase shell - hbck Run the hbase 'fsck' tool + hbck Run the HBase 'fsck' tool. Defaults read-only hbck1. + Pass '-j /path/to/HBCK2.jar' to run hbase-2.x HBCK2. snapshot Tool for managing snapshots wal Write-ahead-log analyzer hfile Store file analyzer @@ -386,12 +387,33 @@ Each command except `RowCounter` and `CellCounter` accept a single `--help` argu [[hbck]] === HBase `hbck` -To run `hbck` against your HBase cluster run `$./bin/hbase hbck`. At the end of the command's output it prints `OK` or `INCONSISTENCY`. -If your cluster reports inconsistencies, pass `-details` to see more detail emitted. -If inconsistencies, run `hbck` a few times because the inconsistency may be transient (e.g. cluster is starting up or a region is splitting). - Passing `-fix` may correct the inconsistency (This is an experimental feature). +The `hbck` tool that shipped with hbase-1.x has been made read-only in hbase-2.x. It is not able to repair +hbase-2.x clusters as hbase internals have changed. Nor should its assessments in read-only mode be +trusted as it does not understand hbase-2.x operation. -For more information, see <>. +A new tool, <>, described in the next section, replaces `hbck`. + +[[HBCK2]] +=== HBase `HBCK2` + +`HBCK2` is the successor to <>, the hbase-1.x fix tool (A.K.A `hbck1`). Use it in place of `hbck1` +making repairs against hbase-2.x installs. + +`HBCK2` does not ship as part of hbase. It can be found as a subproject of the companion +link:https://github.com/apache/hbase-operator-tools[hbase-operator-tools] repository at +link:https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2[Apache HBase HBCK2 Tool]. +`HBCK2` was moved out of hbase so it could evolve at a cadence apart from that of hbase core. + +See the [https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2](HBCK2) Home Page +for how `HBCK2` differs from `hbck1`, and for how to build and use it. + +Once built, you can run `HBCK2` as follows: + +``` +$ hbase hbck -j /path/to/HBCK2.jar +``` + +This will generate `HBCK2` usage describing commands and options. [[hfile_tool2]] === HFile Tool @@ -399,6 +421,8 @@ For more information, see <>. See <>. === WAL Tools +For bulk replaying WAL files or _recovered.edits_ files, see +<>. For reading/verifying individual files, read on. [[hlog_tool]] ==== FSHLog tool @@ -506,6 +530,13 @@ Caching for the input Scan is configured via `hbase.client.scanner.caching` By default, CopyTable utility only copies the latest version of row cells unless `--versions=n` is explicitly specified in the command. ==== +.Data Load +[NOTE] +==== +CopyTable does not perform a diff, it copies all Cells in between the specified startrow/stoprow starttime/endtime range. +This means that already existing cells with same values will still be copied. +==== + See Jonathan Hsieh's link:https://blog.cloudera.com/blog/2012/06/online-hbase-backups-with-copytable-2/[Online HBase Backups with CopyTable] blog post for more on `CopyTable`. @@ -531,21 +562,22 @@ $ ./bin/hbase org.apache.hadoop.hbase.mapreduce.HashTable --help Usage: HashTable [options] Options: - batchsize the target amount of bytes to hash in each batch - rows are added to the batch until this size is reached - (defaults to 8000 bytes) - numhashfiles the number of hash files to create - if set to fewer than number of regions then - the job will create this number of reducers - (defaults to 1/100 of regions -- at least 1) - startrow the start row - stoprow the stop row - starttime beginning of the time range (unixtime in millis) - without endtime means from starttime to forever - endtime end of the time range. Ignored if no starttime specified. - scanbatch scanner batch size to support intra row scans - versions number of cell versions to include - families comma-separated list of families to include + batchsize the target amount of bytes to hash in each batch + rows are added to the batch until this size is reached + (defaults to 8000 bytes) + numhashfiles the number of hash files to create + if set to fewer than number of regions then + the job will create this number of reducers + (defaults to 1/100 of regions -- at least 1) + startrow the start row + stoprow the stop row + starttime beginning of the time range (unixtime in millis) + without endtime means from starttime to forever + endtime end of the time range. Ignored if no starttime specified. + scanbatch scanner batch size to support intra row scans + versions number of cell versions to include + families comma-separated list of families to include + ignoreTimestamps if true, ignores cell timestamps Args: tablename Name of the table to hash @@ -584,6 +616,10 @@ Options: (defaults to true) doPuts if false, does not perform puts (defaults to true) + ignoreTimestamps if true, ignores cells timestamps while comparing + cell values. Any missing cell on target then gets + added with current time as timestamp + (defaults to false) Args: sourcehashdir path to HashTable output dir for source table @@ -597,6 +633,13 @@ Examples: $ bin/hbase org.apache.hadoop.hbase.mapreduce.SyncTable --dryrun=true --sourcezkcluster=zk1.example.com,zk2.example.com,zk3.example.com:2181:/hbase hdfs://nn:9000/hashes/tableA tableA tableA ---- +Cell comparison takes ROW/FAMILY/QUALIFIER/TIMESTAMP/VALUE into account for equality. When syncing at the target, missing cells will be +added with original timestamp value from source. That may cause unexpected results after SyncTable completes, for example, if missing +cells on target have a delete marker with a timestamp T2 (say, a bulk delete performed by mistake), but source cells timestamps have an +older value T1, then those cells would still be unavailable at target because of the newer delete marker timestamp. Since cell timestamps +might not be relevant to all use cases, _ignoreTimestamps_ option adds the flexibility to avoid using cells timestamp in the comparison. +When using _ignoreTimestamps_ set to true, this option must be specified for both HashTable and SyncTable steps. + The *dryrun* option is useful when a read only, diff report is wanted, as it will produce only COUNTERS indicating the differences, but will not perform any actual changes. It can be used as an alternative to VerifyReplication tool. @@ -606,6 +649,7 @@ Setting doDeletes to false modifies default behaviour to not delete target cells Similarly, setting doPuts to false modifies default behaviour to not add missing cells on target. Setting both doDeletes and doPuts to false would give same effect as setting dryrun to true. + .Additional info on doDeletes/doPuts [NOTE] ==== @@ -616,6 +660,16 @@ For major 1.x versions, minimum minor release including it is *1.4.10*. For major 2.x versions, minimum minor release including it is *2.1.5*. ==== +.Additional info on ignoreTimestamps +[NOTE] +==== +"ignoreTimestamps" was only added by +link:https://issues.apache.org/jira/browse/HBASE-24302[HBASE-24302], so it may not be available on +all released versions. +For major 1.x versions, minimum minor release including it is *1.4.14*. +For major 2.x versions, minimum minor release including it is *2.2.5*. +==== + .Set doDeletes to false on Two-Way Replication scenarios [NOTE] ==== @@ -633,8 +687,11 @@ which does not give any meaningful result. .Remote Clusters on different Kerberos Realms [NOTE] ==== -Currently, SyncTable can't be ran for remote clusters on different Kerberos realms. -There's some work in progress to resolve this on link:https://jira.apache.org/jira/browse/HBASE-20586[HBASE-20586] +Often, remote clusters may be deployed on different Kerberos Realms. +link:https://jira.apache.org/jira/browse/HBASE-20586[HBASE-20586] added SyncTable support for +cross realm authentication, allowing a SyncTable process running on target cluster to connect to +source cluster and read both HashTable output files and the given HBase table when performing the +required comparisons. ==== [[export]] @@ -847,6 +904,13 @@ The output can optionally be mapped to another set of tables. WALPlayer can also generate HFiles for later bulk importing, in that case only a single table and no mapping can be specified. +.WALPrettyPrinter/FSHLog Tool +[NOTE] +==== +To read or verify single WAL files or _recovered.edits_ files, since they share the WAL format, +see <<_wal_tools>>. +==== + Invoke via: ---- @@ -947,15 +1011,85 @@ See link:https://issues.apache.org/jira/browse/HBASE-4391[HBASE-4391 Add ability [[compaction.tool]] === Offline Compaction Tool -See the usage for the -link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/CompactionTool.html[CompactionTool]. -Run it like: +*CompactionTool* provides a way of running compactions (either minor or major) as an independent +process from the RegionServer. It reuses same internal implementation classes executed by RegionServer +compaction feature. However, since this runs on a complete separate independent java process, it +releases RegionServers from the overhead involved in rewrite a set of hfiles, which can be critical +for latency sensitive use cases. -[source, bash] +Usage: ---- $ ./bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool + +Usage: java org.apache.hadoop.hbase.regionserver.CompactionTool \ + [-compactOnce] [-major] [-mapred] [-D]* files... + +Options: + mapred Use MapReduce to run compaction. + compactOnce Execute just one compaction step. (default: while needed) + major Trigger major compaction. + +Note: -D properties will be applied to the conf used. +For example: + To stop delete of compacted file, pass -Dhbase.compactiontool.delete=false + To set tmp dir, pass -Dhbase.tmp.dir=ALTERNATE_DIR + +Examples: + To compact the full 'TestTable' using MapReduce: + $ hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred hdfs://hbase/data/default/TestTable + + To compact column family 'x' of the table 'TestTable' region 'abc': + $ hbase org.apache.hadoop.hbase.regionserver.CompactionTool hdfs://hbase/data/default/TestTable/abc/x ---- +As shown by usage options above, *CompactionTool* can run as a standalone client or a mapreduce job. +When running as mapreduce job, each family dir is handled as an input split, and is processed +by a separate map task. + +The *compactionOnce* parameter controls how many compaction cycles will be performed until +*CompactionTool* program decides to finish its work. If omitted, it will assume it should keep +running compactions on each specified family as determined by the given compaction policy +configured. For more info on compaction policy, see <>. + +If a major compaction is desired, *major* flag can be specified. If omitted, *CompactionTool* will +assume minor compaction is wanted by default. + +It also allows for configuration overrides with `-D` flag. In the usage section above, for example, +`-Dhbase.compactiontool.delete=false` option will instruct compaction engine to not delete original +files from temp folder. + +Files targeted for compaction must be specified as parent hdfs dirs. It allows for multiple dirs +definition, as long as each for these dirs are either a *family*, a *region*, or a *table* dir. If a +table or region dir is passed, the program will recursively iterate through related sub-folders, +effectively running compaction for each family found below the table/region level. + +Since these dirs are nested under *hbase* hdfs directory tree, *CompactionTool* requires hbase super +user permissions in order to have access to required hfiles. + +.Running in MapReduce mode +[NOTE] +==== +MapReduce mode offers the ability to process each family dir in parallel, as a separate map task. +Generally, it would make sense to run in this mode when specifying one or more table dirs as targets +for compactions. The caveat, though, is that if number of families to be compacted become too large, +the related mapreduce job may have indirect impacts on *RegionServers* performance . +Since *NodeManagers* are normally co-located with RegionServers, such large jobs could +compete for IO/Bandwidth resources with the *RegionServers*. +==== + +.MajorCompaction completely disabled on RegionServers due performance impacts +[NOTE] +==== +*Major compactions* can be a costly operation (see <>), and can indeed +impact performance on RegionServers, leading operators to completely disable it for critical +low latency application. *CompactionTool* could be used as an alternative in such scenarios, +although, additional custom application logic would need to be implemented, such as deciding +scheduling and selection of tables/regions/families target for a given compaction run. +==== + +For additional details about CompactionTool, see also +link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/CompactionTool.html[CompactionTool]. + === `hbase clean` The `hbase clean` command cleans HBase data from ZooKeeper, HDFS, or both. @@ -1327,7 +1461,7 @@ But usually disks do the "John Wayne" -- i.e. take a while to go down spewing errors in _dmesg_ -- or for some reason, run much slower than their companions. In this case you want to decommission the disk. You have two options. -You can link:https://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F[decommission +You can link:https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDataNodeAdminGuide.html[decommission the datanode] or, less disruptive in that only the bad disks data will be rereplicated, can stop the datanode, unmount the bad volume (You can't umount a volume while the datanode is using it), and then restart the datanode (presuming you have set dfs.datanode.failed.volumes.tolerated > 0). The regionserver will throw some errors in its logs as it recalibrates where to get its data from -- it will likely roll its WAL log too -- but in general but for some latency spikes, it should keep on chugging. .Short Circuit Reads @@ -1400,15 +1534,6 @@ Monitor the output of the _/tmp/log.txt_ file to follow the progress of the scri Use the following guidelines if you want to create your own rolling restart script. . Extract the new release, verify its configuration, and synchronize it to all nodes of your cluster using `rsync`, `scp`, or another secure synchronization mechanism. -. Use the hbck utility to ensure that the cluster is consistent. -+ ----- - -$ ./bin/hbck ----- -+ -Perform repairs if required. -See <> for details. . Restart the master first. You may need to modify these commands if your new HBase directory is different from the old one, such as for an upgrade. @@ -1440,7 +1565,6 @@ $ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart -- ---- . Restart the Master again, to clear out the dead servers list and re-enable the load balancer. -. Run the `hbck` utility again, to be sure the cluster is consistent. [[adding.new.node]] === Adding a New Node @@ -1996,24 +2120,46 @@ include::slow_log_responses_from_systable.adoc[] === Block Cache Monitoring Starting with HBase 0.98, the HBase Web UI includes the ability to monitor and report on the performance of the block cache. -To view the block cache reports, click . +To view the block cache reports, see the Block Cache section of the region server UI. Following are a few examples of the reporting capabilities. -.Basic Info +.Basic Info shows the cache implementation. image::bc_basic.png[] -.Config +.Config shows all cache configuration options. image::bc_config.png[] -.Stats +.Stats shows statistics about the performance of the cache. image::bc_stats.png[] -.L1 and L2 +.L1 and L2 show information about the L1 and L2 caches. image::bc_l1.png[] This is not an exhaustive list of all the screens and reports available. Have a look in the Web UI. +=== Snapshot Space Usage Monitoring + +Starting with HBase 0.95, Snapshot usage information on individual snapshots was shown in the HBase Master Web UI. This was further enhanced starting with HBase 1.3 to show the total Storefile size of the Snapshot Set. The following metrics are shown in the Master Web UI with HBase 1.3 and later. + +* Shared Storefile Size is the Storefile size shared between snapshots and active tables. +* Mob Storefile Size is the Mob Storefile size shared between snapshots and active tables. +* Archived Storefile Size is the Storefile size in Archive. + +The format of Archived Storefile Size is NNN(MMM). NNN is the total Storefile size in Archive, MMM is the total Storefile size in Archive that is specific to the snapshot (not shared with other snapshots and tables). + +.Master Snapshot Overview +image::master-snapshot.png[] + +.Snapshot Storefile Stats Example 1 +image::1-snapshot.png[] + +.Snapshot Storefile Stats Example 2 +image::2-snapshots.png[] + +.Empty Snapshot Storfile Stats Example +image::empty-snapshots.png[] + == Cluster Replication NOTE: This information was previously available at @@ -2030,6 +2176,9 @@ Some use cases for cluster replication include: NOTE: Replication is enabled at the granularity of the column family. Before enabling replication for a column family, create the table and all column families to be replicated, on the destination cluster. +NOTE: Replication is asynchronous as we send WAL to another cluster in background, which means that when you want to do recovery through replication, you could loss some data. To address this problem, we have introduced a new feature called synchronous replication. As the mechanism is a bit different so we use a separated section to describe it. Please see +<>. + === Replication Overview Cluster replication uses a source-push methodology. @@ -2852,14 +3001,17 @@ Since the cluster is up, there is a risk that edits could be missed in the copy The <> approach dumps the content of a table to HDFS on the same cluster. To restore the data, the <> utility would be used. -Since the cluster is up, there is a risk that edits could be missed in the export process. +Since the cluster is up, there is a risk that edits could be missed in the export process. If you want to know more about HBase back-up and restore see the page on link:http://hbase.apache.org/book.html#backuprestore[Backup and Restore]. [[ops.snapshots]] == HBase Snapshots -HBase Snapshots allow you to take a snapshot of a table without too much impact on Region Servers. -Snapshot, Clone and restore operations don't involve data copying. -Also, Exporting the snapshot to another cluster doesn't have impact on the Region Servers. +HBase Snapshots allow you to take a copy of a table (both contents and metadata)with a very small performance impact. A Snapshot is an immutable +collection of table metadata and a list of HFiles that comprised the table at the time the Snapshot was taken. A "clone" +of a snapshot creates a new table from that snapshot, and a "restore" of a snapshot returns the contents of a table to +what it was when the snapshot was created. The "clone" and "restore" operations do not require any data to be copied, +as the underlying HFiles (the files which contain the data for an HBase table) are not modified with either action. +Simiarly, exporting a snapshot to another cluster has little impact on RegionServers of the local cluster. Prior to version 0.94.6, the only way to backup or to clone a table is to use CopyTable/ExportTable, or to copy all the hfiles in HDFS after disabling the table. The disadvantages of these methods are that you can degrade region server performance (Copy/Export Table) or you need to disable the table, that means no reads or writes; and this is usually unacceptable. @@ -3193,8 +3345,6 @@ HDFS replication factor only affects your disk usage and is invisible to most HB You can view the current number of regions for a given table using the HMaster UI. In the [label]#Tables# section, the number of online regions for each table is listed in the [label]#Online Regions# column. This total only includes the in-memory state and does not include disabled or offline regions. -If you do not want to use the HMaster UI, you can determine the number of regions by counting the number of subdirectories of the /hbase// subdirectories in HDFS, or by running the `bin/hbase hbck` command. -Each of these methods may return a slightly different number, depending on the status of each region. [[ops.capacity.regions.count]] ==== Number of regions per RS - upper bound @@ -3481,8 +3631,8 @@ If it appears stuck, restart the Master process. === Remove RegionServer Grouping Removing RegionServer Grouping feature from a cluster on which it was enabled involves -more steps in addition to removing the relevant properties from `hbase-site.xml`. This is -to clean the RegionServer grouping related meta data so that if the feature is re-enabled +more steps in addition to removing the relevant properties from `hbase-site.xml`. This is +to clean the RegionServer grouping related meta data so that if the feature is re-enabled in the future, the old meta data will not affect the functioning of the cluster. - Move all tables in non-default rsgroups to `default` regionserver group @@ -3491,7 +3641,7 @@ in the future, the old meta data will not affect the functioning of the cluster. #Reassigning table t1 from non default group - hbase shell hbase(main):005:0> move_tables_rsgroup 'default',['t1'] ---- -- Move all regionservers in non-default rsgroups to `default` regionserver group +- Move all regionservers in non-default rsgroups to `default` regionserver group [source, bash] ---- #Reassigning all the servers in the non-default rsgroup to default - hbase shell @@ -3575,21 +3725,21 @@ To check normalizer status and enable/disable normalizer [source,bash] ---- hbase(main):001:0> normalizer_enabled -true +true 0 row(s) in 0.4870 seconds - + hbase(main):002:0> normalizer_switch false -true +true 0 row(s) in 0.0640 seconds - + hbase(main):003:0> normalizer_enabled -false +false 0 row(s) in 0.0120 seconds - + hbase(main):004:0> normalizer_switch true false 0 row(s) in 0.0200 seconds - + hbase(main):005:0> normalizer_enabled true 0 row(s) in 0.0090 seconds @@ -3608,19 +3758,19 @@ merge action being taken as a result of the normalization plan computed by Simpl Consider an user table with some pre-split regions having 3 equally large regions (about 100K rows) and 1 relatively small region (about 25K rows). Following is the -snippet from an hbase meta table scan showing each of the pre-split regions for +snippet from an hbase meta table scan showing each of the pre-split regions for the user table. ---- -table_p8ddpd6q5z,,1469494305548.68b9892220865cb6048 column=info:regioninfo, timestamp=1469494306375, value={ENCODED => 68b9892220865cb604809c950d1adf48, NAME => 'table_p8ddpd6q5z,,1469494305548.68b989222 09c950d1adf48. 0865cb604809c950d1adf48.', STARTKEY => '', ENDKEY => '1'} -.... -table_p8ddpd6q5z,1,1469494317178.867b77333bdc75a028 column=info:regioninfo, timestamp=1469494317848, value={ENCODED => 867b77333bdc75a028bb4c5e4b235f48, NAME => 'table_p8ddpd6q5z,1,1469494317178.867b7733 bb4c5e4b235f48. 3bdc75a028bb4c5e4b235f48.', STARTKEY => '1', ENDKEY => '3'} -.... -table_p8ddpd6q5z,3,1469494328323.98f019a753425e7977 column=info:regioninfo, timestamp=1469494328486, value={ENCODED => 98f019a753425e7977ab8636e32deeeb, NAME => 'table_p8ddpd6q5z,3,1469494328323.98f019a7 ab8636e32deeeb. 53425e7977ab8636e32deeeb.', STARTKEY => '3', ENDKEY => '7'} -.... -table_p8ddpd6q5z,7,1469494339662.94c64e748979ecbb16 column=info:regioninfo, timestamp=1469494339859, value={ENCODED => 94c64e748979ecbb166f6cc6550e25c6, NAME => 'table_p8ddpd6q5z,7,1469494339662.94c64e74 6f6cc6550e25c6. 8979ecbb166f6cc6550e25c6.', STARTKEY => '7', ENDKEY => '8'} -.... -table_p8ddpd6q5z,8,1469494339662.6d2b3f5fd1595ab8e7 column=info:regioninfo, timestamp=1469494339859, value={ENCODED => 6d2b3f5fd1595ab8e7c031876057b1ee, NAME => 'table_p8ddpd6q5z,8,1469494339662.6d2b3f5f c031876057b1ee. d1595ab8e7c031876057b1ee.', STARTKEY => '8', ENDKEY => ''} +table_p8ddpd6q5z,,1469494305548.68b9892220865cb6048 column=info:regioninfo, timestamp=1469494306375, value={ENCODED => 68b9892220865cb604809c950d1adf48, NAME => 'table_p8ddpd6q5z,,1469494305548.68b989222 09c950d1adf48. 0865cb604809c950d1adf48.', STARTKEY => '', ENDKEY => '1'} +.... +table_p8ddpd6q5z,1,1469494317178.867b77333bdc75a028 column=info:regioninfo, timestamp=1469494317848, value={ENCODED => 867b77333bdc75a028bb4c5e4b235f48, NAME => 'table_p8ddpd6q5z,1,1469494317178.867b7733 bb4c5e4b235f48. 3bdc75a028bb4c5e4b235f48.', STARTKEY => '1', ENDKEY => '3'} +.... +table_p8ddpd6q5z,3,1469494328323.98f019a753425e7977 column=info:regioninfo, timestamp=1469494328486, value={ENCODED => 98f019a753425e7977ab8636e32deeeb, NAME => 'table_p8ddpd6q5z,3,1469494328323.98f019a7 ab8636e32deeeb. 53425e7977ab8636e32deeeb.', STARTKEY => '3', ENDKEY => '7'} +.... +table_p8ddpd6q5z,7,1469494339662.94c64e748979ecbb16 column=info:regioninfo, timestamp=1469494339859, value={ENCODED => 94c64e748979ecbb166f6cc6550e25c6, NAME => 'table_p8ddpd6q5z,7,1469494339662.94c64e74 6f6cc6550e25c6. 8979ecbb166f6cc6550e25c6.', STARTKEY => '7', ENDKEY => '8'} +.... +table_p8ddpd6q5z,8,1469494339662.6d2b3f5fd1595ab8e7 column=info:regioninfo, timestamp=1469494339859, value={ENCODED => 6d2b3f5fd1595ab8e7c031876057b1ee, NAME => 'table_p8ddpd6q5z,8,1469494339662.6d2b3f5f c031876057b1ee. d1595ab8e7c031876057b1ee.', STARTKEY => '8', ENDKEY => ''} ---- Invoking the normalizer using ‘normalize’ int the HBase shell, the below log snippet from HMaster log shows the normalization plan computed as per the logic defined for @@ -3646,15 +3796,15 @@ and end key as ‘1’, with another region having start key as ‘1’ and end Now, that these regions have been merged we see a single new region with start key as ‘’ and end key as ‘3’ ---- -table_p8ddpd6q5z,,1469516907210.e06c9b83c4a252b130e column=info:mergeA, timestamp=1469516907431, -value=PBUF\x08\xA5\xD9\x9E\xAF\xE2*\x12\x1B\x0A\x07default\x12\x10table_p8ddpd6q5z\x1A\x00"\x011(\x000\x00 ea74d246741ba. 8\x00 +table_p8ddpd6q5z,,1469516907210.e06c9b83c4a252b130e column=info:mergeA, timestamp=1469516907431, +value=PBUF\x08\xA5\xD9\x9E\xAF\xE2*\x12\x1B\x0A\x07default\x12\x10table_p8ddpd6q5z\x1A\x00"\x011(\x000\x00 ea74d246741ba. 8\x00 table_p8ddpd6q5z,,1469516907210.e06c9b83c4a252b130e column=info:mergeB, timestamp=1469516907431, -value=PBUF\x08\xB5\xBA\x9F\xAF\xE2*\x12\x1B\x0A\x07default\x12\x10table_p8ddpd6q5z\x1A\x011"\x013(\x000\x0 ea74d246741ba. 08\x00 +value=PBUF\x08\xB5\xBA\x9F\xAF\xE2*\x12\x1B\x0A\x07default\x12\x10table_p8ddpd6q5z\x1A\x011"\x013(\x000\x0 ea74d246741ba. 08\x00 table_p8ddpd6q5z,,1469516907210.e06c9b83c4a252b130e column=info:regioninfo, timestamp=1469516907431, value={ENCODED => e06c9b83c4a252b130eea74d246741ba, NAME => 'table_p8ddpd6q5z,,1469516907210.e06c9b83c ea74d246741ba. 4a252b130eea74d246741ba.', STARTKEY => '', ENDKEY => '3'} -.... -table_p8ddpd6q5z,3,1469514778736.bf024670a847c0adff column=info:regioninfo, timestamp=1469514779417, value={ENCODED => bf024670a847c0adffb74b2e13408b32, NAME => 'table_p8ddpd6q5z,3,1469514778736.bf024670 b74b2e13408b32. a847c0adffb74b2e13408b32.' STARTKEY => '3', ENDKEY => '7'} -.... -table_p8ddpd6q5z,7,1469514790152.7c5a67bc755e649db2 column=info:regioninfo, timestamp=1469514790312, value={ENCODED => 7c5a67bc755e649db22f49af6270f1e1, NAME => 'table_p8ddpd6q5z,7,1469514790152.7c5a67bc 2f49af6270f1e1. 755e649db22f49af6270f1e1.', STARTKEY => '7', ENDKEY => '8'} +.... +table_p8ddpd6q5z,3,1469514778736.bf024670a847c0adff column=info:regioninfo, timestamp=1469514779417, value={ENCODED => bf024670a847c0adffb74b2e13408b32, NAME => 'table_p8ddpd6q5z,3,1469514778736.bf024670 b74b2e13408b32. a847c0adffb74b2e13408b32.' STARTKEY => '3', ENDKEY => '7'} +.... +table_p8ddpd6q5z,7,1469514790152.7c5a67bc755e649db2 column=info:regioninfo, timestamp=1469514790312, value={ENCODED => 7c5a67bc755e649db22f49af6270f1e1, NAME => 'table_p8ddpd6q5z,7,1469514790152.7c5a67bc 2f49af6270f1e1. 755e649db22f49af6270f1e1.', STARTKEY => '7', ENDKEY => '8'} .... table_p8ddpd6q5z,8,1469514790152.58e7503cda69f98f47 column=info:regioninfo, timestamp=1469514790312, value={ENCODED => 58e7503cda69f98f4755178e74288c3a, NAME => 'table_p8ddpd6q5z,8,1469514790152.58e7503c 55178e74288c3a. da69f98f4755178e74288c3a.', STARTKEY => '8', ENDKEY => ''} ---- @@ -3682,6 +3832,7 @@ server=hbase-test-rc-5.openstacklocal,16020,1469419333913} ---- + [[auto_reopen_regions]] == Auto Region Reopen diff --git a/src/main/asciidoc/_chapters/performance.adoc b/src/main/asciidoc/_chapters/performance.adoc index 866779ca785..ad6572dabcc 100644 --- a/src/main/asciidoc/_chapters/performance.adoc +++ b/src/main/asciidoc/_chapters/performance.adoc @@ -257,12 +257,12 @@ The following examples illustrate some of the possibilities. Note that you always have at least one write queue, no matter what setting you use. * The default value of `0` does not split the queue. -* A value of `.3` uses 30% of the queues for reading and 60% for writing. +* A value of `.3` uses 30% of the queues for reading and 70% for writing. Given a value of `10` for `hbase.ipc.server.num.callqueue`, 3 queues would be used for reads and 7 for writes. * A value of `.5` uses the same number of read queues and write queues. Given a value of `10` for `hbase.ipc.server.num.callqueue`, 5 queues would be used for reads and 5 for writes. -* A value of `.6` uses 60% of the queues for reading and 30% for reading. - Given a value of `10` for `hbase.ipc.server.num.callqueue`, 7 queues would be used for reads and 3 for writes. +* A value of `.6` uses 60% of the queues for reading and 40% for reading. + Given a value of `10` for `hbase.ipc.server.num.callqueue`, 6 queues would be used for reads and 4 for writes. * A value of `1.0` uses one queue to process write requests, and all other queues process read requests. A value higher than `1.0` has the same effect as a value of `1.0`. Given a value of `10` for `hbase.ipc.server.num.callqueue`, 9 queues would be used for reads and 1 for writes. @@ -273,11 +273,11 @@ More queues are used for Gets if the value is below `.5` and more are used for s No matter what setting you use, at least one read queue is used for Get operations. * A value of `0` does not split the read queue. -* A value of `.3` uses 60% of the read queues for Gets and 30% for Scans. +* A value of `.3` uses 70% of the read queues for Gets and 30% for Scans. Given a value of `20` for `hbase.ipc.server.num.callqueue` and a value of `.5` for `hbase.ipc.server.callqueue.read.ratio`, 10 queues would be used for reads, out of those 10, 7 would be used for Gets and 3 for Scans. * A value of `.5` uses half the read queues for Gets and half for Scans. Given a value of `20` for `hbase.ipc.server.num.callqueue` and a value of `.5` for `hbase.ipc.server.callqueue.read.ratio`, 10 queues would be used for reads, out of those 10, 5 would be used for Gets and 5 for Scans. -* A value of `.6` uses 30% of the read queues for Gets and 60% for Scans. +* A value of `.7` uses 30% of the read queues for Gets and 70% for Scans. Given a value of `20` for `hbase.ipc.server.num.callqueue` and a value of `.5` for `hbase.ipc.server.callqueue.read.ratio`, 10 queues would be used for reads, out of those 10, 3 would be used for Gets and 7 for Scans. * A value of `1.0` uses all but one of the read queues for Scans. Given a value of `20` for `hbase.ipc.server.num.callqueue` and a value of`.5` for `hbase.ipc.server.callqueue.read.ratio`, 10 queues would be used for reads, out of those 10, 1 would be used for Gets and 9 for Scans. diff --git a/src/main/asciidoc/_chapters/preface.adoc b/src/main/asciidoc/_chapters/preface.adoc index 280f2d8cc6c..deebdd3dff1 100644 --- a/src/main/asciidoc/_chapters/preface.adoc +++ b/src/main/asciidoc/_chapters/preface.adoc @@ -68,7 +68,7 @@ Yours, the HBase Community. Please use link:https://issues.apache.org/jira/browse/hbase[JIRA] to report non-security-related bugs. -To protect existing HBase installations from new vulnerabilities, please *do not* use JIRA to report security-related bugs. Instead, send your report to the mailing list private@apache.org, which allows anyone to send messages, but restricts who can read them. Someone on that list will contact you to follow up on your report. +To protect existing HBase installations from new vulnerabilities, please *do not* use JIRA to report security-related bugs. Instead, send your report to the mailing list private@hbase.apache.org, which allows anyone to send messages, but restricts who can read them. Someone on that list will contact you to follow up on your report. [[hbase_supported_tested_definitions]] .Support and Testing Expectations diff --git a/src/main/asciidoc/_chapters/profiler.adoc b/src/main/asciidoc/_chapters/profiler.adoc index 09cbccc33a8..522cc7deed6 100644 --- a/src/main/asciidoc/_chapters/profiler.adoc +++ b/src/main/asciidoc/_chapters/profiler.adoc @@ -34,6 +34,9 @@ HBASE-21926 introduced a new servlet that supports integrated profiling via asyn == Prerequisites Go to https://github.com/jvm-profiling-tools/async-profiler, download a release appropriate for your platform, and install on every cluster host. +If 4.6 or later linux, be sure to set proc variables as per 'Basic Usage' section in the +Async Profiler Home Page +(Not doing this will draw you diagrams with no content). Set `ASYNC_PROFILER_HOME` in the environment (put it in hbase-env.sh) to the root directory of the async-profiler install location, or pass it on the HBase daemon's command line as a system property as `-Dasync.profiler.home=/path/to/async-profiler`. diff --git a/src/main/asciidoc/_chapters/protobuf.adoc b/src/main/asciidoc/_chapters/protobuf.adoc index ad7e378d962..7b26f974e36 100644 --- a/src/main/asciidoc/_chapters/protobuf.adoc +++ b/src/main/asciidoc/_chapters/protobuf.adoc @@ -148,3 +148,75 @@ consider extending it also in Going forward, we will provide a new module of common types for use by CPEPs that will have the same guarantees against change as does our public API. TODO. + +=== protobuf changes for hbase-3.0.0 (HBASE-23797) +Since hadoop(start from 3.3.x) also shades protobuf and bumps the version to +3.x, there is no reason for us to stay on protobuf 2.5.0 any more. + +In HBase 3.0.0, the hbase-protocol module has been purged, the CPEP +implementation should use the protos in hbase-protocol-shaded module, and also +make use of the shaded protobuf in hbase-thirdparty. In general, we will keep +the protobuf version compatible for a whole major release, unless there are +critical problems, for example, a critical CVE on protobuf. + +Add this dependency to your pom: +[source,xml] +---- + + org.apache.hbase.thirdparty + hbase-shaded-protobuf + + ${hbase-thirdparty.version} + provided + +---- + +And typically you also need to add this plugin to your pom to make your +generated protobuf code also use the shaded and relocated protobuf version +in hbase-thirdparty. +[source,xml] +---- + + com.google.code.maven-replacer-plugin + replacer + 1.5.3 + + + process-sources + + replace + + + + + ${basedir}/target/generated-sources/ + + **/*.java + + + true + + + ([^\.])com.google.protobuf + $1org.apache.hbase.thirdparty.com.google.protobuf + + + (public)(\W+static)?(\W+final)?(\W+class) + @javax.annotation.Generated("proto") $1$2$3$4 + + + + (@javax.annotation.Generated\("proto"\) ){2} + $1 + + + + +---- + +In hbase-examples module, we have some examples under the +`org.apache.hadoop.hbase.coprocessor.example` package. You can see +`BulkDeleteEndpoint` and `BulkDelete.proto` for more details, and you can also +check the `pom.xml` of hbase-examples module to see how to make use of the above +plugin. diff --git a/src/main/asciidoc/_chapters/schema_design.adoc b/src/main/asciidoc/_chapters/schema_design.adoc index 3dd4ba0a722..d74c2f1a304 100644 --- a/src/main/asciidoc/_chapters/schema_design.adoc +++ b/src/main/asciidoc/_chapters/schema_design.adoc @@ -1143,7 +1143,11 @@ Disable Nagle’s algorithm. Delayed ACKs can add up to ~200ms to RPC round trip Detect regionserver failure as fast as reasonable. Set the following parameters: * In `hbase-site.xml`, set `zookeeper.session.timeout` to 30 seconds or less to bound failure detection (20-30 seconds is a good start). -- Notice: the `sessionTimeout` of zookeeper is limited between 2 times and 20 times the `tickTime`(the basic time unit in milliseconds used by ZooKeeper.the default value is 2000ms.It is used to do heartbeats and the minimum session timeout will be twice the tickTime). +- Note: Zookeeper clients negotiate a session timeout with the server during client init. Server enforces this timeout to be in the +range [`minSessionTimeout`, `maxSessionTimeout`] and both these timeouts (measured in milliseconds) are configurable in Zookeeper service configuration. +If not configured, these default to 2 * `tickTime` and 20 * `tickTime` respectively (`tickTime` is the basic time unit used by ZooKeeper, +as measured in milliseconds. It is used to regulate heartbeats, timeouts etc.). Refer to Zookeeper documentation for additional details. + * Detect and avoid unhealthy or failed HDFS DataNodes: in `hdfs-site.xml` and `hbase-site.xml`, set the following parameters: - `dfs.namenode.avoid.read.stale.datanode = true` - `dfs.namenode.avoid.write.stale.datanode = true` @@ -1160,7 +1164,7 @@ the regionserver/dfsclient side. * In `hbase-site.xml`, set the following parameters: - `dfs.client.read.shortcircuit = true` -- `dfs.client.read.shortcircuit.skip.checksum = true` so we don't double checksum (HBase does its own checksumming to save on i/os. See <> for more on this. +- `dfs.client.read.shortcircuit.skip.checksum = true` so we don't double checksum (HBase does its own checksumming to save on i/os. See <> for more on this. - `dfs.domain.socket.path` to match what was set for the datanodes. - `dfs.client.read.shortcircuit.buffer.size = 131072` Important to avoid OOME -- hbase has a default it uses if unset, see `hbase.dfs.client.read.shortcircuit.buffer.size`; its default is 131072. * Ensure data locality. In `hbase-site.xml`, set `hbase.hstore.min.locality.to.skip.major.compact = 0.7` (Meaning that 0.7 \<= n \<= 1) diff --git a/src/main/asciidoc/_chapters/security.adoc b/src/main/asciidoc/_chapters/security.adoc index 4369e48892a..107b2fff0e6 100644 --- a/src/main/asciidoc/_chapters/security.adoc +++ b/src/main/asciidoc/_chapters/security.adoc @@ -30,7 +30,7 @@ [IMPORTANT] .Reporting Security Bugs ==== -NOTE: To protect existing HBase installations from exploitation, please *do not* use JIRA to report security-related bugs. Instead, send your report to the mailing list private@apache.org, which allows anyone to send messages, but restricts who can read them. Someone on that list will contact you to follow up on your report. +NOTE: To protect existing HBase installations from exploitation, please *do not* use JIRA to report security-related bugs. Instead, send your report to the mailing list private@hbase.apache.org, which allows anyone to send messages, but restricts who can read them. Someone on that list will contact you to follow up on your report. HBase adheres to the Apache Software Foundation's policy on reported vulnerabilities, available at http://apache.org/security/. @@ -1811,7 +1811,7 @@ All options have been discussed separately in the sections above. hbase.superuser - hbase, admin + hbase,admin @@ -1831,8 +1831,7 @@ All options have been discussed separately in the sections above. hbase.coprocessor.regionserver.classes - org.apache.hadoop/hbase.security.access.AccessController, - org.apache.hadoop.hbase.security.access.VisibilityController + org.apache.hadoop.hbase.security.access.AccessController diff --git a/src/main/asciidoc/_chapters/snapshot_scanner.adoc b/src/main/asciidoc/_chapters/snapshot_scanner.adoc new file mode 100644 index 00000000000..781b76074d5 --- /dev/null +++ b/src/main/asciidoc/_chapters/snapshot_scanner.adoc @@ -0,0 +1,152 @@ +//// +/** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +//// + +[[snapshot_scanner]] +== Scan over snapshot +:doctype: book +:numbered: +:toc: left +:icons: font +:experimental: +:toc: left +:source-language: java + +In HBase, a scan of a table costs server-side HBase resources reading, formating, and returning data back to the client. +Luckily, HBase provides a TableSnapshotScanner and TableSnapshotInputFormat (introduced by link:https://issues.apache.org/jira/browse/HBASE-8369[HBASE-8369]), +which can scan HBase-written HFiles directly in the HDFS filesystem completely by-passing hbase. This access mode +performs better than going via HBase and can be used with an offline HBase with in-place or exported +snapshot HFiles. + +To read HFiles directly, the user must have sufficient permissions to access snapshots or in-place hbase HFiles. + +=== TableSnapshotScanner + +TableSnapshotScanner provides a means for running a single client-side scan over snapshot files. +When using TableSnapshotScanner, we must specify a temporary directory to copy the snapshot files into. +The client user should have write permissions to this directory, and the dir should not be a subdirectory of +the hbase.rootdir. The scanner deletes the contents of the directory once the scanner is closed. + +.Use TableSnapshotScanner +==== +[source,java] +---- +Path restoreDir = new Path("XX"); // restore dir should not be a subdirectory of hbase.rootdir +Scan scan = new Scan(); +try (TableSnapshotScanner scanner = new TableSnapshotScanner(conf, restoreDir, snapshotName, scan)) { + Result result = scanner.next(); + while (result != null) { + ... + result = scanner.next(); + } +} +---- +==== + +=== TableSnapshotInputFormat +TableSnapshotInputFormat provides a way to scan over snapshot HFiles in a MapReduce job. + +.Use TableSnapshotInputFormat +==== +[source,java] +---- +Job job = new Job(conf); +Path restoreDir = new Path("XX"); // restore dir should not be a subdirectory of hbase.rootdir +Scan scan = new Scan(); +TableMapReduceUtil.initTableSnapshotMapperJob(snapshotName, scan, MyTableMapper.class, MyMapKeyOutput.class, MyMapOutputValueWritable.class, job, true, restoreDir); +---- +==== + +=== Permission to access snapshot and data files +Generally, only the HBase owner or the HDFS admin have the permission to access HFiles. + +link:https://issues.apache.org/jira/browse/HBASE-18659[HBASE-18659] uses HDFS ACLs to make HBase granted user have permission to access snapshot files. + +==== link:https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#ACLs_Access_Control_Lists[HDFS ACLs] + +HDFS ACLs supports an "access ACL", which defines the rules to enforce during permission checks, and a "default ACL", +which defines the ACL entries that new child files or sub-directories receive automatically during creation. +Via HDFS ACLs, HBase syncs granted users with read permission to HFiles. + +==== Basic idea + +The HBase files are organized in the following ways: + + * {hbase-rootdir}/.tmp/data/{namespace}/{table} + * {hbase-rootdir}/data/{namespace}/{table} + * {hbase-rootdir}/archive/data/{namespace}/{table} + * {hbase-rootdir}/.hbase-snapshot/{snapshotName} + +So the basic idea is to add or remove HDFS ACLs to files of the global/namespace/table directory +when grant or revoke permission to global/namespace/table. + +See the design doc in link:https://issues.apache.org/jira/browse/HBASE-18659[HBASE-18659] for more details. + +==== Configuration to use this feature + + * Firstly, make sure that HDFS ACLs are enabled and umask is set to 027 +---- +dfs.namenode.acls.enabled = true +fs.permissions.umask-mode = 027 +---- + + * Add master coprocessor, please make sure the SnapshotScannerHDFSAclController is configured after AccessController +---- +hbase.coprocessor.master.classes = "org.apache.hadoop.hbase.security.access.AccessController +,org.apache.hadoop.hbase.security.access.SnapshotScannerHDFSAclController" +---- + + * Enable this feature +---- +hbase.acl.sync.to.hdfs.enable=true +---- + + * Modify table scheme to enable this feature for a specified table, this config is + false by default for every table, this means the HBase granted ACLs will not be synced to HDFS +---- +alter 't1', CONFIGURATION => {'hbase.acl.sync.to.hdfs.enable' => 'true'} +---- + +==== Limitation +There are some limitations for this feature: + +===== +If we enable this feature, some master operations such as grant, revoke, snapshot... +(See the design doc for more details) will be slower as we need to sync HDFS ACLs to related hfiles. +===== + +===== +HDFS has a config which limits the max ACL entries num for one directory or file: +---- +dfs.namenode.acls.max.entries = 32(default value) +---- +The 32 entries include four fixed users for each directory or file: owner, group, other, and mask. +For a directory, the four users contain 8 ACL entries(access and default) and for a file, the four +users contain 4 ACL entries(access). This means there are 24 ACL entries left for named users or groups. + +Based on this limitation, we can only sync up to 12 HBase granted users' ACLs. This means, if a table +enables this feature, then the total users with table, namespace of this table, global READ permission +should not be greater than 12. +===== + +===== +There are some cases that this coprocessor has not handled or could not handle, so the user HDFS ACLs +are not synced normally. It will not make a reference link to another hfile of other tables. +===== diff --git a/src/main/asciidoc/_chapters/spark.adoc b/src/main/asciidoc/_chapters/spark.adoc new file mode 100644 index 00000000000..207528f057d --- /dev/null +++ b/src/main/asciidoc/_chapters/spark.adoc @@ -0,0 +1,699 @@ +//// +/** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + . . http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +//// + +[[spark]] += HBase and Spark +:doctype: book +:numbered: +:toc: left +:icons: font +:experimental: + +link:https://spark.apache.org/[Apache Spark] is a software framework that is used +to process data in memory in a distributed manner, and is replacing MapReduce in +many use cases. + +Spark itself is out of scope of this document, please refer to the Spark site for +more information on the Spark project and subprojects. This document will focus +on 4 main interaction points between Spark and HBase. Those interaction points are: + +Basic Spark:: + The ability to have an HBase Connection at any point in your Spark DAG. +Spark Streaming:: + The ability to have an HBase Connection at any point in your Spark Streaming + application. +Spark Bulk Load:: + The ability to write directly to HBase HFiles for bulk insertion into HBase +SparkSQL/DataFrames:: + The ability to write SparkSQL that draws on tables that are represented in HBase. + +The following sections will walk through examples of all these interaction points. + +== Basic Spark + +This section discusses Spark HBase integration at the lowest and simplest levels. +All the other interaction points are built upon the concepts that will be described +here. + +At the root of all Spark and HBase integration is the HBaseContext. The HBaseContext +takes in HBase configurations and pushes them to the Spark executors. This allows +us to have an HBase Connection per Spark Executor in a static location. + +For reference, Spark Executors can be on the same nodes as the Region Servers or +on different nodes, there is no dependence on co-location. Think of every Spark +Executor as a multi-threaded client application. This allows any Spark Tasks +running on the executors to access the shared Connection object. + +.HBaseContext Usage Example +==== + +This example shows how HBaseContext can be used to do a `foreachPartition` on a RDD +in Scala: + +[source, scala] +---- +val sc = new SparkContext("local", "test") +val config = new HBaseConfiguration() + +... + +val hbaseContext = new HBaseContext(sc, config) + +rdd.hbaseForeachPartition(hbaseContext, (it, conn) => { + val bufferedMutator = conn.getBufferedMutator(TableName.valueOf("t1")) + it.foreach((putRecord) => { +. val put = new Put(putRecord._1) +. putRecord._2.foreach((putValue) => put.addColumn(putValue._1, putValue._2, putValue._3)) +. bufferedMutator.mutate(put) + }) + bufferedMutator.flush() + bufferedMutator.close() +}) +---- + +Here is the same example implemented in Java: + +[source, java] +---- +JavaSparkContext jsc = new JavaSparkContext(sparkConf); + +try { + List list = new ArrayList<>(); + list.add(Bytes.toBytes("1")); + ... + list.add(Bytes.toBytes("5")); + + JavaRDD rdd = jsc.parallelize(list); + Configuration conf = HBaseConfiguration.create(); + + JavaHBaseContext hbaseContext = new JavaHBaseContext(jsc, conf); + + hbaseContext.foreachPartition(rdd, + new VoidFunction, Connection>>() { + public void call(Tuple2, Connection> t) + throws Exception { + Table table = t._2().getTable(TableName.valueOf(tableName)); + BufferedMutator mutator = t._2().getBufferedMutator(TableName.valueOf(tableName)); + while (t._1().hasNext()) { + byte[] b = t._1().next(); + Result r = table.get(new Get(b)); + if (r.getExists()) { + mutator.mutate(new Put(b)); + } + } + + mutator.flush(); + mutator.close(); + table.close(); + } + }); +} finally { + jsc.stop(); +} +---- +==== + +All functionality between Spark and HBase will be supported both in Scala and in +Java, with the exception of SparkSQL which will support any language that is +supported by Spark. For the remaining of this documentation we will focus on +Scala examples. + +The examples above illustrate how to do a foreachPartition with a connection. A +number of other Spark base functions are supported out of the box: + +// tag::spark_base_functions[] +`bulkPut`:: For massively parallel sending of puts to HBase +`bulkDelete`:: For massively parallel sending of deletes to HBase +`bulkGet`:: For massively parallel sending of gets to HBase to create a new RDD +`mapPartition`:: To do a Spark Map function with a Connection object to allow full +access to HBase +`hbaseRDD`:: To simplify a distributed scan to create a RDD +// end::spark_base_functions[] + +For examples of all these functionalities, see the +link:https://github.com/apache/hbase-connectors/tree/master/spark[hbase-spark integration] +in the link:https://github.com/apache/hbase-connectors[hbase-connectors] repository +(the hbase-spark connectors live outside hbase core in a related, +Apache HBase project maintained, associated repo). + +== Spark Streaming +https://spark.apache.org/streaming/[Spark Streaming] is a micro batching stream +processing framework built on top of Spark. HBase and Spark Streaming make great +companions in that HBase can help serve the following benefits alongside Spark +Streaming. + +* A place to grab reference data or profile data on the fly +* A place to store counts or aggregates in a way that supports Spark Streaming's +promise of _only once processing_. + +The link:https://github.com/apache/hbase-connectors/tree/master/spark[hbase-spark integration] +with Spark Streaming is similar to its normal Spark integration points, in that the following +commands are possible straight off a Spark Streaming DStream. + +include::spark.adoc[tags=spark_base_functions] + +.`bulkPut` Example with DStreams +==== + +Below is an example of bulkPut with DStreams. It is very close in feel to the RDD +bulk put. + +[source, scala] +---- +val sc = new SparkContext("local", "test") +val config = new HBaseConfiguration() + +val hbaseContext = new HBaseContext(sc, config) +val ssc = new StreamingContext(sc, Milliseconds(200)) + +val rdd1 = ... +val rdd2 = ... + +val queue = mutable.Queue[RDD[(Array[Byte], Array[(Array[Byte], + Array[Byte], Array[Byte])])]]() + +queue += rdd1 +queue += rdd2 + +val dStream = ssc.queueStream(queue) + +dStream.hbaseBulkPut( + hbaseContext, + TableName.valueOf(tableName), + (putRecord) => { + val put = new Put(putRecord._1) + putRecord._2.foreach((putValue) => put.addColumn(putValue._1, putValue._2, putValue._3)) + put + }) +---- + +There are three inputs to the `hbaseBulkPut` function. +The hbaseContext that carries the configuration broadcast information link +to the HBase Connections in the executor, the table name of the table we are +putting data into, and a function that will convert a record in the DStream +into an HBase Put object. +==== + +== Bulk Load + +There are two options for bulk loading data into HBase with Spark. There is the +basic bulk load functionality that will work for cases where your rows have +millions of columns and cases where your columns are not consolidated and +partitioned before the map side of the Spark bulk load process. + +There is also a thin record bulk load option with Spark. This second option is +designed for tables that have less then 10k columns per row. The advantage +of this second option is higher throughput and less over-all load on the Spark +shuffle operation. + +Both implementations work more or less like the MapReduce bulk load process in +that a partitioner partitions the rowkeys based on region splits and +the row keys are sent to the reducers in order, so that HFiles can be written +out directly from the reduce phase. + +In Spark terms, the bulk load will be implemented around a Spark +`repartitionAndSortWithinPartitions` followed by a Spark `foreachPartition`. + +First lets look at an example of using the basic bulk load functionality + +.Bulk Loading Example +==== + +The following example shows bulk loading with Spark. + +[source, scala] +---- +val sc = new SparkContext("local", "test") +val config = new HBaseConfiguration() + +val hbaseContext = new HBaseContext(sc, config) + +val stagingFolder = ... +val rdd = sc.parallelize(Array( + (Bytes.toBytes("1"), + (Bytes.toBytes(columnFamily1), Bytes.toBytes("a"), Bytes.toBytes("foo1"))), + (Bytes.toBytes("3"), + (Bytes.toBytes(columnFamily1), Bytes.toBytes("b"), Bytes.toBytes("foo2.b"))), ... + +rdd.hbaseBulkLoad(TableName.valueOf(tableName), + t => { + val rowKey = t._1 + val family:Array[Byte] = t._2(0)._1 + val qualifier = t._2(0)._2 + val value = t._2(0)._3 + + val keyFamilyQualifier= new KeyFamilyQualifier(rowKey, family, qualifier) + + Seq((keyFamilyQualifier, value)).iterator + }, + stagingFolder.getPath) + +val load = new LoadIncrementalHFiles(config) +load.doBulkLoad(new Path(stagingFolder.getPath), + conn.getAdmin, table, conn.getRegionLocator(TableName.valueOf(tableName))) +---- +==== + +The `hbaseBulkLoad` function takes three required parameters: + +. The table name of the table we intend to bulk load too + +. A function that will convert a record in the RDD to a tuple key value par. With +the tuple key being a KeyFamilyQualifer object and the value being the cell value. +The KeyFamilyQualifer object will hold the RowKey, Column Family, and Column Qualifier. +The shuffle will partition on the RowKey but will sort by all three values. + +. The temporary path for the HFile to be written out too + +Following the Spark bulk load command, use the HBase's LoadIncrementalHFiles object +to load the newly created HFiles into HBase. + +.Additional Parameters for Bulk Loading with Spark + +You can set the following attributes with additional parameter options on hbaseBulkLoad. + +* Max file size of the HFiles +* A flag to exclude HFiles from compactions +* Column Family settings for compression, bloomType, blockSize, and dataBlockEncoding + +.Using Additional Parameters +==== + +[source, scala] +---- +val sc = new SparkContext("local", "test") +val config = new HBaseConfiguration() + +val hbaseContext = new HBaseContext(sc, config) + +val stagingFolder = ... +val rdd = sc.parallelize(Array( + (Bytes.toBytes("1"), + (Bytes.toBytes(columnFamily1), Bytes.toBytes("a"), Bytes.toBytes("foo1"))), + (Bytes.toBytes("3"), + (Bytes.toBytes(columnFamily1), Bytes.toBytes("b"), Bytes.toBytes("foo2.b"))), ... + +val familyHBaseWriterOptions = new java.util.HashMap[Array[Byte], FamilyHFileWriteOptions] +val f1Options = new FamilyHFileWriteOptions("GZ", "ROW", 128, "PREFIX") + +familyHBaseWriterOptions.put(Bytes.toBytes("columnFamily1"), f1Options) + +rdd.hbaseBulkLoad(TableName.valueOf(tableName), + t => { + val rowKey = t._1 + val family:Array[Byte] = t._2(0)._1 + val qualifier = t._2(0)._2 + val value = t._2(0)._3 + + val keyFamilyQualifier= new KeyFamilyQualifier(rowKey, family, qualifier) + + Seq((keyFamilyQualifier, value)).iterator + }, + stagingFolder.getPath, + familyHBaseWriterOptions, + compactionExclude = false, + HConstants.DEFAULT_MAX_FILE_SIZE) + +val load = new LoadIncrementalHFiles(config) +load.doBulkLoad(new Path(stagingFolder.getPath), + conn.getAdmin, table, conn.getRegionLocator(TableName.valueOf(tableName))) +---- +==== + +Now lets look at how you would call the thin record bulk load implementation + +.Using thin record bulk load +==== + +[source, scala] +---- +val sc = new SparkContext("local", "test") +val config = new HBaseConfiguration() + +val hbaseContext = new HBaseContext(sc, config) + +val stagingFolder = ... +val rdd = sc.parallelize(Array( + ("1", + (Bytes.toBytes(columnFamily1), Bytes.toBytes("a"), Bytes.toBytes("foo1"))), + ("3", + (Bytes.toBytes(columnFamily1), Bytes.toBytes("b"), Bytes.toBytes("foo2.b"))), ... + +rdd.hbaseBulkLoadThinRows(hbaseContext, + TableName.valueOf(tableName), + t => { + val rowKey = t._1 + + val familyQualifiersValues = new FamiliesQualifiersValues + t._2.foreach(f => { + val family:Array[Byte] = f._1 + val qualifier = f._2 + val value:Array[Byte] = f._3 + + familyQualifiersValues +=(family, qualifier, value) + }) + (new ByteArrayWrapper(Bytes.toBytes(rowKey)), familyQualifiersValues) + }, + stagingFolder.getPath, + new java.util.HashMap[Array[Byte], FamilyHFileWriteOptions], + compactionExclude = false, + 20) + +val load = new LoadIncrementalHFiles(config) +load.doBulkLoad(new Path(stagingFolder.getPath), + conn.getAdmin, table, conn.getRegionLocator(TableName.valueOf(tableName))) +---- +==== + +Note that the big difference in using bulk load for thin rows is the function +returns a tuple with the first value being the row key and the second value +being an object of FamiliesQualifiersValues, which will contain all the +values for this row for all column families. + +== SparkSQL/DataFrames + +The link:https://github.com/apache/hbase-connectors/tree/master/spark[hbase-spark integration] +leverages +link:https://databricks.com/blog/2015/01/09/spark-sql-data-sources-api-unified-data-access-for-the-spark-platform.html[DataSource API] +(link:https://issues.apache.org/jira/browse/SPARK-3247[SPARK-3247]) +introduced in Spark-1.2.0, which bridges the gap between simple HBase KV store and complex +relational SQL queries and enables users to perform complex data analytical work +on top of HBase using Spark. HBase Dataframe is a standard Spark Dataframe, and is able to +interact with any other data sources such as Hive, Orc, Parquet, JSON, etc. +The link:https://github.com/apache/hbase-connectors/tree/master/spark[hbase-spark integration] +applies critical techniques such as partition pruning, column pruning, +predicate pushdown and data locality. + +To use the +link:https://github.com/apache/hbase-connectors/tree/master/spark[hbase-spark integration] +connector, users need to define the Catalog for the schema mapping +between HBase and Spark tables, prepare the data and populate the HBase table, +then load the HBase DataFrame. After that, users can do integrated query and access records +in HBase tables with SQL query. The following illustrates the basic procedure. + +=== Define catalog + +[source, scala] +---- +def catalog = s"""{ +       |"table":{"namespace":"default", "name":"table1"}, +       |"rowkey":"key", +       |"columns":{ +         |"col0":{"cf":"rowkey", "col":"key", "type":"string"}, +         |"col1":{"cf":"cf1", "col":"col1", "type":"boolean"}, +         |"col2":{"cf":"cf2", "col":"col2", "type":"double"}, +         |"col3":{"cf":"cf3", "col":"col3", "type":"float"}, +         |"col4":{"cf":"cf4", "col":"col4", "type":"int"}, +         |"col5":{"cf":"cf5", "col":"col5", "type":"bigint"}, +         |"col6":{"cf":"cf6", "col":"col6", "type":"smallint"}, +         |"col7":{"cf":"cf7", "col":"col7", "type":"string"}, +         |"col8":{"cf":"cf8", "col":"col8", "type":"tinyint"} +       |} +     |}""".stripMargin +---- + +Catalog defines a mapping between HBase and Spark tables. There are two critical parts of this catalog. +One is the rowkey definition and the other is the mapping between table column in Spark and +the column family and column qualifier in HBase. The above defines a schema for a HBase table +with name as table1, row key as key and a number of columns (col1 `-` col8). Note that the rowkey +also has to be defined in details as a column (col0), which has a specific cf (rowkey). + +=== Save the DataFrame + +[source, scala] +---- +case class HBaseRecord( + col0: String, + col1: Boolean, + col2: Double, + col3: Float, + col4: Int,        + col5: Long, + col6: Short, + col7: String, + col8: Byte) + +object HBaseRecord +{                                                                                                              + def apply(i: Int, t: String): HBaseRecord = { + val s = s"""row${"%03d".format(i)}"""        + HBaseRecord(s, + i % 2 == 0, + i.toDouble, + i.toFloat,   + i, + i.toLong, + i.toShort,   + s"String$i: $t",       + i.toByte) + } +} + +val data = (0 to 255).map { i =>  HBaseRecord(i, "extra")} + +sc.parallelize(data).toDF.write.options( + Map(HBaseTableCatalog.tableCatalog -> catalog, HBaseTableCatalog.newTable -> "5")) + .format("org.apache.hadoop.hbase.spark ") + .save() + +---- +`data` prepared by the user is a local Scala collection which has 256 HBaseRecord objects. +`sc.parallelize(data)` function distributes `data` to form an RDD. `toDF` returns a DataFrame. +`write` function returns a DataFrameWriter used to write the DataFrame to external storage +systems (e.g. HBase here). Given a DataFrame with specified schema `catalog`, `save` function +will create an HBase table with 5 regions and save the DataFrame inside. + +=== Load the DataFrame + +[source, scala] +---- +def withCatalog(cat: String): DataFrame = { + sqlContext + .read + .options(Map(HBaseTableCatalog.tableCatalog->cat)) + .format("org.apache.hadoop.hbase.spark") + .load() +} +val df = withCatalog(catalog) +---- +In ‘withCatalog’ function, sqlContext is a variable of SQLContext, which is the entry point +for working with structured data (rows and columns) in Spark. +`read` returns a DataFrameReader that can be used to read data in as a DataFrame. +`option` function adds input options for the underlying data source to the DataFrameReader, +and `format` function specifies the input data source format for the DataFrameReader. +The `load()` function loads input in as a DataFrame. The date frame `df` returned +by `withCatalog` function could be used to access HBase table, such as 4.4 and 4.5. + +=== Language Integrated Query + +[source, scala] +---- +val s = df.filter(($"col0" <= "row050" && $"col0" > "row040") || + $"col0" === "row005" || + $"col0" <= "row005") + .select("col0", "col1", "col4") +s.show +---- +DataFrame can do various operations, such as join, sort, select, filter, orderBy and so on. +`df.filter` above filters rows using the given SQL expression. `select` selects a set of columns: +`col0`, `col1` and `col4`. + +=== SQL Query + +[source, scala] +---- +df.registerTempTable("table1") +sqlContext.sql("select count(col1) from table1").show +---- + +`registerTempTable` registers `df` DataFrame as a temporary table using the table name `table1`. +The lifetime of this temporary table is tied to the SQLContext that was used to create `df`. +`sqlContext.sql` function allows the user to execute SQL queries. + +=== Others + +.Query with different timestamps +==== +In HBaseSparkConf, four parameters related to timestamp can be set. They are TIMESTAMP, +MIN_TIMESTAMP, MAX_TIMESTAMP and MAX_VERSIONS respectively. Users can query records with +different timestamps or time ranges with MIN_TIMESTAMP and MAX_TIMESTAMP. In the meantime, +use concrete value instead of tsSpecified and oldMs in the examples below. + +The example below shows how to load df DataFrame with different timestamps. +tsSpecified is specified by the user. +HBaseTableCatalog defines the HBase and Relation relation schema. +writeCatalog defines catalog for the schema mapping. + +[source, scala] +---- +val df = sqlContext.read + .options(Map(HBaseTableCatalog.tableCatalog -> writeCatalog, HBaseSparkConf.TIMESTAMP -> tsSpecified.toString)) + .format("org.apache.hadoop.hbase.spark") + .load() +---- + +The example below shows how to load df DataFrame with different time ranges. +oldMs is specified by the user. + +[source, scala] +---- +val df = sqlContext.read + .options(Map(HBaseTableCatalog.tableCatalog -> writeCatalog, HBaseSparkConf.MIN_TIMESTAMP -> "0", + HBaseSparkConf.MAX_TIMESTAMP -> oldMs.toString)) + .format("org.apache.hadoop.hbase.spark") + .load() +---- +After loading df DataFrame, users can query data. + +[source, scala] +---- +df.registerTempTable("table") +sqlContext.sql("select count(col1) from table").show +---- +==== + +.Native Avro support +==== +The link:https://github.com/apache/hbase-connectors/tree/master/spark[hbase-spark integration] +connector supports different data formats like Avro, JSON, etc. The use case below +shows how spark supports Avro. Users can persist the Avro record into HBase directly. Internally, +the Avro schema is converted to a native Spark Catalyst data type automatically. +Note that both key-value parts in an HBase table can be defined in Avro format. + +1) Define catalog for the schema mapping: + +[source, scala] +---- +def catalog = s"""{ + |"table":{"namespace":"default", "name":"Avrotable"}, + |"rowkey":"key", + |"columns":{ + |"col0":{"cf":"rowkey", "col":"key", "type":"string"}, + |"col1":{"cf":"cf1", "col":"col1", "type":"binary"} + |} + |}""".stripMargin +---- + +`catalog` is a schema for a HBase table named `Avrotable`. row key as key and +one column col1. The rowkey also has to be defined in details as a column (col0), +which has a specific cf (rowkey). + +2) Prepare the Data: + +[source, scala] +---- + object AvroHBaseRecord { + val schemaString = + s"""{"namespace": "example.avro", + | "type": "record", "name": "User", + | "fields": [ + | {"name": "name", "type": "string"}, + | {"name": "favorite_number", "type": ["int", "null"]}, + | {"name": "favorite_color", "type": ["string", "null"]}, + | {"name": "favorite_array", "type": {"type": "array", "items": "string"}}, + | {"name": "favorite_map", "type": {"type": "map", "values": "int"}} + | ] }""".stripMargin + + val avroSchema: Schema = { + val p = new Schema.Parser + p.parse(schemaString) + } + + def apply(i: Int): AvroHBaseRecord = { + val user = new GenericData.Record(avroSchema); + user.put("name", s"name${"%03d".format(i)}") + user.put("favorite_number", i) + user.put("favorite_color", s"color${"%03d".format(i)}") + val favoriteArray = new GenericData.Array[String](2, avroSchema.getField("favorite_array").schema()) + favoriteArray.add(s"number${i}") + favoriteArray.add(s"number${i+1}") + user.put("favorite_array", favoriteArray) + import collection.JavaConverters._ + val favoriteMap = Map[String, Int](("key1" -> i), ("key2" -> (i+1))).asJava + user.put("favorite_map", favoriteMap) + val avroByte = AvroSedes.serialize(user, avroSchema) + AvroHBaseRecord(s"name${"%03d".format(i)}", avroByte) + } + } + + val data = (0 to 255).map { i => + AvroHBaseRecord(i) + } +---- + +`schemaString` is defined first, then it is parsed to get `avroSchema`. `avroSchema` is used to +generate `AvroHBaseRecord`. `data` prepared by users is a local Scala collection +which has 256 `AvroHBaseRecord` objects. + +3) Save DataFrame: + +[source, scala] +---- + sc.parallelize(data).toDF.write.options( + Map(HBaseTableCatalog.tableCatalog -> catalog, HBaseTableCatalog.newTable -> "5")) + .format("org.apache.spark.sql.execution.datasources.hbase") + .save() +---- + +Given a data frame with specified schema `catalog`, above will create an HBase table with 5 +regions and save the data frame inside. + +4) Load the DataFrame + +[source, scala] +---- +def avroCatalog = s"""{ + |"table":{"namespace":"default", "name":"avrotable"}, + |"rowkey":"key", + |"columns":{ + |"col0":{"cf":"rowkey", "col":"key", "type":"string"}, + |"col1":{"cf":"cf1", "col":"col1", "avro":"avroSchema"} + |} + |}""".stripMargin + + def withCatalog(cat: String): DataFrame = { + sqlContext + .read + .options(Map("avroSchema" -> AvroHBaseRecord.schemaString, HBaseTableCatalog.tableCatalog -> avroCatalog)) + .format("org.apache.spark.sql.execution.datasources.hbase") + .load() + } + val df = withCatalog(catalog) +---- + +In `withCatalog` function, `read` returns a DataFrameReader that can be used to read data in as a DataFrame. +The `option` function adds input options for the underlying data source to the DataFrameReader. +There are two options: one is to set `avroSchema` as `AvroHBaseRecord.schemaString`, and one is to +set `HBaseTableCatalog.tableCatalog` as `avroCatalog`. The `load()` function loads input in as a DataFrame. +The date frame `df` returned by `withCatalog` function could be used to access the HBase table. + +5) SQL Query + +[source, scala] +---- + df.registerTempTable("avrotable") + val c = sqlContext.sql("select count(1) from avrotable"). +---- + +After loading df DataFrame, users can query data. registerTempTable registers df DataFrame +as a temporary table using the table name avrotable. `sqlContext.sql` function allows the +user to execute SQL queries. +==== diff --git a/src/main/asciidoc/_chapters/troubleshooting.adoc b/src/main/asciidoc/_chapters/troubleshooting.adoc index 378ad4b2ceb..032d2c1868e 100644 --- a/src/main/asciidoc/_chapters/troubleshooting.adoc +++ b/src/main/asciidoc/_chapters/troubleshooting.adoc @@ -724,6 +724,17 @@ Insure the JCE jars are on the classpath on both server and client systems. You may also need to download the link:http://www.oracle.com/technetwork/java/javase/downloads/jce-6-download-429243.html[unlimited strength JCE policy files]. Uncompress and extract the downloaded file, and install the policy jars into _/lib/security_. +[[trouble.client.masterregistry]] +=== Trouble shooting master registry issues + +* For connectivity issues, usually an exception like "MasterRegistryFetchException: Exception making rpc to masters..." is logged in the client logs. The logging includes the +list of master end points that were attempted by the client. The bottom part of the stack trace should include the underlying reason. If you suspect connectivity +issues (ConnectionRefused?), make sure the master end points are accessible from client. +* If there is a suspicion of higher load on the masters due to hedging of RPCs, it can be controlled by either reducing the hedging fan out (via _hbase.rpc.hedged.fanout_) or +by restricting the set of masters that clients can access for the master registry purposes (via _hbase.masters_). + +Refer to <> and <> for more details. + [[trouble.mapreduce]] == MapReduce @@ -870,9 +881,9 @@ Snapshots:: When you create a snapshot, HBase retains everything it needs to recreate the table's state at that time of the snapshot. This includes deleted cells or expired versions. For this reason, your snapshot usage pattern should be well-planned, and you should - prune snapshots that you no longer need. Snapshots are stored in `/hbase/.snapshots`, + prune snapshots that you no longer need. Snapshots are stored in `/hbase/.hbase-snapshot`, and archives needed to restore snapshots are stored in - `/hbase/.archive/<_tablename_>/<_region_>/<_column_family_>/`. + `/hbase/archive/<_tablename_>/<_region_>/<_column_family_>/`. *Do not* manage snapshots or archives manually via HDFS. HBase provides APIs and HBase Shell commands for managing them. For more information, see <>. @@ -1290,7 +1301,7 @@ If you have a DNS server, you can set `hbase.zookeeper.dns.interface` and `hbase ZooKeeper is the cluster's "canary in the mineshaft". It'll be the first to notice issues if any so making sure its happy is the short-cut to a humming cluster. -See the link:https://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting[ZooKeeper Operating Environment Troubleshooting] page. +See the link:https://cwiki.apache.org/confluence/display/HADOOP2/ZooKeeper+Troubleshooting[ZooKeeper Operating Environment Troubleshooting] page. It has suggestions and tools for checking disk and networking performance; i.e. the operating environment your ZooKeeper and HBase are running in. diff --git a/src/main/asciidoc/_chapters/upgrading.adoc b/src/main/asciidoc/_chapters/upgrading.adoc index 02739279b96..39e86b9f61c 100644 --- a/src/main/asciidoc/_chapters/upgrading.adoc +++ b/src/main/asciidoc/_chapters/upgrading.adoc @@ -162,7 +162,7 @@ HBase Private API:: [[hbase.binary.compatibility]] .Binary Compatibility -When we say two HBase versions are compatible, we mean that the versions are wire and binary compatible. Compatible HBase versions means that clients can talk to compatible but differently versioned servers. It means too that you can just swap out the jars of one version and replace them with the jars of another, compatible version and all will just work. Unless otherwise specified, HBase point versions are (mostly) binary compatible. You can safely do rolling upgrades between binary compatible versions; i.e. across maintenance releases: e.g. from 1.2.4 to 1.2.6. See link:[Does compatibility between versions also mean binary compatibility?] discussion on the HBase dev mailing list. +When we say two HBase versions are compatible, we mean that the versions are wire and binary compatible. Compatible HBase versions means that clients can talk to compatible but differently versioned servers. It means too that you can just swap out the jars of one version and replace them with the jars of another, compatible version and all will just work. Unless otherwise specified, HBase point versions are (mostly) binary compatible. You can safely do rolling upgrades between binary compatible versions; i.e. across maintenance releases: e.g. from 1.4.4 to 1.4.6. See link:[Does compatibility between versions also mean binary compatibility?] discussion on the HBase dev mailing list. [[hbase.rolling.upgrade]] === Rolling Upgrades @@ -180,7 +180,7 @@ The rolling-restart script will first gracefully stop and restart the master, an [[hbase.rolling.restart]] .Rolling Upgrade Between Versions that are Binary/Wire Compatible -Unless otherwise specified, HBase minor versions are binary compatible. You can do a <> between HBase point versions. For example, you can go to 1.2.4 from 1.2.6 by doing a rolling upgrade across the cluster replacing the 1.2.4 binary with a 1.2.6 binary. +Unless otherwise specified, HBase minor versions are binary compatible. You can do a <> between HBase point versions. For example, you can go to 1.4.4 from 1.4.6 by doing a rolling upgrade across the cluster replacing the 1.4.4 binary with a 1.4.6 binary. In the minor version-particular sections below, we call out where the versions are wire/protocol compatible and in this case, it is also possible to do a <>. @@ -315,6 +315,50 @@ Quitting... == Upgrade Paths +[[upgrade2.3]] +=== Upgrade from 2.0.x-2.2.x to 2.3+ +There is no special consideration upgrading to hbase-2.3.x from earlier versions. From 2.2.x, it should be +rolling upgradeable. From 2.1.x or 2.0.x, you will need to clear the <> hurdle first. + +[[upgrade2.3_zookeeper]] +==== Upgraded ZooKeeper Dependency Version + +Our dependency on Apache ZooKeeper has been upgraded to 3.5.7 +(https://issues.apache.org/jira/browse/HBASE-24132[HBASE-24132]), as 3.4.x is EOL. The newer 3.5.x +client is compatible with the older 3.4.x server. However, if you're using HBase in stand-alone +mode and perform an in-place upgrade, there are some upgrade steps +https://cwiki.apache.org/confluence/display/ZOOKEEPER/Upgrade+FAQ[documented by the ZooKeeper community]. +This doesn't impact a production deployment, but would impact a developer's local environment. + +[[upgrade2.3_in-master-procedure-store-region]] +==== New In-Master Procedure Store + +Of note, HBase 2.3.0 changes the in-Master Procedure Store implementation. It was a dedicated custom store +(see <>) to instead use a standard HBase Region (https://issues.apache.org/jira/browse/HBASE-23326[HBASE-23326]). +The migration from the old to new format is automatic run by the new 2.3.0 Master on startup. The old _MasterProcWALs_ +dir which hosted the old custom implementation files in _${hbase.rootdir}_ is deleted on successful +migration. A new _MasterProc_ sub-directory replaces it to host the Store files and WALs for the new +Procedure Store in-Master Region. The in-Master Region is unusual in that it writes to an +alternate location at _${hbase.rootdir}/MasterProc_ rather than under _${hbase.rootdir}/data_ in the +filesystem and the special Procedure Store in-Master Region is hidden from all clients other than the active +Master itself. Otherwise, it is like any other with the Master process running flushes and compactions, +archiving WALs when over-flushed, and so on. Its files are readable by standard Region and Store file +tooling for triage and analysis as long as they are pointed to the appropriate location in the filesystem. + +[[upgrade2.2]] +=== Upgrade from 2.0 or 2.1 to 2.2+ + +HBase 2.2+ uses a new Procedure form assiging/unassigning/moving Regions. It does not process HBase 2.1 and 2.0's Unassign/Assign Procedure types. Upgrade requires that we first drain the Master Procedure Store of old style Procedures before starting the new 2.2 Master. So you need to make sure that before you kill the old version (2.0 or 2.1) Master, there is no region in transition. And once the new version (2.2+) Master is up, you can rolling upgrade RegionServers one by one. + +And there is a more safer way if you are running 2.1.1+ or 2.0.3+ cluster. It need four steps to upgrade Master. + +. Shutdown both active and standby Masters (Your cluster will continue to server reads and writes without interruption). +. Set the property hbase.procedure.upgrade-to-2-2 to true in hbase-site.xml for the Master, and start only one Master, still using the 2.1.1+ (or 2.0.3+) version. +. Wait until the Master quits. Confirm that there is a 'READY TO ROLLING UPGRADE' message in the Master log as the cause of the shutdown. The Procedure Store is now empty. +. Start new Masters with the new 2.2+ version. + +Then you can rolling upgrade RegionServers one by one. See link:https://issues.apache.org/jira/browse/HBASE-21075[HBASE-21075] for more details. + [[upgrade2.0]] === Upgrading from 1.x to 2.x @@ -332,7 +376,11 @@ As noted in the section <>, HBase 2.0+ requires a minimum o .HBCK must match HBase server version You *must not* use an HBase 1.x version of HBCK against an HBase 2.0+ cluster. HBCK is strongly tied to the HBase server version. Using the HBCK tool from an earlier release against an HBase 2.0+ cluster will destructively alter said cluster in unrecoverable ways. -As of HBase 2.0, HBCK is a read-only tool that can report the status of some non-public system internals. You should not rely on the format nor content of these internals to remain consistent across HBase releases. +As of HBase 2.0, HBCK (A.K.A _HBCK1_ or _hbck1_) is a read-only tool that can report the status of some non-public system internals but will often misread state because it does not understand the workings of hbase2. + +To read about HBCK's replacement, see <> in <>. + +IMPORTANT: Related, before you upgrade, ensure that _hbck1_ reports no `INCONSISTENCIES`. Fixing hbase1-type inconsistencies post-upgrade is an involved process. //// Link to a ref guide section on HBCK in 2.0 that explains use and calls out the inability of clients and server sides to detect version of each other. @@ -614,6 +662,18 @@ Performance is also an area that is now under active review so look forward to improvement in coming releases (See link:https://issues.apache.org/jira/browse/HBASE-20188[HBASE-20188 TESTING Performance]). +[[upgrade2.0.it.kerberos]] +.Integration Tests and Kerberos +Integration Tests (`IntegrationTests*`) used to rely on the Kerberos credential cache +for authentication against secured clusters. This used to lead to tests failing due +to authentication failures when the tickets in the credential cache expired. +As of hbase-2.0.0 (and hbase-1.3.0+), the integration test clients will make use +of the configuration properties `hbase.client.keytab.file` and +`hbase.client.kerberos.principal`. They are required. The clients will perform a +login from the configured keytab file and automatically refresh the credentials +in the background for the process lifetime (See +link:https://issues.apache.org/jira/browse/HBASE-16231[HBASE-16231]). + [[upgrade2.0.compaction.throughput.limit]] .Default Compaction Throughput HBase 2.x comes with default limits to the speed at which compactions can execute. This @@ -706,6 +766,7 @@ rolling upgrade of a 1.4 cluster. .Pre-Requirements * Upgrade to the latest 1.4.x release. Pre 1.4 releases may also work but are not tested, so please upgrade to 1.4.3+ before upgrading to 2.x, unless you are an expert and familiar with the region assignment and crash processing. See the section <> on how to upgrade to 1.4.x. * Make sure that the zk-less assignment is enabled, i.e, set `hbase.assignment.usezk` to `false`. This is the most important thing. It allows the 1.x master to assign/unassign regions to/from 2.x region servers. See the release note section of link:https://issues.apache.org/jira/browse/HBASE-11059[HBASE-11059] on how to migrate from zk based assignment to zk less assignment. +* Before you upgrade, ensure that _hbck1_ reports no `INCONSISTENCIES`. Fixing hbase1-type inconsistencies post-upgrade is an involved process. * We have tested rolling upgrading from 1.4.3 to 2.1.0, but it should also work if you want to upgrade to 2.0.x. .Instructions @@ -726,6 +787,7 @@ NOTE: If you have success running this prescription, please notify the dev list To upgrade an existing HBase 1.x cluster, you should: +* Ensure that _hbck1_ reports no `INCONSISTENCIES`. Fixing hbase1-type inconsistencies post-upgrade is an involved process. Fix all _hbck1_ complaints before proceeding. * Clean shutdown of existing 1.x cluster * Update coprocessors * Upgrade Master roles first @@ -764,6 +826,11 @@ Notes: Doing a raw scan will now return results that have expired according to TTL settings. +[[upgrade1.3]] +=== Upgrading from pre-1.3 to 1.3+ +If running Integration Tests under Kerberos, see <>. + + [[upgrade1.0]] === Upgrading to 1.x diff --git a/src/main/asciidoc/_chapters/zookeeper.adoc b/src/main/asciidoc/_chapters/zookeeper.adoc index 75dd71db4e7..98fc4980ef3 100644 --- a/src/main/asciidoc/_chapters/zookeeper.adoc +++ b/src/main/asciidoc/_chapters/zookeeper.adoc @@ -137,7 +137,7 @@ Just make sure to set `HBASE_MANAGES_ZK` to `false` if you want it to stay For more information about running a distinct ZooKeeper cluster, see the ZooKeeper link:https://zookeeper.apache.org/doc/current/zookeeperStarted.html[Getting Started Guide]. -Additionally, see the link:https://wiki.apache.org/hadoop/ZooKeeper/FAQ#A7[ZooKeeper Wiki] or the link:https://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_zkMulitServerSetup[ZooKeeper +Additionally, see the link:https://cwiki.apache.org/confluence/display/HADOOP2/ZooKeeper+FAQ#ZooKeeperFAQ-7[ZooKeeper Wiki] or the link:https://zookeeper.apache.org/doc/r3.4.10/zookeeperAdmin.html#sc_zkMulitServerSetup[ZooKeeper documentation] for more information on ZooKeeper sizing. [[zk.sasl.auth]] diff --git a/src/main/asciidoc/book.adoc b/src/main/asciidoc/book.adoc index a3ef047f6a1..e0df010283a 100644 --- a/src/main/asciidoc/book.adoc +++ b/src/main/asciidoc/book.adoc @@ -38,6 +38,7 @@ :experimental: :source-language: java :leveloffset: 0 +:stem: // Logo for HTML -- doesn't render in PDF ifdef::backend-html5[] @@ -64,11 +65,13 @@ include::_chapters/mapreduce.adoc[] include::_chapters/security.adoc[] include::_chapters/architecture.adoc[] include::_chapters/hbase_mob.adoc[] +include::_chapters/snapshot_scanner.adoc[] include::_chapters/inmemory_compaction.adoc[] include::_chapters/offheap_read_write.adoc[] include::_chapters/hbase_apis.adoc[] include::_chapters/external_apis.adoc[] include::_chapters/thrift_filter_language.adoc[] +include::_chapters/spark.adoc[] include::_chapters/cp.adoc[] include::_chapters/performance.adoc[] include::_chapters/profiler.adoc[] @@ -82,12 +85,12 @@ include::_chapters/pv2.adoc[] include::_chapters/amv2.adoc[] include::_chapters/zookeeper.adoc[] include::_chapters/community.adoc[] +include::_chapters/hbtop.adoc[] = Appendix include::_chapters/appendix_contributing_to_documentation.adoc[] include::_chapters/faq.adoc[] -include::_chapters/hbck_in_depth.adoc[] include::_chapters/appendix_acl_matrix.adoc[] include::_chapters/compression.adoc[] include::_chapters/sql.adoc[] diff --git a/src/site/asciidoc/acid-semantics.adoc b/src/site/asciidoc/acid-semantics.adoc index 0b56aa8e136..b557165cb5b 100644 --- a/src/site/asciidoc/acid-semantics.adoc +++ b/src/site/asciidoc/acid-semantics.adoc @@ -82,7 +82,7 @@ NOTE:This is not true _across rows_ for multirow batch mutations. A scan is *not* a consistent view of a table. Scans do *not* exhibit _snapshot isolation_. Rather, scans have the following properties: -. Any row returned by the scan will be a consistent view (i.e. that version of the complete row existed at some point in time)footnoteref[consistency,A consistent view is not guaranteed intra-row scanning -- i.e. fetching a portion of a row in one RPC then going back to fetch another portion of the row in a subsequent RPC. Intra-row scanning happens when you set a limit on how many values to return per Scan#next (See link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setBatch(int)"[Scan#setBatch(int)]).] +. Any row returned by the scan will be a consistent view (i.e. that version of the complete row existed at some point in time)footnoteref[consistency,A consistent view is not guaranteed intra-row scanning -- i.e. fetching a portion of a row in one RPC then going back to fetch another portion of the row in a subsequent RPC. Intra-row scanning happens when you set a limit on how many values to return per Scan#next (See link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setBatch(int)"[Scan#setBatch(int)]).] . A scan will always reflect a view of the data _at least as new as_ the beginning of the scan. This satisfies the visibility guarantees enumerated below. .. For example, if client A writes data X and then communicates via a side channel to client B, any scans started by client B will contain data at least as new as X. .. A scan _must_ reflect all mutations committed prior to the construction of the scanner, and _may_ reflect some mutations committed subsequent to the construction of the scanner. @@ -115,4 +115,4 @@ All of the above guarantees must be possible within Apache HBase. For users who == More Information -For more information, see the link:book.html#client[client architecture] and link:book.html#datamodel[data model] sections in the Apache HBase Reference Guide. +For more information, see the link:book.html#client[client architecture] and link:book.html#datamodel[data model] sections in the Apache HBase Reference Guide. diff --git a/src/site/asciidoc/bulk-loads.adoc b/src/site/asciidoc/bulk-loads.adoc index 8fc9a1a1a5e..fc320d88fde 100644 --- a/src/site/asciidoc/bulk-loads.adoc +++ b/src/site/asciidoc/bulk-loads.adoc @@ -20,3 +20,4 @@ under the License. = Bulk Loads in Apache HBase (TM) This page has been retired. The contents have been moved to the link:book.html#arch.bulk.load[Bulk Loading] section in the Reference Guide. + diff --git a/src/site/asciidoc/export_control.adoc b/src/site/asciidoc/export_control.adoc index f6e5e181838..1bbefb50a2a 100644 --- a/src/site/asciidoc/export_control.adoc +++ b/src/site/asciidoc/export_control.adoc @@ -29,11 +29,11 @@ encryption software, to see if this is permitted. See the link:http://www.wassenaar.org/[Wassenaar Arrangement] for more information. -The U.S. Government Department of Commerce, Bureau of Industry and Security -(BIS), has classified this software as Export Commodity Control Number (ECCN) -5D002.C.1, which includes information security software using or performing +The U.S. Government Department of Commerce, Bureau of Industry and Security +(BIS), has classified this software as Export Commodity Control Number (ECCN) +5D002.C.1, which includes information security software using or performing cryptographic functions with asymmetric algorithms. The form and manner of this -Apache Software Foundation distribution makes it eligible for export under the +Apache Software Foundation distribution makes it eligible for export under the License Exception ENC Technology Software Unrestricted (TSU) exception (see the BIS Export Administration Regulations, Section 740.13) for both object code and source code. diff --git a/src/site/asciidoc/index.adoc b/src/site/asciidoc/index.adoc index 9b31c493bc5..dd19a99d05b 100644 --- a/src/site/asciidoc/index.adoc +++ b/src/site/asciidoc/index.adoc @@ -20,7 +20,7 @@ under the License. = Apache HBase™ Home .Welcome to Apache HBase(TM) -link:http://www.apache.org/[Apache HBase(TM)] is the link:http://hadoop.apache.org[Hadoop] database, a distributed, scalable, big data store. +link:https://www.apache.org/[Apache HBase(TM)] is the link:https://hadoop.apache.org[Hadoop] database, a distributed, scalable, big data store. .When Would I Use Apache HBase? Use Apache HBase when you need random, realtime read/write access to your Big Data. + diff --git a/src/site/asciidoc/metrics.adoc b/src/site/asciidoc/metrics.adoc index 41db2a05b88..146b7e1ae97 100644 --- a/src/site/asciidoc/metrics.adoc +++ b/src/site/asciidoc/metrics.adoc @@ -20,13 +20,13 @@ under the License. = Apache HBase (TM) Metrics == Introduction -Apache HBase (TM) emits Hadoop link:http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/metrics/package-summary.html[metrics]. +Apache HBase (TM) emits Hadoop link:https://hadoop.apache.org/core/docs/stable/api/org/apache/hadoop/metrics/package-summary.html[metrics]. == Setup -First read up on Hadoop link:http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/metrics/package-summary.html[metrics]. +First read up on Hadoop link:https://hadoop.apache.org/core/docs/stable/api/org/apache/hadoop/metrics/package-summary.html[metrics]. -If you are using ganglia, the link:http://wiki.apache.org/hadoop/GangliaMetrics[GangliaMetrics] wiki page is useful read. +If you are using ganglia, the link:https://cwiki.apache.org/confluence/display/HADOOP2/GangliaMetrics[GangliaMetrics] wiki page is useful read. To have HBase emit metrics, edit `$HBASE_HOME/conf/hadoop-metrics.properties` and enable metric 'contexts' per plugin. As of this writing, hadoop supports *file* and *ganglia* plugins. Yes, the hbase metrics files is named hadoop-metrics rather than _hbase-metrics_ because currently at least the hadoop metrics system has the properties filename hardcoded. Per metrics _context_, comment out the NullContext and enable one or more plugins instead. @@ -41,9 +41,9 @@ The _jvm_ context is useful for long-term stats on running hbase jvms -- memory == Using with JMX -In addition to the standard output contexts supported by the Hadoop -metrics package, you can also export HBase metrics via Java Management -Extensions (JMX). This will allow viewing HBase stats in JConsole or +In addition to the standard output contexts supported by the Hadoop +metrics package, you can also export HBase metrics via Java Management +Extensions (JMX). This will allow viewing HBase stats in JConsole or any other JMX client. === Enable HBase stats collection @@ -67,7 +67,7 @@ rpc.period=60 === Setup JMX Remote Access For remote access, you will need to configure JMX remote passwords and access profiles. Create the files: -`$HBASE_HOME/conf/jmxremote.passwd` (set permissions +`$HBASE_HOME/conf/jmxremote.passwd` (set permissions to 600):: + ---- monitorRole monitorpass @@ -98,4 +98,5 @@ After restarting the processes you want to monitor, you should now be able to ru == Understanding HBase Metrics -For more information on understanding HBase metrics, see the link:book.html#hbase_metrics[metrics section] in the Apache HBase Reference Guide. +For more information on understanding HBase metrics, see the link:book.html#hbase_metrics[metrics section] in the Apache HBase Reference Guide. + diff --git a/src/site/asciidoc/old_news.adoc b/src/site/asciidoc/old_news.adoc index 75179e0114a..9c8d0885676 100644 --- a/src/site/asciidoc/old_news.adoc +++ b/src/site/asciidoc/old_news.adoc @@ -57,7 +57,7 @@ October 25th, 2012:: link:http://www.meetup.com/HBase-NYC/events/81728932/[Strat September 11th, 2012:: link:http://www.meetup.com/hbaseusergroup/events/80621872/[Contributor's Pow-Wow at HortonWorks HQ.] -August 8th, 2012:: link:http://www.apache.org/dyn/closer.cgi/hbase/[Apache HBase 0.94.1 is available for download] +August 8th, 2012:: link:https://www.apache.org/dyn/closer.lua/hbase/[Apache HBase 0.94.1 is available for download] June 15th, 2012:: link:http://www.meetup.com/hbaseusergroup/events/59829652/[Birds-of-a-feather] in San Jose, day after:: link:http://hadoopsummit.org[Hadoop Summit] @@ -69,9 +69,9 @@ March 27th, 2012:: link:http://www.meetup.com/hbaseusergroup/events/56021562/[Me January 19th, 2012:: link:http://www.meetup.com/hbaseusergroup/events/46702842/[Meetup @ EBay] -January 23rd, 2012:: Apache HBase 0.92.0 released. link:http://www.apache.org/dyn/closer.cgi/hbase/[Download it!] +January 23rd, 2012:: Apache HBase 0.92.0 released. link:https://www.apache.org/dyn/closer.lua/hbase/[Download it!] -December 23rd, 2011:: Apache HBase 0.90.5 released. link:http://www.apache.org/dyn/closer.cgi/hbase/[Download it!] +December 23rd, 2011:: Apache HBase 0.90.5 released. link:https://www.apache.org/dyn/closer.lua/hbase/[Download it!] November 29th, 2011:: link:http://www.meetup.com/hackathon/events/41025972/[Developer Pow-Wow in SF] at Salesforce HQ @@ -83,9 +83,9 @@ June 30th, 2011:: link:http://www.meetup.com/hbaseusergroup/events/20572251/[HBa June 8th, 2011:: link:http://berlinbuzzwords.de/wiki/hbase-workshop-and-hackathon[HBase Hackathon] in Berlin to coincide with:: link:http://berlinbuzzwords.de/[Berlin Buzzwords] -May 19th, 2011: Apache HBase 0.90.3 released. link:http://www.apache.org/dyn/closer.cgi/hbase/[Download it!] +May 19th, 2011: Apache HBase 0.90.3 released. link:https://www.apache.org/dyn/closer.lua/hbase/[Download it!] -April 12th, 2011: Apache HBase 0.90.2 released. link:http://www.apache.org/dyn/closer.cgi/hbase/[Download it!] +April 12th, 2011: Apache HBase 0.90.2 released. link:https://www.apache.org/dyn/closer.lua/hbase/[Download it!] March 21st, 2011:: link:http://www.meetup.com/hackathon/events/16770852/[HBase 0.92 Hackathon at StumbleUpon, SF] February 22nd, 2011:: link:http://www.meetup.com/hbaseusergroup/events/16492913/[HUG12: February HBase User Group at StumbleUpon SF] @@ -97,7 +97,7 @@ October 12th, 2010:: HBase-related presentations by core contributors and users October 11th, 2010:: link:http://www.meetup.com/hbaseusergroup/calendar/14606174/[HUG-NYC: HBase User Group NYC Edition] (Night before Hadoop World) June 30th, 2010:: link:http://www.meetup.com/hbaseusergroup/calendar/13562846/[Apache HBase Contributor Workshop] (Day after Hadoop Summit) -May 10th, 2010:: Apache HBase graduates from Hadoop sub-project to Apache Top Level Project +May 10th, 2010:: Apache HBase graduates from Hadoop sub-project to Apache Top Level Project April 19, 2010:: Signup for link:http://www.meetup.com/hbaseusergroup/calendar/12689490/[HBase User Group Meeting, HUG10] hosted by Trend Micro @@ -105,7 +105,7 @@ March 10th, 2010:: link:http://www.meetup.com/hbaseusergroup/calendar/12689351/[ January 27th, 2010:: Sign up for the link:http://www.meetup.com/hbaseusergroup/calendar/12241393/[HBase User Group Meeting, HUG8], at StumbleUpon in SF -September 8th, 2010:: Apache HBase 0.20.0 is faster, stronger, slimmer, and sweeter tasting than any previous Apache HBase release. Get it off the link:http://www.apache.org/dyn/closer.cgi/hbase/[Releases] page. +September 8th, 2010:: Apache HBase 0.20.0 is faster, stronger, slimmer, and sweeter tasting than any previous Apache HBase release. Get it off the link:https://www.apache.org/dyn/closer.lua/hbase/[Releases] page. November 2-6th, 2009:: link:http://dev.us.apachecon.com/c/acus2009/[ApacheCon] in Oakland. The Apache Foundation will be celebrating its 10th anniversary in beautiful Oakland by the Bay. Lots of good talks and meetups including an HBase presentation by a couple of the lads. @@ -118,3 +118,4 @@ June, 2009:: HBase at HadoopSummit2009 and at NOSQL: See the link:https://hbase March 3rd, 2009 :: HUG6 -- link:http://www.meetup.com/hbaseusergroup/calendar/9764004/[HBase User Group 6] January 30th, 2009:: LA Hbackathon: link:http://www.meetup.com/hbasela/calendar/9450876/[HBase January Hackathon Los Angeles] at link:http://streamy.com[Streamy] in Manhattan Beach + diff --git a/src/site/asciidoc/pseudo-distributed.adoc b/src/site/asciidoc/pseudo-distributed.adoc index ec6f53de74b..d13c63b0836 100644 --- a/src/site/asciidoc/pseudo-distributed.adoc +++ b/src/site/asciidoc/pseudo-distributed.adoc @@ -20,3 +20,4 @@ under the License. = Running Apache HBase (TM) in pseudo-distributed mode This page has been retired. The contents have been moved to the link:book.html#distributed[Distributed Operation: Pseudo- and Fully-distributed modes] section in the Reference Guide. + diff --git a/src/site/asciidoc/resources.adoc b/src/site/asciidoc/resources.adoc index 5f2d5d4a28f..fef217e4287 100644 --- a/src/site/asciidoc/resources.adoc +++ b/src/site/asciidoc/resources.adoc @@ -24,3 +24,4 @@ HBase: The Definitive Guide:: link:http://shop.oreilly.com/product/0636920014348 HBase In Action:: link:http://www.manning.com/dimidukkhurana[HBase In Action] By Nick Dimiduk and Amandeep Khurana. Publisher: Manning, MEAP Began: January 2012, Softbound print: Fall 2012, Pages: 350. HBase Administration Cookbook:: link:http://www.packtpub.com/hbase-administration-for-optimum-database-performance-cookbook/book[HBase Administration Cookbook] by Yifeng Jiang. Publisher: PACKT Publishing, Release: Expected August 2012, Pages: 335. + diff --git a/src/site/asciidoc/sponsors.adoc b/src/site/asciidoc/sponsors.adoc index bf93557b9c7..e6fec1b1a2f 100644 --- a/src/site/asciidoc/sponsors.adoc +++ b/src/site/asciidoc/sponsors.adoc @@ -19,11 +19,11 @@ under the License. = Apache HBase(TM) Sponsors -First off, thanks to link:http://www.apache.org/foundation/thanks.html[all who sponsor] our parent, the Apache Software Foundation. +First off, thanks to link:https://www.apache.org/foundation/thanks.html[all who sponsor] our parent, the Apache Software Foundation. The below companies have been gracious enough to provide their commerical tool offerings free of charge to the Apache HBase(TM) project. -* The crew at link:http://www.ej-technologies.com/[ej-technologies] have been letting us use link:http://www.ej-technologies.com/products/jprofiler/overview.html[JProfiler] for years now. +* The crew at link:http://www.ej-technologies.com/[ej-technologies] have been letting us use link:http://www.ej-technologies.com/products/jprofiler/overview.html[JProfiler] for years now. * The lads at link:http://headwaysoftware.com/[headway software] have given us a license for link:http://headwaysoftware.com/products/?code=Restructure101[Restructure101] so we can untangle our interdependency mess. @@ -32,4 +32,5 @@ The below companies have been gracious enough to provide their commerical tool o * Thank you to Boris at link:http://www.vectorportal.com/[Vector Portal] for granting us a license on the image on which our logo is based. == Sponsoring the Apache Software Foundation"> -To contribute to the Apache Software Foundation, a good idea in our opinion, see the link:http://www.apache.org/foundation/sponsorship.html[ASF Sponsorship] page. +To contribute to the Apache Software Foundation, a good idea in our opinion, see the link:https://www.apache.org/foundation/sponsorship.html[ASF Sponsorship] page. +