diff --git a/src/main/docbkx/performance.xml b/src/main/docbkx/performance.xml index c37cd93022d..0e2c5305332 100644 --- a/src/main/docbkx/performance.xml +++ b/src/main/docbkx/performance.xml @@ -687,6 +687,27 @@ more discussion around short circuit reads. on the difference between the old and new implementations. See Hadoop shortcircuit reads configuration page for how to enable the latter, better version of shortcircuit. +For example, here is a minimal config. enabling short-circuit reads added to +hbase-site.xml: + + dfs.client.read.shortcircuit + true + + This configuration parameter turns on short-circuit local reads. + + + + dfs.domain.socket.path + /home/stack/sockets/short_circuit_read_socket_PORT + + Optional. This is a path to a UNIX domain socket that will be used for + communication between the DataNode and local HDFS clients. + If the string "_PORT" is present in this path, it will be replaced by the + TCP port of the DataNode. + +]]> +Be careful about permissions for the directory that hosts the shared domain +socket; dfsclient will complain if open to other than the hbase user. If you are running on an old Hadoop, one that is without HDFS-347 but that has @@ -699,14 +720,6 @@ This has to be the user that started HBase. Then in hbase-site.xml, set dfs.client.read.shortcircuit to be true - - With short-circuit enabled, for more speed-up, it is recommended that you have HBase do the checksum validation. HBase writes - checksums inline with the data whereas HDFS keeps checksums in a separate file that it must seek independent of - the data file. Set to have HBase do checksum validation. - While you might think it safe to set the HDFS configuration parameter "dfs.client.read.shortcircuit.skip.checksum", - you should NOT; HBase checksum validation covers hfiles only, not WAL files and if the HBase checksum validation fails, - we will fall back on HDFS's. - Services -- at least the HBase RegionServers -- will need to be restarted in order to pick up the new configurations.