diff --git a/hadoop-common-project/hadoop-common/CHANGES.txt b/hadoop-common-project/hadoop-common/CHANGES.txt index 0c533c9cb9e..1e65a6d95b0 100644 --- a/hadoop-common-project/hadoop-common/CHANGES.txt +++ b/hadoop-common-project/hadoop-common/CHANGES.txt @@ -107,6 +107,9 @@ Release 2.7.4 - UNRELEASED HADOOP-11859. PseudoAuthenticationHandler fails with httpcomponents v4.4. (Eugene Koifman via jitendra) + HADOOP-13839. Fix outdated tracing documentation. + (Elek, Marton via iwasakims) + Release 2.7.3 - 2016-08-25 INCOMPATIBLE CHANGES diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/Tracing.md b/hadoop-common-project/hadoop-common/src/site/markdown/Tracing.md index 3ef35b23780..66b0ede0933 100644 --- a/hadoop-common-project/hadoop-common/src/site/markdown/Tracing.md +++ b/hadoop-common-project/hadoop-common/src/site/markdown/Tracing.md @@ -12,6 +12,7 @@ limitations under the License. See accompanying LICENSE file. --> + Enabling Dapper-like Tracing in Hadoop ====================================== @@ -24,7 +25,8 @@ Enabling Dapper-like Tracing in Hadoop * [Dynamic update of tracing configuration](#Dynamic_update_of_tracing_configuration) * [Starting tracing spans by HTrace API](#Starting_tracing_spans_by_HTrace_API) * [Sample code for tracing](#Sample_code_for_tracing) - + + Dapper-like Tracing in Hadoop ----------------------------- @@ -37,14 +39,14 @@ Setting up tracing is quite simple, however it requires some very minor changes ### Samplers -Configure the samplers in `core-site.xml` property: `hadoop.htrace.sampler`. +Configure the samplers in `core-site.xml` property: `dfs.htrace.sampler`. The value can be NeverSampler, AlwaysSampler or ProbabilitySampler. NeverSampler: HTrace is OFF for all spans; AlwaysSampler: HTrace is ON for all spans; ProbabilitySampler: HTrace is ON for some percentage% of top-level spans. - hadoop.htrace.sampler + dfs.htrace.sampler NeverSampler @@ -61,18 +63,18 @@ by putting a comma separated list of the fully-qualified class name of classes i in `core-site.xml` property: `hadoop.htrace.spanreceiver.classes`. - hadoop.htrace.spanreceiver.classes + dfs.htrace.spanreceiver.classes org.apache.htrace.impl.LocalFileSpanReceiver - hadoop.htrace.local-file-span-receiver.path + dfs.htrace.local-file-span-receiver.path /var/log/hadoop/htrace.out You can omit package name prefix if you use span receiver bundled with HTrace. - hadoop.htrace.spanreceiver.classes + dfs.htrace.spanreceiver.classes LocalFileSpanReceiver @@ -83,30 +85,32 @@ you can use `ZipkinSpanReceiver` which uses [Zipkin](https://github.com/twitter/zipkin) for collecting and displaying tracing data. In order to use `ZipkinSpanReceiver`, -you need to download and setup [Zipkin](https://github.com/twitter/zipkin) first. +you need to download and setup [Zipkin](https://github.com/twitter/zipkin) first. With docker you can start an experimental zipkin server with the following command. + +``` +docker run -d -p 9411:9411 -p 9410:9410 openzipkin/zipkin +``` you also need to add the jar of `htrace-zipkin` to the classpath of Hadoop on each node. -Here is example setup procedure. +The easiest way to achieve this is downloading the binary jars from maven central add put it to the hadoop lib directory. - $ git clone https://github.com/cloudera/htrace - $ cd htrace/htrace-zipkin - $ mvn compile assembly:single - $ cp target/htrace-zipkin-*-jar-with-dependencies.jar $HADOOP_HOME/share/hadoop/common/lib/ + $ wget http://repo1.maven.org/maven2/org/apache/thrift/libthrift/0.9.0/libthrift-0.9.0.jar -O $HADOOP_HOME/share/hadoop/common/lib/libthrift-0.9.0.jar + $ wget http://repo1.maven.org/maven2/org/apache/htrace/htrace-zipkin/3.1.0-incubating/htrace-zipkin-3.1.0-incubating.jar -O $HADOOP_HOME/share/hadoop/common/lib/htrace-zipkin-3.1.0-incubating.jar The sample configuration for `ZipkinSpanReceiver` is shown below. By adding these to `core-site.xml` of NameNode and DataNodes, `ZipkinSpanReceiver` is initialized on the startup. You also need this configuration on the client node in addition to the servers. - hadoop.htrace.spanreceiver.classes + dfs.htrace.spanreceiver.classes ZipkinSpanReceiver - hadoop.htrace.zipkin.collector-hostname + dfs.htrace.zipkin.collector-hostname 192.168.1.2 - hadoop.htrace.zipkin.collector-port + dfs.htrace.zipkin.collector-port 9410 @@ -136,8 +140,8 @@ You need to run the command against all servers if you want to update the config You need to specify the class name of span receiver as argument of `-class` option. You can specify the configuration associated with span receiver by `-Ckey=value` options. - $ hadoop trace -add -class LocalFileSpanReceiver -Chadoop.htrace.local-file-span-receiver.path=/tmp/htrace.out -host 192.168.56.2:9000 - Added trace span receiver 2 with configuration hadoop.htrace.local-file-span-receiver.path = /tmp/htrace.out + $ hadoop trace -add -class LocalFileSpanReceiver -Clocal-file-span-receiver.path=/tmp/htrace.out -host 192.168.56.2:9000 + Added trace span receiver 2 with configuration local-file-span-receiver.path = /tmp/htrace.out $ hadoop trace -list -host 192.168.56.2:9000 ID CLASS @@ -159,7 +163,7 @@ In addition, you need to initialize `SpanReceiver` once per process. ... - SpanReceiverHost.getInstance(new HdfsConfiguration()); + SpanReceiverHost.get(new HdfsConfiguration(), "dfs"); ... @@ -189,7 +193,7 @@ which start tracing span before invoking HDFS shell command. FsShell shell = new FsShell(); conf.setQuietMode(false); shell.setConf(conf); - SpanReceiverHost.getInstance(conf); + SpanReceiverHost.get(conf, "dfs"); int res = 0; TraceScope ts = null; try { @@ -207,3 +211,5 @@ You can compile and execute this code as shown below. $ javac -cp `hadoop classpath` TracingFsShell.java $ java -cp .:`hadoop classpath` TracingFsShell -ls / + +The configuration prefix for the client-side htrace configuration is defined in the SpanReceiverHost.get call. In the case above you should use the dfs prefix as on the server side. (*dfs*.htrace…. configuration values)