HADOOP-13839. Fix outdated tracing documentation. Contributed by Elek, Marton.
This commit is contained in:
parent
a3f4ee53c8
commit
04ccb1bec5
|
@ -107,6 +107,9 @@ Release 2.7.4 - UNRELEASED
|
||||||
HADOOP-11859. PseudoAuthenticationHandler fails with httpcomponents v4.4.
|
HADOOP-11859. PseudoAuthenticationHandler fails with httpcomponents v4.4.
|
||||||
(Eugene Koifman via jitendra)
|
(Eugene Koifman via jitendra)
|
||||||
|
|
||||||
|
HADOOP-13839. Fix outdated tracing documentation.
|
||||||
|
(Elek, Marton via iwasakims)
|
||||||
|
|
||||||
Release 2.7.3 - 2016-08-25
|
Release 2.7.3 - 2016-08-25
|
||||||
|
|
||||||
INCOMPATIBLE CHANGES
|
INCOMPATIBLE CHANGES
|
||||||
|
|
|
@ -12,6 +12,7 @@
|
||||||
limitations under the License. See accompanying LICENSE file.
|
limitations under the License. See accompanying LICENSE file.
|
||||||
-->
|
-->
|
||||||
|
|
||||||
|
|
||||||
Enabling Dapper-like Tracing in Hadoop
|
Enabling Dapper-like Tracing in Hadoop
|
||||||
======================================
|
======================================
|
||||||
|
|
||||||
|
@ -25,6 +26,7 @@ Enabling Dapper-like Tracing in Hadoop
|
||||||
* [Starting tracing spans by HTrace API](#Starting_tracing_spans_by_HTrace_API)
|
* [Starting tracing spans by HTrace API](#Starting_tracing_spans_by_HTrace_API)
|
||||||
* [Sample code for tracing](#Sample_code_for_tracing)
|
* [Sample code for tracing](#Sample_code_for_tracing)
|
||||||
|
|
||||||
|
|
||||||
Dapper-like Tracing in Hadoop
|
Dapper-like Tracing in Hadoop
|
||||||
-----------------------------
|
-----------------------------
|
||||||
|
|
||||||
|
@ -37,14 +39,14 @@ Setting up tracing is quite simple, however it requires some very minor changes
|
||||||
|
|
||||||
### Samplers
|
### Samplers
|
||||||
|
|
||||||
Configure the samplers in `core-site.xml` property: `hadoop.htrace.sampler`.
|
Configure the samplers in `core-site.xml` property: `dfs.htrace.sampler`.
|
||||||
The value can be NeverSampler, AlwaysSampler or ProbabilitySampler.
|
The value can be NeverSampler, AlwaysSampler or ProbabilitySampler.
|
||||||
NeverSampler: HTrace is OFF for all spans;
|
NeverSampler: HTrace is OFF for all spans;
|
||||||
AlwaysSampler: HTrace is ON for all spans;
|
AlwaysSampler: HTrace is ON for all spans;
|
||||||
ProbabilitySampler: HTrace is ON for some percentage% of top-level spans.
|
ProbabilitySampler: HTrace is ON for some percentage% of top-level spans.
|
||||||
|
|
||||||
<property>
|
<property>
|
||||||
<name>hadoop.htrace.sampler</name>
|
<name>dfs.htrace.sampler</name>
|
||||||
<value>NeverSampler</value>
|
<value>NeverSampler</value>
|
||||||
</property>
|
</property>
|
||||||
|
|
||||||
|
@ -61,18 +63,18 @@ by putting a comma separated list of the fully-qualified class name of classes i
|
||||||
in `core-site.xml` property: `hadoop.htrace.spanreceiver.classes`.
|
in `core-site.xml` property: `hadoop.htrace.spanreceiver.classes`.
|
||||||
|
|
||||||
<property>
|
<property>
|
||||||
<name>hadoop.htrace.spanreceiver.classes</name>
|
<name>dfs.htrace.spanreceiver.classes</name>
|
||||||
<value>org.apache.htrace.impl.LocalFileSpanReceiver</value>
|
<value>org.apache.htrace.impl.LocalFileSpanReceiver</value>
|
||||||
</property>
|
</property>
|
||||||
<property>
|
<property>
|
||||||
<name>hadoop.htrace.local-file-span-receiver.path</name>
|
<name>dfs.htrace.local-file-span-receiver.path</name>
|
||||||
<value>/var/log/hadoop/htrace.out</value>
|
<value>/var/log/hadoop/htrace.out</value>
|
||||||
</property>
|
</property>
|
||||||
|
|
||||||
You can omit package name prefix if you use span receiver bundled with HTrace.
|
You can omit package name prefix if you use span receiver bundled with HTrace.
|
||||||
|
|
||||||
<property>
|
<property>
|
||||||
<name>hadoop.htrace.spanreceiver.classes</name>
|
<name>dfs.htrace.spanreceiver.classes</name>
|
||||||
<value>LocalFileSpanReceiver</value>
|
<value>LocalFileSpanReceiver</value>
|
||||||
</property>
|
</property>
|
||||||
|
|
||||||
|
@ -83,30 +85,32 @@ you can use `ZipkinSpanReceiver` which uses
|
||||||
[Zipkin](https://github.com/twitter/zipkin) for collecting and displaying tracing data.
|
[Zipkin](https://github.com/twitter/zipkin) for collecting and displaying tracing data.
|
||||||
|
|
||||||
In order to use `ZipkinSpanReceiver`,
|
In order to use `ZipkinSpanReceiver`,
|
||||||
you need to download and setup [Zipkin](https://github.com/twitter/zipkin) first.
|
you need to download and setup [Zipkin](https://github.com/twitter/zipkin) first. With docker you can start an experimental zipkin server with the following command.
|
||||||
|
|
||||||
|
```
|
||||||
|
docker run -d -p 9411:9411 -p 9410:9410 openzipkin/zipkin
|
||||||
|
```
|
||||||
|
|
||||||
you also need to add the jar of `htrace-zipkin` to the classpath of Hadoop on each node.
|
you also need to add the jar of `htrace-zipkin` to the classpath of Hadoop on each node.
|
||||||
Here is example setup procedure.
|
The easiest way to achieve this is downloading the binary jars from maven central add put it to the hadoop lib directory.
|
||||||
|
|
||||||
$ git clone https://github.com/cloudera/htrace
|
$ wget http://repo1.maven.org/maven2/org/apache/thrift/libthrift/0.9.0/libthrift-0.9.0.jar -O $HADOOP_HOME/share/hadoop/common/lib/libthrift-0.9.0.jar
|
||||||
$ cd htrace/htrace-zipkin
|
$ wget http://repo1.maven.org/maven2/org/apache/htrace/htrace-zipkin/3.1.0-incubating/htrace-zipkin-3.1.0-incubating.jar -O $HADOOP_HOME/share/hadoop/common/lib/htrace-zipkin-3.1.0-incubating.jar
|
||||||
$ mvn compile assembly:single
|
|
||||||
$ cp target/htrace-zipkin-*-jar-with-dependencies.jar $HADOOP_HOME/share/hadoop/common/lib/
|
|
||||||
|
|
||||||
The sample configuration for `ZipkinSpanReceiver` is shown below.
|
The sample configuration for `ZipkinSpanReceiver` is shown below.
|
||||||
By adding these to `core-site.xml` of NameNode and DataNodes, `ZipkinSpanReceiver` is initialized on the startup.
|
By adding these to `core-site.xml` of NameNode and DataNodes, `ZipkinSpanReceiver` is initialized on the startup.
|
||||||
You also need this configuration on the client node in addition to the servers.
|
You also need this configuration on the client node in addition to the servers.
|
||||||
|
|
||||||
<property>
|
<property>
|
||||||
<name>hadoop.htrace.spanreceiver.classes</name>
|
<name>dfs.htrace.spanreceiver.classes</name>
|
||||||
<value>ZipkinSpanReceiver</value>
|
<value>ZipkinSpanReceiver</value>
|
||||||
</property>
|
</property>
|
||||||
<property>
|
<property>
|
||||||
<name>hadoop.htrace.zipkin.collector-hostname</name>
|
<name>dfs.htrace.zipkin.collector-hostname</name>
|
||||||
<value>192.168.1.2</value>
|
<value>192.168.1.2</value>
|
||||||
</property>
|
</property>
|
||||||
<property>
|
<property>
|
||||||
<name>hadoop.htrace.zipkin.collector-port</name>
|
<name>dfs.htrace.zipkin.collector-port</name>
|
||||||
<value>9410</value>
|
<value>9410</value>
|
||||||
</property>
|
</property>
|
||||||
|
|
||||||
|
@ -136,8 +140,8 @@ You need to run the command against all servers if you want to update the config
|
||||||
You need to specify the class name of span receiver as argument of `-class` option.
|
You need to specify the class name of span receiver as argument of `-class` option.
|
||||||
You can specify the configuration associated with span receiver by `-Ckey=value` options.
|
You can specify the configuration associated with span receiver by `-Ckey=value` options.
|
||||||
|
|
||||||
$ hadoop trace -add -class LocalFileSpanReceiver -Chadoop.htrace.local-file-span-receiver.path=/tmp/htrace.out -host 192.168.56.2:9000
|
$ hadoop trace -add -class LocalFileSpanReceiver -Clocal-file-span-receiver.path=/tmp/htrace.out -host 192.168.56.2:9000
|
||||||
Added trace span receiver 2 with configuration hadoop.htrace.local-file-span-receiver.path = /tmp/htrace.out
|
Added trace span receiver 2 with configuration local-file-span-receiver.path = /tmp/htrace.out
|
||||||
|
|
||||||
$ hadoop trace -list -host 192.168.56.2:9000
|
$ hadoop trace -list -host 192.168.56.2:9000
|
||||||
ID CLASS
|
ID CLASS
|
||||||
|
@ -159,7 +163,7 @@ In addition, you need to initialize `SpanReceiver` once per process.
|
||||||
|
|
||||||
...
|
...
|
||||||
|
|
||||||
SpanReceiverHost.getInstance(new HdfsConfiguration());
|
SpanReceiverHost.get(new HdfsConfiguration(), "dfs");
|
||||||
|
|
||||||
...
|
...
|
||||||
|
|
||||||
|
@ -189,7 +193,7 @@ which start tracing span before invoking HDFS shell command.
|
||||||
FsShell shell = new FsShell();
|
FsShell shell = new FsShell();
|
||||||
conf.setQuietMode(false);
|
conf.setQuietMode(false);
|
||||||
shell.setConf(conf);
|
shell.setConf(conf);
|
||||||
SpanReceiverHost.getInstance(conf);
|
SpanReceiverHost.get(conf, "dfs");
|
||||||
int res = 0;
|
int res = 0;
|
||||||
TraceScope ts = null;
|
TraceScope ts = null;
|
||||||
try {
|
try {
|
||||||
|
@ -207,3 +211,5 @@ You can compile and execute this code as shown below.
|
||||||
|
|
||||||
$ javac -cp `hadoop classpath` TracingFsShell.java
|
$ javac -cp `hadoop classpath` TracingFsShell.java
|
||||||
$ java -cp .:`hadoop classpath` TracingFsShell -ls /
|
$ java -cp .:`hadoop classpath` TracingFsShell -ls /
|
||||||
|
|
||||||
|
The configuration prefix for the client-side htrace configuration is defined in the SpanReceiverHost.get call. In the case above you should use the dfs prefix as on the server side. (*dfs*.htrace…. configuration values)
|
||||||
|
|
Loading…
Reference in New Issue