diff --git a/CHANGES.txt b/CHANGES.txt index bba620899dc..bda14aabdfb 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1122,6 +1122,8 @@ Release 0.21.0 - Unreleased HADOOP-6286. Fix bugs in related to URI handling in glob methods in FileContext. (Boris Shkolnik via suresh) + HADOOP-6292. Update native libraries guide. (Corinne Chandel via cdouglas) + Release 0.20.2 - Unreleased HADOOP-6231. Allow caching of filesystem instances to be disabled on a diff --git a/src/docs/src/documentation/content/xdocs/native_libraries.xml b/src/docs/src/documentation/content/xdocs/native_libraries.xml index 379d47cf435..8609605623f 100644 --- a/src/docs/src/documentation/content/xdocs/native_libraries.xml +++ b/src/docs/src/documentation/content/xdocs/native_libraries.xml @@ -26,90 +26,82 @@ -
- Purpose - -

Hadoop has native implementations of certain components for reasons of - both performance and non-availability of Java implementations. These - components are available in a single, dynamically-linked, native library. - On the *nix platform it is libhadoop.so. This document describes - the usage and details on how to build the native libraries.

-
- -
- Components - -

Hadoop currently has the following - - compression codecs as the native components:

- - -

Of the above, the availability of native hadoop libraries is imperative - for the gzip and bzip2 compression codecs to work.

-
+
+ Overview + +

This guide describes the native hadoop library and includes a small discussion about native shared libraries.

+

Note: Depending on your environment, the term "native libraries" could + refer to all *.so's you need to compile; and, the term "native compression" could refer to all *.so's + you need to compile that are specifically related to compression. + Currently, however, this document only addresses the native hadoop library (libhadoop.so).

+ +
+ +
+ Native Hadoop Library + +

Hadoop has native implementations of certain components for + performance reasons and for non-availability of Java implementations. These + components are available in a single, dynamically-linked native library called + the native hadoop library. On the *nix platforms the library is named libhadoop.so.

+
Usage -

It is fairly simple to use the native hadoop libraries:

+

It is fairly easy to use the native hadoop library:

+
    +
  1. + Review the components. +
  2. +
  3. + Review the supported platforms. +
  4. +
  5. + Either download a hadoop release, which will + include a pre-built version of the native hadoop library, or + build your own version of the + native hadoop library. Whether you download or build, the name for the library is + the same: libhadoop.so +
  6. +
  7. + Install the compression codec development packages + (>zlib-1.2, >gzip-1.2): +
      +
    • If you download the library, install one or more development packages - + whichever compression codecs you want to use with your deployment.
    • +
    • If you build the library, it is mandatory + to install both development packages.
    • +
    +
  8. +
  9. + Check the runtime log files. +
  10. +
+
+
+ Components +

The native hadoop library includes two components, the zlib and gzip + + compression codecs: +

- -

The bin/hadoop script ensures that the native hadoop - library is on the library path via the system property - -Djava.library.path=<path>.

- -

To check everything went alright check the hadoop log files for:

- -

- - DEBUG util.NativeCodeLoader - Trying to load the custom-built - native-hadoop library... -
- - INFO util.NativeCodeLoader - Loaded the native-hadoop library - -

- -

If something goes wrong, then:

-

- - INFO util.NativeCodeLoader - Unable to load native-hadoop library for - your platform... using builtin-java classes where applicable - -

+

The native hadoop library is imperative for gzip to work.

Supported Platforms -

Hadoop native library is supported only on *nix platforms only. - Unfortunately it is known not to work on Cygwin - and Mac OS X and has mainly been used on the - GNU/Linux platform.

+

The native hadoop library is supported on *nix platforms only. + The library does not to work with Cygwin + or the Mac OS X platform.

-

It has been tested on the following GNU/Linux distributions:

+

The native hadoop library is mainly used on the GNU/Linus platform and + has been tested on these distributions:

-

On all the above platforms a 32/64 bit Hadoop native library will work +

On all the above distributions a 32/64 bit native hadoop library will work with a respective 32/64 bit jvm.

- Building Native Hadoop Libraries + Download -

Hadoop native library is written in - ANSI C and built using - the GNU autotools-chain (autoconf, autoheader, automake, autoscan, libtool). - This means it should be straight-forward to build them on any platform with - a standards compliant C compiler and the GNU autotools-chain. - See supported platforms.

+

The pre-built 32-bit i386-Linux native hadoop library is available as part of the + hadoop distribution and is located in the lib/native directory. You can download the + hadoop distribution from Hadoop Common Releases.

+ +

Be sure to install the zlib and/or gzip development packages - whichever compression + codecs you want to use with your deployment.

+
+ +
+ Build + +

The native hadoop library is written in ANSI C + and is built using the GNU autotools-chain (autoconf, autoheader, automake, autoscan, libtool). + This means it should be straight-forward to build the library on any platform with a standards-compliant + C compiler and the GNU autotools-chain (see the supported platforms).

-

In particular the various packages you would need on the target - platform are:

+

The packages you need to install on the target platform are:

-

Once you have the prerequisites use the standard build.xml - and pass along the compile.native flag (set to - true) to build the native hadoop library:

+

Once you installed the prerequisite packages use the standard hadoop build.xml + file and pass along the compile.native flag (set to true) to build the native hadoop library:

$ ant -Dcompile.native=true <target>

-

The native hadoop library is not built by default since not everyone is - interested in building them.

- -

You should see the newly-built native hadoop library in:

+

You should see the newly-built library in:

$ build/native/<platform>/lib

-

where <platform> is combination of the system-properties: - ${os.name}-${os.arch}-${sun.arch.data.model}; for e.g. - Linux-i386-32.

+

where <platform> is a combination of the system-properties: + ${os.name}-${os.arch}-${sun.arch.data.model} (for example, Linux-i386-32).

-
- Notes - +

Please note the following:

  • - It is mandatory to have the - zlib, gzip, and bzip2 - development packages on the target platform for building the - native hadoop library; however for deployment it is sufficient to - install one of them if you wish to use only one of them. + It is mandatory to install both the zlib and gzip + development packages on the target platform in order to build the + native hadoop library; however, for deployment it is sufficient to + install just one package if you wish to use only one codec.
  • - It is necessary to have the correct 32/64 libraries of both zlib - depending on the 32/64 bit jvm for the target platform for - building/deployment of the native hadoop library. + It is necessary to have the correct 32/64 libraries for zlib, + depending on the 32/64 bit jvm for the target platform, in order to + build and deploy the native hadoop library.
-
+ +
+ Runtime +

The bin/hadoop script ensures that the native hadoop + library is on the library path via the system property:
+ -Djava.library.path=<path>

+ +

During runtime, check the hadoop log files for your MapReduce tasks.

+ + +
+
+
- Loading Native Libraries Through DistributedCache -

User can load native shared libraries through + Native Shared Libraries +

You can load any native shared library using DistributedCache for distributing and symlinking the library files.

-

Here is an example, describing how to distribute the library and - load it from a MapReduce task.

+

This example shows you how to distribute a shared library, mylib.so, + and load it from a MapReduce task.

    -
  1. First copy the library to the HDFS.
    +
  2. First copy the library to the HDFS:
    bin/hadoop fs -copyFromLocal mylib.so.1 /libraries/mylib.so.1
  3. The job launching program should contain the following:
    @@ -206,6 +223,9 @@ System.loadLibrary("mylib.so");
+ +


Note: If you downloaded or built the native hadoop library, you don’t need to use DistibutedCache to + make the library available to your MapReduce tasks.