hbase/hbase-examples
stack 95c1dc93fb HBASE-15638 Shade protobuf
Which includes

    HBASE-16742 Add chapter for devs on how we do protobufs going forward

    HBASE-16741 Amend the generate protobufs out-of-band build step
    to include shade, pulling in protobuf source and a hook for patching protobuf

    Removed ByteStringer from hbase-protocol-shaded. Use the protobuf-3.1.0
    trick directly instead. Makes stuff cleaner. All under 'shaded' dir is
    now generated.

    HBASE-16567 Upgrade to protobuf-3.1.x
    Regenerate all protos in this module with protoc3.
    Redo ByteStringer to use new pb3.1.0 unsafebytesutil
    instead of HBaseZeroCopyByteString

    HBASE-16264 Figure how to deal with endpoints and shaded pb Shade our protobufs.
    Do it in a manner that makes it so we can still have in our API references to
    com.google.protobuf (and in REST). The c.g.p in API is for Coprocessor Endpoints (CPEP)

            This patch is Tactic #4 from Shading Doc attached to the referenced issue.
            Figuring an appoach took a while because we have Coprocessor Endpoints
            mixed in with the core of HBase that are tough to untangle (FIX).

            Tactic #4 (the fourth attempt at addressing this issue) is COPY all but
            the CPEP .proto files currently in hbase-protocol to a new module named
            hbase-protocol-shaded. Generate .protos again in the new location and
            then relocate/shade the generated files. Let CPEPs keep on with the
            old references at com.google.protobuf.* and
            org.apache.hadoop.hbase.protobuf.* but change the hbase core so all
            instead refer to the relocated files in their new location at
            org.apache.hadoop.hbase.shaded.com.google.protobuf.*.

            Let the new module also shade protobufs themselves and change hbase
            core to pick up this shaded protobuf rather than directly reference
            com.google.protobuf.

            This approach allows us to explicitly refer to either the shaded or
            non-shaded version of a protobuf class in any particular context (though
            usually context dictates one or the other). Core runs on shaded protobuf.
            CPEPs continue to use whatever is on the classpath with
            com.google.protobuf.* which is pb2.5.0 for the near future at least.

            See above cited doc for follow-ons and downsides. In short, IDEs will complain
            about not being able to find the shaded protobufs since shading happens at package
            time; will fix by checking in all generated classes and relocated protobuf in
            a follow-on. Also, CPEPs currently suffer an extra-copy as marshalled from
            non-shaded to shaded. To fix. Finally, our .protos are duplicated; once
            shaded, and once not. Pain, but how else to reveal our protos to CPEPs or
            C++ client that wants to talk with HBase AND shade protobuf.

            Details:

            Add a new hbase-protocol-shaded module. It is a copy of hbase-protocol
    i       with all relocated offset from o.a.h.h. to o.a.h.h.shaded. The new module
            also includes the relocated pb. It does not include CPEPs. They stay in
            their old location.

            Add another module hbase-endpoint which has in it all the endpoints
            that ship as part of hbase -- at least the ones that are not
            entangled with core such as AccessControl and Auth. Move all protos
            for these CPEPs here as well as their unit tests (mostly moving a
            bunch of stuff out of hbase-server module)

            Much of the change looks like this:

                 -import org.apache.hadoop.hbase.protobuf.ProtobufUtil;
                 -import org.apache.hadoop.hbase.protobuf.generated.ClusterIdProtos;
                 +import org.apache.hadoop.hbase.protobuf.shaded.ProtobufUtil;
                 +import org.apache.hadoop.hbase.shaded.protobuf.generated.ClusterIdProtos;

            In HTable and in HBaseAdmin, regularize the way Callables are used and also hide
            protobuf usage as much as possible moving it up into Callable super classes or out
            to utility classes. Still TODO is adding in of retries, etc., but can wait on
            procedure which will redo all this.

            Also in HTable and HBaseAdmin as well as in HRegionServer and Server, be explicit
            when using non-shaded protobuf. Do the full-path so it is clear. This is around
            endpoint coprocessors registration of services and execution of CPEP methods.

            Shrunk ProtobufUtil by moving methods used by one CPEP only back to the CPEP either
            into Client class or as new Util class; e.g. AccessControlUtil.

            There are actually two versions of ProtobufUtil now; a shaded one and a subset
            that is used by CPEPs doing non-shaded work.

            Made it so hbase-common no longer depends on hbase-protocol (with Matteo's help)

            R*Converter classes got moved down under shaded package -- they are for internal
            use only. There are no non-shaded versions of these classes.

            D hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractRegionServerCallable
            D RetryingCallableBase
             Not used anymore and we have too many tiers of Callables so removed/cleaned-up.

            A ClientServicecallable
             Had to add this one. RegionServerCallable was made generic so it could be used
             for a few Interfaces (Client and Admin). Then added ClientServiceCallable to
             implement RegionServerCallable with the Client Interface.
2016-10-03 21:37:32 -07:00
..
src HBASE-15638 Shade protobuf 2016-10-03 21:37:32 -07:00
README.txt HBASE-15638 Shade protobuf 2016-10-03 21:37:32 -07:00
pom.xml HBASE-15638 Shade protobuf 2016-10-03 21:37:32 -07:00

README.txt

Example code.

* org.apache.hadoop.hbase.mapreduce.SampleUploader
    Demonstrates uploading data from text files (presumably stored in HDFS) to HBase.

* org.apache.hadoop.hbase.mapreduce.IndexBuilder
    Demonstrates map/reduce with a table as the source and other tables as the sink.
    You can generate sample data for this MR job via hbase-examples/src/main/ruby/index-builder-setup.rb.


* Thrift examples
    Sample clients of the HBase ThriftServer. They perform the same actions, implemented in
    C++, Java, Ruby, PHP, Perl, and Python. Pre-generated Thrift code for HBase is included
    to be able to compile/run the examples without Thrift installed.
    If desired, the code can be re-generated as follows:
    thrift --gen cpp --gen java --gen rb --gen py --gen php --gen perl \
        ${HBASE_ROOT}/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift
    and re-placed at the corresponding paths. You should not have to do this generally.

    Before you run any Thrift examples, find a running HBase Thrift server (and a running
    hbase cluster for this server to talk to -- at a minimum start a standalone instance
    by doing ./bin/start-hbase.sh). If you start one locally (bin/hbase thrift start),
    the default port is 9090 (a webserver with basic stats defaults showing on port 9095).

    * Java: org.apache.hadoop.hbase.thrift.DemoClient (jar under lib/).
      1. Make sure your client has all required jars on the CLASSPATH when it starts. If lazy,
      just add all jars as follows: {HBASE_EXAMPLE_CLASSPATH=`./bin/hbase classpath`}
      2. If HBase server is not secure, or authentication is not enabled for the Thrift server, execute:
      {java -cp hbase-examples-[VERSION].jar:${HBASE_EXAMPLE_CLASSPATH} org.apache.hadoop.hbase.thrift.DemoClient <host> <port>}
      3. If HBase server is secure, and authentication is enabled for the Thrift server, run kinit at first, then execute:
      {java -cp hbase-examples-[VERSION].jar:${HBASE_EXAMPLE_CLASSPATH} org.apache.hadoop.hbase.thrift.DemoClient <host> <port> true}
      4. Here is a lazy example that just pulls in all hbase dependency jars and that goes against default location on localhost.
      It should work with a standalone hbase instance started by doing ./bin/start-hbase.sh:
      {java -cp ./hbase-examples/target/hbase-examples-2.0.0-SNAPSHOT.jar:`./bin/hbase classpath` org.apache.hadoop.hbase.thrift.DemoClient localhost 9090}

    * Ruby: hbase-examples/src/main/ruby/DemoClient.rb
      1. Modify the import path in the file to point to {$THRIFT_HOME}/lib/rb/lib.
      2. Execute {ruby DemoClient.rb} (or {ruby DemoClient.rb <host> <port>}).

    * Python: hbase-examples/src/main/python/DemoClient.py
      1. Modify the added system path in the file to point to {$THRIFT_HOME}/lib/py/build/lib.[YOUR SYSTEM]
      2. Execute {python DemoClient.py <host> <port>}.

    * PHP: hbase-examples/src/main/php/DemoClient.php
      1. Modify the THRIFT_HOME path in the file to point to actual {$THRIFT_HOME}.
      2. Execute {php DemoClient.php}.
      3. Starting from Thrift 0.9.0, if Thrift.php complains about some files it cannot include, go to thrift root,
        and copy the contents of php/lib/Thrift under lib/php/src. Thrift.php appears to include, from under the same root,
        both TStringUtils.php, only present in src/, and other files only present under lib/; this will bring them under
        the same root (src/).
        If you know better about PHP and Thrift, please feel free to fix this.

    * Perl: hbase-examples/src/main/perl/DemoClient.pl
      1. Modify the "use lib" path in the file to point to {$THRIFT_HOME}/lib/perl/lib.
      2. Use CPAN to get Bit::Vector and Class::Accessor modules if not present (see thrift perl README if more modules are missing).
      3. Execute {perl DemoClient.pl}.

    * CPP: hbase-examples/src/main/cpp/DemoClient.cpp
      1. Make sure you have boost and Thrift C++ libraries; modify Makefile if necessary.
        The recent (0.9.0 as of this writing) version of Thrift can be downloaded from http://thrift.apache.org/download/.
        Boost can be found at http://www.boost.org/users/download/.
      2. Execute {make}.
      3. Execute {./DemoClient}.

Also includes example coprocessor endpoint examples. The protobuf files are at src/main/protobuf.
See hbase-protocol README.txt for how to generate the example RowCountService Coprocessor
Endpoint and Aggregator examples.