------------------------------------------------------
TL;DR
------------------------------------------------------
We are moving from Inheritence
- Observer *is* Coprocessor
- FooService *is* CoprocessorService
To Composition
- Coprocessor *has* Observer
- Coprocessor *has* Service
------------------------------------------------------
Design Changes
------------------------------------------------------
- Adds four new interfaces - MasterCoprocessor, RegionCoprocessor, RegionServierCoprocessor,
WALCoprocessor
- These new *Coprocessor interfaces have a get*Observer() function for each observer type
supported by them.
- Added Coprocessor#getService() to base interface. All extending *Coprocessor interfaces will
get it from the base interface.
- Added BulkLoadObserver hooks to RegionCoprocessorHost instad of SecureBulkLoadManager doing its
own trickery.
- CoprocessorHost#find*() fuctions: Too many testing hooks digging into CP internals.
Deleted if can, else marked @VisibleForTesting.
------------------------------------------------------
Backward Compatibility
------------------------------------------------------
- Old coprocessors implementing *Observer won't get loaded (no backward compatibility guarantees).
- Third party coprocessors only implementing Coprocessor will not get loaded (just like Observers).
- Old coprocessors implementing CoprocessorService (for master/region host)
/SingletonCoprocessorService (for RegionServer host) will continue to work with 2.0.
- Added test to ensure backward compatibility of CoprocessorService/SingletonCoprocessorService
- Note that if a coprocessor implements both observer and service in same class, its service
component will continue to work but it's observer component won't work.
------------------------------------------------------
Notes
------------------------------------------------------
Did a side-by-side comparison of CPs in master and after patch. These coprocessors which were just
CoprocessorService earlier, needed a home in some coprocessor in new design. For most it was clear
since they were using a particular type of environment. Some were tricky.
- JMXListener - MasterCoprocessor and RSCoprocessor (because jmx listener makes sense for
processes?)
- RSGroupAdminEndpoint --> MasterCP
- VisibilityController -> MasterCP and RegionCP
These were converted to RegionCoprocessor because they were using RegionCoprocessorEnvironment
which can only come from a RegionCPHost.
- AggregateImplementation
- BaseRowProcessorEndpoint
- BulkDeleteEndpoint
- Export
- RefreshHFilesEndpoint
- RowCountEndpoint
- MultiRowMutationEndpoint
- SecureBulkLoadEndpoint
- TokenProvider
Change-Id: I813145f2bc11815f52ac703563b879962c249764
Do a pass with dependency:analyze; remove unused and
explicity list the dependencies we exploit.
Remove the parent dependencies set which had junit, mockito,
log4j, and findbugs annotations (had to put junit back
temporarily in subsequent version of this patch TODO). Listing in
parent set meant these libs were dependencies for all modules
which in practice was not the case. Edited all modules so
those that need any from this parent set now do explicit listing.
Ran the dependency:analyze over the project. Acted on most
suggested removals and requests for explicit listing. Some
grey areas remain around transitives that come in with
hadoop -needs better excludes, another project- and that
the dependency:analyze tool is not always accurate in its
reporting.
Selective add of dependency on hbase-thirdparty jars.
Update to READMEs on how protobuf is done (and update to refguide).
Removed all checked in generated protobuf files. They are generated
on the fly now as part of mainline build.
Pull in guava 22.0 by using the shaded version up in new hbase-thirdparty project.
In poms, exclude guava everywhere except on hadoop-common. Do this so
we minimize transitive includes. hadoop-common is needed because hadoop
Configuration uses guava doing preconditions.
Everywhere we used guava, instead use shaded so fix a load of imports.
Stopwatch API changed as did hashing and toStringHelper which is now
in MoreObjects class. Otherwise, minimal changes to come up on 22.0
hbase-thirdparty jars. Update to READMEs on how protobuf is done (and update to
refguide) Removed all checked in generated protobuf files. They are generatedon
the fly now as part of mainline build.
Reason for refactor:
In cases where one might need to use multiple observers, say region, master and regionserver; and the fact that only one class can be extended, it gives rise to following pattern:
public class BaseMasterAndRegionObserver
extends BaseRegionObserver
implements MasterObserver
class AccessController
extends BaseMasterAndRegionObserver
implements RegionServerObserver
were BaseMasterAndRegionObserver is full copy of BaseMasterObserver.
There is an example of simple case too where the current design fails.
Say only one observer is needed by the coprocessor, but the design doesn't permit extending even that single observer (see RSGroupAdminEndpoint), that leads to copy of full Bas
e...Observer class into coprocessor class leading to 1000s of lines of code and this ugly mix of 5 main functions with 100 useless functions.
Javadocs changes:
- Adds class comments on 'default' methods and expectations.
- Adds explanaiton of Exception handling in Observers' class comment. Removes redundant @throws before each function.
- Improves javadocs for a bunch of functions
- deletes empty @params in a bunch of places
Change-Id: I265738d47e8554e7b4678e88bb916a0cc7d00ab3
M TestStressWALProcedureStore.java
Disable test that now runs that fails because of difference in pb3.1.0.
Signed-off-by: Michael Stack <stack@apache.org>
Rely on the new plugin to do all proto generation. No need of an
external protoc setup anymore. Mvn will do it all for you.
Updated all READMEs appropriately.
Signed-off-by: Michael Stack <stack@apache.org>
This patch changes poms to use protobuf-maven-plugin instaed of
hadoop-maven-plugins generating protos. Adds a few missing READMEs too
as well as purge of unused protos turned up by the new plugin.
Which includes
HBASE-16742 Add chapter for devs on how we do protobufs going forward
HBASE-16741 Amend the generate protobufs out-of-band build step
to include shade, pulling in protobuf source and a hook for patching protobuf
Removed ByteStringer from hbase-protocol-shaded. Use the protobuf-3.1.0
trick directly instead. Makes stuff cleaner. All under 'shaded' dir is
now generated.
HBASE-16567 Upgrade to protobuf-3.1.x
Regenerate all protos in this module with protoc3.
Redo ByteStringer to use new pb3.1.0 unsafebytesutil
instead of HBaseZeroCopyByteString
HBASE-16264 Figure how to deal with endpoints and shaded pb Shade our protobufs.
Do it in a manner that makes it so we can still have in our API references to
com.google.protobuf (and in REST). The c.g.p in API is for Coprocessor Endpoints (CPEP)
This patch is Tactic #4 from Shading Doc attached to the referenced issue.
Figuring an appoach took a while because we have Coprocessor Endpoints
mixed in with the core of HBase that are tough to untangle (FIX).
Tactic #4 (the fourth attempt at addressing this issue) is COPY all but
the CPEP .proto files currently in hbase-protocol to a new module named
hbase-protocol-shaded. Generate .protos again in the new location and
then relocate/shade the generated files. Let CPEPs keep on with the
old references at com.google.protobuf.* and
org.apache.hadoop.hbase.protobuf.* but change the hbase core so all
instead refer to the relocated files in their new location at
org.apache.hadoop.hbase.shaded.com.google.protobuf.*.
Let the new module also shade protobufs themselves and change hbase
core to pick up this shaded protobuf rather than directly reference
com.google.protobuf.
This approach allows us to explicitly refer to either the shaded or
non-shaded version of a protobuf class in any particular context (though
usually context dictates one or the other). Core runs on shaded protobuf.
CPEPs continue to use whatever is on the classpath with
com.google.protobuf.* which is pb2.5.0 for the near future at least.
See above cited doc for follow-ons and downsides. In short, IDEs will complain
about not being able to find the shaded protobufs since shading happens at package
time; will fix by checking in all generated classes and relocated protobuf in
a follow-on. Also, CPEPs currently suffer an extra-copy as marshalled from
non-shaded to shaded. To fix. Finally, our .protos are duplicated; once
shaded, and once not. Pain, but how else to reveal our protos to CPEPs or
C++ client that wants to talk with HBase AND shade protobuf.
Details:
Add a new hbase-protocol-shaded module. It is a copy of hbase-protocol
i with all relocated offset from o.a.h.h. to o.a.h.h.shaded. The new module
also includes the relocated pb. It does not include CPEPs. They stay in
their old location.
Add another module hbase-endpoint which has in it all the endpoints
that ship as part of hbase -- at least the ones that are not
entangled with core such as AccessControl and Auth. Move all protos
for these CPEPs here as well as their unit tests (mostly moving a
bunch of stuff out of hbase-server module)
Much of the change looks like this:
-import org.apache.hadoop.hbase.protobuf.ProtobufUtil;
-import org.apache.hadoop.hbase.protobuf.generated.ClusterIdProtos;
+import org.apache.hadoop.hbase.protobuf.shaded.ProtobufUtil;
+import org.apache.hadoop.hbase.shaded.protobuf.generated.ClusterIdProtos;
In HTable and in HBaseAdmin, regularize the way Callables are used and also hide
protobuf usage as much as possible moving it up into Callable super classes or out
to utility classes. Still TODO is adding in of retries, etc., but can wait on
procedure which will redo all this.
Also in HTable and HBaseAdmin as well as in HRegionServer and Server, be explicit
when using non-shaded protobuf. Do the full-path so it is clear. This is around
endpoint coprocessors registration of services and execution of CPEP methods.
Shrunk ProtobufUtil by moving methods used by one CPEP only back to the CPEP either
into Client class or as new Util class; e.g. AccessControlUtil.
There are actually two versions of ProtobufUtil now; a shaded one and a subset
that is used by CPEPs doing non-shaded work.
Made it so hbase-common no longer depends on hbase-protocol (with Matteo's help)
R*Converter classes got moved down under shaded package -- they are for internal
use only. There are no non-shaded versions of these classes.
D hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractRegionServerCallable
D RetryingCallableBase
Not used anymore and we have too many tiers of Callables so removed/cleaned-up.
A ClientServicecallable
Had to add this one. RegionServerCallable was made generic so it could be used
for a few Interfaces (Client and Admin). Then added ClientServiceCallable to
implement RegionServerCallable with the Client Interface.