Adds means of patching our shaded protobuf. Does it using
the Anoop patch attached to HBASE-15789 that adds ByteInput
to protobuf. This patch gets applied after protobuf has been
downloaded, relocated, and then unpacked over src/main/java.
Also fixes a few small build WARNINGs because of duplicate
mentions of dependencies.
1). Fix resource leak issue upon exception during mob compaction.
2). Reorg the code in compactMobFilesInBatch() to make it more readable.
Signed-off-by: Jonathan M Hsieh <jmhsieh@apache.org>
Which includes
HBASE-16742 Add chapter for devs on how we do protobufs going forward
HBASE-16741 Amend the generate protobufs out-of-band build step
to include shade, pulling in protobuf source and a hook for patching protobuf
Removed ByteStringer from hbase-protocol-shaded. Use the protobuf-3.1.0
trick directly instead. Makes stuff cleaner. All under 'shaded' dir is
now generated.
HBASE-16567 Upgrade to protobuf-3.1.x
Regenerate all protos in this module with protoc3.
Redo ByteStringer to use new pb3.1.0 unsafebytesutil
instead of HBaseZeroCopyByteString
HBASE-16264 Figure how to deal with endpoints and shaded pb Shade our protobufs.
Do it in a manner that makes it so we can still have in our API references to
com.google.protobuf (and in REST). The c.g.p in API is for Coprocessor Endpoints (CPEP)
This patch is Tactic #4 from Shading Doc attached to the referenced issue.
Figuring an appoach took a while because we have Coprocessor Endpoints
mixed in with the core of HBase that are tough to untangle (FIX).
Tactic #4 (the fourth attempt at addressing this issue) is COPY all but
the CPEP .proto files currently in hbase-protocol to a new module named
hbase-protocol-shaded. Generate .protos again in the new location and
then relocate/shade the generated files. Let CPEPs keep on with the
old references at com.google.protobuf.* and
org.apache.hadoop.hbase.protobuf.* but change the hbase core so all
instead refer to the relocated files in their new location at
org.apache.hadoop.hbase.shaded.com.google.protobuf.*.
Let the new module also shade protobufs themselves and change hbase
core to pick up this shaded protobuf rather than directly reference
com.google.protobuf.
This approach allows us to explicitly refer to either the shaded or
non-shaded version of a protobuf class in any particular context (though
usually context dictates one or the other). Core runs on shaded protobuf.
CPEPs continue to use whatever is on the classpath with
com.google.protobuf.* which is pb2.5.0 for the near future at least.
See above cited doc for follow-ons and downsides. In short, IDEs will complain
about not being able to find the shaded protobufs since shading happens at package
time; will fix by checking in all generated classes and relocated protobuf in
a follow-on. Also, CPEPs currently suffer an extra-copy as marshalled from
non-shaded to shaded. To fix. Finally, our .protos are duplicated; once
shaded, and once not. Pain, but how else to reveal our protos to CPEPs or
C++ client that wants to talk with HBase AND shade protobuf.
Details:
Add a new hbase-protocol-shaded module. It is a copy of hbase-protocol
i with all relocated offset from o.a.h.h. to o.a.h.h.shaded. The new module
also includes the relocated pb. It does not include CPEPs. They stay in
their old location.
Add another module hbase-endpoint which has in it all the endpoints
that ship as part of hbase -- at least the ones that are not
entangled with core such as AccessControl and Auth. Move all protos
for these CPEPs here as well as their unit tests (mostly moving a
bunch of stuff out of hbase-server module)
Much of the change looks like this:
-import org.apache.hadoop.hbase.protobuf.ProtobufUtil;
-import org.apache.hadoop.hbase.protobuf.generated.ClusterIdProtos;
+import org.apache.hadoop.hbase.protobuf.shaded.ProtobufUtil;
+import org.apache.hadoop.hbase.shaded.protobuf.generated.ClusterIdProtos;
In HTable and in HBaseAdmin, regularize the way Callables are used and also hide
protobuf usage as much as possible moving it up into Callable super classes or out
to utility classes. Still TODO is adding in of retries, etc., but can wait on
procedure which will redo all this.
Also in HTable and HBaseAdmin as well as in HRegionServer and Server, be explicit
when using non-shaded protobuf. Do the full-path so it is clear. This is around
endpoint coprocessors registration of services and execution of CPEP methods.
Shrunk ProtobufUtil by moving methods used by one CPEP only back to the CPEP either
into Client class or as new Util class; e.g. AccessControlUtil.
There are actually two versions of ProtobufUtil now; a shaded one and a subset
that is used by CPEPs doing non-shaded work.
Made it so hbase-common no longer depends on hbase-protocol (with Matteo's help)
R*Converter classes got moved down under shaded package -- they are for internal
use only. There are no non-shaded versions of these classes.
D hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractRegionServerCallable
D RetryingCallableBase
Not used anymore and we have too many tiers of Callables so removed/cleaned-up.
A ClientServicecallable
Had to add this one. RegionServerCallable was made generic so it could be used
for a few Interfaces (Client and Admin). Then added ClientServiceCallable to
implement RegionServerCallable with the Client Interface.
In some particular deployments, the Replication code believes it has
reached EOF for a WAL prior to succesfully parsing all bytes known to
exist in a cleanly closed file.
Consistently this failure happens due to an InvalidProtobufException
after some number of seeks during our attempts to tail the in-progress
RegionServer WAL. As a work-around, this patch treats cleanly closed
files differently than other execution paths. If an EOF is detected due
to parsing or other errors while there are still unparsed bytes before
the end-of-file trailer, we now reset the WAL to the very beginning and
attempt a clean read-through.
In current testing, a single such reset is sufficient to work around
observed dataloss. However, the above change will retry a given WAL file
indefinitely. On each such attempt, a log message like the below will
be emitted at the WARN level:
Processing end of WAL file '{}'. At position {}, which is too far away
from reported file length {}. Restarting WAL reading (see HBASE-15983
for details).
Additionally, this patch adds some additional log detail at the TRACE
level about file offsets seen while handling recoverable errors. It also
add metrics that measure the use of this recovery mechanism.
Eliminates use of removed or deprecated hadoop2 api
- MBeanUtil -> MBeans Hadoop2 has both; Hadoop 3 removes MBeanUtil and uses MBeans
- FSDataOutputStream(OutputStream) -> FSDataOutputStream(OutputStream, FileSystem.Statistics)
- MetricsServlet is removed. See HADOOP-12504
Port for kdc service gets selected in the constructor, but we bind to it later in MiniKdc.start()-->MiniKdc.initKDCServer() --> KdcServer.start(). In meantime, some other service can capture the port which results in BindException. The solution here is to catch the exception and retry.
Testing methodology:
- Used python and intellij.
- breakpoint on kdc.start(1), in catch block(2) and just after catch block(3).
- used python to bind to the selected port on breakpoint 1 --> run the program --> stops at breakpoint 2 (catch block)
- On breakpoint 1 and after 2 failures, close the port --> run the program --> skips catch block and goes to breakpoint 3.
Change-Id: I4e06e69819d1ec9a0a7fa471bf017f3a72c75cb3
Tool to test performance of locks and queues in procedure scheduler independently from other framework components.
Inserts table and region operations in the scheduler, then polls them and exercises their locks. Number of tables, regions and operations can be set using cli args.
Change-Id: I0fb27e67d3fcab70dd5d0b5197396b117b11eac6
New tool to dump existing replication peers, configurations and
queues when using HBase Replication. The tool provides two flags:
--distributed This flag will poll each RS for information about
the replication queues being processed on this RS.
By default this is not enabled and the information
about the replication queues and configuration will
be obtained from ZooKeeper.
--hdfs When --distributed is used, this flag will attempt
to calculate the total size of the WAL files used
by the replication queues. Since its possible that
multiple peers can be configured this value can be
overestimated.
Signed-off-by: Matteo Bertozzi <matteo.bertozzi@cloudera.com>
This is a revert of a revert; i.e. we are adding back the change only adding
back with fixes for the broken unit test; was a real issue on a test that
went in just at same time as this commit; I was getting a new nonce on each
retry rather than getting one for the mutation.
Other changes since revert are more hiding of RpcController. Use
accessor method rather than always pass in a RpcController
Walked back retrying operations that used to be single-shot (though
code comment said need a retry) because it opens a can of worms where
we retry stuff like bad column family when we shouldn't (needs
work adding in DoNotRetryIOEs)
Changed name of class from PayloadCarryingServerCallable to
CancellableRegionServerCallable.
Fix javadoc and findbugs warnings.
Fix case of not initializing the ScannerCallable RpcController.
Below is original commit message:
Remove mention of ServiceException and other protobuf classes from all over the codebase.
Purge TimeLimitedRpcController. Lets just have one override of RpcController.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractRegionServerCallable.java
Cleanup. Make it clear this is an odd class for async hbase intro.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java
Refactor of RegionServerCallable allows me clean up a bunch of
boilerplate in here and remove protobuf references.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
Purge protobuf references everywhere except a reference to a throw of a
ServiceException in method checkHBaseAvailable. I deprecated it in favor
of new available method (the SE is not actually needed)
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/PayloadCarryingServerCallable.java
Move the RetryingTimeTracker instance in here from HTable.
Allows me to contain tracker and remove a repeated code in HTable.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionServerCallable.java
Clean up move set up of rpc in here rather than have it repeat in HTable.
Allows me to remove protobuf references from a bunch of places.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/FlushRegionCallable.java
Make use of the push of boilerplate up into RegionServerCallable
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiServerCallable.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/PayloadCarryingServerCallable.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionAdminServiceCallable.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/SecureBulkLoadClient.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
Move boilerplate up into superclass.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RetryingTimeTracker.java
Cleanup
M hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/PayloadCarryingRpcController.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEditsReplaySink.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/RegionReplicaReplicationEndpoint.java
Factor in TimeLimitedRpcController. Just have one RpcController override.
D hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/TimeLimitedRpcController.java
Removed. Lets have one override of pb rpccontroller only.
M hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
(handleRemoteException) added
(toText) added
Purge ServiceException from Callable subclasses by pushing SE handling
up into the parent Callable class (varies by context but this is basic
patten). Allows us remove a bunch of boilerplate.
Do this in the public facing classes in particular (though if
an API has SE in it -- which a few do, this patch leaves these
untouched -- for now.) Make it so HBaseAdmin and HTable have no
direct pb imports (except for endpoint processor API).
Change a few of the HBaseAdmin calls to be retrying where comments
ask that we do retry rather than one time.
Purge TimeLimitedRpcController. Lets just have one override of RpcController.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractRegionServerCallable.java
Cleanup. Make it clear this is an odd class for async hbase intro.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java
Refactor of RegionServerCallable allows me clean up a bunch of
boilerplate in here and remove protobuf references.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
Purge protobuf references everywhere except a reference to a throw of a
ServiceException in method checkHBaseAvailable. I deprecated it in favor
of new available method (the SE is not actually needed)
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/PayloadCarryingServerCallable.java
Move the RetryingTimeTracker instance in here from HTable.
Allows me to contain tracker and remove a repeated code in HTable.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionServerCallable.java
Clean up move set up of rpc in here rather than have it repeat in HTable.
Allows me to remove protobuf references from a bunch of places.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/FlushRegionCallable.java
Make use of the push of boilerplate up into RegionServerCallable
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiServerCallable.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/PayloadCarryingServerCallable.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionAdminServiceCallable.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/SecureBulkLoadClient.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
Move boilerplate up into superclass.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RetryingTimeTracker.java
Cleanup
M hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/PayloadCarryingRpcController.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEditsReplaySink.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/RegionReplicaReplicationEndpoint.java
Factor in TimeLimitedRpcController. Just have one RpcController override.
D hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/TimeLimitedRpcController.java
Removed. Lets have one override of pb rpccontroller only.
M hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
(handleRemoteException) added
(toText) added
Signed-off-by: stack <stack@apache.org>
TimeRangeTracker as point of contention when many threads reading a StoreFile
Fixes HBASE-16074 ITBLL fails, reports lost big or tiny families broken
scanning because of a side effect of a clean up in HBASE-15650 to make
TimeRange construction consistent exposed a latent issue in
TimeRange#compare. See HBASE-16074 for more detail.
Also change HFile Writer constructor so we pass in the TimeRangeTracker, if one,
on construction rather than set later (the flag and reference were not volatile
so could have made for issues in concurrent case). And make sure the construction
of a TimeRange from a TimeRangeTracer on open of an HFile Reader never makes a
bad minimum value, one that would preclude us reading any values from a file
(set min to 0)
M hbase-common/src/main/java/org/apache/hadoop/hbase/io/TimeRange.java
Call through to next constructor (if minStamp was 0, we'd skip setting
allTime=true). Add asserts that timestamps are not < 0 cos it messes
us up if they are (we already were checking for < 0 on construction but
assert passed in timestamps are not < 0).
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
Add constructor override that takes a TimeRangeTracker (set when flushing
but not when compacting)
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
Add override creating an HFile in tmp that takes a TimeRangeTracker
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
Add override for HFile Writer that takes a TimeRangeTracker Take it on
construction instead of having it passed by a setter later (flags and
reference set by the setter were not volatile... could have been prob
in concurrent case)
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/TimeRangeTracker.java
Log WARN if bad initial TimeRange value (and then 'fix' it)
M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestTimeRangeTracker.java
A few tests to prove serialization works as expected and that we'll get a bad min if not constructed properly.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
Handle OLDEST_TIMESTAMP explictly. Don't expect TimeRange to do it.
M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
Refactor from junit3 to junit4 and add test for this weird case.
Instead of running the primary test in a separate thread and hoping it finishes in time, just run the test in the primary thread.
Signed-off-by: Elliott Clark <eclark@apache.org>
All ReplicationTableBase method's that need to access the Replication Table will block until it is created though.
Also refactored ReplicationSourceManager so that abandoned queue adoption is run in the background too so that it does not block HRegionServer initialization.
Signed-off-by: Elliott Clark <eclark@apache.org>
Building on HBase-15958.
Provided a ReplicationQueuesClientHBaseImpl that relies on the HBase Replication Table to track WAL queues.
Refactored out a large section of ReplicationQueuesHBaseImpl into a ReplicationTableClient class that handles Replication Table operations.
Signed-off-by: Elliott Clark <eclark@apache.org>
M hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java
Refactor which makes a Handler type. Put all 'handler' stuff inside this
new type. Also make it so subclass can provide its own Handler type.
M hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java
Name the handler threads for their type so can tell if configs are
having an effect.
Signed-off-by: stack <stack@apache.org>
Invoking 'hbase hfile' inside a servlet raises several concerns. This
patch avoids invoking a separate process, and also adds validation that
the file being read is at least inside the HBase root directory.
Signed-off-by: Mikhail Antonov <antonov@apache.org>
Building on HBase-15883.
Now implementing the claim queues procedure within an HBase table.
Also added UnitTests to test claimQueue.
Peer tracking will still be performed by ZooKeeper though.
Also modified the queueId tracking procedure so we no longer have to perform scans over the Replication Table.
This does make our queue naming schema slightly different from ReplicationQueuesZKImpl though.
Signed-off-by: Elliott Clark <eclark@apache.org>
Changes how we do accounting of Connections to match how it is done in Hadoop.
Adds a ConnectionManager class. Adds new configurations for this new class.
"hbase.ipc.client.idlethreshold" 4000
"hbase.ipc.client.connection.idle-scan-interval.ms" 10000
"hbase.ipc.client.connection.maxidletime" 10000
"hbase.ipc.client.kill.max", 10
"hbase.ipc.server.handler.queue.size", 100
The new scheme does away with synchronization that purportedly would freeze out
reads while we were cleaning up stale connections (according to HADOOP-9955)
Also adds in new mechanism for accepting Connections by pulling in as many
as we can at a time adding them to a Queue instead of doing one at a time.
Can help when bursty traffic according to HADOOP-9956. Removes a blocking
while Reader is busy parsing a request. Adds configuration
"hbase.ipc.server.read.connection-queue.size" with default of 100 for
queue size.
Signed-off-by: stack <stack@apache.org>
Adds HADOOP-9955 RPC idle connection closing is extremely inefficient
Then removes queue added by HADOOP-9956 at Enis suggestion
Changes how we do accounting of Connections to match how it is done in Hadoop.
Adds a ConnectionManager class. Adds new configurations for this new class.
"hbase.ipc.client.idlethreshold" 4000
"hbase.ipc.client.connection.idle-scan-interval.ms" 10000
"hbase.ipc.client.connection.maxidletime" 10000
"hbase.ipc.client.kill.max", 10
"hbase.ipc.server.handler.queue.size", 100
The new scheme does away with synchronization that purportedly would freeze out
reads while we were cleaning up stale connections (according to HADOOP-9955)
Also adds in new mechanism for accepting Connections by pulling in as many
as we can at a time adding them to a Queue instead of doing one at a time.
Can help when bursty traffic according to HADOOP-9956. Removes a blocking
while Reader is busy parsing a request. Adds configuration
"hbase.ipc.server.read.connection-queue.size" with default of 100 for
queue size.
Implemented ReplicationQueuesHBaseImpl that tracks WAL offsets and replication queues in an HBase table.
Only wrote the basic tracking methods, have not implemented claimQueue() or HFileRef methods yet.
Wrote a basic unit test for ReplicationQueueHBaseImpl that tests the implemented functions on a single Region Server
Signed-off-by: Elliott Clark <elliott@fb.com>
Signed-off-by: Elliott Clark <eclark@apache.org>
@Before and @After to setup/teardown tables using @Rule to set table name based on testname.
Refactor out copy-pasted code fragments to single function.
(Apekshit)
Change-Id: Ic22e5027cc3952bab5ec30070ed20e98017db65a