* move mapreduce version of TableInputFormat tests out of mapred
* add ability to get runnable job via MR test shims
* correct the javadoc example for current APIs.
* add tests the run a job based on the extending TableInputFormatBase (as given in the javadocs)
* add tests that run jobs based on the javadocs from 0.98
* fall back to our own Connection if ussers of the deprecated table configuration have a managed connection.
In our pre-1.0 API, HTable is considered a light-weight object that consumed by
a single thread at a time. The HTablePool class provided a means of sharing
multiple HTable instances across a number of threads. As an optimization,
HTable managed a "write buffer", accumulating edits and sending a "batch" all
at once. By default the batch was sent as the last step in invocations of
put(Put) and put(List<Put>). The user could disable the automatic flushing of
the write buffer, retaining edits locally and only sending the whole "batch"
once the write buffer has filled or when the flushCommits() method in invoked
explicitly. Explicit or implicit batch writing was controlled by the
setAutoFlushTo(boolean) method. A value of true (the default) had the write
buffer flushed at the completion of a call to put(Put) or put(List<Put>). A
value of false allowed for explicit buffer management. HTable also exposed the
buffer to consumers via getWriteBuffer().
The combination of HTable with setAutoFlushTo(false) and the HTablePool
provided a convenient mechanism by which multiple "Put-producing" threads could
share a common write buffer. Both HTablePool and HTable are deprecated, and
they are officially replaced in The new 1.0 API by Table and BufferedMutator.
Table, which replaces HTable, no longer exposes explicit write-buffer
management. Instead, explicit buffer management is exposed via BufferedMutator.
BufferedMutator is made safe for concurrent use. Where code would previously
retrieve and return HTables from an HTablePool, now that code creates and
shares a single BufferedMutator instance across all threads.
Instead of just blocking the client for 90 seconds when the region gets too
busy, it now sends along region load stats to the client so the client can
know how busy the server is. Currently, its just the load on the memstore, but
it can be extended for other stats (e.g. cpu, general memory, etc.).
It is then up to the client to decide if it wants to listen to these stats.
By default, the client ignores the stats, but it can easily be toggled to the
built-in exponential back-off or users can plug in their own back-off
implementations
HConnection#getTable (0.98, 0.99)
Replaced HTable under hbase-*/src/main/java. Skipped tests. Would take
till end of time to do all and some cases are cryptic. Also skipped
some mapreduce where HTable comes through in API. Can do both of
these stragglers in another issue.
Generally, if a utility class or standalone class, tried to pass in a
Connection rather than have the utility or standalone create its own
connection on each invocation; e.g. the Quota stuff. Where not possible,
noted where invocation comes from... if test or hbck, didn't worry about
it.
Some classes are just standalone and nothing to be done to avoid
a Connection setup per invocation (this is probably how it worked
in the new HTable...days anyways). Some classes are not used:
AggregationClient, FavoredNodes... we should just purge this stuff.
Doc on what short circuit connection does (I can just use it...
I thought it was just for short circuit but no, it switches dependent
on where you are connecting).
Changed HConnection to super Interface ClusterConnection where safe (
internal usage by private classes only).
Doc cleanup in example usage so we do new mode rather than the old
fashion.
Used java7 idiom that allows you avoid writing out finally to call close
on implementations of Closeable.
Added a RegistryFactory.. moved it out from being inner class.
Added a utility createGetClosestRowOrBeforeReverseScan method to Scan
to create a Scan that can ...
Renamed getShortCircuitConnection as getConnection – users don't need
to know what implementation does (that it can short-circuit RPC).
The old name gave pause. I was frightened to use it thinking it only
for short-circuit reading – that it would not do remote too.
Squashed commit of the following:
Move from HConnection to ClusterConnection or Connection
Use unmanaged connections where we use managed previous
(used the jdk7 https://docs.oracle.com/javase/7/docs/technotes/guides/language/try-with-resources.html idiom).
In ZKConfig, synchronize on Configuration rather than make a copy.
Making a copy we were dropping hbase configs in certain test context
(could not find the zk ensemble because default port).
In tests, some move to the new style connection setup but mostly
fixes for premature connection close or adding cleanup where it
was lacking.