Summary:
Allow TestTimestampFilterSeekHint to provide a seek next hint.
This can be incorrect as it might skip deletes. However it can
make things much much faster.
Test Plan: Added a unit test.
Differential Revision: https://reviews.facebook.net/D55617
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
moreRowsMayExistAfterCell Exploit the fact a Scan is a Get Scan. Also save compares
if no non-default stopRow.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
optimize Add doc on what is being optimized. Also, if a Get Scan, do not
optimize else we'll keep going after our row is DONE.
Another place to make use of the Get Scan fact is when we are DONE.. if
Get Scan, we can close out the scan.
M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java
Add tests for Get Scans and optimize around block loading.
When we read from HDFS, we overread to pick up the next blocks header.
Doing this saves a seek as we move through the hfile; we save having to
do an explicit seek just to read the block header every time we need to
read the body. We used to read in the next header as part of the
current blocks buffer. This buffer was then what got persisted to
blockcache; so we were over-persisting: our block plus the next blocks'
header (33 bytes).
This patch undoes this over-persisting.
Removes support for version 1 blocks (0.2 was added in hbase-0.92.0).
Not needed any more.
There is an open question on whether checksums should be persisted
when caching. The code seems to say no but if cache is SSD backed or
backed by anything that does not do error correction, we'll want
checksums.
Adds loads of documentation.
M hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
(write) Add writing from a ByteBuff.
M hbase-common/src/main/java/org/apache/hadoop/hbase/nio/ByteBuff.java
(toString) Add one so ByteBuff looks like ByteBuffer when you click on
it in IDE
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Remove support for version 1 blocks.
Cleaned up handling of metadata added when we serialize a block to
caches. Metadata is smaller now.
When we serialize (used when caching), do not persist the next blocks
header if present.
Removed a bunch of methods, a few of which had overlapping
functionality and others that exposed too much of our internals.
Also removed a bunch of constructors and unified the constructors we
had left over making them share a common init method.
Shutdown access to defines that should only be used internally here.
Renamed all to do w/ 'EXTRA' and 'extraSerialization' to instead talk
about metadata saved to caches; was unclear previously what EXTRA was
about.
Renamed static final declarations as all uppercase.
(readBlockDataInternal): Redid. Couldn't make sense of it previously.
Undid heavy-duty parse of header by constructing HFileBlock. Other
cleanups. Its 1/3rd the length it used to be. More to do in here.
When we read from HDFS, we overread to pick up the next blocks header.
Doing this saves a seek as we move through the hfile; we save having to
do an explicit seek just to read the block header every time we need to
read the body. We used to read in the next header as part of the
current blocks buffer. This buffer was then what got persisted to
blockcache; so we were over-persisting: our block plus the next blocks'
header (33 bytes).
This patch undoes this over-persisting.
Removes support for version 1 blocks (0.2 was added in hbase-0.92.0).
Not needed any more.
There is an open question on whether checksums should be persisted
when caching. The code seems to say no but if cache is SSD backed or
backed by anything that does not do error correction, we'll want
checksums.
Adds loads of documentation.
M hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
(write) Add writing from a ByteBuff.
M hbase-common/src/main/java/org/apache/hadoop/hbase/nio/ByteBuff.java
(toString) Add one so ByteBuff looks like ByteBuffer when you click on
it in IDE
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Remove support for version 1 blocks.
Cleaned up handling of metadata added when we serialize a block to
caches. Metadata is smaller now.
When we serialize (used when caching), do not persist the next blocks
header if present.
Removed a bunch of methods, a few of which had overlapping
functionality and others that exposed too much of our internals.
Also removed a bunch of constructors and unified the constructors we
had left over making them share a common init method.
Shutdown access to defines that should only be used internally here.
Renamed all to do w/ 'EXTRA' and 'extraSerialization' to instead talk
about metadata saved to caches; was unclear previously what EXTRA was
about.
Renamed static final declarations as all uppercase.
(readBlockDataInternal): Redid. Couldn't make sense of it previously.
Undid heavy-duty parse of header by constructing HFileBlock. Other
cleanups. Its 1/3rd the length it used to be. More to do in here.
When we read from HDFS, we overread to pick up the next blocks header.
Doing this saves a seek as we move through the hfile; we save having to
do an explicit seek just to read the block header every time we need to
read the body. We used to read in the next header as part of the
current blocks buffer. This buffer was then what got persisted to
blockcache; so we were over-persisting: our block plus the next blocks'
header (33 bytes).
This patch undoes this over-persisting.
Removes support for version 1 blocks (0.2 was added in hbase-0.92.0).
Not needed any more.
There is an open question on whether checksums should be persisted
when caching. The code seems to say no but if cache is SSD backed or
backed by anything that does not do error correction, we'll want
checksums.
Adds loads of documentation.
M hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
(write) Add writing from a ByteBuff.
M hbase-common/src/main/java/org/apache/hadoop/hbase/nio/ByteBuff.java
(toString) Add one so ByteBuff looks like ByteBuffer when you click on
it in IDE
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Remove support for version 1 blocks.
Cleaned up handling of metadata added when we serialize a block to
caches. Metadata is smaller now.
When we serialize (used when caching), do not persist the next blocks
header if present.
Removed a bunch of methods, a few of which had overlapping
functionality and others that exposed too much of our internals.
Also removed a bunch of constructors and unified the constructors we
had left over making them share a common init method.
Shutdown access to defines that should only be used internally here.
Renamed all to do w/ 'EXTRA' and 'extraSerialization' to instead talk
about metadata saved to caches; was unclear previously what EXTRA was
about.
Renamed static final declarations as all uppercase.
(readBlockDataInternal): Redid. Couldn't make sense of it previously.
Undid heavy-duty parse of header by constructing HFileBlock. Other
cleanups. Its 1/3rd the length it used to be. More to do in here.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
optimize Add doc on what is being optimized. Also, if a Get Scan, do not
optimize else we'll keep going after our row is DONE.
Another place to make use of the Get Scan fact is when we are DONE.. if
Get Scan, we can close out the scan.
M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java
Add tests for Get Scans and optimize around block loading.
Summary:
Move the config keys to one place
Make Two different config keys. One for default, one for priority
Test Plan: unit tests
Differential Revision: https://reviews.facebook.net/D55575
Summary:
Currently WAL splitting is broken when a region has been opened multiple times in recent minutes.
Region open and region close write event markers to the wal. These markers should have the sequence id in them. However it is currently getting 1. That means that if a region has moved multiple times in the last few mins then multiple split log workers will try and create the recovered edits file for sequence id 1. One of the workers will fail and on failing they will delete the recovered edits. Causing all split wal attempts to fail.
We need to:
It appears that the close event with a sequence id of one is coming from region warm up.
This patch fixes that by making sure the close on warm up doesn't happen. Also splitting will ignore any of the events that are already in the logs.
Test Plan: Unit tests pass
Differential Revision: https://reviews.facebook.net/D55557
M hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java
Make it emit its toString in format that matches the way we log
elsewhere
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
Capitalize statics.
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Verify and cleanup documentation of hfileblock format at head of class.
Explain what 'EXTRA_SERIALIZATION_SPACE' is all about.
Connect how we serialize and deserialize... done in different places
and one way when pulling from HDFS and another when pulling from cache
(TO BE FIXED). Shut down a load of public access.
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java
Add trace-level logging
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java
Make it Closeable
Summary:
Use less contended things for metrics.
For histogram which was the largest culprit we use FastLongHistogram
For atomic long where possible we now use counter.
Test Plan: unit tests
Reviewers:
Subscribers:
Differential Revision: https://reviews.facebook.net/D54381
Further investigation after HBASE-15221 lead to some findings that
AsyncProcess should have been managing the contents of the region
location cache, appropriately clearing it when necessary (e.g. an
RPC to a server fails because the server doesn't host that region)
For multi() RPCs, the tableName argument is null since there is no
single table that the updates are destined to. This inadvertently
caused the existing region location cache updates to fail on 1.x
branches. AsyncProcess needs to handle when tableName is null
and perform the necessary cache evictions.
As such, much of the new retry logic in HTableMultiplexer is
unnecessary and is removed with this commit. Getters which were
added as a part of testing were left since that are mostly
harmless and should contain no negative impact.
Signed-off-by: stack <stack@apache.org>
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
Cleanup trace message and include offset; makes debug the easier.
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
Fix incorrect data member javadoc.
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Pass along the offset we are checksumming at.
M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpljava
Add trace logging for debugging and set the end of the prefetch to be
last datablock, not size minus trailersize (there is the root indices
and file info to be skipped)
HeapMemoryTuner often over tunes memstore without looking at
the lower limit of the previous memstore size and causing a
situation in which memstore used size suddenly exceeds the
total memstore size.
Signed-off-by: Elliott Clark <eclark@apache.org>
ReplicationSinkManager and HBaseInterClusterReplicationEndpoint
perform certain unsafe operations which might lead to undesirable
behavior with multiwal enabled.
Signed-off-by: Elliott Clark <eclark@apache.org>
In HBaseInterClusterReplicationEndpoint, we fail the whole batch
in case of a RejectedExecutionException on an individual sub-batch.
We should let the submitted sub-batches finish and retry only for
the remaining ones.
Signed-off-by: Elliott Clark <eclark@apache.org>
Summary:
* Add VersionInfoUtil to determine if a client has a specified version or better
* Add an exception type to say that the response should be chunked
* Add on client knowledge of retry exceptions
* Add on metrics for how often this happens
Test Plan: Added a unit test
Differential Revision: https://reviews.facebook.net/D51771
Patch includes loosening of the test so we continue when threads
run in not-expected order. Also includes minor clean ups in
FSHLog -- a formatting change, removal of an unused trace logging,
and a check so we don't create a new exception when not needed --
but it also includes a subtle so we check if we need to get to safe
point EVEN IF an outstanding exception. Previous we could by-pass
the safe point check. This should make us even more robust against
lockup (though this is a change that comes of code reading, not of
any issue seen in test).
Here is some detail on how I loosened the test:
The test can run in an unexpected order. Attempts at dictating the
order in which threads fire only had me deadlocking one latch
against another (the test latch vs the WAL zigzag latch) so I
gave up trying and instead, if we happen to go the unusual route of
rolling WALs and failing flush before the scheduled log roll
latch goes into place, just time out the run after a few seconds
and exit the test (but do not fail it); just log a WARN.
This is less than ideal but allows us keep some coverage of the
tricky scenario that was bringing on deadlock (a broken WAL that
is throwing exceptions getting stuck waiting on a sync to clear
out the ring buffer getting overshadowed by a subsequent append
added in by a concurrent flush).