HBASE-2406 Define semantics of cell timestamps/versions
git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1028949 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
8379057874
commit
c9da74ebc7
@ -626,6 +626,7 @@ Release 0.21.0 - Unreleased
|
||||
(Nicolas Spiegelberg via Stack)
|
||||
HBASE-3172 Reverse order of AssignmentManager and MetaNodeTracker in
|
||||
ZooKeeperWatcher
|
||||
HBASE-2406 Define semantics of cell timestamps/versions
|
||||
|
||||
|
||||
IMPROVEMENTS
|
||||
|
4
pom.xml
4
pom.xml
@ -255,6 +255,10 @@
|
||||
<xincludeSupported>true</xincludeSupported>
|
||||
<chunkedOutput>true</chunkedOutput>
|
||||
<useIdAsFilename>true</useIdAsFilename>
|
||||
<baseDir>book-</baseDir>
|
||||
<sectionAutolabelMaxDepth>100</sectionAutolabelMaxDepth>
|
||||
<sectionAutolabel>true</sectionAutolabel>
|
||||
<sectionLabelIncludesComponentLabel>true</sectionLabelIncludesComponentLabel>
|
||||
<targetDirectory>${basedir}/target/site/</targetDirectory>
|
||||
</configuration>
|
||||
</plugin>
|
||||
|
@ -23,17 +23,373 @@
|
||||
</revhistory>
|
||||
</info>
|
||||
|
||||
<chapter xml:id="introduction">
|
||||
<title>Introduction</title>
|
||||
|
||||
<para>This book aims to be the official guide for the <link
|
||||
xlink:href="http://hbase.apache.org/">HBase</link> version it ships with.
|
||||
This document describes HBase version <emphasis><?eval ${project.version}?></emphasis>.
|
||||
Herein you will find either the definitive documentation on an HBase topic
|
||||
as of its standing when the referenced HBase version shipped, or failing
|
||||
that, this book will point to the location in <link
|
||||
xlink:href="http://hbase.apache.org/docs/current/api/index.html">javadoc</link>,
|
||||
<link xlink:href="https://issues.apache.org/jira/browse/HBASE">JIRA</link>
|
||||
or <link xlink:href="http://wiki.apache.org/hadoop/Hbase">wiki</link>
|
||||
where the pertinent information can be found.</para>
|
||||
|
||||
<para>This book is a work in progress. It is lacking in many areas but we
|
||||
hope to fill in the holes with time. Feel free to add to this book should
|
||||
you feel so inclined by adding a patch to an issue up in the HBase <link
|
||||
xlink:href="https://issues.apache.org/jira/browse/HBASE">JIRA</link>.</para>
|
||||
</chapter>
|
||||
|
||||
<chapter xml:id="getting_started">
|
||||
<title>Getting Started</title>
|
||||
|
||||
<section>
|
||||
<title>Requirements</title>
|
||||
<section xml:id="quickstart">
|
||||
<title>Quick Start</title>
|
||||
|
||||
<para>First...</para>
|
||||
<para><itemizedlist>
|
||||
<para>Here is a quick guide to starting up a standalone HBase
|
||||
instance, inserting rows into a table via the <link
|
||||
linkend="shell">HBase Shell</link>, and then clean up and shutting
|
||||
down your instance.</para>
|
||||
|
||||
<listitem>
|
||||
<para>Download and unpack the latest stable release.</para>
|
||||
|
||||
<para>Choose a download source from <link
|
||||
xlink:href="http://www.apache.org/dyn/closer.cgi/hbase/">Apache
|
||||
Download Mirrors</link>. Click on it. This will take you to a
|
||||
mirror of the <emphasis>HBase Releases</emphasis> page. Click on
|
||||
the folder named <filename>stable</filename> and then download the
|
||||
file <filename><?eval ${project.version}?>.tar.gz</filename>.</para>
|
||||
|
||||
<para>Decompress and untar your download. Then change into the
|
||||
unpacked directory and startHBase</para>
|
||||
|
||||
<para><programlisting>$ tar xfz <?eval ${project.version}?>.tar.gz
|
||||
$ cd <?eval ${project.version}
|
||||
$ ./bin/start-hbase.sh
|
||||
starting master, logging to logs/hbase-user-master-example.org.out?></programlisting></para>
|
||||
|
||||
<para>You now have a running HBase instance. HBase logs can be
|
||||
found in the <filename>logs</filename> subdirectory. Check them
|
||||
out.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Connect to your running HBase via the HBase Shell</para>
|
||||
|
||||
<para><programlisting>$ ./bin/hbase shell
|
||||
HBase Shell; enter 'help<RETURN>' for list of supported commands.
|
||||
Type "exit<RETURN>" to leave the HBase Shell
|
||||
Version: 0.89.20100924, r1001068, Fri Sep 24 13:55:42 PDT 2010
|
||||
|
||||
hbase(main):001:0> </programlisting></para>
|
||||
|
||||
<para>Type <command>help</command> to see a listing of shell
|
||||
commands and options. Browse at least the paragraphs at the end of
|
||||
the help emission for the gist of how variables are entered in the
|
||||
HBase shell; in particular note how table names, rows, and
|
||||
columns, etc., must be quoted.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Create a table named <filename>test</filename> with a single
|
||||
colum family named <filename>cf.</filename></para>
|
||||
|
||||
<para><programlisting>hbase(main):003:0> create 'test', 'cf'
|
||||
0 row(s) in 1.2200 seconds</programlisting></para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Insert some values into the table
|
||||
<varname>test</varname>.</para>
|
||||
|
||||
<para>Below we insert 3 values. The first insert is at
|
||||
<varname>row1</varname>, column <varname>cf:a</varname> -- columns
|
||||
have a column family prefix delimited by the colon character --
|
||||
with a value of <varname>value1</varname>.</para>
|
||||
|
||||
<para><programlisting>hbase(main):004:0> put 'test', 'row1', 'cf:a', 'value1'
|
||||
0 row(s) in 0.0560 seconds
|
||||
hbase(main):005:0> put 'test', 'row2', 'cf:b', 'value2'
|
||||
0 row(s) in 0.0370 seconds
|
||||
hbase(main):006:0> put 'test', 'row3', 'cf:c', 'value3'
|
||||
0 row(s) in 0.0450 seconds</programlisting></para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Verify the table content</para>
|
||||
|
||||
<para>Run a scan of the table by doing the following</para>
|
||||
|
||||
<para><programlisting>hbase(main):007:0> scan 'test'
|
||||
ROW COLUMN+CELL
|
||||
row1 column=cf:a, timestamp=1288380727188, value=value1
|
||||
row2 column=cf:b, timestamp=1288380738440, value=value2
|
||||
row3 column=cf:c, timestamp=1288380747365, value=value3
|
||||
3 row(s) in 0.0590 seconds</programlisting></para>
|
||||
|
||||
<para>Get a single row as follows</para>
|
||||
|
||||
<para><programlisting>hbase(main):008:0> get 'test', 'row1'
|
||||
COLUMN CELL
|
||||
cf:a timestamp=1288380727188, value=value1
|
||||
1 row(s) in 0.0400 seconds</programlisting></para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Now, disable and drop your table. This will clean up all
|
||||
done above.</para>
|
||||
|
||||
<para><programlisting>hbase(main):012:0> disable 'test'
|
||||
0 row(s) in 1.0930 seconds
|
||||
hbase(main):013:0> drop 'test'
|
||||
0 row(s) in 0.0770 seconds </programlisting></para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Exit the shell by typing exit.</para>
|
||||
|
||||
<para><programlisting>hbase(main):014:0> exit
|
||||
$ </programlisting></para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Stop your hbase instance by running the stop script.</para>
|
||||
|
||||
<para><programlisting>$ ./bin/stop-hbase.sh
|
||||
stopping hbase...............</programlisting></para>
|
||||
</listitem>
|
||||
</itemizedlist></para>
|
||||
</section>
|
||||
|
||||
<section xml:id="notsoquick">
|
||||
<title>Not-so-quick Start</title>
|
||||
|
||||
<para>The HBase API overview document contains a detailed <link
|
||||
xlink:href="http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description">Getting
|
||||
Started</link> with a list of requirements and description of the
|
||||
different HBase run modes: standalone, what is described above in <link
|
||||
linkend="quickstart">Quick Start,</link> pseudo-distributed where all
|
||||
daemons run on a single server, and distributed.</para>
|
||||
</section>
|
||||
</chapter>
|
||||
|
||||
<chapter>
|
||||
<chapter xml:id="datamodel">
|
||||
<title>Data Model</title>
|
||||
|
||||
<section>
|
||||
<title>Table</title>
|
||||
|
||||
<para></para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Row</title>
|
||||
|
||||
<para></para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Column Family</title>
|
||||
|
||||
<para></para>
|
||||
</section>
|
||||
|
||||
<section xml:id="versions">
|
||||
<title>Versions</title>
|
||||
|
||||
<para>A <emphasis>{row, column, version} </emphasis>tuple exactly
|
||||
specifies a <literal>cell</literal> in HBase. Its possible to have an
|
||||
unbounded number of cells where the row and column are the same but the
|
||||
cell address differs only in its version dimension.</para>
|
||||
|
||||
<para>While rows and column keys are expressed as bytes, the version is
|
||||
specified using a long integer. Typically this long contains time
|
||||
instances such as those returned by
|
||||
<code>java.util.Date.getTime()</code> or
|
||||
<code>System.currentTimeMillis()</code>, that is: <quote>the difference,
|
||||
measured in milliseconds, between the current time and midnight, January
|
||||
1, 1970 UTC</quote>.</para>
|
||||
|
||||
<para>The HBase version dimension is stored in decreasing order, so that
|
||||
when reading from a store file, the most recent values are found
|
||||
first.</para>
|
||||
|
||||
<para>There is a lot of confusion over the semantics of
|
||||
<literal>cell</literal> versions, in HBase. In particular, a couple
|
||||
questions that often come up are:<itemizedlist>
|
||||
<listitem>
|
||||
<para>If multiple writes to a cell have the same version, are all
|
||||
versions maintained or just the last?<footnote>
|
||||
<para>Currently, only the last written is fetchable.</para>
|
||||
</footnote></para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Is it OK to write cells in a non-increasing version
|
||||
order?<footnote>
|
||||
<para>Yes</para>
|
||||
</footnote></para>
|
||||
</listitem>
|
||||
</itemizedlist></para>
|
||||
|
||||
<para>Below we describe how the version dimension in HBase currently
|
||||
works<footnote>
|
||||
<para>See <link
|
||||
xlink:href="https://issues.apache.org/jira/browse/HBASE-2406">HBASE-2406</link>
|
||||
for discussion of HBase versions. <link
|
||||
xlink:href="http://outerthought.org/blog/417-ot.html">Bending time
|
||||
in HBase</link> makes for a good read on the version, or time,
|
||||
dimension in HBase. It has more detail on versioning than is
|
||||
provided here. As of this writing, the limiitation
|
||||
<emphasis>Overwriting values at existing timestamps</emphasis>
|
||||
mentioned in the article no longer holds in HBase. This section is
|
||||
basically a synopsis of this article by Bruno Dumon.</para>
|
||||
</footnote>.</para>
|
||||
|
||||
<section>
|
||||
<title>Versions and HBase Operations</title>
|
||||
|
||||
<para>In this section we look at the behavior of the version dimension
|
||||
for each of the core HBase operations.</para>
|
||||
|
||||
<section>
|
||||
<title>Get/Scan</title>
|
||||
|
||||
<para>Gets are implemented on top of Scans. The below discussion of
|
||||
Get applies equally to Scans.</para>
|
||||
|
||||
<para>By default, i.e. if you specify no explicit version, when
|
||||
doing a <literal>get</literal>, the cell whose version has the
|
||||
largest value is returned (which may or may not be the latest one
|
||||
written, see later). The default behavior can be modified in the
|
||||
following ways:</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>to return more than one version, see <link
|
||||
xlink:href="http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/client/Get.html#setMaxVersions()">Get.setMaxVersions()</link></para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>to return versions other than the latest, see <link
|
||||
xlink:href="???">Get.setTimeRange()</link></para>
|
||||
|
||||
<para>To retrieve the latest version that is less than or equal
|
||||
to a given value, thus giving the 'latest' state of the record
|
||||
at a certain point in time, just use a range from 0 to the
|
||||
desired version and set the max versions to 1.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Put</title>
|
||||
|
||||
<para>Doing a put always creates a new version of a
|
||||
<literal>cell</literal>, at a certain timestamp. By default the
|
||||
system uses the server's <literal>currentTimeMillis</literal>, but
|
||||
you can specify the version (= the long integer) yourself, on a
|
||||
per-column level. This means you could assign a time in the past or
|
||||
the future, or use the long value for non-time purposes.</para>
|
||||
|
||||
<para>To overwrite an existing value, do a put at exactly the same
|
||||
row, column, and version as that of the cell you would
|
||||
overshadow.</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Delete</title>
|
||||
|
||||
<para>When performing a delete operation in HBase, there are two
|
||||
ways to specify the versions to be deleted</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Delete all versions older than a certain timestamp</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Delete the version at a specific timestamp</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>A delete can apply to a complete row, a complete column
|
||||
family, or to just one column. It is only in the last case that you
|
||||
can delete explicit versions. For the deletion of a row or all the
|
||||
columns within a family, it always works by deleting all cells older
|
||||
than a certain version.</para>
|
||||
|
||||
<para>Deletes work by creating <emphasis>tombstone</emphasis>
|
||||
markers. For example, let's suppose we want to delete a row. For
|
||||
this you can specify a version, or else by default the
|
||||
<literal>currentTimeMillis</literal> is used. What this means is
|
||||
<quote>delete all cells where the version is less than or equal to
|
||||
this version</quote>. HBase never modifies data in place, so for
|
||||
example a delete will not immediately delete (or mark as deleted)
|
||||
the entries in the storage file that correspond to the delete
|
||||
condition. Rather, a so-called <emphasis>tombstone</emphasis> is
|
||||
written, which will mask the deleted values<footnote>
|
||||
<para>When HBase does a major compaction, the tombstones are
|
||||
processed to actually remove the dead values, together with the
|
||||
tombstones themselves.</para>
|
||||
</footnote>. If the version you specified when deleting a row is
|
||||
larger than the version of any value in the row, then you can
|
||||
consider the complete row to be deleted.</para>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Current Limitations</title>
|
||||
|
||||
<para>There are still some bugs (or at least 'undecided behavior')
|
||||
with the version dimension that will be addressed by later HBase
|
||||
releases.</para>
|
||||
|
||||
<section>
|
||||
<title>Deletes mask Puts</title>
|
||||
|
||||
<para>Deletes mask puts, even puts that happened after the delete
|
||||
was entered<footnote>
|
||||
<para><link
|
||||
xlink:href="https://issues.apache.org/jira/browse/HBASE-2256">HBASE-2256</link></para>
|
||||
</footnote>. Remember that a delete writes a tombstone, which only
|
||||
disappears after then next major compaction has run. Suppose you do
|
||||
a delete of everything <= T. After this you do a new put with a
|
||||
timestamp <= T. This put, even if it happened after the delete,
|
||||
will be masked by the delete tombstone. Performing the put will not
|
||||
fail, but when you do a get you will notice the put did have no
|
||||
effect. It will start working again after the major compaction has
|
||||
run. These issues should not be a problem if you use
|
||||
always-increasing versions for new puts to a row. But they can occur
|
||||
even if you do not care about time: just do delete and put
|
||||
immediately after each other, and there is some chance they happen
|
||||
within the same millisecond.</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Major compactions change query results</title>
|
||||
|
||||
<para><quote>...create three cell versions at t1, t2 and t3, with a
|
||||
maximum-versions setting of 2. So when getting all versions, only
|
||||
the values at t2 and t3 will be returned. But if you delete the
|
||||
version at t2 or t3, the one at t1 will appear again. Obviously,
|
||||
once a major compaction has run, such behavior will not be the case
|
||||
anymore...<footnote>
|
||||
<para>See <emphasis>Garbage Collection</emphasis> in <link
|
||||
xlink:href="http://outerthought.org/blog/417-ot.html">Bending
|
||||
time in HBase</link> </para>
|
||||
</footnote></quote></para>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
</chapter>
|
||||
|
||||
<chapter xml:id="shell">
|
||||
<title>The HBase Shell</title>
|
||||
|
||||
<para></para>
|
||||
@ -63,11 +419,14 @@
|
||||
</section>
|
||||
</chapter>
|
||||
|
||||
<chapter>
|
||||
<chapter xml:id="regions">
|
||||
<title>Regions</title>
|
||||
|
||||
<para>This chapter is all about Regions.</para>
|
||||
|
||||
<note>
|
||||
<para>Does this belong in the data model chapter?</para>
|
||||
</note>
|
||||
|
||||
<section>
|
||||
<title>Region Size</title>
|
||||
@ -114,10 +473,11 @@
|
||||
|
||||
<section>
|
||||
<title>Region Transitions</title>
|
||||
<note>
|
||||
<para>TODO: Review all of the below to ensure it matches what was
|
||||
committed -- St.Ack 20100901</para>
|
||||
</note>
|
||||
|
||||
<note>
|
||||
<para>TODO: Review all of the below to ensure it matches what was
|
||||
committed -- St.Ack 20100901</para>
|
||||
</note>
|
||||
|
||||
<para>Regions only transition in a limited set of circumstances.</para>
|
||||
|
||||
@ -674,20 +1034,21 @@
|
||||
</itemizedlist>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Region Splits</title>
|
||||
<para>Splits run unaided on the RegionServer; i.e. the Master does not
|
||||
participate. The RegionServer splits
|
||||
a region, offlines the split region and then adds the daughter regions
|
||||
to META, opens daughters on the parent's hosting RegionServer and then
|
||||
reports the split to the master.
|
||||
</para>
|
||||
<title>Region Splits</title>
|
||||
|
||||
<para>Splits run unaided on the RegionServer; i.e. the Master does not
|
||||
participate. The RegionServer splits a region, offlines the split
|
||||
region and then adds the daughter regions to META, opens daughters on
|
||||
the parent's hosting RegionServer and then reports the split to the
|
||||
master.</para>
|
||||
</section>
|
||||
</section>
|
||||
</chapter>
|
||||
|
||||
<chapter>
|
||||
<title>The WAL</title>
|
||||
<title xml:id="wal">The WAL</title>
|
||||
|
||||
<subtitle>HBase's<link
|
||||
xlink:href="http://en.wikipedia.org/wiki/Write-ahead_logging"> Write-Ahead
|
||||
@ -767,7 +1128,7 @@
|
||||
</chapter>
|
||||
|
||||
<chapter>
|
||||
<title>Bloom Filters</title>
|
||||
<title xml:id="blooms">Bloom Filters</title>
|
||||
|
||||
<para>Bloom filters were developed over in <link
|
||||
xlink:href="https://issues.apache.org/jira/browse/HBASE-1200">HBase-1200
|
||||
@ -796,7 +1157,8 @@
|
||||
<title>Configurations</title>
|
||||
|
||||
<para>Blooms are enabled by specifying options on a column family in the
|
||||
HBase shell or in java code as specification on <classname>org.apache.hadoop.hbase.HColumnDescriptor</classname>.</para>
|
||||
HBase shell or in java code as specification on
|
||||
<classname>org.apache.hadoop.hbase.HColumnDescriptor</classname>.</para>
|
||||
|
||||
<section>
|
||||
<title><code>HColumnDescriptor</code> option</title>
|
||||
@ -885,9 +1247,25 @@
|
||||
</chapter>
|
||||
|
||||
<appendix>
|
||||
<title>Tools</title>
|
||||
<title xml:id="tools">Tools</title>
|
||||
|
||||
<para>Here we list HBase tools for administration, analysis, fixup, and
|
||||
debugging.</para>
|
||||
</appendix>
|
||||
|
||||
<glossary xml:id="glossary">
|
||||
<title xml:id="glossary">HBase Glossary</title>
|
||||
|
||||
<glossentry>
|
||||
<glossterm xml:id="cf">column family</glossterm>
|
||||
|
||||
<acronym>cf</acronym>
|
||||
|
||||
<abbrev>cf</abbrev>
|
||||
|
||||
<glossdef>
|
||||
<para>Define a column family</para>
|
||||
</glossdef>
|
||||
</glossentry>
|
||||
</glossary>
|
||||
</book>
|
||||
|
@ -1,57 +0,0 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<article version="5.0" xmlns="http://docbook.org/ns/docbook"
|
||||
xmlns:xlink="http://www.w3.org/1999/xlink"
|
||||
xmlns:xi="http://www.w3.org/2001/XInclude"
|
||||
xmlns:svg="http://www.w3.org/2000/svg"
|
||||
xmlns:m="http://www.w3.org/1998/Math/MathML"
|
||||
xmlns:html="http://www.w3.org/1999/xhtml"
|
||||
xmlns:db="http://docbook.org/ns/docbook">
|
||||
<info>
|
||||
<title>Wah-wah
|
||||
<?eval ${project.version}?>
|
||||
</title>
|
||||
|
||||
|
||||
</info>
|
||||
|
||||
<section xml:id="wahwah">
|
||||
<title>Wah-Wah changed my life</title>
|
||||
|
||||
<para>I was born very young...</para>
|
||||
|
||||
<para>This is a sample docbook article.</para>
|
||||
<para>
|
||||
<?eval ${project.version}?>
|
||||
</para>
|
||||
|
||||
<section xml:id="then">
|
||||
<title>Then</title>
|
||||
|
||||
<para></para>
|
||||
</section>
|
||||
|
||||
<section xml:id="and">
|
||||
<title>And</title>
|
||||
|
||||
<para></para>
|
||||
</section>
|
||||
|
||||
<section xml:id="later">
|
||||
<title>Later</title>
|
||||
|
||||
<para></para>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
<section xml:id="good_books">
|
||||
<title>Good books</title>
|
||||
|
||||
<para></para>
|
||||
</section>
|
||||
|
||||
<section xml:id="rainy_days">
|
||||
<title>Rainy days</title>
|
||||
|
||||
<para>Today it was raining</para>
|
||||
</section>
|
||||
</article>
|
@ -38,7 +38,6 @@
|
||||
<item name="Cluster replication" href="replication.html" />
|
||||
<item name="Pseudo-Distributed HBase" href="pseudo-distributed.html" />
|
||||
<item name="HBase Book" href="book.html" />
|
||||
<item name="Example Docbook Article" href="sample_article.html" />
|
||||
</menu>
|
||||
</body>
|
||||
<skin>
|
||||
|
Loading…
x
Reference in New Issue
Block a user