diff --git a/src/main/asciidoc/_chapters/architecture.adoc b/src/main/asciidoc/_chapters/architecture.adoc index 8b521dfc58e..0f02a7901e3 100644 --- a/src/main/asciidoc/_chapters/architecture.adoc +++ b/src/main/asciidoc/_chapters/architecture.adoc @@ -76,7 +76,7 @@ HBase can run quite well stand-alone on a laptop - but this should be considered [[arch.overview.hbasehdfs]] === What Is The Difference Between HBase and Hadoop/HDFS? -link:http://hadoop.apache.org/hdfs/[HDFS] is a distributed file system that is well suited for the storage of large files. +link:https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html[HDFS] is a distributed file system that is well suited for the storage of large files. Its documentation states that it is not, however, a general purpose file system, and does not provide fast individual record lookups in files. HBase, on the other hand, is built on top of HDFS and provides fast record lookups (and updates) for large tables. This can sometimes be a point of conceptual confusion. @@ -119,9 +119,7 @@ If a region has both an empty start and an empty end key, it is the only region ==== In the (hopefully unlikely) event that programmatic processing of catalog metadata -is required, see the -+++Writables+++ -utility. +is required, see the link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/RegionInfo.html#parseFrom-byte:A-[RegionInfo.parseFrom] utility. [[arch.catalog.startup]] === Startup Sequencing @@ -221,7 +219,7 @@ In HBase 2.0 and later, link:http://hbase.apache.org/devapidocs/org/apache/hadoo For additional information on write durability, review the link:/acid-semantics.html[ACID semantics] page. -For fine-grained control of batching of ``Put``s or ``Delete``s, see the link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch%28java.util.List%29[batch] methods on Table. +For fine-grained control of batching of ``Put``s or ``Delete``s, see the link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch-java.util.List-java.lang.Object:A-[batch] methods on Table. [[async.client]] === Asynchronous Client === @@ -2799,7 +2797,7 @@ if (result.isStale()) { === Resources . More information about the design and implementation can be found at the jira issue: link:https://issues.apache.org/jira/browse/HBASE-10070[HBASE-10070] -. HBaseCon 2014 link:http://hbasecon.com/sessions/#session15[talk] also contains some details and link:http://www.slideshare.net/enissoz/hbase-high-availability-for-reads-with-time[slides]. +. HBaseCon 2014 talk: link:http://hbase.apache.org/www.hbasecon.com/#2014-PresentationsRecordings[HBase Read High Availability Using Timeline-Consistent Region Replicas] also contains some details and link:http://www.slideshare.net/enissoz/hbase-high-availability-for-reads-with-time[slides]. ifdef::backend-docbook[] [index] diff --git a/src/main/asciidoc/_chapters/community.adoc b/src/main/asciidoc/_chapters/community.adoc index f63d597ba26..d141dbf2e10 100644 --- a/src/main/asciidoc/_chapters/community.adoc +++ b/src/main/asciidoc/_chapters/community.adoc @@ -47,9 +47,9 @@ The below policy is something we put in place 09/2012. It is a suggested policy rather than a hard requirement. We want to try it first to see if it works before we cast it in stone. -Apache HBase is made of link:https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel[components]. +Apache HBase is made of link:https://issues.apache.org/jira/projects/HBASE?selectedItem=com.atlassian.jira.jira-projects-plugin:components-page[components]. Components have one or more <>s. -See the 'Description' field on the link:https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel[components] JIRA page for who the current owners are by component. +See the 'Description' field on the link:https://issues.apache.org/jira/projects/HBASE?selectedItem=com.atlassian.jira.jira-projects-plugin:components-page[components] JIRA page for who the current owners are by component. Patches that fit within the scope of a single Apache HBase component require, at least, a +1 by one of the component's owners before commit. If owners are absent -- busy or otherwise -- two +1s by non-owners will suffice. @@ -88,7 +88,7 @@ We also are currently in violation of this basic tenet -- replication at least k [[owner]] .Component Owner/Lieutenant -Component owners are listed in the description field on this Apache HBase JIRA link:https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel[components] page. +Component owners are listed in the description field on this Apache HBase JIRA link:https://issues.apache.org/jira/projects/HBASE?selectedItem=com.atlassian.jira.jira-projects-plugin:components-page[components] page. The owners are listed in the 'Description' field rather than in the 'Component Lead' field because the latter only allows us list one individual whereas it is encouraged that components have multiple owners. Owners or component lieutenants are volunteers who are (usually, but not necessarily) expert in their component domain and may have an agenda on how they think their Apache HBase component should evolve. diff --git a/src/main/asciidoc/_chapters/compression.adoc b/src/main/asciidoc/_chapters/compression.adoc index e5b9b8f463b..4f86631cd3c 100644 --- a/src/main/asciidoc/_chapters/compression.adoc +++ b/src/main/asciidoc/_chapters/compression.adoc @@ -267,8 +267,7 @@ See <>). .Install LZO Support HBase cannot ship with LZO because of incompatibility between HBase, which uses an Apache Software License (ASL) and LZO, which uses a GPL license. -See the link:http://wiki.apache.org/hadoop/UsingLzoCompression[Using LZO - Compression] wiki page for information on configuring LZO support for HBase. +See the link:https://github.com/twitter/hadoop-lzo/blob/master/README.md[Hadoop-LZO at Twitter] for information on configuring LZO support for HBase. If you depend upon LZO compression, consider configuring your RegionServers to fail to start if LZO is not available. See <>. diff --git a/src/main/asciidoc/_chapters/configuration.adoc b/src/main/asciidoc/_chapters/configuration.adoc index 6c550c4fd3c..3d34dccda20 100644 --- a/src/main/asciidoc/_chapters/configuration.adoc +++ b/src/main/asciidoc/_chapters/configuration.adoc @@ -820,7 +820,7 @@ See the entry for `hbase.hregion.majorcompaction` in the <> diff --git a/src/main/asciidoc/_chapters/cp.adoc b/src/main/asciidoc/_chapters/cp.adoc index 2f5267f2d64..21b8c50ed7b 100644 --- a/src/main/asciidoc/_chapters/cp.adoc +++ b/src/main/asciidoc/_chapters/cp.adoc @@ -121,8 +121,8 @@ package. Observer coprocessors are triggered either before or after a specific event occurs. Observers that happen before an event use methods that start with a `pre` prefix, -such as link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionObserver.html#prePut%28org.apache.hadoop.hbase.coprocessor.ObserverContext,%20org.apache.hadoop.hbase.client.Put,%20org.apache.hadoop.hbase.regionserver.wal.WALEdit,%20org.apache.hadoop.hbase.client.Durability%29[`prePut`]. Observers that happen just after an event override methods that start -with a `post` prefix, such as link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionObserver.html#postPut%28org.apache.hadoop.hbase.coprocessor.ObserverContext,%20org.apache.hadoop.hbase.client.Put,%20org.apache.hadoop.hbase.regionserver.wal.WALEdit,%20org.apache.hadoop.hbase.client.Durability%29[`postPut`]. +such as link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionObserver.html#prePut-org.apache.hadoop.hbase.coprocessor.ObserverContext-org.apache.hadoop.hbase.client.Put-org.apache.hadoop.hbase.wal.WALEdit-org.apache.hadoop.hbase.client.Durability-[`prePut`]. Observers that happen just after an event override methods that start +with a `post` prefix, such as link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionObserver.html#postPut-org.apache.hadoop.hbase.coprocessor.ObserverContext-org.apache.hadoop.hbase.client.Put-org.apache.hadoop.hbase.wal.WALEdit-org.apache.hadoop.hbase.client.Durability-[`postPut`]. ==== Use Cases for Observer Coprocessors @@ -178,7 +178,7 @@ average or summation for an entire table which spans hundreds of regions. In contrast to observer coprocessors, where your code is run transparently, endpoint coprocessors must be explicitly invoked using the -link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Table.html#coprocessorService%28java.lang.Class,%20byte%5B%5D,%20byte%5B%5D,%20org.apache.hadoop.hbase.client.coprocessor.Batch.Call%29[CoprocessorService()] +link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Table.html#coprocessorService-java.lang.Class-byte:A-byte:A-org.apache.hadoop.hbase.client.coprocessor.Batch.Call-[CoprocessorService()] method available in link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Table.html[Table] or diff --git a/src/main/asciidoc/_chapters/datamodel.adoc b/src/main/asciidoc/_chapters/datamodel.adoc index 3d08ace471d..12fb804890c 100644 --- a/src/main/asciidoc/_chapters/datamodel.adoc +++ b/src/main/asciidoc/_chapters/datamodel.adoc @@ -67,7 +67,7 @@ Timestamp:: [[conceptual.view]] == Conceptual View -You can read a very understandable explanation of the HBase data model in the blog post link:http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable[Understanding HBase and BigTable] by Jim R. Wilson. +You can read a very understandable explanation of the HBase data model in the blog post link:http://jimbojw.com/#understanding%20hbase[Understanding HBase and BigTable] by Jim R. Wilson. Another good explanation is available in the PDF link:http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/9353-login1210_khurana.pdf[Introduction to Basic Schema Design] by Amandeep Khurana. It may help to read different perspectives to get a solid understanding of HBase schema design. @@ -275,11 +275,11 @@ Operations are applied via link:http://hbase.apache.org/apidocs/org/apache/hadoo === Get link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html[Get] returns attributes for a specified row. -Gets are executed via link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#get(org.apache.hadoop.hbase.client.Get)[Table.get]. +Gets are executed via link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#get-org.apache.hadoop.hbase.client.Get-[Table.get] === Put -link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html[Put] either adds new rows to a table (if the key is new) or can update existing rows (if the key already exists). Puts are executed via link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#put(org.apache.hadoop.hbase.client.Put)[Table.put] (non-writeBuffer) or link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch(java.util.List,%20java.lang.Object%5B%5D)[Table.batch] (non-writeBuffer). +link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html[Put] either adds new rows to a table (if the key is new) or can update existing rows (if the key already exists). Puts are executed via link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#put-org.apache.hadoop.hbase.client.Put-[Table.put] (non-writeBuffer) or link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch-java.util.List-java.lang.Object:A-[Table.batch] (non-writeBuffer) [[scan]] === Scans @@ -316,7 +316,7 @@ Note that generally the easiest way to specify a specific stop point for a scan === Delete link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Delete.html[Delete] removes a row from a table. -Deletes are executed via link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete(org.apache.hadoop.hbase.client.Delete)[Table.delete]. +Deletes are executed via link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete-org.apache.hadoop.hbase.client.Delete-[Table.delete]. HBase does not modify data in place, and so deletes are handled by creating new markers called _tombstones_. These tombstones, along with the dead values, are cleaned up on major compactions. @@ -389,8 +389,8 @@ The below discussion of link:http://hbase.apache.org/apidocs/org/apache/hadoop/h By default, i.e. if you specify no explicit version, when doing a `get`, the cell whose version has the largest value is returned (which may or may not be the latest one written, see later). The default behavior can be modified in the following ways: -* to return more than one version, see link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setMaxVersions()[Get.setMaxVersions()] -* to return versions other than the latest, see link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setTimeRange(long,%20long)[Get.setTimeRange()] +* to return more than one version, see link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setMaxVersions--[Get.setMaxVersions()] +* to return versions other than the latest, see link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setTimeRange-long-long-[Get.setTimeRange()] + To retrieve the latest version that is less than or equal to a given value, thus giving the 'latest' state of the record at a certain point in time, just use a range from 0 to the desired version and set the max versions to 1. diff --git a/src/main/asciidoc/_chapters/developer.adoc b/src/main/asciidoc/_chapters/developer.adoc index d937d77bff3..ed82dedbaa6 100644 --- a/src/main/asciidoc/_chapters/developer.adoc +++ b/src/main/asciidoc/_chapters/developer.adoc @@ -64,7 +64,7 @@ FreeNode offers a web-based client, but most people prefer a native client, and === Jira -Check for existing issues in link:https://issues.apache.org/jira/browse/HBASE[Jira]. +Check for existing issues in link:https://issues.apache.org/jira/projects/HBASE/issues[Jira]. If it's either a new feature request, enhancement, or a bug, file a ticket. We track multiple types of work in JIRA: @@ -479,8 +479,7 @@ mvn -DskipTests package assembly:single deploy If you see `Unable to find resource 'VM_global_library.vm'`, ignore it. It's not an error. -It is link:http://jira.codehaus.org/browse/MSITE-286[officially - ugly] though. +It is link:https://issues.apache.org/jira/browse/MSITE-286[officially ugly] though. [[releasing]] == Releasing Apache HBase @@ -593,8 +592,7 @@ Adjust the version in all the POM files appropriately. If you are making a release candidate, you must remove the `-SNAPSHOT` label from all versions in all pom.xml files. If you are running this receipe to publish a snapshot, you must keep the `-SNAPSHOT` suffix on the hbase version. -The link:http://mojo.codehaus.org/versions-maven-plugin/[Versions - Maven Plugin] can be of use here. +The link:http://www.mojohaus.org/versions-maven-plugin/[Versions Maven Plugin] can be of use here. To set a version in all the many poms of the hbase multi-module project, use a command like the following: + [source,bourne] @@ -738,7 +736,7 @@ If you run the script, do your checks at this stage verifying the src and bin ta Tag before you start the build. You can always delete it if the build goes haywire. -. Sign, fingerprint and then 'stage' your release candiate version directory via svnpubsub by committing your directory to link:https://dist.apache.org/repos/dist/dev/hbase/[The 'dev' distribution directory] (See comments on link:https://issues.apache.org/jira/browse/HBASE-10554[HBASE-10554 Please delete old releases from mirroring system] but in essence it is an svn checkout of https://dist.apache.org/repos/dist/dev/hbase -- releases are at https://dist.apache.org/repos/dist/release/hbase). In the _version directory_ run the following commands: +. Sign, fingerprint and then 'stage' your release candiate version directory via svnpubsub by committing your directory to link:https://dist.apache.org/repos/dist/dev/hbase/[The 'dev' distribution directory] (See comments on link:https://issues.apache.org/jira/browse/HBASE-10554[HBASE-10554 Please delete old releases from mirroring system] but in essence, it is an svn checkout of https://dist.apache.org/repos/dist/dev/hbase. And releases are at https://dist.apache.org/repos/dist/release/hbase). In the _version directory_ run the following commands: + [source,bourne] ---- @@ -920,7 +918,7 @@ Also, keep in mind that if you are running tests in the `hbase-server` module yo === Unit Tests Apache HBase test cases are subdivided into four categories: small, medium, large, and -integration with corresponding JUnit link:http://www.junit.org/node/581[categories]: `SmallTests`, `MediumTests`, `LargeTests`, `IntegrationTests`. +integration with corresponding JUnit link:https://github.com/junit-team/junit4/wiki/Categories[categories]: `SmallTests`, `MediumTests`, `LargeTests`, `IntegrationTests`. JUnit categories are denoted using java annotations and look like this in your unit test code. [source,java] @@ -1286,7 +1284,7 @@ For other deployment options, a ClusterManager can be implemented and plugged in ==== Destructive integration / system tests (ChaosMonkey) HBase 0.96 introduced a tool named `ChaosMonkey`, modeled after -link:http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html[same-named tool by Netflix's Chaos Monkey tool]. +link:https://netflix.github.io/chaosmonkey/[same-named tool by Netflix's Chaos Monkey tool]. ChaosMonkey simulates real-world faults in a running cluster by killing or disconnecting random servers, or injecting other failures into the environment. You can use ChaosMonkey as a stand-alone tool @@ -1934,7 +1932,7 @@ Use the btn:[Submit Patch] button in JIRA, just like othe ===== Reject -Patches which do not adhere to the guidelines in link:https://wiki.apache.org/hadoop/Hbase/HowToCommit/hadoop/Hbase/HowToContribute#[HowToContribute] and to the link:https://wiki.apache.org/hadoop/Hbase/HowToCommit/hadoop/CodeReviewChecklist#[code review checklist] should be rejected. +Patches which do not adhere to the guidelines in link:https://hbase.apache.org/book.html#developer[HowToContribute] and to the link:https://wiki.apache.org/hadoop/CodeReviewChecklist[code review checklist] should be rejected. Committers should always be polite to contributors and try to instruct and encourage them to contribute better patches. If a committer wishes to improve an unacceptable patch, then it should first be rejected, and a new patch should be attached by the committer for review. diff --git a/src/main/asciidoc/_chapters/external_apis.adoc b/src/main/asciidoc/_chapters/external_apis.adoc index c0e4a5f5ec7..fe5e5fb2917 100644 --- a/src/main/asciidoc/_chapters/external_apis.adoc +++ b/src/main/asciidoc/_chapters/external_apis.adoc @@ -625,7 +625,9 @@ Documentation about Thrift has moved to <>. == C/C++ Apache HBase Client FB's Chip Turner wrote a pure C/C++ client. -link:https://github.com/facebook/native-cpp-hbase-client[Check it out]. +link:https://github.com/hinaria/native-cpp-hbase-client[Check it out]. + +C++ client implementation. To see link:https://issues.apache.org/jira/browse/HBASE-14850[HBASE-14850]. [[jdo]] diff --git a/src/main/asciidoc/_chapters/ops_mgt.adoc b/src/main/asciidoc/_chapters/ops_mgt.adoc index 4a9815cdf9b..8e5c848dada 100644 --- a/src/main/asciidoc/_chapters/ops_mgt.adoc +++ b/src/main/asciidoc/_chapters/ops_mgt.adoc @@ -766,7 +766,7 @@ The LoadTestTool has received many updates in recent HBase releases, including s [[ops.regionmgt.majorcompact]] === Major Compaction -Major compactions can be requested via the HBase shell or link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact%28java.lang.String%29[Admin.majorCompact]. +Major compactions can be requested via the HBase shell or link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact-org.apache.hadoop.hbase.TableName-[Admin.majorCompact]. Note: major compactions do NOT do region merges. See <> for more information about compactions. @@ -783,7 +783,7 @@ $ bin/hbase org.apache.hadoop.hbase.util.Merge If you feel you have too many regions and want to consolidate them, Merge is the utility you need. Merge must run be done when the cluster is down. -See the link:http://ofps.oreilly.com/titles/9781449396107/performance.html[O'Reilly HBase +See the link:https://web.archive.org/web/20111231002503/http://ofps.oreilly.com/titles/9781449396107/performance.html[O'Reilly HBase Book] for an example of usage. You will need to pass 3 parameters to this application. @@ -1050,7 +1050,7 @@ In this case, or if you are in a OLAP environment and require having locality, t [[hbase_metrics]] == HBase Metrics -HBase emits metrics which adhere to the link:http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/metrics/package-summary.html[Hadoop metrics] API. +HBase emits metrics which adhere to the link:https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Metrics.html[Hadoop Metrics] API. Starting with HBase 0.95footnote:[The Metrics system was redone in HBase 0.96. See Migration to the New Metrics Hotness – Metrics2 by Elliot Clark for detail], HBase is configured to emit a default set of metrics with a default sampling period of every 10 seconds. @@ -1331,7 +1331,7 @@ Have a look in the Web UI. == Cluster Replication NOTE: This information was previously available at -link:http://hbase.apache.org#replication[Cluster Replication]. +link:https://hbase.apache.org/0.94/replication.html[Cluster Replication]. HBase provides a cluster replication mechanism which allows you to keep one cluster's state synchronized with that of another cluster, using the write-ahead log (WAL) of the source cluster to propagate the changes. Some use cases for cluster replication include: @@ -2093,7 +2093,7 @@ The act of copying these files creates new HDFS metadata, which is why a restore === Live Cluster Backup - Replication This approach assumes that there is a second cluster. -See the HBase page on link:http://hbase.apache.org/book.html#replication[replication] for more information. +See the HBase page on link:http://hbase.apache.org/book.html#_cluster_replication[replication] for more information. [[ops.backup.live.copytable]] === Live Cluster Backup - CopyTable diff --git a/src/main/asciidoc/_chapters/other_info.adoc b/src/main/asciidoc/_chapters/other_info.adoc index 8bcbe0f7431..f2dd1b81da4 100644 --- a/src/main/asciidoc/_chapters/other_info.adoc +++ b/src/main/asciidoc/_chapters/other_info.adoc @@ -32,16 +32,14 @@ === HBase Videos .Introduction to HBase -* link:http://www.cloudera.com/content/cloudera/en/resources/library/presentation/chicago_data_summit_apache_hbase_an_introduction_todd_lipcon.html[Introduction to HBase] by Todd Lipcon (Chicago Data Summit 2011). -* link:http://www.cloudera.com/videos/intorduction-hbase-todd-lipcon[Introduction to HBase] by Todd Lipcon (2010). -link:http://www.cloudera.com/videos/hadoop-world-2011-presentation-video-building-realtime-big-data-services-at-facebook-with-hadoop-and-hbase[Building Real Time Services at Facebook with HBase] by Jonathan Gray (Hadoop World 2011). - -link:http://www.cloudera.com/videos/hw10_video_how_stumbleupon_built_and_advertising_platform_using_hbase_and_hadoop[HBase and Hadoop, Mixing Real-Time and Batch Processing at StumbleUpon] by JD Cryans (Hadoop World 2010). +* link:https://vimeo.com/23400732[Introduction to HBase] by Todd Lipcon (Chicago Data Summit 2011). +* link:https://vimeo.com/26804675[Building Real Time Services at Facebook with HBase] by Jonathan Gray (Berlin buzzwords 2011) +* link:http://www.cloudera.com/videos/hw10_video_how_stumbleupon_built_and_advertising_platform_using_hbase_and_hadoop[The Multiple Uses Of HBase] by Jean-Daniel Cryans(Berlin buzzwords 2011). [[other.info.pres]] === HBase Presentations (Slides) -link:http://www.cloudera.com/content/cloudera/en/resources/library/hadoopworld/hadoop-world-2011-presentation-video-advanced-hbase-schema-design.html[Advanced HBase Schema Design] by Lars George (Hadoop World 2011). +link:https://www.slideshare.net/cloudera/hadoop-world-2011-advanced-hbase-schema-design-lars-george-cloudera[Advanced HBase Schema Design] by Lars George (Hadoop World 2011). link:http://www.slideshare.net/cloudera/chicago-data-summit-apache-hbase-an-introduction[Introduction to HBase] by Todd Lipcon (Chicago Data Summit 2011). @@ -61,9 +59,7 @@ link:http://ianvarley.com/UT/MR/Varley_MastersReport_Full_2009-08-07.pdf[No Rela link:https://blog.cloudera.com/blog/category/hbase/[Cloudera's HBase Blog] has a lot of links to useful HBase information. -* link:https://blog.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/[CAP Confusion] is a relevant entry for background information on distributed storage systems. - -link:http://wiki.apache.org/hadoop/HBase/HBasePresentations[HBase Wiki] has a page with a number of presentations. +link:https://blog.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/[CAP Confusion] is a relevant entry for background information on distributed storage systems. link:http://refcardz.dzone.com/refcardz/hbase[HBase RefCard] from DZone. diff --git a/src/main/asciidoc/_chapters/performance.adoc b/src/main/asciidoc/_chapters/performance.adoc index d3942bc4725..f1d89b511b7 100644 --- a/src/main/asciidoc/_chapters/performance.adoc +++ b/src/main/asciidoc/_chapters/performance.adoc @@ -587,7 +587,7 @@ If all your data is being written to one region at a time, then re-read the sect Also, if you are pre-splitting regions and all your data is _still_ winding up in a single region even though your keys aren't monotonically increasing, confirm that your keyspace actually works with the split strategy. There are a variety of reasons that regions may appear "well split" but won't work with your data. -As the HBase client communicates directly with the RegionServers, this can be obtained via link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#getRegionLocation(byte%5B%5D)[Table.getRegionLocation]. +As the HBase client communicates directly with the RegionServers, this can be obtained via link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/RegionLocator.html#getRegionLocation-byte:A-[RegionLocator.getRegionLocation]. See <>, as well as <> @@ -699,7 +699,7 @@ Enabling Bloom Filters can save your having to go to disk and can help improve r link:http://en.wikipedia.org/wiki/Bloom_filter[Bloom filters] were developed over in link:https://issues.apache.org/jira/browse/HBASE-1200[HBase-1200 Add bloomfilters]. For description of the development process -- why static blooms rather than dynamic -- and for an overview of the unique properties that pertain to blooms in HBase, as well as possible future directions, see the _Development Process_ section of the document link:https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf[BloomFilters in HBase] attached to link:https://issues.apache.org/jira/browse/HBASE-1200[HBASE-1200]. The bloom filters described here are actually version two of blooms in HBase. -In versions up to 0.19.x, HBase had a dynamic bloom option based on work done by the link:http://www.one-lab.org/[European Commission One-Lab Project 034819]. +In versions up to 0.19.x, HBase had a dynamic bloom option based on work done by the link:http://www.onelab.org[European Commission One-Lab Project 034819]. The core of the HBase bloom work was later pulled up into Hadoop to implement org.apache.hadoop.io.BloomMapFile. Version 1 of HBase blooms never worked that well. Version 2 is a rewrite from scratch though again it starts with the one-lab work. @@ -816,7 +816,7 @@ In this case, special care must be taken to regularly perform major compactions As is documented in <>, marking rows as deleted creates additional StoreFiles which then need to be processed on reads. Tombstones only get cleaned up with major compactions. -See also <> and link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact%28java.lang.String%29[Admin.majorCompact]. +See also <> and link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact-org.apache.hadoop.hbase.TableName-[Admin.majorCompact]. [[perf.deleting.rpc]] === Delete RPC Behavior @@ -825,8 +825,7 @@ Be aware that `Table.delete(Delete)` doesn't use the writeBuffer. It will execute an RegionServer RPC with each invocation. For a large number of deletes, consider `Table.delete(List)`. -See -+++hbase.client.Delete+++. +See link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete-org.apache.hadoop.hbase.client.Delete-[hbase.client.Delete] [[perf.hdfs]] == HDFS diff --git a/src/main/asciidoc/_chapters/protobuf.adoc b/src/main/asciidoc/_chapters/protobuf.adoc index 8c73dd0c374..ad7e378d962 100644 --- a/src/main/asciidoc/_chapters/protobuf.adoc +++ b/src/main/asciidoc/_chapters/protobuf.adoc @@ -29,7 +29,7 @@ == Protobuf -HBase uses Google's link:http://protobuf.protobufs[protobufs] wherever +HBase uses Google's link:https://developers.google.com/protocol-buffers/[protobufs] wherever it persists metadata -- in the tail of hfiles or Cells written by HBase into the system hbase:meta table or when HBase writes znodes to zookeeper, etc. -- and when it passes objects over the wire making diff --git a/src/main/asciidoc/_chapters/schema_design.adoc b/src/main/asciidoc/_chapters/schema_design.adoc index d17f06bd50a..92064ae9596 100644 --- a/src/main/asciidoc/_chapters/schema_design.adoc +++ b/src/main/asciidoc/_chapters/schema_design.adoc @@ -338,7 +338,7 @@ This is the main trade-off. ==== link:https://issues.apache.org/jira/browse/HBASE-4811[HBASE-4811] implements an API to scan a table or a range within a table in reverse, reducing the need to optimize your schema for forward or reverse scanning. This feature is available in HBase 0.98 and later. -See https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean for more information. +See link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed-boolean-[Scan.setReversed()] for more information. ==== A common problem in database processing is quickly finding the most recent version of a value. @@ -760,7 +760,7 @@ Neither approach is wrong, it just depends on what is most appropriate for the s ==== link:https://issues.apache.org/jira/browse/HBASE-4811[HBASE-4811] implements an API to scan a table or a range within a table in reverse, reducing the need to optimize your schema for forward or reverse scanning. This feature is available in HBase 0.98 and later. -See https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean for more information. +See link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed-boolean-[Scan.setReversed()] for more information. ==== [[schema.casestudies.log_timeseries.varkeys]] @@ -789,8 +789,7 @@ The rowkey of LOG_TYPES would be: * `[bytes]` variable length bytes for raw hostname or event-type. A column for this rowkey could be a long with an assigned number, which could be obtained -by using an -+++HBase counter+++. +by using an link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#incrementColumnValue-byte:A-byte:A-byte:A-long-[HBase counter] So the resulting composite rowkey would be: @@ -806,7 +805,7 @@ In either the Hash or Numeric substitution approach, the raw values for hostname This effectively is the OpenTSDB approach. What OpenTSDB does is re-write data and pack rows into columns for certain time-periods. For a detailed explanation, see: http://opentsdb.net/schema.html, and -+++Lessons Learned from OpenTSDB+++ +link:https://www.slideshare.net/cloudera/4-opentsdb-hbasecon[Lessons Learned from OpenTSDB] from HBaseCon2012. But this is how the general concept works: data is ingested, for example, in this manner... diff --git a/src/main/asciidoc/_chapters/security.adoc b/src/main/asciidoc/_chapters/security.adoc index 9a67778c149..6657b5032bb 100644 --- a/src/main/asciidoc/_chapters/security.adoc +++ b/src/main/asciidoc/_chapters/security.adoc @@ -1390,11 +1390,11 @@ When you issue a Scan or Get, HBase uses your default set of authorizations to filter out cells that you do not have access to. A superuser can set the default set of authorizations for a given user by using the `set_auths` HBase Shell command or the -link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/security/visibility/VisibilityClient.html#setAuths(org.apache.hadoop.hbase.client.Connection,%20java.lang.String\[\],%20java.lang.String)[VisibilityClient.setAuths()] method. +link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/security/visibility/VisibilityClient.html#setAuths-org.apache.hadoop.hbase.client.Connection-java.lang.String:A-java.lang.String-[VisibilityClient.setAuths()] method. You can specify a different authorization during the Scan or Get, by passing the AUTHORIZATIONS option in HBase Shell, or the -link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setAuthorizations%28org.apache.hadoop.hbase.security.visibility.Authorizations%29[setAuthorizations()] +link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setAuthorizations-org.apache.hadoop.hbase.security.visibility.Authorizations-[Scan.setAuthorizations()] method if you use the API. This authorization will be combined with your default set as an additional filter. It will further filter your results, rather than giving you additional authorization. diff --git a/src/main/asciidoc/_chapters/troubleshooting.adoc b/src/main/asciidoc/_chapters/troubleshooting.adoc index 1cf93d636b3..67a9defdb99 100644 --- a/src/main/asciidoc/_chapters/troubleshooting.adoc +++ b/src/main/asciidoc/_chapters/troubleshooting.adoc @@ -706,7 +706,10 @@ Because of a change in the format in which MIT Kerberos writes its credentials c If you have this problematic combination of components in your environment, to work around this problem, first log in with `kinit` and then immediately refresh the credential cache with `kinit -R`. The refresh will rewrite the credential cache without the problematic formatting. -Finally, depending on your Kerberos configuration, you may need to install the link:http://docs.oracle.com/javase/1.4.2/docs/guide/security/jce/JCERefGuide.html[Java Cryptography Extension], or JCE. +Prior to JDK 1.4, the JCE was an unbundled product, and as such, the JCA and JCE were regularly referred to as separate, distinct components. +As JCE is now bundled in the JDK 7.0, the distinction is becoming less apparent. Since the JCE uses the same architecture as the JCA, the JCE should be more properly thought of as a part of the JCA. + +You may need to install the link:https://docs.oracle.com/javase/1.5.0/docs/guide/security/jce/JCERefGuide.html[Java Cryptography Extension], or JCE because of JDK 1.5 or earlier version. Insure the JCE jars are on the classpath on both server and client systems. You may also need to download the link:http://www.oracle.com/technetwork/java/javase/downloads/jce-6-download-429243.html[unlimited strength JCE policy files]. @@ -758,7 +761,7 @@ For example (substitute VERSION with your HBase version): HADOOP_CLASSPATH=`hbase classpath` hadoop jar $HBASE_HOME/hbase-server-VERSION.jar rowcounter usertable ---- -See http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#classpathfor more information on HBase MapReduce jobs and classpaths. +See <> for more information on HBase MapReduce jobs and classpaths. [[trouble.hbasezerocopybytestring]] === Launching a job, you get java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString or class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString @@ -799,7 +802,7 @@ hadoop fs -du /hbase/myTable ---- ...returns a list of the regions under the HBase table 'myTable' and their disk utilization. -For more information on HDFS shell commands, see the link:http://hadoop.apache.org/common/docs/current/file_system_shell.html[HDFS FileSystem Shell documentation]. +For more information on HDFS shell commands, see the link:http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/FileSystemShell.html[HDFS FileSystem Shell documentation]. [[trouble.namenode.hbase.objects]] === Browsing HDFS for HBase Objects @@ -830,7 +833,7 @@ The HDFS directory structure of HBase WAL is.. / (WAL files for the RegionServer) ---- -See the link:http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html[HDFS User Guide] for other non-shell diagnostic utilities like `fsck`. +See the link:https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[HDFS User Guide] for other non-shell diagnostic utilities like `fsck`. [[trouble.namenode.0size.hlogs]] ==== Zero size WALs with data in them diff --git a/src/main/asciidoc/_chapters/unit_testing.adoc b/src/main/asciidoc/_chapters/unit_testing.adoc index 6131d5a9c62..50d4a71500f 100644 --- a/src/main/asciidoc/_chapters/unit_testing.adoc +++ b/src/main/asciidoc/_chapters/unit_testing.adoc @@ -33,7 +33,7 @@ For information on unit tests for HBase itself, see <>. == JUnit -HBase uses link:http://junit.org[JUnit] 4 for unit tests +HBase uses link:http://junit.org[JUnit] for unit tests This example will add unit tests to the following example class: diff --git a/src/main/asciidoc/_chapters/upgrading.adoc b/src/main/asciidoc/_chapters/upgrading.adoc index 35f38fae034..086fa86187e 100644 --- a/src/main/asciidoc/_chapters/upgrading.adoc +++ b/src/main/asciidoc/_chapters/upgrading.adoc @@ -161,7 +161,7 @@ HBase Private API:: .HBase Pre-1.0 versions are all EOM NOTE: For new installations, do not deploy 0.94.y, 0.96.y, or 0.98.y. Deploy our stable version. See link:https://issues.apache.org/jira/browse/HBASE-11642[EOL 0.96], link:https://issues.apache.org/jira/browse/HBASE-16215[clean up of EOM releases], and link:http://www.apache.org/dist/hbase/[the header of our downloads]. -Before the semantic versioning scheme pre-1.0, HBase tracked either Hadoop's versions (0.2x) or 0.9x versions. If you are into the arcane, checkout our old wiki page on link:http://wiki.apache.org/hadoop/Hbase/HBaseVersions[HBase Versioning] which tries to connect the HBase version dots. Below sections cover ONLY the releases before 1.0. +Before the semantic versioning scheme pre-1.0, HBase tracked either Hadoop's versions (0.2x) or 0.9x versions. If you are into the arcane, checkout our old wiki page on link:https://web.archive.org/web/20150905071342/https://wiki.apache.org/hadoop/Hbase/HBaseVersions[HBase Versioning] which tries to connect the HBase version dots. Below sections cover ONLY the releases before 1.0. [[hbase.development.series]] .Odd/Even Versioning or "Development" Series Releases diff --git a/src/main/asciidoc/_chapters/zookeeper.adoc b/src/main/asciidoc/_chapters/zookeeper.adoc index 91577dadd93..5f92ff03e00 100644 --- a/src/main/asciidoc/_chapters/zookeeper.adoc +++ b/src/main/asciidoc/_chapters/zookeeper.adoc @@ -181,7 +181,7 @@ We'll refer to this JAAS configuration file as _$CLIENT_CONF_ below. === HBase-managed ZooKeeper Configuration -On each node that will run a zookeeper, a master, or a regionserver, create a link:http://docs.oracle.com/javase/1.4.2/docs/guide/security/jgss/tutorials/LoginConfigFile.html[JAAS] configuration file in the conf directory of the node's _HBASE_HOME_ directory that looks like the following: +On each node that will run a zookeeper, a master, or a regionserver, create a link:http://docs.oracle.com/javase/7/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html[JAAS] configuration file in the conf directory of the node's _HBASE_HOME_ directory that looks like the following: [source,java] ----