HBASE-24271 Set values in `conf/hbase-site.xml` that enable running on `LocalFileSystem` out of the box
Simplify the new user experience shipping a configuration that enables a fresh checkout or tarball distribution to run in standalone mode without direct user configuration. This change restores the behavior we had when running on Hadoop 2.8 and earlier. Patch for master includes an update to the book. This change will be omitted when backporting to earlier branches. Signed-off-by: stack <stack@apache.org> Signed-off-by: Josh Elser <elserj@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>
This commit is contained in:
parent
888eaa094d
commit
9f25673bb5
|
@ -21,3 +21,4 @@ linklint/
|
|||
.checkstyle
|
||||
**/.checkstyle
|
||||
.java-version
|
||||
tmp
|
||||
|
|
|
@ -1,8 +1,7 @@
|
|||
<?xml version="1.0"?>
|
||||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
|
||||
<!--
|
||||
/**
|
||||
*
|
||||
/*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
|
@ -21,4 +20,35 @@
|
|||
*/
|
||||
-->
|
||||
<configuration>
|
||||
<!--
|
||||
The following properties are set for running HBase as a single process on a
|
||||
developer workstation. With this configuration, HBase is running in
|
||||
"stand-alone" mode and without a distributed file system. In this mode, and
|
||||
without further configuration, HBase and ZooKeeper data are stored on the
|
||||
local filesystem, in a path under the value configured for `hbase.tmp.dir`.
|
||||
This value is overridden from its default value of `/tmp` because many
|
||||
systems clean `/tmp` on a regular basis. Instead, it points to a path within
|
||||
this HBase installation directory.
|
||||
|
||||
Running against the `LocalFileSystem`, as opposed to a distributed
|
||||
filesystem, runs the risk of data integrity issues and data loss. Normally
|
||||
HBase will refuse to run in such an environment. Setting
|
||||
`hbase.unsafe.stream.capability.enforce` to `false` overrides this behavior,
|
||||
permitting operation. This configuration is for the developer workstation
|
||||
only and __should not be used in production!__
|
||||
|
||||
See also https://hbase.apache.org/book.html#standalone_dist
|
||||
-->
|
||||
<property>
|
||||
<name>hbase.cluster.distributed</name>
|
||||
<value>false</value>
|
||||
</property>
|
||||
<property>
|
||||
<name>hbase.tmp.dir</name>
|
||||
<value>./tmp</value>
|
||||
</property>
|
||||
<property>
|
||||
<name>hbase.unsafe.stream.capability.enforce</name>
|
||||
<value>false</value>
|
||||
</property>
|
||||
</configuration>
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
/**
|
||||
/*
|
||||
*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
|
@ -350,7 +350,7 @@ public final class CommonFSUtils {
|
|||
public static FileSystem getWALFileSystem(final Configuration c) throws IOException {
|
||||
Path p = getWALRootDir(c);
|
||||
FileSystem fs = p.getFileSystem(c);
|
||||
// hadoop-core does fs caching, so need to propogate this if set
|
||||
// hadoop-core does fs caching, so need to propagate this if set
|
||||
String enforceStreamCapability = c.get(UNSAFE_STREAM_CAPABILITY_ENFORCE);
|
||||
if (enforceStreamCapability != null) {
|
||||
fs.getConf().set(UNSAFE_STREAM_CAPABILITY_ENFORCE, enforceStreamCapability);
|
||||
|
|
|
@ -80,76 +80,12 @@ $ cd hbase-{Version}/
|
|||
JAVA_HOME=/usr
|
||||
----
|
||||
+
|
||||
|
||||
. Edit _conf/hbase-site.xml_, which is the main HBase configuration file.
|
||||
At this time, you need to specify the directory on the local filesystem where HBase and ZooKeeper write data and acknowledge some risks.
|
||||
By default, a new directory is created under /tmp.
|
||||
Many servers are configured to delete the contents of _/tmp_ upon reboot, so you should store the data elsewhere.
|
||||
The following configuration will store HBase's data in the _hbase_ directory, in the home directory of the user called `testuser`.
|
||||
Paste the `<property>` tags beneath the `<configuration>` tags, which should be empty in a new HBase install.
|
||||
+
|
||||
.Example _hbase-site.xml_ for Standalone HBase
|
||||
====
|
||||
[source,xml]
|
||||
----
|
||||
|
||||
<configuration>
|
||||
<property>
|
||||
<name>hbase.rootdir</name>
|
||||
<value>file:///home/testuser/hbase</value>
|
||||
</property>
|
||||
<property>
|
||||
<name>hbase.zookeeper.property.dataDir</name>
|
||||
<value>/home/testuser/zookeeper</value>
|
||||
</property>
|
||||
<property>
|
||||
<name>hbase.unsafe.stream.capability.enforce</name>
|
||||
<value>false</value>
|
||||
<description>
|
||||
Controls whether HBase will check for stream capabilities (hflush/hsync).
|
||||
|
||||
Disable this if you intend to run on LocalFileSystem, denoted by a rootdir
|
||||
with the 'file://' scheme, but be mindful of the NOTE below.
|
||||
|
||||
WARNING: Setting this to false blinds you to potential data loss and
|
||||
inconsistent system state in the event of process and/or node failures. If
|
||||
HBase is complaining of an inability to use hsync or hflush it's most
|
||||
likely not a false positive.
|
||||
</description>
|
||||
</property>
|
||||
</configuration>
|
||||
----
|
||||
====
|
||||
+
|
||||
You do not need to create the HBase data directory.
|
||||
HBase will do this for you. If you create the directory,
|
||||
HBase will attempt to do a migration, which is not what you want.
|
||||
+
|
||||
NOTE: The _hbase.rootdir_ in the above example points to a directory
|
||||
in the _local filesystem_. The 'file://' prefix is how we denote local
|
||||
filesystem. You should take the WARNING present in the configuration example
|
||||
to heart. In standalone mode HBase makes use of the local filesystem abstraction
|
||||
from the Apache Hadoop project. That abstraction doesn't provide the durability
|
||||
promises that HBase needs to operate safely. This is fine for local development
|
||||
and testing use cases where the cost of cluster failure is well contained. It is
|
||||
not appropriate for production deployments; eventually you will lose data.
|
||||
|
||||
To home HBase on an existing instance of HDFS, set the _hbase.rootdir_ to point at a
|
||||
directory up on your instance: e.g. _hdfs://namenode.example.org:8020/hbase_.
|
||||
For more on this variant, see the section below on Standalone HBase over HDFS.
|
||||
|
||||
. The _bin/start-hbase.sh_ script is provided as a convenient way to start HBase.
|
||||
Issue the command, and if all goes well, a message is logged to standard output showing that HBase started successfully.
|
||||
You can use the `jps` command to verify that you have one running process called `HMaster`.
|
||||
In standalone mode HBase runs all daemons within this single JVM, i.e.
|
||||
the HMaster, a single HRegionServer, and the ZooKeeper daemon.
|
||||
Go to _http://localhost:16010_ to view the HBase Web UI.
|
||||
+
|
||||
NOTE: Java needs to be installed and available.
|
||||
If you get an error indicating that Java is not installed,
|
||||
but it is on your system, perhaps in a non-standard location,
|
||||
edit the _conf/hbase-env.sh_ file and modify the `JAVA_HOME`
|
||||
setting to point to the directory that contains _bin/java_ on your system.
|
||||
|
||||
|
||||
[[shell_exercises]]
|
||||
|
@ -309,7 +245,7 @@ The above has shown you how to start and stop a standalone instance of HBase.
|
|||
In the next sections we give a quick overview of other modes of hbase deploy.
|
||||
|
||||
[[quickstart_pseudo]]
|
||||
=== Pseudo-Distributed Local Install
|
||||
=== Pseudo-Distributed for Local Testing
|
||||
|
||||
After working your way through <<quickstart,quickstart>> standalone mode,
|
||||
you can re-configure HBase to run in pseudo-distributed mode.
|
||||
|
@ -351,8 +287,8 @@ First, add the following property which directs HBase to run in distributed mode
|
|||
</property>
|
||||
----
|
||||
+
|
||||
Next, change the `hbase.rootdir` from the local filesystem to the address of your HDFS instance, using the `hdfs:////` URI syntax.
|
||||
In this example, HDFS is running on the localhost at port 8020. Be sure to either remove the entry for `hbase.unsafe.stream.capability.enforce` or set it to true.
|
||||
Next, add a configuration for `hbase.rootdir`, pointing to the address of your HDFS instance, using the `hdfs:////` URI syntax.
|
||||
In this example, HDFS is running on the localhost at port 8020.
|
||||
+
|
||||
[source,xml]
|
||||
----
|
||||
|
@ -364,8 +300,9 @@ In this example, HDFS is running on the localhost at port 8020. Be sure to eithe
|
|||
----
|
||||
+
|
||||
You do not need to create the directory in HDFS.
|
||||
HBase will do this for you.
|
||||
If you create the directory, HBase will attempt to do a migration, which is not what you want.
|
||||
HBase will do this for you. If you create the directory, HBase will attempt to do a migration, which is not what you want.
|
||||
+
|
||||
Finally, remove existing configuration for `hbase.tmp.dir` and `hbase.unsafe.stream.capability.enforce`,
|
||||
|
||||
. Start HBase.
|
||||
+
|
||||
|
@ -452,7 +389,7 @@ You can stop HBase the same way as in the <<quickstart,quickstart>> procedure, u
|
|||
|
||||
|
||||
[[quickstart_fully_distributed]]
|
||||
=== Advanced - Fully Distributed
|
||||
=== Fully Distributed for Production
|
||||
|
||||
In reality, you need a fully-distributed configuration to fully test HBase and to use it in real-world scenarios.
|
||||
In a distributed configuration, the cluster contains multiple nodes, each of which runs one or more HBase daemon.
|
||||
|
|
Loading…
Reference in New Issue