From 4ab697139556822ff460895f81ffa045070c4d59 Mon Sep 17 00:00:00 2001 From: Jonathan M Hsieh Date: Wed, 6 Aug 2014 14:20:43 -0700 Subject: [PATCH] HBASE-11681 Update and move doc about disabling the WAL (Misty Stanley-Jones) --- src/main/docbkx/book.xml | 15 +++++++++++++++ src/main/docbkx/performance.xml | 19 ++++++++++--------- 2 files changed, 25 insertions(+), 9 deletions(-) diff --git a/src/main/docbkx/book.xml b/src/main/docbkx/book.xml index cdfe5b2ca47..0234f85774c 100644 --- a/src/main/docbkx/book.xml +++ b/src/main/docbkx/book.xml @@ -2671,6 +2671,21 @@ ctime = Sat Jun 23 11:13:40 PDT 2012 +
+ Disabling the WAL + It is possible to disable the WAL, to improve performace in certain specific + situations. However, disabling the WAL puts your data at risk. The only situation where + this is recommended is during a bulk load. This is because, in the event of a problem, + the bulk load can be re-run with no risk of data loss. + The WAL is disabled by calling the HBase client field + Mutation.writeToWAL(false). Use the + Mutation.setDurability(Durability.SKIP_WAL) and Mutation.getDurability() + methods to set and get the field's value. There is no way to disable the WAL for only a + specific table. + + If you disable the WAL for anything other than bulk loads, your data is at + risk. +
diff --git a/src/main/docbkx/performance.xml b/src/main/docbkx/performance.xml index 47b67be7621..57c866ad3a5 100644 --- a/src/main/docbkx/performance.xml +++ b/src/main/docbkx/performance.xml @@ -564,16 +564,17 @@ admin.createTable(table, splits);
HBase Client: Turn off WAL on Puts - A frequently discussed option for increasing throughput on Puts - is to call writeToWAL(false). Turning this off means that the RegionServer will - not write the Put to the Write Ahead Log, only - into the memstore, HOWEVER the consequence is that if there is a RegionServer failure - there will be data loss. If writeToWAL(false) is used, - do so with extreme caution. You may find in actuality that it makes little difference if - your load is well distributed across the cluster. + A frequent request is to disable the WAL to increase performance of Puts. This is only + appropriate for bulk loads, as it puts your data at risk by removing the protection of the + WAL in the event of a region server crash. Bulk loads can be re-run in the event of a crash, + with little risk of data loss. + + If you disable the WAL for anything other than bulk loads, your data is at + risk. In general, it is best to use WAL for Puts, and where loading throughput is a concern to - use bulk loading techniques instead. + use bulk loading techniques instead. For normal + Puts, you are not likely to see a performance improvement which would outweigh the risk. To + disable the WAL, see .