Core: don't load bloom filters by default

This change just changes the default for index.codec.bloom.load to false: with recent performance improvements to ID lookup, such as #6298, bloom filters don't give much of a performance gain anymore, and they can consume non-trivial RAM when there are many tiny documents. For now, we still index the bloom filters, so if a given app wants them back, it can just update the index.codec.bloom.load to true. Closes #6959
2014-07-23 05:58:31 -04:00 · 2014-07-23 05:58:31 -04:00 · cc4d7c6272
parent 3f9aea883f
commit cc4d7c6272
3 changed files with 15 additions and 10 deletions
--- a/docs/reference/index-modules/codec.asciidoc
+++ b/docs/reference/index-modules/codec.asciidoc
@ -144,21 +144,21 @@ Type name: `bloom`
 [TIP]
 ==================================================

-It can sometime make sense to disable bloom filters. For instance, if you are
-logging into an index per day, and you have thousands of indices, the bloom
-filters can take up a sizable amount of memory. For most queries you are only
-interested in recent indices, so you don't mind CRUD operations on older
-indices taking slightly longer.
+As of 1.4, the bloom filters are no longer loaded at search time by
+default: they consume ~10 bits per unique id value, which can quickly
+add up for indices with many tiny documents, and separate performance
+improvements have made the performance gains with bloom filters very
+small.

-In these cases you can disable loading of the bloom filter on  a per-index
-basis by updating the index settings:
+You can enable loading of the bloom filter at search time on a
+per-index basis by updating the index settings:

 [source,js]
 --------------------------------------------------
 PUT /old_index/_settings?index.codec.bloom.load=false
 --------------------------------------------------

-This setting, which defaults to `true`, can be updated on a live index. Note,
+This setting, which defaults to `false`, can be updated on a live index. Note,
 however, that changing the value will cause the index to be reopened, which
 will invalidate any existing caches.

--- a/src/main/java/org/elasticsearch/index/codec/CodecService.java
+++ b/src/main/java/org/elasticsearch/index/codec/CodecService.java
@ -45,7 +45,7 @@ import org.elasticsearch.index.settings.IndexSettings;
 public class CodecService extends AbstractIndexComponent {

    public static final String INDEX_CODEC_BLOOM_LOAD = "index.codec.bloom.load";
-    public static final boolean INDEX_CODEC_BLOOM_LOAD_DEFAULT = true;
+    public static final boolean INDEX_CODEC_BLOOM_LOAD_DEFAULT = false;

    private final PostingsFormatService postingsFormatService;
    private final DocValuesFormatService docValuesFormatService;
--- a/src/test/java/org/elasticsearch/test/ElasticsearchIntegrationTest.java
+++ b/src/test/java/org/elasticsearch/test/ElasticsearchIntegrationTest.java
@ -73,10 +73,11 @@ import org.elasticsearch.common.xcontent.XContentBuilder;
 import org.elasticsearch.common.xcontent.XContentFactory;
 import org.elasticsearch.common.xcontent.support.XContentMapValues;
 import org.elasticsearch.discovery.zen.elect.ElectMasterService;
+import org.elasticsearch.index.codec.CodecService;
 import org.elasticsearch.index.fielddata.FieldDataType;
 import org.elasticsearch.index.mapper.DocumentMapper;
-import org.elasticsearch.index.mapper.FieldMapper;
 import org.elasticsearch.index.mapper.FieldMapper.Loading;
+import org.elasticsearch.index.mapper.FieldMapper;
 import org.elasticsearch.index.mapper.internal.FieldNamesFieldMapper;
 import org.elasticsearch.index.mapper.internal.IdFieldMapper;
 import org.elasticsearch.index.merge.policy.*;
@ -437,6 +438,10 @@ public abstract class ElasticsearchIntegrationTest extends ElasticsearchTestCase
        if (random.nextBoolean()) {
             builder.put(FsTranslog.INDEX_TRANSLOG_FS_TYPE, RandomPicks.randomFrom(random, FsTranslogFile.Type.values()).name());
        }
+
+        // Randomly load or don't load bloom filters:
+        builder.put(CodecService.INDEX_CODEC_BLOOM_LOAD, random.nextBoolean());
+
        return builder;
    }