mirror of https://github.com/apache/druid.git
Fix wrong partitionsSpec type names in the document (#8297)
* Fix wrong type names for partitionsSpec * add unit tests; add json properties for backward compatibility * beautify conf names * remove maxRowsPerSegment from hashed partitionsSpec * fix doc build
This commit is contained in:
parent
7749571a7f
commit
95fa609615
|
@ -273,7 +273,7 @@ The tuningConfig is optional and default parameters will be used if no tuningCon
|
|||
|-----|----|-----------|--------|
|
||||
|workingPath|String|The working path to use for intermediate results (results between Hadoop jobs).|Only used by the [Command-line Hadoop indexer](#cli). The default is '/tmp/druid-indexing'. This field must be null otherwise.|
|
||||
|version|String|The version of created segments. Ignored for HadoopIndexTask unless useExplicitVersion is set to true|no (default == datetime that indexing starts at)|
|
||||
|partitionsSpec|Object|A specification of how to partition each time bucket into segments. Absence of this property means no partitioning will occur. See [`partitionsSpec`](#partitionsspec) below.|no (default == 'hadoop_hashed_partitions')|
|
||||
|partitionsSpec|Object|A specification of how to partition each time bucket into segments. Absence of this property means no partitioning will occur. See [`partitionsSpec`](#partitionsspec) below.|no (default == 'hashed')|
|
||||
|maxRowsInMemory|Integer|The number of rows to aggregate before persisting. Note that this is the number of post-aggregation rows which may not be equal to the number of input events due to roll-up. This is used to manage the required JVM heap size. Normally user does not need to set this, but depending on the nature of data, if rows are short in terms of bytes, user may not want to store a million rows in memory and this value should be set.|no (default == 1000000)|
|
||||
|maxBytesInMemory|Long|The number of bytes to aggregate in heap memory before persisting. Normally this is computed internally and user does not need to set it. This is based on a rough estimate of memory usage and not actual usage. The maximum heap memory usage for indexing is maxBytesInMemory * (2 + maxPendingPersists).|no (default == One-sixth of max JVM memory)|
|
||||
|leaveIntermediate|Boolean|Leave behind intermediate files (for debugging) in the workingPath when a job completes, whether it passes or fails.|no (default == false)|
|
||||
|
@ -313,8 +313,8 @@ for more details.
|
|||
## `partitionsSpec`
|
||||
|
||||
Segments are always partitioned based on timestamp (according to the granularitySpec) and may be further partitioned in
|
||||
some other way depending on partition type. Druid supports two types of partitioning strategies: `hadoop_hashed_partitions` (based on the
|
||||
hash of all dimensions in each row), and `hadoop_single_dim_partitions` (based on ranges of a single dimension).
|
||||
some other way depending on partition type. Druid supports two types of partitioning strategies: `hashed` (based on the
|
||||
hash of all dimensions in each row), and `single_dim` (based on ranges of a single dimension).
|
||||
|
||||
Hashed partitioning is recommended in most cases, as it will improve indexing performance and create more uniformly
|
||||
sized data segments relative to single-dimension partitioning.
|
||||
|
@ -323,7 +323,7 @@ sized data segments relative to single-dimension partitioning.
|
|||
|
||||
```json
|
||||
"partitionsSpec": {
|
||||
"type": "hadoop_hashed_partitions",
|
||||
"type": "hashed",
|
||||
"targetPartitionSize": 5000000
|
||||
}
|
||||
```
|
||||
|
@ -336,21 +336,21 @@ The configuration options are:
|
|||
|
||||
|Field|Description|Required|
|
||||
|--------|-----------|---------|
|
||||
|type|Type of partitionSpec to be used.|"hadoop_hashed_partitions"|
|
||||
|type|Type of partitionSpec to be used.|"hashed"|
|
||||
|targetPartitionSize|Target number of rows to include in a partition, should be a number that targets segments of 500MB\~1GB.|either this or numShards|
|
||||
|numShards|Specify the number of partitions directly, instead of a target partition size. Ingestion will run faster, since it can skip the step necessary to select a number of partitions automatically.|either this or targetPartitionSize|
|
||||
|partitionDimensions|The dimensions to partition on. Leave blank to select all dimensions. Only used with numShards, will be ignored when targetPartitionSize is set|no|
|
||||
|
||||
### Single-dimension partitioning
|
||||
### Single-dimension range partitioning
|
||||
|
||||
```json
|
||||
"partitionsSpec": {
|
||||
"type": "hadoop_single_dim_partitions",
|
||||
"type": "single_dim",
|
||||
"targetPartitionSize": 5000000
|
||||
}
|
||||
```
|
||||
|
||||
Single-dimension partitioning works by first selecting a dimension to partition on, and then separating that dimension
|
||||
Single-dimension range partitioning works by first selecting a dimension to partition on, and then separating that dimension
|
||||
into contiguous ranges. Each segment will contain all rows with values of that dimension in that range. For example,
|
||||
your segments may be partitioned on the dimension "host" using the ranges "a.example.com" to "f.example.com" and
|
||||
"f.example.com" to "z.example.com". By default, the dimension to use is determined automatically, although you can
|
||||
|
@ -360,7 +360,7 @@ The configuration options are:
|
|||
|
||||
|Field|Description|Required|
|
||||
|--------|-----------|---------|
|
||||
|type|Type of partitionSpec to be used.|"hadoop_single_dim_partitions"|
|
||||
|type|Type of partitionSpec to be used.|"single_dim"|
|
||||
|targetPartitionSize|Target number of rows to include in a partition, should be a number that targets segments of 500MB\~1GB.|yes|
|
||||
|maxPartitionSize|Maximum number of rows to include in a partition. Defaults to 50% larger than the targetPartitionSize.|no|
|
||||
|partitionDimension|The dimension to partition on. Leave blank to select a dimension automatically.|no|
|
||||
|
|
|
@ -199,9 +199,12 @@ The tuningConfig is optional and default parameters will be used if no tuningCon
|
|||
|property|description|default|required?|
|
||||
|--------|-----------|-------|---------|
|
||||
|type|The task type, this should always be `index_parallel`.|none|yes|
|
||||
|maxRowsPerSegment|Deprecated. Use `partitionsSpec` instead. Used in sharding. Determines how many rows are in each segment.|5000000|no|
|
||||
|maxRowsInMemory|Used in determining when intermediate persists to disk should occur. Normally user does not need to set this, but depending on the nature of data, if rows are short in terms of bytes, user may not want to store a million rows in memory and this value should be set.|1000000|no|
|
||||
|maxBytesInMemory|Used in determining when intermediate persists to disk should occur. Normally this is computed internally and user does not need to set it. This value represents number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. The maximum heap memory usage for indexing is maxBytesInMemory * (2 + maxPendingPersists)|1/6 of max JVM memory|no|
|
||||
|partitionsSpec|Defines how to partition the segments in a timeChunk, see [PartitionsSpec](#partitionsspec)|`dynamic` if `forceGuaranteedRollup` = false, `hashed` if `forceGuaranteedRollup` = true|no|
|
||||
|maxTotalRows|Deprecated. Use `partitionsSpec` instead. Total number of rows in segments waiting for being pushed. Used in determining when intermediate pushing should occur.|20000000|no|
|
||||
|numShards|Deprecated. Use `partitionsSpec` instead. Directly specify the number of shards to create. If this is specified and `intervals` is specified in the `granularitySpec`, the index task can skip the determine intervals/partitions pass through the data. `numShards` cannot be specified if `maxRowsPerSegment` is set.|null|no|
|
||||
|partitionsSpec|Defines how to partition data in each timeChunk, see [PartitionsSpec](#partitionsspec)|`dynamic` if `forceGuaranteedRollup` = false, `hashed` if `forceGuaranteedRollup` = true|no|
|
||||
|indexSpec|Defines segment storage format options to be used at indexing time, see [IndexSpec](index.html#indexspec)|null|no|
|
||||
|indexSpecForIntermediatePersists|Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. this can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. however, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](index.html#indexspec) for possible values.|same as indexSpec|no|
|
||||
|maxPendingPersists|Maximum number of persists that can be pending but not started. If this limit would be exceeded by a new intermediate persist, ingestion will block until the currently-running persist finishes. Maximum heap memory usage for indexing scales with maxRowsInMemory * (2 + maxPendingPersists).|0 (meaning one persist can be running concurrently with ingestion, and none can be queued up)|no|
|
||||
|
@ -226,8 +229,7 @@ For perfect rollup, you should use `hashed`.
|
|||
|property|description|default|required?|
|
||||
|--------|-----------|-------|---------|
|
||||
|type|This should always be `hashed`|none|yes|
|
||||
|maxRowsPerSegment|Used in sharding. Determines how many rows are in each segment.|5000000|no|
|
||||
|numShards|Directly specify the number of shards to create. If this is specified and 'intervals' is specified in the granularitySpec, the index task can skip the determine intervals/partitions pass through the data. numShards cannot be specified if maxRowsPerSegment is set.|null|no|
|
||||
|numShards|Directly specify the number of shards to create. If this is specified and `intervals` is specified in the `granularitySpec`, the index task can skip the determine intervals/partitions pass through the data. `numShards` cannot be specified if `maxRowsPerSegment` is set.|null|no|
|
||||
|partitionDimensions|The dimensions to partition on. Leave blank to select all dimensions.|null|no|
|
||||
|
||||
For best-effort rollup, you should use `dynamic`.
|
||||
|
@ -599,9 +601,13 @@ The tuningConfig is optional and default parameters will be used if no tuningCon
|
|||
|property|description|default|required?|
|
||||
|--------|-----------|-------|---------|
|
||||
|type|The task type, this should always be "index".|none|yes|
|
||||
|maxRowsPerSegment|Deprecated. Use `partitionsSpec` instead. Used in sharding. Determines how many rows are in each segment.|5000000|no|
|
||||
|maxRowsInMemory|Used in determining when intermediate persists to disk should occur. Normally user does not need to set this, but depending on the nature of data, if rows are short in terms of bytes, user may not want to store a million rows in memory and this value should be set.|1000000|no|
|
||||
|maxBytesInMemory|Used in determining when intermediate persists to disk should occur. Normally this is computed internally and user does not need to set it. This value represents number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. The maximum heap memory usage for indexing is maxBytesInMemory * (2 + maxPendingPersists)|1/6 of max JVM memory|no|
|
||||
|partitionsSpec|Defines how to partition the segments in a timeChunk, see [PartitionsSpec](#partitionsspec)|`dynamic` if `forceGuaranteedRollup` = false, `hashed` if `forceGuaranteedRollup` = true|no|
|
||||
|maxTotalRows|Deprecated. Use `partitionsSpec` instead. Total number of rows in segments waiting for being pushed. Used in determining when intermediate pushing should occur.|20000000|no|
|
||||
|numShards|Deprecated. Use `partitionsSpec` instead. Directly specify the number of shards to create. If this is specified and `intervals` is specified in the `granularitySpec`, the index task can skip the determine intervals/partitions pass through the data. `numShards` cannot be specified if `maxRowsPerSegment` is set.|null|no|
|
||||
|partitionDimensions|Deprecated. Use `partitionsSpec` instead. The dimensions to partition on. Leave blank to select all dimensions. Only used with `forceGuaranteedRollup` = true, will be ignored otherwise.|null|no|
|
||||
|partitionsSpec|Defines how to partition data in each timeChunk, see [PartitionsSpec](#partitionsspec)|`dynamic` if `forceGuaranteedRollup` = false, `hashed` if `forceGuaranteedRollup` = true|no|
|
||||
|indexSpec|Defines segment storage format options to be used at indexing time, see [IndexSpec](index.html#indexspec)|null|no|
|
||||
|indexSpecForIntermediatePersists|Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. this can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. however, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](index.html#indexspec) for possible values.|same as indexSpec|no|
|
||||
|maxPendingPersists|Maximum number of persists that can be pending but not started. If this limit would be exceeded by a new intermediate persist, ingestion will block until the currently-running persist finishes. Maximum heap memory usage for indexing scales with maxRowsInMemory * (2 + maxPendingPersists).|0 (meaning one persist can be running concurrently with ingestion, and none can be queued up)|no|
|
||||
|
|
|
@ -56,7 +56,7 @@ If your multitenant cluster uses shared datasources, most of your queries will l
|
|||
dimension. These sorts of queries perform best when data is well-partitioned by tenant. There are a few ways to
|
||||
accomplish this.
|
||||
|
||||
With batch indexing, you can use [single-dimension partitioning](../ingestion/hadoop.html#single-dimension-partitioning)
|
||||
With batch indexing, you can use [single-dimension partitioning](../ingestion/hadoop.html#single-dimension-range-partitioning)
|
||||
to partition your data by tenant_id. Druid always partitions by time first, but the secondary partition within each
|
||||
time bucket will be on tenant_id.
|
||||
|
||||
|
|
|
@ -1276,18 +1276,6 @@ public class IndexTask extends AbstractBatchIndexTask implements ChatHandler
|
|||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Return the max number of rows per segment. This returns null if it's not specified in tuningConfig.
|
||||
* Deprecated in favor of {@link #getGivenOrDefaultPartitionsSpec()}.
|
||||
*/
|
||||
@Nullable
|
||||
@Override
|
||||
@Deprecated
|
||||
public Integer getMaxRowsPerSegment()
|
||||
{
|
||||
return partitionsSpec == null ? null : partitionsSpec.getMaxRowsPerSegment();
|
||||
}
|
||||
|
||||
@JsonProperty
|
||||
@Override
|
||||
public int getMaxRowsInMemory()
|
||||
|
@ -1302,37 +1290,6 @@ public class IndexTask extends AbstractBatchIndexTask implements ChatHandler
|
|||
return maxBytesInMemory;
|
||||
}
|
||||
|
||||
/**
|
||||
* Return the max number of total rows in appenderator. This returns null if it's not specified in tuningConfig.
|
||||
* Deprecated in favor of {@link #getGivenOrDefaultPartitionsSpec()}.
|
||||
*/
|
||||
@Override
|
||||
@Nullable
|
||||
@Deprecated
|
||||
public Long getMaxTotalRows()
|
||||
{
|
||||
return partitionsSpec instanceof DynamicPartitionsSpec
|
||||
? ((DynamicPartitionsSpec) partitionsSpec).getMaxTotalRows()
|
||||
: null;
|
||||
}
|
||||
|
||||
@Deprecated
|
||||
@Nullable
|
||||
public Integer getNumShards()
|
||||
{
|
||||
return partitionsSpec instanceof HashedPartitionsSpec
|
||||
? ((HashedPartitionsSpec) partitionsSpec).getNumShards()
|
||||
: null;
|
||||
}
|
||||
|
||||
@Deprecated
|
||||
public List<String> getPartitionDimensions()
|
||||
{
|
||||
return partitionsSpec instanceof HashedPartitionsSpec
|
||||
? ((HashedPartitionsSpec) partitionsSpec).getPartitionDimensions()
|
||||
: Collections.emptyList();
|
||||
}
|
||||
|
||||
@JsonProperty
|
||||
@Nullable
|
||||
public PartitionsSpec getPartitionsSpec()
|
||||
|
@ -1364,12 +1321,6 @@ public class IndexTask extends AbstractBatchIndexTask implements ChatHandler
|
|||
return indexSpecForIntermediatePersists;
|
||||
}
|
||||
|
||||
@Override
|
||||
public File getBasePersistDirectory()
|
||||
{
|
||||
return basePersistDirectory;
|
||||
}
|
||||
|
||||
@JsonProperty
|
||||
@Override
|
||||
public int getMaxPendingPersists()
|
||||
|
@ -1406,6 +1357,14 @@ public class IndexTask extends AbstractBatchIndexTask implements ChatHandler
|
|||
return pushTimeout;
|
||||
}
|
||||
|
||||
@Nullable
|
||||
@Override
|
||||
@JsonProperty
|
||||
public SegmentWriteOutMediumFactory getSegmentWriteOutMediumFactory()
|
||||
{
|
||||
return segmentWriteOutMediumFactory;
|
||||
}
|
||||
|
||||
@JsonProperty
|
||||
public boolean isLogParseExceptions()
|
||||
{
|
||||
|
@ -1424,20 +1383,65 @@ public class IndexTask extends AbstractBatchIndexTask implements ChatHandler
|
|||
return maxSavedParseExceptions;
|
||||
}
|
||||
|
||||
/**
|
||||
* Return the max number of rows per segment. This returns null if it's not specified in tuningConfig.
|
||||
* Deprecated in favor of {@link #getGivenOrDefaultPartitionsSpec()}.
|
||||
*/
|
||||
@Nullable
|
||||
@Override
|
||||
@Deprecated
|
||||
@JsonProperty
|
||||
public Integer getMaxRowsPerSegment()
|
||||
{
|
||||
return partitionsSpec == null ? null : partitionsSpec.getMaxRowsPerSegment();
|
||||
}
|
||||
|
||||
/**
|
||||
* Return the max number of total rows in appenderator. This returns null if it's not specified in tuningConfig.
|
||||
* Deprecated in favor of {@link #getGivenOrDefaultPartitionsSpec()}.
|
||||
*/
|
||||
@Override
|
||||
@Nullable
|
||||
@Deprecated
|
||||
@JsonProperty
|
||||
public Long getMaxTotalRows()
|
||||
{
|
||||
return partitionsSpec instanceof DynamicPartitionsSpec
|
||||
? ((DynamicPartitionsSpec) partitionsSpec).getMaxTotalRows()
|
||||
: null;
|
||||
}
|
||||
|
||||
@Deprecated
|
||||
@Nullable
|
||||
@JsonProperty
|
||||
public Integer getNumShards()
|
||||
{
|
||||
return partitionsSpec instanceof HashedPartitionsSpec
|
||||
? ((HashedPartitionsSpec) partitionsSpec).getNumShards()
|
||||
: null;
|
||||
}
|
||||
|
||||
@Deprecated
|
||||
@JsonProperty
|
||||
public List<String> getPartitionDimensions()
|
||||
{
|
||||
return partitionsSpec instanceof HashedPartitionsSpec
|
||||
? ((HashedPartitionsSpec) partitionsSpec).getPartitionDimensions()
|
||||
: Collections.emptyList();
|
||||
}
|
||||
|
||||
@Override
|
||||
public File getBasePersistDirectory()
|
||||
{
|
||||
return basePersistDirectory;
|
||||
}
|
||||
|
||||
@Override
|
||||
public Period getIntermediatePersistPeriod()
|
||||
{
|
||||
return new Period(Integer.MAX_VALUE); // intermediate persist doesn't make much sense for batch jobs
|
||||
}
|
||||
|
||||
@Nullable
|
||||
@Override
|
||||
@JsonProperty
|
||||
public SegmentWriteOutMediumFactory getSegmentWriteOutMediumFactory()
|
||||
{
|
||||
return segmentWriteOutMediumFactory;
|
||||
}
|
||||
|
||||
@Override
|
||||
public boolean equals(Object o)
|
||||
{
|
||||
|
|
|
@ -0,0 +1,262 @@
|
|||
/*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
|
||||
package org.apache.druid.indexing.common.task;
|
||||
|
||||
import com.fasterxml.jackson.databind.ObjectMapper;
|
||||
import com.fasterxml.jackson.databind.jsontype.NamedType;
|
||||
import com.google.common.collect.ImmutableList;
|
||||
import org.apache.druid.indexer.partitions.DynamicPartitionsSpec;
|
||||
import org.apache.druid.indexer.partitions.HashedPartitionsSpec;
|
||||
import org.apache.druid.indexing.common.task.IndexTask.IndexTuningConfig;
|
||||
import org.apache.druid.jackson.DefaultObjectMapper;
|
||||
import org.apache.druid.segment.IndexSpec;
|
||||
import org.apache.druid.segment.data.CompressionFactory.LongEncodingStrategy;
|
||||
import org.apache.druid.segment.data.CompressionStrategy;
|
||||
import org.apache.druid.segment.data.RoaringBitmapSerdeFactory;
|
||||
import org.apache.druid.segment.indexing.TuningConfig;
|
||||
import org.apache.druid.segment.writeout.OffHeapMemorySegmentWriteOutMediumFactory;
|
||||
import org.junit.Assert;
|
||||
import org.junit.BeforeClass;
|
||||
import org.junit.Rule;
|
||||
import org.junit.Test;
|
||||
import org.junit.rules.ExpectedException;
|
||||
|
||||
import java.io.IOException;
|
||||
|
||||
public class IndexTaskSerdeTest
|
||||
{
|
||||
private static final ObjectMapper MAPPER = new DefaultObjectMapper();
|
||||
|
||||
@Rule
|
||||
public ExpectedException expectedException = ExpectedException.none();
|
||||
|
||||
@BeforeClass
|
||||
public static void setup()
|
||||
{
|
||||
MAPPER.registerSubtypes(new NamedType(IndexTuningConfig.class, "index"));
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testSerdeTuningConfigWithDynamicPartitionsSpec() throws IOException
|
||||
{
|
||||
final IndexTuningConfig tuningConfig = new IndexTuningConfig(
|
||||
null,
|
||||
null,
|
||||
100,
|
||||
2000L,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
new DynamicPartitionsSpec(1000, 2000L),
|
||||
new IndexSpec(
|
||||
new RoaringBitmapSerdeFactory(false),
|
||||
CompressionStrategy.LZ4,
|
||||
CompressionStrategy.LZF,
|
||||
LongEncodingStrategy.LONGS
|
||||
),
|
||||
null,
|
||||
null,
|
||||
false,
|
||||
null,
|
||||
null,
|
||||
100L,
|
||||
OffHeapMemorySegmentWriteOutMediumFactory.instance(),
|
||||
true,
|
||||
10,
|
||||
100
|
||||
);
|
||||
assertSerdeTuningConfig(tuningConfig);
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testSerdeTuningConfigWithHashedPartitionsSpec() throws IOException
|
||||
{
|
||||
final IndexTuningConfig tuningConfig = new IndexTuningConfig(
|
||||
null,
|
||||
null,
|
||||
100,
|
||||
2000L,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
new HashedPartitionsSpec(null, 10, ImmutableList.of("dim1", "dim2")),
|
||||
new IndexSpec(
|
||||
new RoaringBitmapSerdeFactory(false),
|
||||
CompressionStrategy.LZ4,
|
||||
CompressionStrategy.LZF,
|
||||
LongEncodingStrategy.LONGS
|
||||
),
|
||||
null,
|
||||
null,
|
||||
true,
|
||||
null,
|
||||
null,
|
||||
100L,
|
||||
OffHeapMemorySegmentWriteOutMediumFactory.instance(),
|
||||
true,
|
||||
10,
|
||||
100
|
||||
);
|
||||
assertSerdeTuningConfig(tuningConfig);
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testSerdeTuningConfigWithDeprecatedDynamicPartitionsSpec() throws IOException
|
||||
{
|
||||
final IndexTuningConfig tuningConfig = new IndexTuningConfig(
|
||||
null,
|
||||
1000,
|
||||
100,
|
||||
2000L,
|
||||
3000L,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
new IndexSpec(
|
||||
new RoaringBitmapSerdeFactory(false),
|
||||
CompressionStrategy.LZ4,
|
||||
CompressionStrategy.LZF,
|
||||
LongEncodingStrategy.LONGS
|
||||
),
|
||||
null,
|
||||
null,
|
||||
false,
|
||||
null,
|
||||
null,
|
||||
100L,
|
||||
OffHeapMemorySegmentWriteOutMediumFactory.instance(),
|
||||
true,
|
||||
10,
|
||||
100
|
||||
);
|
||||
assertSerdeTuningConfig(tuningConfig);
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testSerdeTuningConfigWithDeprecatedHashedPartitionsSpec() throws IOException
|
||||
{
|
||||
final IndexTuningConfig tuningConfig = new IndexTuningConfig(
|
||||
null,
|
||||
null,
|
||||
100,
|
||||
2000L,
|
||||
null,
|
||||
null,
|
||||
10,
|
||||
ImmutableList.of("dim1", "dim2"),
|
||||
null,
|
||||
new IndexSpec(
|
||||
new RoaringBitmapSerdeFactory(false),
|
||||
CompressionStrategy.LZ4,
|
||||
CompressionStrategy.LZF,
|
||||
LongEncodingStrategy.LONGS
|
||||
),
|
||||
null,
|
||||
null,
|
||||
false,
|
||||
null,
|
||||
null,
|
||||
100L,
|
||||
OffHeapMemorySegmentWriteOutMediumFactory.instance(),
|
||||
true,
|
||||
10,
|
||||
100
|
||||
);
|
||||
assertSerdeTuningConfig(tuningConfig);
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testForceGuaranteedRollupWithDynamicPartitionsSpec()
|
||||
{
|
||||
expectedException.expect(IllegalStateException.class);
|
||||
expectedException.expectMessage("HashedPartitionsSpec must be used for perfect rollup");
|
||||
final IndexTuningConfig tuningConfig = new IndexTuningConfig(
|
||||
null,
|
||||
null,
|
||||
100,
|
||||
2000L,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
new DynamicPartitionsSpec(1000, 2000L),
|
||||
new IndexSpec(
|
||||
new RoaringBitmapSerdeFactory(false),
|
||||
CompressionStrategy.LZ4,
|
||||
CompressionStrategy.LZF,
|
||||
LongEncodingStrategy.LONGS
|
||||
),
|
||||
null,
|
||||
null,
|
||||
true,
|
||||
null,
|
||||
null,
|
||||
100L,
|
||||
OffHeapMemorySegmentWriteOutMediumFactory.instance(),
|
||||
true,
|
||||
10,
|
||||
100
|
||||
);
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testBestEffortRollupWithHashedPartitionsSpec()
|
||||
{
|
||||
expectedException.expect(IllegalStateException.class);
|
||||
expectedException.expectMessage("DynamicPartitionsSpec must be used for best-effort rollup");
|
||||
final IndexTuningConfig tuningConfig = new IndexTuningConfig(
|
||||
null,
|
||||
null,
|
||||
100,
|
||||
2000L,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
null,
|
||||
new HashedPartitionsSpec(null, 10, ImmutableList.of("dim1", "dim2")),
|
||||
new IndexSpec(
|
||||
new RoaringBitmapSerdeFactory(false),
|
||||
CompressionStrategy.LZ4,
|
||||
CompressionStrategy.LZF,
|
||||
LongEncodingStrategy.LONGS
|
||||
),
|
||||
null,
|
||||
null,
|
||||
false,
|
||||
null,
|
||||
null,
|
||||
100L,
|
||||
OffHeapMemorySegmentWriteOutMediumFactory.instance(),
|
||||
true,
|
||||
10,
|
||||
100
|
||||
);
|
||||
}
|
||||
|
||||
private static void assertSerdeTuningConfig(IndexTuningConfig tuningConfig) throws IOException
|
||||
{
|
||||
final byte[] json = MAPPER.writeValueAsBytes(tuningConfig);
|
||||
final IndexTuningConfig fromJson = (IndexTuningConfig) MAPPER.readValue(json, TuningConfig.class);
|
||||
Assert.assertEquals(tuningConfig, fromJson);
|
||||
}
|
||||
}
|
Loading…
Reference in New Issue