mirror of https://github.com/apache/druid.git
fix typo in segments.md
This commit is contained in:
parent
252bb5a6bc
commit
e222e6b86b
|
@ -82,11 +82,11 @@ Note that the bitmap is different from the first two data structures:
|
||||||
whereas the first two grow linearly in the size of the data (in the
|
whereas the first two grow linearly in the size of the data (in the
|
||||||
worst case), the size of the bitmap section is the product of data
|
worst case), the size of the bitmap section is the product of data
|
||||||
size * column cardinality. Compression will help us here though
|
size * column cardinality. Compression will help us here though
|
||||||
because we know that each row will have only non-zero entry in a only
|
because we know that for each row in 'column data', there will only be a
|
||||||
a single bitmap. This means that high cardinality columns will have
|
single bitmap that has non-zero entry. This means that high cardinality
|
||||||
extremely sparse, and therefore highly compressible, bitmaps. Druid
|
columns will have extremely sparse, and therefore highly compressible,
|
||||||
exploits this using compression algorithms that are specially suited
|
bitmaps. Druid exploits this using compression algorithms that are
|
||||||
for bitmaps, such as roaring bitmap compression.
|
specially suited for bitmaps, such as roaring bitmap compression.
|
||||||
|
|
||||||
### Multi-value columns
|
### Multi-value columns
|
||||||
|
|
||||||
|
@ -121,8 +121,8 @@ data structures would now look as follows:
|
||||||
Note the changes to the second row in the column data and the Ke$ha
|
Note the changes to the second row in the column data and the Ke$ha
|
||||||
bitmap. If a row has more than one value for a column, its entry in
|
bitmap. If a row has more than one value for a column, its entry in
|
||||||
the 'column data' is an array of values. Additionally, a row with *n*
|
the 'column data' is an array of values. Additionally, a row with *n*
|
||||||
values in a column columns will have *n* non-zero valued entries in
|
values in 'column data' will have *n* non-zero valued entries in
|
||||||
that column's bitmaps.
|
bitmaps.
|
||||||
|
|
||||||
Naming Convention
|
Naming Convention
|
||||||
-----------------
|
-----------------
|
||||||
|
@ -176,4 +176,4 @@ representing the same time interval for the same datasource may be
|
||||||
created. These segments will contain some partition number as part of
|
created. These segments will contain some partition number as part of
|
||||||
their identifier. Sharding by dimension reduces some of the the costs
|
their identifier. Sharding by dimension reduces some of the the costs
|
||||||
associated with operations over high cardinality dimensions. For more
|
associated with operations over high cardinality dimensions. For more
|
||||||
information on sharding, see the ingestion documentat
|
information on sharding, see the ingestion documentation.
|
||||||
|
|
Loading…
Reference in New Issue