Commit Graph

236 Commits

Author SHA1 Message Date
Adrien Grand 398d70b567 Add `scaled_float`. #19264
This is a tentative to revive #15939 motivated by elastic/beats#1941.
Half-floats are a pretty bad option for storing percentages. They would likely
require 2 bytes all the time while they don't need more than one byte.

So this PR exposes a new `scaled_float` type that requires a `scaling_factor`
and internally indexes `value*scaling_factor` in a long field. Compared to the
original PR it exposes a lower-level API so that the trade-offs are clearer and
avoids any reference to fixed precision that might imply that this type is more
accurate (actually it is *less* accurate).

In addition to being more space-efficient for some use-cases that beats is
interested in, this is also faster that `half_float` unless we can improve the
efficiency of decoding half-float bits (which is currently done using software)
or until Java gets first-class support for half-floats.
2016-07-18 12:36:23 +02:00
Nik Everett 7aeea764ba Remove wait_for_status=yellow from the docs
It is no longer required after 687e2e12b3.
2016-07-15 16:02:07 -04:00
Clinton Gormley 05271d58ca Updated fielddata docs to make it easier for users with old mappings 2016-07-14 19:58:12 +02:00
Martijn van Groningen ff5527f037 percolator: Forbid the usage or `range` queries with a range based on the current time
If there are percolator queries containing `range` queries with ranges based on the current time then this can lead to incorrect results if the `percolate` query gets cached.  These ranges are changing each time the `percolate` query gets executed and if this query gets cached then the results will be based on how the range was at the time when the `percolate` query got cached.

The ExtractQueryTermsService has been renamed `QueryAnalyzer` and now only deals with analyzing the query (extracting terms and deciding if the entire query is a verified match) . The `PercolatorFieldMapper` is responsible for adding the right fields based on the analysis the `QueryAnalyzer` has performed, because this is highly dependent on the field mappings. Also the `PercolatorFieldMapper` is responsible for creating the percolate query.
2016-07-08 14:20:56 +02:00
Britta Weber f36c1b4e60 Update fielddata.asciidoc 2016-07-05 16:21:52 +02:00
Jim Ferenczi afe99fcdcd Restore reverted change now that alpha4 is out:
Rename `fields` to `stored_fields` and add `docvalue_fields`

`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-07-04 10:39:49 +02:00
Jim Ferenczi 6d2df0dc18 Fix docs example for the _id field, the field is not accessible in scripts 2016-06-29 15:25:51 +02:00
Robert Muir 6d52cec2a0 Merge pull request #19092 from rmuir/more_painless_docs
cutover some docs to painless
2016-06-28 13:40:25 -04:00
Jim Ferenczi eb1e231a63 Revert "Rename `fields` to `stored_fields` and add `docvalue_fields`"
This reverts commit 2f46f53dc8.
2016-06-27 17:20:32 +02:00
Robert Muir 6fc1a22977 cutover some docs to painless 2016-06-27 09:55:16 -04:00
Martijn van Groningen 0cae9ad30e docs: removed obsolete information, percolator queries are not longer loaded into jvm heap memory. 2016-06-23 15:32:26 +02:00
Jim Ferenczi 2f46f53dc8 Rename `fields` to `stored_fields` and add `docvalue_fields`
`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-06-22 17:38:30 +02:00
Adrien Grand 7d63f4b8db Fix doc build. 2016-06-22 09:34:49 +02:00
Adrien Grand db9af54ec0 Remove `_timestamp` and `_ttl` on 5.x indices. #18980
This removes the ability to use `_timestamp` and `_ttl` on indices created on
or after 5.0.

Closes #18280
2016-06-22 08:35:54 +02:00
Clinton Gormley 0160d91c2c Removed docs for precision_step - no longer used 2016-06-21 15:19:12 +02:00
Adrien Grand 9ffb2ff6ba Expose half-floats. #18887
They have been implemented in https://issues.apache.org/jira/browse/LUCENE-7289.
Ranges are implemented so that the accuracy loss only occurs at index time,
which means that if you are searching for values between A and B, the query will
match exactly all documents whose value rounded to the closest half-float point
is between A and B.
2016-06-16 09:46:39 +02:00
Jim Ferenczi 6d62f33702 Make doc_values accessible for _type
`doc_values` for _type field are created but any attempt to load them throws an IAE.

This PR re-enables `doc_values` loading for _type, it also enables `fielddata` loading for indices created between 2.0 and 2.1 since doc_values were disabled during that period.

It also restores the old docs that gives example on how to sort or aggregate on _type field.
2016-05-25 18:56:13 +02:00
G. Richard Bellamy cf54903580 Support full range of Java Long for epoch DateTime
Remove the arbitrary limit on epoch_millis and epoch_seconds of 13 and 10
characters, respectively. Instead allow any character combination that can
be converted to a Java Long.

Update the docs to reflect this change.
2016-05-22 13:08:20 -07:00
Clinton Gormley 97a41ee973 First pass at improving analyzer docs (#18269)
* Docs: First pass at improving analyzer docs

I've rewritten the intro to analyzers plus the docs
for all analyzers to provide working examples.

I've also removed:

* analyzer aliases (see #18244)
* analyzer versions (see #18267)
* snowball analyzer (see #8690)

Next steps will be tokenizers, token filters, char filters

* Fixed two typos
2016-05-11 14:17:56 +02:00
Clinton Gormley 3f594089c2 Renamed all AUTOSENSE snippets to CONSOLE (#18210) 2016-05-09 15:42:23 +02:00
Clinton Gormley b352a90454 Correct docs for dynamic mapping of fields
Floating point numbers are added as `float`, and Strings are added as `text` with `keyword sub-field
2016-05-07 17:16:31 +02:00
Nik Everett cb40b986d1 Allow leading `/` in AUTOSENSE path
Relates to #18160
2016-05-06 09:26:19 -04:00
Clinton Gormley c55df195c5 Fixed bad asciidoc 2016-05-06 09:25:58 +02:00
Nik Everett f3b2ab822d Another wait_for_yellow to the docs
All in service of the snippets passing consistently.
2016-05-05 19:03:23 -04:00
Nik Everett 4b1c116461 Generate and run tests from the docs
Adds infrastructure so `gradle :docs:check` will extract tests from
snippets in the documentation and execute the tests. This is included
in `gradle check` so it should happen on CI and during a normal build.

By default each `// AUTOSENSE` snippet creates a unique REST test. These
tests are executed in a random order and the cluster is wiped between
each one. If multiple snippets chain together into a test you can annotate
all snippets after the first with `// TEST[continued]` to have the
generated tests for both snippets joined.

Snippets marked as `// TESTRESPONSE` are checked against the response
of the last action.

See docs/README.asciidoc for lots more.

Closes #12583. That issue is about catching bugs in the docs during build.
This catches *some* bugs in the docs during build which is a good start.
2016-05-05 13:58:03 -04:00
Adrien Grand 80dbe31d59 Add note about using ipv6 addresses in `query_string`. 2016-05-04 08:53:11 +02:00
Clinton Gormley 7c8397d99b Update keyword.asciidoc
`ignore_above` doesn't apply to analyzed `text` fields
2016-05-02 13:47:14 +02:00
Robin Joseph e322903f2c Fix typo in include-in-all.asciidoc (#18055) 2016-04-29 18:03:22 +02:00
Shane Connelly 713c0df3a3 Merge pull request #17994 from eskibars/master
Add new IPv6 types to docs where it's supported
2016-04-29 06:00:32 -07:00
Clinton Gormley 84a2b4e17e Update id-field.asciidoc
Clarified which queries support the `_id` field
2016-04-28 13:36:14 +02:00
Christoph Büscher a2c3b5cae1 Update keyword.asciidoc 2016-04-27 12:10:19 +02:00
Shane Connelly aff148f532 Add new IPv6 types to docs where it's supported 2016-04-26 11:38:49 -07:00
Martijn van Groningen 81449fc912 percolator: renamed `percolator` query to `percolate` query 2016-04-20 15:23:54 +02:00
Martijn van Groningen 40c22fc654 percolator: removed .percolator type instead a field of type `percolator` should be configured before indexing percolator queries
* Added an extra `field` parameter to the `percolator` query to indicate what percolator field should be used. This must be an existing field in the mapping of type `percolator`.
* The `.percolator` type is now forbidden. (just like any type that starts with a `.`)

This only applies for new indices created on 5.0 and later. Indices created on previous versions the .percolator type is still allowed to exist.
The new `percolator` field type isn't active in such indices and the `PercolatorQueryCache` knows how to load queries from these legacy indices.
The `PercolatorQueryBuilder` will not enforce that the `field` parameter is of type `percolator`.
2016-04-19 11:20:31 +02:00
LeonardGC 0b8be7f894 Update field-mapping.asciidoc (#17670) 2016-04-15 09:22:38 +02:00
Adrien Grand d84c643f58 Use the new points API to index numeric fields. #17746
This makes all numeric fields including `date`, `ip` and `token_count` use
points instead of the inverted index as a lookup structure. This is expected
to perform worse for exact queries, but faster for range queries. It also
requires less storage.

Notes about how the change works:
 - Numeric mappers have been split into a legacy version that is essentially
   the current mapper, and a new version that uses points, eg.
   LegacyDateFieldMapper and DateFieldMapper.
 - Since new and old fields have the same names, the decision about which one
   to use is made based on the index creation version.
 - If you try to force using a legacy field on a new index or a field that uses
   points on an old index, you will get an exception.
 - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them
   in SORTED_SET doc values using the same encoding (fixed length of 16 bytes
   and sortable).
 - The internal MappedFieldType that is stored by the new mappers does not have
   any of the points-related properties set. Instead, it keeps setting the index
   options when parsing the `index` property of mappings and does
   `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }`
   when parsing documents.

Known issues that won't fix:
 - You can't use numeric fields in significant terms aggregations anymore since
   this requires document frequencies, which points do not record.
 - Term queries on numeric fields will now return constant scores instead of
   giving better scores to the rare values.

Known issues that we could work around (in follow-up PRs, this one is too large
already):
 - Range queries on `ip` addresses only work if both the lower and upper bounds
   are inclusive (exclusive bounds are not exposed in Lucene). We could either
   decide to implement it, or drop range support entirely and tell users to
   query subnets using the CIDR notation instead.
 - Since IP addresses now use a different representation for doc values,
   aggregations will fail when running a terms aggregation on an ip field on a
   list of indices that contains both pre-5.0 and 5.0 indices.
 - The ip range aggregation does not work on the new ip field. We need to either
   implement range aggs for SORTED_SET doc values or drop support for ip ranges
   and tell users to use filters instead. #17700

Closes #16751
Closes #17007
Closes #11513
2016-04-14 17:56:23 +02:00
Nik Everett 0f9804b0e2 reindex: gracefully handle when _source is disabled
Closes #17666
2016-04-13 08:19:58 -04:00
Ibrahim Awwal 5121060e75 Fix typo in templates.asciidoc
The doc mentions match_path in one place but the correct syntax is path_match which is mentioned everywhere else. Using the wrong string leads to errors because the mapping becomes too greedy, and matches things it shouldn't.
2016-04-06 16:40:20 -06:00
Sergii Golubev 8430b379d8 string.asciidoc: fix for `position_increment_gap`
Remove  outdated and duplicate description for the `position_increment_gap` parameter.
2016-04-05 16:23:42 -04:00
Adrien Grand 26a0fb37a4 Add examples of useful dynamic templates to the docs. #17413 2016-03-31 09:45:11 +02:00
Adrien Grand fc47007e17 Add a soft limit on the mapping depth. #17400
This commit adds the new `index.mapping.depth.limit` setting which controls the
maximum mapping depth that is allowed. It has a default value of 20.
2016-03-30 14:37:00 +02:00
Yanjun Huang 361adcf387 Add limit to total number of fields in mapping. #17357
This is to prevent mapping explosion when dynamic keys such as UUID are used as field names. index.mapping.total_fields.limit specifies the total number of fields an index can have. An exception will be thrown when the limit is reached. The default limit is 1000. Value 0 means no limit. This setting is runtime adjustable

Closes #11443
2016-03-29 19:39:46 +02:00
Adrien Grand b42f66c8ac Document 5.0 mapping changes. 2016-03-22 16:22:58 +01:00
Clinton Gormley 2fa573bc58 Missing word in docs 2016-03-10 14:34:05 +01:00
Nicholas Knize 55635d5de1 update coerce and breaking changes documentation 2016-03-09 16:09:44 -06:00
Nicholas Knize 61f39e6c92 GeoPointV2 update docs and query builders
This commit updates the documentation for GeoPointField by removing all references to the coerce and doc_values parameters. DocValues are enabled in lucene GeoPointField by default (required for boundary filtering). The QueryBuilders are updated to automatically normalize points (ignoring the coerce parameter) for any index created onOrAfter version 2.2.
2016-03-09 16:09:44 -06:00
Jim Ferenczi 927303e7a9 Change the field mapping index time boost into a query time boost.
Index time boost will still be applied for indices created before 5.0.0.
2016-03-04 11:47:35 +01:00
Clinton Gormley 05e3cd6b97 Merge pull request #16878 from peschlowp/patch-8
Update index-options.asciidoc
2016-03-02 10:52:44 +01:00
Clinton Gormley 812f03a33f Merge pull request #16842 from anhlqn/patch-1
Fix minor spelling
2016-02-29 01:32:42 +01:00
Clinton Gormley 00b9640208 Merge pull request #16672 from teuneboon/patch-1
Clarify text about date format range
2016-02-15 16:16:19 +01:00