Commit Graph

451 Commits

Author SHA1 Message Date
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Pablo Musa a88f8789a0
Highlight that index_phrases only works if no slop is used (#33303)
Highlight that `index_phrases` only works if no slop is used at query time.
2018-08-31 14:48:55 +02:00
Luca Cavanna 393eec1482
Set maxScore for empty TopDocs to Nan rather than 0 (#32938)
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
2018-08-22 17:23:54 +02:00
Dimitrios Liappis abb4c183f1
Clarify ignore_above behavior with arrays of strings
Currently docs don't explain how `ignore_above` behaves with arrays of
strings.

Clarify how `ignore_above` applies for arrays of strings and
also note that all string(s) will still be visible in the
`_source` field.

Relates #33057
2018-08-22 18:18:30 +03:00
Julie Tibshirani 815c56b677
Fix an inaccuracy in the dynamic templates documentation. (#32890) 2018-08-20 11:00:11 -07:00
Julie Tibshirani 0f0068b91c
Ensure that field aliases cannot be used in multi-fields. (#32219) 2018-07-20 00:18:54 -07:00
Julie Tibshirani 15ff3da653
Add support for field aliases. (#32172)
* Add basic support for field aliases in index mappings. (#31287)
* Allow for aliases when fetching stored fields. (#31411)
* Add tests around accessing field aliases in scripts. (#31417)
* Add documentation around field aliases. (#31538)
* Add validation for field alias mappings. (#31518)
* Return both concrete fields and aliases in DocumentFieldMappers#getMapper. (#31671)
* Make sure that field-level security is enforced when using field aliases. (#31807)
* Add more comprehensive tests for field aliases in queries + aggregations. (#31565)
* Remove the deprecated method DocumentFieldMappers#getFieldMapper. (#32148)
2018-07-18 09:33:09 -07:00
Nik Everett 0522c6644d Docs: Remove duplicate test setup
The range docs had an introductory section that described how to set up
and index *and* a test setup section in `docs/build.gradle` that
duplicated that section. This is bad because these section can (and do)
drift from one another. This change removes the setup in build.gradle
and marks the introductor snippet with `// TESTSETUP` so it is used on
all the snippets.
2018-06-28 10:59:35 -04:00
Peter Dyson e7a7b9689d [Docs] Mention ip_range datatypes on ip type page (#31416)
A link to the ip_range datatype page provides a way for newer users to know
it exists if they land directly on the ip datatype page first via a search.
2018-06-20 13:04:03 +02:00
Julie Tibshirani 3f5ebb862d
Clarify that IP range data can be specified in CIDR notation. (#31374) 2018-06-18 08:21:41 -07:00
David Turner 6ad7217656
Remove reference to multiple fields with one name (#31127)
If there is only one type per index then each field's name is unique.
2018-06-07 12:38:57 +01:00
Rafał Bigaj 749d39061a [Docs] Correct minor typos in templates.asciidoc (#31167) 2018-06-07 10:44:57 +02:00
Adrien Grand 458bca11bc
Add a `feature_vector` field. (#31102)
This field is similar to the `feature` field but is better suited to index
sparse feature vectors. A use-case for this field could be to record topics
associated with every documents alongside a metric that quantifies how well
the topic is connected to this document, and then boost queries based on the
topics that the logged user is interested in.

Relates #27552
2018-06-07 10:05:37 +02:00
Colin Goodheart-Smithe d09d60858a
[DOCS] Clarify nested datatype introduction (#31055) 2018-06-06 09:32:45 +01:00
Christoph Büscher 1cee45e768
[Docs] Delete superfluous callouts (#31111)
Those callout create rendering problems on the subsequent page.

Closes #30532
2018-06-06 09:53:14 +02:00
Adrien Grand 500094f5c8
Improve documentation of dynamic mappings. (#30952)
Closes #30939
2018-06-05 08:51:52 +02:00
Jim Ferenczi fa6b7266eb Remove wrong link in index phrases doc
Relates #30450
2018-06-04 12:13:55 +02:00
Colin Goodheart-Smithe 1efb1aae28
[DOCS] Rewords _field_names documentation (#31029)
* [DOCS] Rewords _field_names documentation

Corrects the language around when we write to `_field_names` and when you might want to disable it given that n recent versions it does not carry the indexing overhead it once did.

Relates to #30862

* Update wording following review
2018-06-04 09:17:11 +01:00
Alan Woodward 0427339ab0
Index phrases (#30450)
Specifying `index_phrases: true` on a text field mapping will add a subsidiary
[field]._index_phrase field, indexing two-term shingles from the parent field.
The parent analysis chain is re-used, wrapped with a FixedShingleFilter.

At query time, if a phrase match query is executed, the mapping will redirect it
to run against the subsidiary field.

This should trade faster phrase querying for a larger index and longer indexing
times.

Relates to #27049
2018-06-04 08:50:35 +01:00
Igor Motov 7376c35960
[DOCS] Make geoshape docs less memory hungry (#31014)
Reduces shape size and precision in geo shape mapper examples to reduce
amount of memory required to check docs.

Fixes #23836
2018-06-01 15:05:37 -04:00
Jim Ferenczi 0791f93dbd
Add an option to split keyword field on whitespace at query time (#30691)
This change adds an option named `split_queries_on_whitespace` to the `keyword`
field type. When set to true full text queries (`match`, `multi_match`, `query_string`, ...) that target the field will split the input on whitespace to build the query terms. Defaults to `false`.
Closes #30393
2018-06-01 09:47:03 +02:00
Alan Woodward 67905c85a5
Rename index_prefix to index_prefixes (#30932)
This commit also adds index_prefixes tests to TextFieldMapperTests to ensure that cloning and wire-serialization work correctly
2018-05-30 08:32:31 +01:00
Adrien Grand 886db84ad2
Expose Lucene's FeatureField. (#30618)
Lucene has a new `FeatureField` which gives the ability to record numeric
features as term frequencies. Its main benefit is that it allows to boost
queries with the values of these features and efficiently skip non-competitive
documents at the same time using block-max WAND and indexed impacts.
2018-05-23 08:55:21 +02:00
Jason Tedor 4a4e3d70d5
Default to one shard (#30539)
This commit changes the default out-of-the-box configuration for the
number of shards from five to one. We think this will help address a
common problem of oversharding. For users with time-based indices that
need a different default, this can be managed with index templates. For
users with non-time-based indices that find they need to re-shard with
the split API in place they no longer need to resort only to
reindexing.

Since this has the impact of changing the default number of shards used
in REST tests, we want to ensure that we still have coverage for issues
that could arise from multiple shards. As such, we randomize (rarely)
the default number of shards in REST tests to two. This is managed via a
global index template. However, some tests check the templates that are
in the cluster state during the test. Since this template is randomly
there, we need a way for tests to skip adding the template used to set
the number of shards to two. For this we add the default_shards feature
skip. To avoid having to write our docs in a complicated way because
sometimes they might be behind one shard, and sometimes they might be
behind two shards we apply the default_shards feature skip to all docs
tests. That is, these tests will always run with the default number of
shards (one).
2018-05-14 12:22:35 -04:00
Sue Gallagher 09a6ba4fea
Change quad tree max levels to 29. Closes #21191 (#29663)
* [DOCS] Changed quad tree max levels to 29. Clears 21191

* Changed QuadPrefixTree max levels to 29 and added defaults. Closes #21191
2018-05-03 09:48:21 -07:00
wmellouli c8d8407012 [Docs] Add term query with normalizer example 2018-05-03 10:23:14 +02:00
Adrien Grand 5991ede9ef Fix docs of the `_ignored` meta field.
Relates #29658
2018-05-02 11:43:50 +02:00
Adrien Grand 7358946bda
Add a new `_ignored` meta field. (#29658)
This adds a new `_ignored` meta field which indexes and stores fields that have
been ignored at index time because of the `ignore_malformed` option. It makes
malformed documents easier to identify by using `exists` or `term(s)` queries
on the `_ignored` field.

Closes #29494
2018-05-02 10:47:02 +02:00
Adrien Grand 0a5a9a2086 Remove reference to `not_analyzed`.
Relates #30122.
2018-04-25 15:00:53 +02:00
Adrien Grand 6e62b481b4
Update plan for the removal of mapping types. (#29586)
8.x will no longer allow types in APIs and 7.x will issue deprecation warnings
when `include_type_name` is set to `false`.
2018-04-19 15:09:14 +02:00
Igor Motov 983d6c15a2
Add null_value support to geo_point type (#29451)
Adds support for null_value attribute to the geo_point types.

Closes #12998
2018-04-17 10:19:54 -04:00
Adrien Grand 3367948be6
Add documentation about the include_type_name option. (#29555)
This option will be useful in 7.x to prepare for upgrade to 8.0 which won't
know about types anymore.
2018-04-17 15:04:46 +02:00
Igor Motov e334baf6fc
Fix overflow error in parsing of long geohashes (#29418)
Fixes a possible overflow error that geohashes longer than 12 characters
can cause during parsing.

Fixes #24616
2018-04-16 12:37:38 -04:00
Adrien Grand 3a147b442a Fix docs build. 2018-04-11 13:48:53 +02:00
Adrien Grand 4918924fae
Remove legacy mapping code. (#29224)
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
2018-04-11 09:41:37 +02:00
Adrien Grand 569d0c0e89
Improve similarity integration. (#29187)
This improves the way similarities are plugged in in order to:
 - reject the classic similarity on 7.x indices and emit a deprecation
   warning otherwise
 - reject unkwown parameters on 7.x indices and emit a deprecation
   warning otherwise

Even though this breaks the plugin API, I'd like to backport to 7.x so
that users can get deprecation warnings when they are doing something
that will become unsupported in the future.

Closes #23208
Closes #29035
2018-04-03 16:45:25 +02:00
David Turner 40d19532bc
Clarify expectations of false positives/negatives (#27964)
Today this part of the documentation just says that Geo queries are not 100% 
accurate, but in fact we can be more precise about which kinds of queries see
which kinds of error. This commit clarifies this point.
2018-04-02 10:03:42 +01:00
David Turner 3ca9310aee
Update docs on vertex ordering (#27963)
At time of writing, GeoJSON did not enforce a specific ordering of vertices in
a polygon, but it now does. We occasionally get reports of Elasticsearch
rejecting apparently-valid GeoJSON because of badly oriented polygons, and it's
helpful to be able to point at this bit of the documentation when responding.
2018-04-02 09:59:12 +01:00
Sue Gallagher 5518640d46
[DOCS] Added info on WGS-84. Closes issue #3590 (#29305) 2018-03-29 15:50:05 -07:00
Nicholas Knize d400a08788 [DOCS] Remove ignore_z_value parameter link
Removes invalid ignore_z_value parameter link in geo-point.asciidoc.
2018-03-23 11:07:24 -05:00
Nicholas Knize fede633563 Add Z value support to geo_shape
This enhancement adds Z value support (source only) to geo_shape fields. If vertices are provided with a third dimension, the third dimension is ignored for indexing but returned as part of source. Like beofre, any values greater than the 3rd dimension are ignored.

closes #23747
2018-03-23 08:50:55 -05:00
Adrien Grand 8f9d2ee4e2
Reject updates to the `_default_` mapping. (#29165)
This will reject mapping updates to the `_default_` mapping with 7.x indices
and still emit a deprecation warning with 6.x indices.

Relates #15613
Supersedes #28248
2018-03-21 10:44:11 +01:00
Adrien Grand 0755ff425f
Clarify requirements of strict date formats. (#29090)
Closes #29014
2018-03-16 14:39:36 +01:00
Adrien Grand 695ec05160
Clarify that dates are always rendered as strings. (#29093)
Even in the case that the date was originally supplied as a long in the
JSON document.

Closes #26504
2018-03-16 14:34:33 +01:00
Cladis 3234fb1369 Grammar: "by geographically" → "geographically" (#28595) 2018-02-15 16:12:58 -08:00
Alex Moros Marco abe1e05ba4 [Docs] Add missing word in nested.asciidoc (#28507) 2018-02-15 14:56:02 +01:00
Christoph Büscher bc10334f7a
[Docs] Move callouts in range.asciidoc (#28264)
Currently the callouts for this section are below all the examples, making it
harder to relate them to the snippets. Instead they should be moved closer 
to the examples.
2018-02-02 11:00:07 +01:00
Adrien Grand 3f5716b9b8
Clarify that the `null_value` option doesn't modify the `_source` document. (#28374)
Closes #15959
2018-01-31 15:04:11 +01:00
Adrien Grand 9163c9b8d1
Clarify the defaults for `ignore_above`. (#28372)
Closes #27992
2018-01-31 15:03:20 +01:00
Alan Woodward 424ecb3c7d
Add ability to index prefixes on text fields (#28290)
This adds the ability to index term prefixes into a hidden subfield, enabling prefix queries to be run without multitermquery rewrites. The subfield reuses the analysis chain of its parent text field, appending an EdgeNGramTokenFilter. It can be configured with minimum and maximum ngram lengths. Query terms with lengths outside this min-max range fall back to using prefix queries against the parent text field.

The mapping looks like this:

"my_text_field" : {
"type" : "text",
"analyzer" : "english",
"index_prefix" : { "min_chars" : 1, "max_chars" : 10 }
}

Relates to #27049
2018-01-30 08:26:56 +00:00
David Kemp 531c58cf81 Documents applicability of term query to range type (#28166)
Closes #27030
2018-01-18 17:19:01 -05:00
Christoph Büscher d4ac0026fc
[Docs] Clarify numeric datatype ranges (#28240)
Since #25826 we reject infinite values for float, double and half_float
datatypes. This change adds this restriction to the documentation for the
supported datatypes.

Closes #27653
2018-01-16 15:53:28 +01:00
Martijn van Groningen cef7bd2079
docs: add best practises for wildcard queries inside percolator queries 2017-12-15 10:49:59 +01:00
Adrien Grand 1b660821a2
Allow `_doc` as a type. (#27816)
Allowing `_doc` as a type will enable users to make the transition to 7.0
smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`.
This also moves most of the documentation to `_doc` as a type name.

Closes #27750
Closes #27751
2017-12-14 17:47:53 +01:00
Ryan Ernst c51e48bec0
Correct docs for binary fields and their default for doc values (#27680)
closes #27240
2017-12-05 15:10:18 -08:00
Nicholas Knize 8bcf5393f2 [Geo] Add Well Known Text (WKT) Parsing Support to ShapeBuilders
This commit adds WKT support to Geo ShapeBuilders.

This supports the following format:

POINT (30 10)
LINESTRING (30 10, 10 30, 40 40)
BBOX (-10, 10, 10, -10)
POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))
POLYGON ((35 10, 45 45, 15 40, 10 20, 35 10), (20 30, 35 35, 30 20, 20 30))
MULTIPOINT ((10 40), (40 30), (20 20), (30 10))
MULTIPOINT (10 40, 40 30, 20 20, 30 10)
MULTILINESTRING ((10 10, 20 20, 10 40),(40 40, 30 30, 40 20, 30 10))
MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5)))
MULTIPOLYGON (((40 40, 20 45, 45 30, 40 40)), ((20 35, 10 30, 10 10, 30 5, 45 20, 20 35), (30 20, 20 15, 20 25, 30 20)))
GEOMETRYCOLLECTION (POINT (30 10), MULTIPOINT ((10 40), (40 30), (20 20), (30 10)))

closes #9120
2017-12-05 10:56:41 -06:00
Clinton Gormley 0bba2a8438 Update removal_of_types.asciidoc
Corrected  `include_in_type` to `include_type_name`
2017-12-05 10:44:48 +01:00
Christoph Büscher 0d11b9fe34
[Docs] Unify spelling of Elasticsearch (#27567)
Removes occurences of "elasticsearch" or "ElasticSearch" in favour of
"Elasticsearch" where appropriate.
2017-11-29 09:44:25 +01:00
David Turner a165d1df40
Minor improvements to docs for numeric types (#27553)
* Caps
* Fix awkward wording that took multiple passes to parse
* Floating point _number_
* Something more descriptive about the `scaled_float` scaling factor.
2017-11-28 11:36:07 +00:00
Mayya Sharipova 57e4d10007
Limit the number of nested documents (#27405)
Add an index level setting `index.mapping.nested_objects.limit` to control
the number of nested json objects that can be in a single document
across all fields. Defaults to 10000.

Throw an error if the number of created nested documents exceed this
limit during the parsing of a document.

Closes #26962
2017-11-22 10:16:28 -05:00
Jim Ferenczi bf72858ce8
[Docs] Restore section about multi-level parent/child relation in parent-join (#27392)
This section was removed to hide this ability to new users.
This change restores the section and adds a warning regarding the expected performance.

Closes #27336
2017-11-16 11:29:16 +01:00
Martijn van Groningen b4048b4e7f
Use CoveringQuery to select percolate candidate matches and
extract all clauses from a conjunction query.

When clauses from a conjunction are extracted the number of clauses is
also stored in an internal doc values field (minimum_should_match field).
This field is used by the CoveringQuery and allows the percolator to
reduce the number of false positives when selecting candidate matches and
in certain cases be absolutely sure that a conjunction candidate match
will match and then skip MemoryIndex validation. This can greatly improve
performance.

Before this change only a single clause was extracted from a conjunction
query. The percolator tried to extract the clauses that was rarest in order
(based on term length) to attempt less candidate queries to be selected
in the first place. However this still method there is still a very high
chance that candidate query matches are false positives.

This change also removes the influencing query extraction added via #26081
as this is no longer needed because now all conjunction clauses are extracted.

https://www.elastic.co/guide/en/elasticsearch/reference/6.x/percolator.html#_influencing_query_extraction

Closes #26307
2017-11-10 07:44:42 +01:00
Nicholas Knize 06ff92d237 Add ignore_malformed to geo_shape fields
This commit adds ignore_malformed support to geo_shape field types to skip malformed geoJson fields.

closes #23747
2017-11-09 17:59:05 -06:00
Holger Bartnick aa03fb72b7 [Docs] Correct link target for datatype murmur3 (#27143) 2017-10-30 09:31:55 +01:00
Martijn van Groningen f1e944a675
docs: describe parent/child performances 2017-10-26 11:49:13 +02:00
markwalkom 2b864156ca [Docs] Clarify mapping `index` option default (#27104) 2017-10-25 12:42:29 +02:00
David Turner 559fc5a4de Update numbers to reflect 4-byte UTF-8-encoded characters (#27083)
You need 4 bytes for characters outside the BMP, which includes many emoji and
a bunch of less-common writing characters too.
2017-10-24 09:50:47 +01:00
Adrien Grand 4e1ff8d086 Add documentation about disabling `_field_names`. (#26813)
This field has significant index-time overhead.

Closes #26779
2017-10-06 16:49:15 +02:00
Clinton Gormley eb3ead6561 Update type-field.asciidoc
Fixed asciidoc syntax on deprecated annotation
2017-10-06 11:57:27 +02:00
Christoph Büscher 6189c54c84 Reject the `index_options` parameter for numeric fields (#26668)
Numeric fields no longer support the index_options parameter. This changes the parameter
to be rejected in numeric field types after it was deprecated in 6.0.

Closes #21475
2017-09-25 23:43:14 +02:00
Michael Basnight f385e0cf26 Add bad_request to the rest-api-spec catch params (#26539)
This adds another request to the catch params. It also makes sure that
the generic request param does not allow 400 either.
2017-09-14 14:24:03 -05:00
Bernd 59600dfe2d [Docs] Correct typo in removal_of_types.asciidoc (#26646) 2017-09-14 15:34:07 +02:00
Daniel A. Ochoa 914416e9f4 [Docs] Update link in removal_of_types.asciidoc (#26614)
Fix link to [parent-child relationship].
2017-09-14 10:11:03 +02:00
Jim Ferenczi c709b8d6ac Fix incomplete sentences in parent-join docs (#26623)
* Fix incomplete sentences in parent-join docs

Closes #26590
2017-09-13 16:09:00 +02:00
Martijn van Groningen b391425da1
Added support to the percolate query to percolate multiple documents
The percolator will add a `_percolator_document_slot` field to all percolator
hits to indicate with what document it has matched. This number matches with
the order in which the documents have been specified in the percolate query.

Also improved the support for multiple percolate queries in a search request.
2017-09-08 17:28:39 +02:00
Martijn van Groningen a4d5c6418e
percolator: Rename map_unmapped_fields_as_string setting to map_unmapped_fields_as_text
The `index.percolator.map_unmapped_fields_as_text` is a more better name, because unmapped fields are mapped to a text field with default settings
and string is no longer a field type (it is either keyword or text).
2017-09-04 14:12:44 +02:00
Jim Ferenczi 86d97971a4 Remove the _all metadata field (#26356)
* Remove the _all metadata field

This change removes the `_all` metadata field. This field is deprecated in 6
and cannot be activated for indices created in 6 so it can be safely removed in
the next major version (e.g. 7).
2017-08-28 17:43:59 +02:00
Martijn van Groningen 636e85e5b7
percolator: Hint what clauses are important in a conjunction query based on fields
The percolator field mapper doesn't need to extract all terms and ranges from a bool query with must or filter clauses.
In order to help to default extraction behavior, boost fields can be configured, so that fields that are known for not being
selective enough can be ignored in favor for other fields or clauses with specific fields can forcefully take precedence over other clauses.
This can help selecting clauses for fields that don't match with a lot of percolator queries over other clauses and thus improving performance of the percolate query.

For example a status like field is something that should configured as an ignore field.
Queries on this field tend to match with more documents and so if clauses for this fields
get selected as best clause then that isn't very helpful for the candidate query that the
percolate query generates to filter out percolator queries that are likely not going to match.
2017-08-11 15:32:01 +02:00
Martijn van Groningen b88cfe2008
docs: Use stackexchange based example to make documentation easier to understand 2017-08-04 16:04:26 +02:00
Martijn van Groningen ec7ac32772
docs: document work around for the percolator if query time text analysis is expensive. 2017-07-28 15:04:15 +02:00
Martijn van Groningen 7c3735bdc4
percolator: Store the QueryBuilder's Writable representation instead of its XContent representation.
The Writeble representation is less heavy to parse and that will benefit percolate performance and throughput.

The query builder's binary format has now the same bwc guarentees as the xcontent format.

Added a qa test that verifies that percolator queries written in older versions are still readable by the current version.
2017-07-28 12:24:10 +02:00
Martijn van Groningen 5cf56a846a
docs: Remove incorrect warning
Closes #25935
2017-07-28 10:53:47 +02:00
Colin Goodheart-Smithe f1f1725fcf [DOCS] improve explanation of dynamic mapping setting (#25829)
Closes #25825
2017-07-21 12:24:38 +01:00
Clinton Gormley febb4bf7bc Update removal_of_types.asciidoc
Fixed `include_in_type` -> `include_type_name`
2017-07-20 19:18:51 +02:00
Clinton Gormley f69decf509 NOCONSOLE -> NOTCONSOLE in removal-of-types 2017-07-19 14:06:04 +02:00
Clinton Gormley ff4a2519f2 Update experimental labels in the docs (#25727)
Relates https://github.com/elastic/elasticsearch/issues/19798

Removed experimental label from:
* Painless
* Diversified Sampler Agg
* Sampler Agg
* Significant Terms Agg
* Terms Agg document count error and execution_hint
* Cardinality Agg precision_threshold
* Pipeline Aggregations
* index.shard.check_on_startup
* index.store.type (added warning)
* Preloading data into the file system cache
* foreach ingest processor
* Field caps API
* Profile API

Added experimental label to:
* Moving Average Agg Prediction


Changed experimental to beta for:
* Adjacency matrix agg
* Normalizers
* Tasks API
* Index sorting

Labelled experimental in Lucene:
* ICU plugin custom rules file
* Flatten graph token filter
* Synonym graph token filter
* Word delimiter graph token filter
* Simple pattern tokenizer
* Simple pattern split tokenizer

Replaced experimental label with warning that details may change in the future:
* Analysis explain output format
* Segments verbose output format
* Percentile Agg compression and HDR Histogram
* Percentile Rank Agg HDR Histogram
2017-07-18 14:06:22 +02:00
Simon Willnauer e81804cfa4 Add a shard filter search phase to pre-filter shards based on query rewriting (#25658)
Today if we search across a large amount of shards we hit every shard. Yet, it's quite
common to search across an index pattern for time based indices but filtering will exclude
all results outside a certain time range ie. `now-3d`. While the search can potentially hit
hundreds of shards the majority of the shards might yield 0 results since there is not document
that is within this date range. Kibana for instance does this regularly but used `_field_stats`
to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results.

This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards
and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
2017-07-12 22:19:20 +02:00
James Baiera 847378a43b Add another parent value option to join documentation (#25609)
Indexing a join field on a document requires a value of type "object" and two sub fields "name" 
and "parent". The "parent" field is only required on child documents, but the "name" field which 
denotes the name of the relation is always needed. Previously, only the short-hand version of the 
join field was documented. This adds documentation for the long-hand join field data, and 
explicitly points out that just specifying the name of the relation for the field value is a 
convenience shortcut.
2017-07-11 15:36:59 -04:00
Martijn van Groningen d0f9f425bd
parent/child: Removed ParentJoinFieldSubFetchPhase 2017-07-06 13:15:02 +02:00
Adrien Grand 26de905f1e Fix the documentation to state that the `_id` field is indexed. (#25540) 2017-07-05 16:09:31 +02:00
Clinton Gormley 0170e0e8d3 Remove usage of multi-types from the docs and added a page explaining type removal (#25543)
Closes #25401
2017-07-05 12:30:19 +02:00
Martijn van Groningen 9ce9c21b83
docs: added percolator script query limitation 2017-06-28 17:10:30 +02:00
Nathan Taylor 645bb9d0fb Docs: Removed duplicated line in mapping docs 2017-06-21 10:47:19 +02:00
Jim Ferenczi afada69ea9 [Docs] more fix for the parent-join docs 2017-06-16 12:49:16 +02:00
Jim Ferenczi 664193185e [Docs] Fix cross reference for parent-join field 2017-06-16 11:53:16 +02:00
Jim Ferenczi ccb3c9aae7 Add documentation for the new parent-join field (#25227)
* Add documentation for the new parent-join field

This commit adds the docs for the new parent-join field.
It explains how to define, index and query this new field.

Relates #20257
2017-06-16 11:13:23 +02:00
Russ Cam f6821c41d8 Add half_float and scaled float (#22988)
to numeric datatypes
(cherry picked from commit 67ea06145a80d5ec52ba55d1f2e1e8287e1882b1)
2017-06-13 09:54:44 +10:00
Ryan Ernst a03b6c2fa5 Scripting: Change keys for inline/stored scripts to source/id (#25127)
This commit adds back "id" as the key within a script to specify a
stored script (which with file scripts now gone is no longer ambiguous).
It also adds "source" as a replacement for "code". This is in an attempt
to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
2017-06-09 08:29:25 -07:00
Jim Ferenczi 8250aa4267 Remove the postings highlighter and make unified the default highlighter choice (#25028)
This change removes the `postings` highlighter. This highlighter has been removed from Lucene master (7.x) because it behaves
exactly like the `unified` highlighter when index_options is set to `offsets`:
https://issues.apache.org/jira/browse/LUCENE-7815

It also makes the `unified` highlighter the default choice for highlighting a field (if `type` is not provided).
The strategy used internally by this highlighter remain the same as before, it checks `term_vectors` first, then `postings` and ultimately it re-analyzes the text.
Ultimately it rewrites the docs so that the options that the `unified` highlighter cannot handle are clearly marked as such.
There are few features that the `unified` highlighter is not able to handle which is why the other highlighters (`plain` and `fvh`) are still available.
I'll open separate issues for these features and we'll deprecate the `fvh` and `plain` highlighters when full support for these features have been added to the `unified`.
2017-06-09 14:09:57 +02:00
Andrey Groshev e4fd8485ce Made the same length of opening and closing lines (#23583) 2017-06-09 00:50:43 -07:00