Commit Graph

299 Commits

Author SHA1 Message Date
markharwood b7197f5e21 SignificantText aggregation - like significant_terms, but for text (#24432)
* SignificantText aggregation - like significant_terms but doesn’t require fielddata=true, recommended used with `sampler` agg to limit expense of tokenizing docs and takes optional `filter_duplicate_text`:true setting to avoid stats skew from repeated sections of text in search results.

Closes #23674
2017-05-24 13:46:43 +01:00
Adrien Grand a72eaa8e0f Identify documents by their `_id`. (#24460)
Now that indices have a single type by default, we can move to the next step
and identify documents using their `_id` rather than the `_uid`.

One notable change in this commit is that I made deletions implicitly create
types. This helps with the live version map in the case that documents are
deleted before the first type is introduced. Otherwise there would be no way
to differenciate `DELETE index/foo/1` followed by `PUT index/foo/1` from
`DELETE index/bar/1` followed by `PUT index/foo/1`, even though those are
different if versioning is involved.
2017-05-09 16:33:52 +02:00
Nicholas Knize 0c4eb0a029 Add new ip_range field type
This commit adds support for indexing and searching a new ip_range field type. Both IPv4 and IPv6 formats are supported. Tests are updated and docs are added.
2017-05-05 09:43:42 -05:00
Nik Everett a01f846226 CONSOLEify a few more docs
Adds CONSOLE to cross-cluster-search docs but skips them for testing
because we don't have a second cluster set up. This gets us the
`VIEW IN CONSOLE` and `COPY AS CURL` links and makes sure that they
are valid yaml (not json, technically) but doesn't get testing.
Which is better than we had before.

Adds CONSOLE to the dynamic templates docs and ingest-node docs.
The ingest-node docs contain a *ton* of non-console snippets. We
might want to convert them to full examples later, but that can be
a separate thing.

Relates to #18160
2017-05-04 21:01:14 -04:00
Adrien Grand 1be2800120 Only allow one type on 7.0 indices (#24317)
This adds the `index.mapping.single_type` setting, which enforces that indices
have at most one type when it is true. The default value is true for 6.0+ indices
and false for old indices.

Relates #15613
2017-04-27 08:43:20 +02:00
Danilo Akamine 0adaf9fb4c Drop `search_analyzer` parameter from keyword.asciidoc (#24221)
`search_analyzer` isn't supported by `keyword` fields so this removes
it from the documentation for them.
2017-04-25 12:49:50 -04:00
Nik Everett e429d66956 CONSOLEify some more docs
Relates to #18160
2017-04-24 16:08:19 -04:00
Fabien Baligand 4a45579506 token_count type : add an option to count tokens (fix #23227) (#24175)
Add option "enable_position_increments" with default value true.
If option is set to false, indexed value is the number of tokens
(not position increments count)
2017-04-21 00:53:28 +02:00
Loek van Gool e11d892562 Update field-names-field.asciidoc (#24178)
fix typo in field name
2017-04-19 11:57:37 +02:00
Martijn van Groningen 3d9671a668
[PERCOLATOR] Allowing range queries with now ranges inside percolator queries.
Before now ranges where forbidden, because the percolator query itself could get cached and then the percolator queries with now ranges that should no longer match, incorrectly will continue to match.
By disabling caching when the `percolator` is being used, the percolator can now correctly support range queries with now based ranges.

 I think this is the right tradeoff. The percolator query is likely to not be the same between search requests and disabling range queries with now ranges really disabled people using the percolator for their use cases.

 Also fixed an issue that existed in the percolator fieldmapper, it was unable to find forbidden queries inside `dismax` queries.

 Closes #23859
2017-04-07 08:44:43 +02:00
Lee Hinman b6b9ef8e26 [DOCS] Remove line about eager loading global ordinals
Fielddata can no longer be configured to be loaded eagerly (it only accepts
`true` and `false`), so this line is a little misleading because it talks about
a procedure we can no longer do.
2017-04-03 12:56:21 -06:00
Nik Everett 653f50973a CONSOLEify geo-shape docs
`CONSOLE`ify geo-shape type and geo-shape query docs.

Relates to #18160
2017-03-31 09:11:54 -04:00
Nik Everett 5f91241f57 CONSOLEify geo aggregation docs
Turns the top example in each of the geo aggregation docs into a working
example that can be opened in CONSOLE. Subsequent examples can all also
be opened in console and will work after you've run the first example.
All examples are tested as part of the build.
2017-03-30 21:28:52 -04:00
Ali Beyad 8359dd05c9 Adds boolean similarity to Elasticsearch (#23637)
This commit adds the boolean similarity scoring from Lucene to
Elasticsearch.  The boolean similarity provides a means to specify that
a field should not be scored with typical full-text ranking algorithms,
but rather just whether the query terms match the document or not.
Boolean similarity scores a query term equal to its query boost only.
Boolean similarity is available as a default similarity option and thus
a field can be specified to have boolean similarity by declaring in its
mapping:
    "similarity": "boolean"

Closes #6731
2017-03-28 10:17:23 -04:00
Martijn van Groningen b116b8f0cb
[DOCS] Update the docs about the fact that global ordinals for _parent field are loaded eagerly instead of lazily by default.
Relates to #8053
2017-03-22 10:39:39 +01:00
Lee Hinman b3c27a7fdd Disallow include_in_all for 6.0+ indices
Since `_all` is now deprecated and cannot be set for new indices, we should also
disallow any field that has the `include_in_all` parameter set.

Resolves #22923
2017-02-07 19:31:51 -07:00
AlexNodex fb8bdbc57a Update typo in date (#22955)
your example has yyy and it should be yyyy
2017-02-03 13:16:17 +01:00
Clinton Gormley 19ce039d2d Update type-field.asciidoc
Wildcard type names are not supported
2017-01-27 17:50:28 +01:00
Yannick Welsch 881993de3a [Docs] Remove outdated info about enabling/disabling doc_values (#22694) 2017-01-19 17:33:40 +01:00
Daniel Mitterdorfer aece89d6a1 Make boolean conversion strict (#22200)
This PR removes all leniency in the conversion of Strings to booleans: "true"
is converted to the boolean value `true`, "false" is converted to the boolean
value `false`. Everything else raises an error.
2017-01-19 07:59:18 +01:00
Scott Somerville 372812da98 Allow an index to be partitioned with custom routing (#22274)
This change makes it possible for custom routing values to go to a subset of shards rather than
just a single shard. This enables the ability to utilize the spatial locality that custom routing can
provide while mitigating the likelihood of ending up with an imbalanced cluster or suffering
from a hot shard.

This is ideal for large multi-tenant indices with custom routing that suffer from one or both of
the following:
- The big tenants cannot fit into a single shard or there is so many of them that they will likely
end up on the same shard
- Tenants often have a surge in write traffic and a single shard cannot process it fast enough

Beyond that, this should also be useful for use cases where most queries are done under the context
of a specific field (e.g. a category) since it gives a hint at how the data can be stored to minimize
the number of shards to check per query. While a similar solution can be achieved with multiple
concrete indices or aliases per value today, those approaches breakdown for high cardinality fields.

A partitioned index enforces that mappings have routing required, that the partition size does not
change when shrinking an index (the partitions will shrink proportionally), and rejects mappings
that have parent/child relationships.

Closes #21585
2017-01-18 08:51:23 +01:00
Alex a0c83c4511 Minor doc changes to clarify mapping index param for string type (#22652)
* Grammatical correction

* Add note for legacy string mapping type

* Update truncate token filter to not mention the keyword tokenizer

The advice predates the existence of the keyword field

Closes #22650
2017-01-17 16:43:11 +01:00
Lee Hinman 7a18bb50fc Disable _all by default
This change disables the _all meta field by default.

Now that we have the "all-fields" method of query execution, we can save both
indexing time and disk space by disabling it.

_all can no longer be configured for indices created after 6.0.

Relates to #20925 and #21341
Resolves #19784
2017-01-11 16:47:13 -07:00
Nik Everett 75d5b3d9eb Fix parent_id example in docs
And fix some indentation I noticed while looking up the query.
2017-01-10 10:01:31 -05:00
Clinton Gormley cb7952e71d Docs: Parent field is no longer indexed and should use parent_id instead of term query
Closes #22517
2017-01-10 13:48:07 +01:00
Jason Veatch 20f90178fe Docs: Detail on false/strict dynamic mapping setting (#22451)
Reference: https://www.elastic.co/guide/en/elasticsearch/guide/master/dynamic-mapping.html
2017-01-05 14:36:18 -05:00
Adrien Grand 3f805d68cb Add the ability to set an analyzer on keyword fields. (#21919)
This adds a new `normalizer` property to `keyword` fields that pre-processes the
field value prior to indexing, but without altering the `_source`. Note that
only the normalization components that work on a per-character basis are
applied, so for instance stemming filters will be ignored while lowercasing or
ascii folding will be applied.

Closes #18064
2016-12-30 09:36:10 +01:00
Adrien Grand 84edf36f11 Make `-0` compare less than `+0` consistently. (#22173)
Our `float`/`double` fields generally assume that `-0` compares less than `+0`,
except when bounds are exclusive: an exclusive lower bound on `-0` excludes
`+0` and an exclusive upper bound on `+0` excludes `-0`.

Closes #22167
2016-12-21 16:51:45 +01:00
Adrien Grand 9524c81af9 Document the `locale` option of the `date` field. (#22050)
This also adds another level of protection against using the default locale.
Relates to https://discuss.elastic.co/t/mapping-for-12h-date-format/68433/3.
2016-12-09 09:45:53 +01:00
Nicholas Knize af1ab68b64 Add RangeFieldMapper for numeric and date range types
Lucene 6.2 added index and query support for numeric ranges. This commit adds a new RangeFieldMapper for indexing numeric (int, long, float, double) and date ranges and creating appropriate range and term queries. The design is similar to NumericFieldMapper in that it uses a RangeType enumerator for implementing the logic specific to each type. The following range types are supported by this field mapper: int_range, float_range, long_range, double_range, date_range.

Lucene does not provide a DocValue field specific to RangeField types so the RangeFieldMapper implements a CustomRangeDocValuesField for handling doc value support.

When executing a Range query over a Range field, the RangeQueryBuilder has been enhanced to accept a new relation parameter for defining the type of query as one of: WITHIN, CONTAINS, INTERSECTS. This provides support for finding all ranges that are related to a specific range in a desired way. As with other spatial queries, DISJOINT can be achieved as a MUST_NOT of an INTERSECTS query.
2016-11-29 10:10:14 -06:00
Clinton Gormley 5555e85619 Document that the PUT mapping API with the _default_ type overwrites instead of merging
Closes #8215
2016-11-26 12:43:56 +01:00
Clinton Gormley a4e88bb64a Fixed bad asciidoc in boolean mapping docs 2016-11-15 17:50:23 +00:00
Lee Hinman 96122aa518 Be strict when parsing values searching for booleans (#21555)
This changes only the query parsing behavior to be strict when searching on
boolean values. We continue to accept the variety of values during index time,
but searches will only be parsed using `"true"` or `"false"`.

Resolves #21545
2016-11-15 10:36:57 -07:00
Alexander Lin 0219a211d3 Allows multiple patterns to be specified for index templates (#21009)
* Allows for an array of index template patterns to be provided to an
index template, and rename the field from 'template' to 'index_pattern'.

Closes #20690
2016-11-10 18:00:30 -05:00
LakumiNarayanan 5af6deb5b5 Fix typo in keyword.asciidoc (#21237) 2016-11-01 10:15:12 -04:00
Lee Hinman 6a8bad8a06 [DOCS] Document all date formats (#21164)
Resolves #21046
2016-10-31 09:15:36 -06:00
Jun Ohtani a66c76eb44 Merge pull request #20704 from johtani/remove_request_params_in_analyze_api
Removing request parameters in _analyze API
2016-10-27 17:43:18 +09:00
Colin Goodheart-Smithe c1a9833445 Correct similarity default for 5.0 (#21144) 2016-10-27 09:33:21 +01:00
Pascal Borreli fcb01deb34 Fixed typos (#20843) 2016-10-10 14:51:47 -06:00
Jun Ohtani 370f0b885e Removing request parameters in _analyze API
Remove request params in _analyze API without index param
Change rest-api-test using JSON
Change docs using JSON

Closes #20246
2016-10-07 16:23:24 +09:00
Anatolii Stepaniuk f895abcf40 Fix grammar issues in some docs
This commit fixes some grammar issues in various docs.

Closes #20751
Closes #20752
Closes #20754
Closes #20755
2016-10-05 11:20:45 -04:00
Lee Hinman 3f77eacab1 Revert "Default `include_in_all` for numeric-like types to false"
This reverts commit 6666892038.
2016-09-28 07:07:46 -06:00
Clinton Gormley e3b7b4f032 Reorganised docs for mapping safeguard settings 2016-09-22 14:58:17 +02:00
Martijn van Groningen ad7c22198c docs: describe more explicitly what happens when indexing queries that fetch terms 2016-09-22 10:00:11 +00:00
David Pilato dfd1eebdd0 Remove mapper attachments plugin
We now have in 5.0.0 `ingest-attachment` plugin.
We can remove `mapper-attachments` plugin for 6.0.

Closes #18837.
2016-09-19 09:01:16 +02:00
Nicholas Knize 598bab93ae [DOC] Cleanup dangling references to deprecated geo parameters
With the cut over to LatLonPoint the geohash, geohash_precision, lat_lon, and geohash_prefix parameters have been removed. This commit fixes the doc build by removing the remaining dangling references to these removed parameters.
2016-09-13 16:38:38 -05:00
Nicholas Knize 1a60e1c3d2 Update docs for LatLonPoint cut over
This commit removes documentation for:

* geohash cell query
* lat_lon parameter
* geohash parameter
* geohash_precision parameter
* geohash_prefix parameter

It also updates failing tests that reference these parameters for backcompat.
2016-09-13 12:18:21 -05:00
Lee Hinman 40b088d728 Rework documentation example for _all to be less ambigious with numerics 2016-09-08 09:09:48 -06:00
Lee Hinman 6666892038 Default `include_in_all` for numeric-like types to false
This includes:

- All regular numeric types such as int, long, scaled-float, double, etc
- IP addresses
- Dates
- Geopoints and Geoshapes

Relates to #19784
2016-09-08 09:09:48 -06:00
Nik Everett e03fb602cd Add CONSOLE places where it is obviously missing
These places already have other annotations like `// TEST` and
`// TESTSETUP` so they are already in console format.
2016-09-06 10:48:19 -04:00