OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nick Knize	ec0dc2c0e9	[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#36751 ) * [Geo] Expose BKDBackedGeoShapes as new VECTOR strategy This commit exposes lucene's LatLonShape field as a new strategy in GeoShapeFieldMapper. To use the new indexing approach, strategy should be set to "vector" in the geo_shape field mapper. If the tree parameter is set the mapper will throw an IAE. Note the following: When using vector strategy: * geo_shape query does not support querying by POINT, MULTIPOINT, or GEOMETRYCOLLECTION. * LINESTRING and MULTILINESTRING queries do not support WITHIN relation. * CONTAINS relation is not supported. * The tree, precision, tree_levels, distance_error_pct, and points_only parameters will not throw an exception but they have no effect and will be marked as deprecated.. All other features are supported. * revert change to PercolatorFieldMapper * fix ExistsQuery for geo_shape vector strategy * add deprecation logging for tree, precision, tree_levels, distance_error_pct, and points_only * initial update to geoshape docs, including mapping migration updates * initial support for GeoCollection queries * fix docs and javadoc errors * clean up geocollection queries * set deprecated mapping tests to NOTCONSOLE * fix geo-shape mapper asciidoc mapping and test warnings * add support for point queries using LatLonShapeBoundingBoxQuery * update GeoShapeQueryBuilderTests to include POINT queries for VECTOR strategy. Other comment cleanups * add lucene geometry build testing to ShapeBuilder tests * remove deprecated prefix tree mapping from geo-shape.asciidoc * refactor GeoShapeFieldMapper into LegacyGeoShapeFieldMapper and GeoShapeFieldMapper Both classes derive from BaseGeoShapeFieldMapper that provides shared parameters: coerce, ignoreMalformed, ignore_z_value, orientation. * update docs to remove vector strategy * fix GeometryCollectionBuilder#buildLucene to return the object created by the shape builder * fix LineLength failure in GeoJsonShapeParserTests * ShapeMapper refactor changes from PR feedback * fix typo in geo-shape.asciidoc * ignore circle test in docs * update indexing-approach ref to geoshape-indexing-approach * add warnings check for LegacyGeoShapeFieldMapper to AbstractBuilderTestCase * fix deprecatedParameters setup * update indexing approach * fixing unexpected warnings failures * move orientation back to field type * remove if in LegacyGeoShapeFieldMapper#doXContent. Fix GeoShapeFieldMapper to work with double array as a point * fix indexing-approach link in circle section of geoshape docs * add strategy to deprecation warnings check * fix test failures * fix typo in QueryStringQueryBuilderTests * fix total hits to totalHits().value * fix version number * add version check to BaseGeoShapeFieldMapper * fix line length! * revert version check in BaseGeoShapeFieldMapper * Fix serialization of mappings of legacy shapes.	2018-12-18 09:54:56 -06:00
Nicholas Knize	96d279ed83	Revert "[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320 )" This reverts commit `5bc7822562`.	2018-12-17 20:09:46 -06:00
Nick Knize	5bc7822562	[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320 ) This commit exposes lucene's LatLonShape field as the default type in GeoShapeFieldMapper. To use the new indexing approach, simply set "type" : "geo_shape" in the mappings without setting any of the strategy, precision, tree_levels, or distance_error_pct parameters. Note the following when using the new indexing approach: * geo_shape query does not support querying by MULTIPOINT. * LINESTRING and MULTILINESTRING queries do not yet support WITHIN relation. * CONTAINS relation is not yet supported. The tree, precision, tree_levels, distance_error_pct, and points_only parameters are deprecated.	2018-12-17 14:38:14 -06:00
Mayya Sharipova	bda03163e7	Make vector fields experimental feature Relates to #33022	2018-12-13 07:17:52 -05:00
Mayya Sharipova	b5d532f9e3	Vector field (#33022 ) 1. Dense vector PUT dindex { "mappings": { "_doc": { "properties": { "my_vector": { "type": "dense_vector" }, "my_text" : { "type" : "keyword" } } } } } PUT dinex/_doc/1 { "my_text" : "text1", "my_vector" : [ 0.5, 10, 6 ] } 2. Sparse vector PUT sindex { "mappings": { "_doc": { "properties": { "my_vector": { "type": "sparse_vector" }, "my_text" : { "type" : "keyword" } } } } } PUT sindex/_doc/1 { "my_text" : "text1", "my_vector" : {"1": 0.5, "99": -0.5, "5": 1} }	2018-12-12 21:20:53 -05:00
Jim Ferenczi	18866c4c0b	Make hits.total an object in the search response (#35849 ) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028	2018-12-05 19:49:06 +01:00
Alan Woodward	73ceaad03a	Update to lucene-8.0.0-snapshot-c78429a554 (#36212 ) Includes: * A fix for a bug in Intervals.or() (https://issues.apache.org/jira/browse/LUCENE-8586) * The ability to disable offset mangling in WordDelimiterGraphFilter (https://issues.apache.org/jira/browse/LUCENE-8509) * BM25Similarity no longer multiplies scores by k1 + 1	2018-12-05 12:43:56 +00:00
Guido Lena Cota	89fae42833	(Minor) Fix some typos (#36180 )	2018-12-04 11:10:30 +01:00
Peter Dyson	1f25a0bd31	[Docs] Add example for updating meta field (#35893 )	2018-11-28 12:04:57 +01:00
Alan Woodward	be8097f9ce	Improve docs for index_prefixes option (#35778 ) This commit moves the documentation and examples for the `index_prefixes` option on text fields to its own file, to bring it in line with other mapping parameters, and expands a bit on both.	2018-11-22 09:20:46 +00:00
Alan Woodward	26cc8ff8c3	Add pointer to the index-phrases option in shingle filter docs (#35771 ) We should be discouraging the use of shingle filters and instead pointing users to the index-phrases parameter on text fields.	2018-11-21 15:27:11 +00:00
Takuro Wada	7b2d547e8e	[Docs] Delete inappropriate backtick (#35722 )	2018-11-20 10:08:32 +01:00
Julie Tibshirani	ec53288fc0	Remove include_type_name from the relevant APIs. (#35192 ) We've decided that the bulk, delete, get, index, update, and search APIs should not contain this request parameter, and we will instead accept both typed and typeless calls.	2018-11-06 14:33:48 -08:00
Julie Tibshirani	70da490f34	Remove some documentation that only makes sense with multiple types. (#35066 ) * Remove a tip about ignore_above that only makes sense with multiple types. * Remove a line from the percolator documentation that refers to multiple types.	2018-10-30 10:19:12 -07:00
Julie Tibshirani	f854330e06	Make sure to use the type _doc in the REST documentation. (#34662 ) * Replace custom type names with _doc in REST examples. * Avoid using two mapping types in the percolator docs. * Rename doc -> _doc in the main repository README. * Also replace some custom type names in the HLRC docs.	2018-10-22 11:54:04 -07:00
Igor Motov	94bde37bcf	Geo: Don't flip longitude of envelopes crossing dateline (#34535 ) When a envelope that crosses the dateline is specified as a part of geo_shape query is parsed it shouldn't have its left and right points flipped. Fixes #34418	2018-10-19 13:53:54 -04:00
Daniel Mitterdorfer	02fb5aa4ec	Remove leftover doc about format being updatable With this commit we remove a leftover in the docs about the `format` field being updatable. This is not true since we removed support for updates in #25285. Closes #33986 Relates #25285 Relates #34006	2018-09-25 10:13:23 +02:00
markharwood	2fa09f062e	New plugin - Annotated_text field type (#30364 ) New plugin for annotated_text field type. Largely a copy of `text` field type but adds ability to include markdown-like syntax in the text. The “AnnotatedText” class parses text+markup and converts into plain text and AnnotationTokens. The annotation token values are injected unchanged alongside the regular text tokens to provide a form of additional indexed overlay useful in positional searches and highlighting. Annotated_text fields do not support fielddata as we want to phase this out. Also includes a new "annotated" highlighter type that retains annotations and merges in search hits as additional annotation markup. Closes #29467	2018-09-18 10:25:27 +01:00
Jim Ferenczi	7ad71f906a	Upgrade to a Lucene 8 snapshot (#33310 ) The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly. Some comments about the change: * Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309 Closes #32899	2018-09-06 14:42:06 +02:00
Pablo Musa	a88f8789a0	Highlight that index_phrases only works if no slop is used (#33303 ) Highlight that `index_phrases` only works if no slop is used at query time.	2018-08-31 14:48:55 +02:00
Luca Cavanna	393eec1482	Set maxScore for empty TopDocs to Nan rather than 0 (#32938 ) We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).	2018-08-22 17:23:54 +02:00
Dimitrios Liappis	abb4c183f1	Clarify ignore_above behavior with arrays of strings Currently docs don't explain how `ignore_above` behaves with arrays of strings. Clarify how `ignore_above` applies for arrays of strings and also note that all string(s) will still be visible in the `_source` field. Relates #33057	2018-08-22 18:18:30 +03:00
Julie Tibshirani	815c56b677	Fix an inaccuracy in the dynamic templates documentation. (#32890 )	2018-08-20 11:00:11 -07:00
Julie Tibshirani	0f0068b91c	Ensure that field aliases cannot be used in multi-fields. (#32219 )	2018-07-20 00:18:54 -07:00
Julie Tibshirani	15ff3da653	Add support for field aliases. (#32172 ) * Add basic support for field aliases in index mappings. (#31287) * Allow for aliases when fetching stored fields. (#31411) * Add tests around accessing field aliases in scripts. (#31417) * Add documentation around field aliases. (#31538) * Add validation for field alias mappings. (#31518) * Return both concrete fields and aliases in DocumentFieldMappers#getMapper. (#31671) * Make sure that field-level security is enforced when using field aliases. (#31807) * Add more comprehensive tests for field aliases in queries + aggregations. (#31565) * Remove the deprecated method DocumentFieldMappers#getFieldMapper. (#32148)	2018-07-18 09:33:09 -07:00
Nik Everett	0522c6644d	Docs: Remove duplicate test setup The range docs had an introductory section that described how to set up and index and a test setup section in `docs/build.gradle` that duplicated that section. This is bad because these section can (and do) drift from one another. This change removes the setup in build.gradle and marks the introductor snippet with `// TESTSETUP` so it is used on all the snippets.	2018-06-28 10:59:35 -04:00
Peter Dyson	e7a7b9689d	[Docs] Mention ip_range datatypes on ip type page (#31416 ) A link to the ip_range datatype page provides a way for newer users to know it exists if they land directly on the ip datatype page first via a search.	2018-06-20 13:04:03 +02:00
Julie Tibshirani	3f5ebb862d	Clarify that IP range data can be specified in CIDR notation. (#31374 )	2018-06-18 08:21:41 -07:00
David Turner	6ad7217656	Remove reference to multiple fields with one name (#31127 ) If there is only one type per index then each field's name is unique.	2018-06-07 12:38:57 +01:00
Rafał Bigaj	749d39061a	[Docs] Correct minor typos in templates.asciidoc (#31167 )	2018-06-07 10:44:57 +02:00
Adrien Grand	458bca11bc	Add a `feature_vector` field. (#31102 ) This field is similar to the `feature` field but is better suited to index sparse feature vectors. A use-case for this field could be to record topics associated with every documents alongside a metric that quantifies how well the topic is connected to this document, and then boost queries based on the topics that the logged user is interested in. Relates #27552	2018-06-07 10:05:37 +02:00
Colin Goodheart-Smithe	d09d60858a	[DOCS] Clarify nested datatype introduction (#31055 )	2018-06-06 09:32:45 +01:00
Christoph Büscher	1cee45e768	[Docs] Delete superfluous callouts (#31111 ) Those callout create rendering problems on the subsequent page. Closes #30532	2018-06-06 09:53:14 +02:00
Adrien Grand	500094f5c8	Improve documentation of dynamic mappings. (#30952 ) Closes #30939	2018-06-05 08:51:52 +02:00
Jim Ferenczi	fa6b7266eb	Remove wrong link in index phrases doc Relates #30450	2018-06-04 12:13:55 +02:00
Colin Goodheart-Smithe	1efb1aae28	[DOCS] Rewords _field_names documentation (#31029 ) * [DOCS] Rewords _field_names documentation Corrects the language around when we write to `_field_names` and when you might want to disable it given that n recent versions it does not carry the indexing overhead it once did. Relates to #30862 * Update wording following review	2018-06-04 09:17:11 +01:00
Alan Woodward	0427339ab0	Index phrases (#30450 ) Specifying `index_phrases: true` on a text field mapping will add a subsidiary [field]._index_phrase field, indexing two-term shingles from the parent field. The parent analysis chain is re-used, wrapped with a FixedShingleFilter. At query time, if a phrase match query is executed, the mapping will redirect it to run against the subsidiary field. This should trade faster phrase querying for a larger index and longer indexing times. Relates to #27049	2018-06-04 08:50:35 +01:00
Igor Motov	7376c35960	[DOCS] Make geoshape docs less memory hungry (#31014 ) Reduces shape size and precision in geo shape mapper examples to reduce amount of memory required to check docs. Fixes #23836	2018-06-01 15:05:37 -04:00
Jim Ferenczi	0791f93dbd	Add an option to split keyword field on whitespace at query time (#30691 ) This change adds an option named `split_queries_on_whitespace` to the `keyword` field type. When set to true full text queries (`match`, `multi_match`, `query_string`, ...) that target the field will split the input on whitespace to build the query terms. Defaults to `false`. Closes #30393	2018-06-01 09:47:03 +02:00
Alan Woodward	67905c85a5	Rename index_prefix to index_prefixes (#30932 ) This commit also adds index_prefixes tests to TextFieldMapperTests to ensure that cloning and wire-serialization work correctly	2018-05-30 08:32:31 +01:00
Adrien Grand	886db84ad2	Expose Lucene's FeatureField. (#30618 ) Lucene has a new `FeatureField` which gives the ability to record numeric features as term frequencies. Its main benefit is that it allows to boost queries with the values of these features and efficiently skip non-competitive documents at the same time using block-max WAND and indexed impacts.	2018-05-23 08:55:21 +02:00
Jason Tedor	4a4e3d70d5	Default to one shard (#30539 ) This commit changes the default out-of-the-box configuration for the number of shards from five to one. We think this will help address a common problem of oversharding. For users with time-based indices that need a different default, this can be managed with index templates. For users with non-time-based indices that find they need to re-shard with the split API in place they no longer need to resort only to reindexing. Since this has the impact of changing the default number of shards used in REST tests, we want to ensure that we still have coverage for issues that could arise from multiple shards. As such, we randomize (rarely) the default number of shards in REST tests to two. This is managed via a global index template. However, some tests check the templates that are in the cluster state during the test. Since this template is randomly there, we need a way for tests to skip adding the template used to set the number of shards to two. For this we add the default_shards feature skip. To avoid having to write our docs in a complicated way because sometimes they might be behind one shard, and sometimes they might be behind two shards we apply the default_shards feature skip to all docs tests. That is, these tests will always run with the default number of shards (one).	2018-05-14 12:22:35 -04:00
Sue Gallagher	09a6ba4fea	Change quad tree max levels to 29. Closes #21191 (#29663 ) * [DOCS] Changed quad tree max levels to 29. Clears 21191 * Changed QuadPrefixTree max levels to 29 and added defaults. Closes #21191	2018-05-03 09:48:21 -07:00
wmellouli	c8d8407012	[Docs] Add term query with normalizer example	2018-05-03 10:23:14 +02:00
Adrien Grand	5991ede9ef	Fix docs of the `_ignored` meta field. Relates #29658	2018-05-02 11:43:50 +02:00
Adrien Grand	7358946bda	Add a new `_ignored` meta field. (#29658 ) This adds a new `_ignored` meta field which indexes and stores fields that have been ignored at index time because of the `ignore_malformed` option. It makes malformed documents easier to identify by using `exists` or `term(s)` queries on the `_ignored` field. Closes #29494	2018-05-02 10:47:02 +02:00
Adrien Grand	0a5a9a2086	Remove reference to `not_analyzed`. Relates #30122.	2018-04-25 15:00:53 +02:00
Adrien Grand	6e62b481b4	Update plan for the removal of mapping types. (#29586 ) 8.x will no longer allow types in APIs and 7.x will issue deprecation warnings when `include_type_name` is set to `false`.	2018-04-19 15:09:14 +02:00
Igor Motov	983d6c15a2	Add null_value support to geo_point type (#29451 ) Adds support for null_value attribute to the geo_point types. Closes #12998	2018-04-17 10:19:54 -04:00
Adrien Grand	3367948be6	Add documentation about the include_type_name option. (#29555 ) This option will be useful in 7.x to prepare for upgrade to 8.0 which won't know about types anymore.	2018-04-17 15:04:46 +02:00
Igor Motov	e334baf6fc	Fix overflow error in parsing of long geohashes (#29418 ) Fixes a possible overflow error that geohashes longer than 12 characters can cause during parsing. Fixes #24616	2018-04-16 12:37:38 -04:00
Adrien Grand	3a147b442a	Fix docs build.	2018-04-11 13:48:53 +02:00
Adrien Grand	4918924fae	Remove legacy mapping code. (#29224 ) Some features have been deprecated since `6.0` like the `_parent` field or the ability to have multiple types per index. This allows to remove quite some code, which in-turn will hopefully make it easier to proceed with the removal of types.	2018-04-11 09:41:37 +02:00
Adrien Grand	569d0c0e89	Improve similarity integration. (#29187 ) This improves the way similarities are plugged in in order to: - reject the classic similarity on 7.x indices and emit a deprecation warning otherwise - reject unkwown parameters on 7.x indices and emit a deprecation warning otherwise Even though this breaks the plugin API, I'd like to backport to 7.x so that users can get deprecation warnings when they are doing something that will become unsupported in the future. Closes #23208 Closes #29035	2018-04-03 16:45:25 +02:00
David Turner	40d19532bc	Clarify expectations of false positives/negatives (#27964 ) Today this part of the documentation just says that Geo queries are not 100% accurate, but in fact we can be more precise about which kinds of queries see which kinds of error. This commit clarifies this point.	2018-04-02 10:03:42 +01:00
David Turner	3ca9310aee	Update docs on vertex ordering (#27963 ) At time of writing, GeoJSON did not enforce a specific ordering of vertices in a polygon, but it now does. We occasionally get reports of Elasticsearch rejecting apparently-valid GeoJSON because of badly oriented polygons, and it's helpful to be able to point at this bit of the documentation when responding.	2018-04-02 09:59:12 +01:00
Sue Gallagher	5518640d46	[DOCS] Added info on WGS-84. Closes issue #3590 (#29305 )	2018-03-29 15:50:05 -07:00
Nicholas Knize	d400a08788	[DOCS] Remove ignore_z_value parameter link Removes invalid ignore_z_value parameter link in geo-point.asciidoc.	2018-03-23 11:07:24 -05:00
Nicholas Knize	fede633563	Add Z value support to geo_shape This enhancement adds Z value support (source only) to geo_shape fields. If vertices are provided with a third dimension, the third dimension is ignored for indexing but returned as part of source. Like beofre, any values greater than the 3rd dimension are ignored. closes #23747	2018-03-23 08:50:55 -05:00
Adrien Grand	8f9d2ee4e2	Reject updates to the `_default_` mapping. (#29165 ) This will reject mapping updates to the `_default_` mapping with 7.x indices and still emit a deprecation warning with 6.x indices. Relates #15613 Supersedes #28248	2018-03-21 10:44:11 +01:00
Adrien Grand	0755ff425f	Clarify requirements of strict date formats. (#29090 ) Closes #29014	2018-03-16 14:39:36 +01:00
Adrien Grand	695ec05160	Clarify that dates are always rendered as strings. (#29093 ) Even in the case that the date was originally supplied as a long in the JSON document. Closes #26504	2018-03-16 14:34:33 +01:00
Cladis	3234fb1369	Grammar: "by geographically" → "geographically" (#28595 )	2018-02-15 16:12:58 -08:00
Alex Moros Marco	abe1e05ba4	[Docs] Add missing word in nested.asciidoc (#28507 )	2018-02-15 14:56:02 +01:00
Christoph Büscher	bc10334f7a	[Docs] Move callouts in range.asciidoc (#28264 ) Currently the callouts for this section are below all the examples, making it harder to relate them to the snippets. Instead they should be moved closer to the examples.	2018-02-02 11:00:07 +01:00
Adrien Grand	3f5716b9b8	Clarify that the `null_value` option doesn't modify the `_source` document. (#28374 ) Closes #15959	2018-01-31 15:04:11 +01:00
Adrien Grand	9163c9b8d1	Clarify the defaults for `ignore_above`. (#28372 ) Closes #27992	2018-01-31 15:03:20 +01:00
Alan Woodward	424ecb3c7d	Add ability to index prefixes on text fields (#28290 ) This adds the ability to index term prefixes into a hidden subfield, enabling prefix queries to be run without multitermquery rewrites. The subfield reuses the analysis chain of its parent text field, appending an EdgeNGramTokenFilter. It can be configured with minimum and maximum ngram lengths. Query terms with lengths outside this min-max range fall back to using prefix queries against the parent text field. The mapping looks like this: "my_text_field" : { "type" : "text", "analyzer" : "english", "index_prefix" : { "min_chars" : 1, "max_chars" : 10 } } Relates to #27049	2018-01-30 08:26:56 +00:00
David Kemp	531c58cf81	Documents applicability of term query to range type (#28166 ) Closes #27030	2018-01-18 17:19:01 -05:00
Christoph Büscher	d4ac0026fc	[Docs] Clarify numeric datatype ranges (#28240 ) Since #25826 we reject infinite values for float, double and half_float datatypes. This change adds this restriction to the documentation for the supported datatypes. Closes #27653	2018-01-16 15:53:28 +01:00
Martijn van Groningen	cef7bd2079	docs: add best practises for wildcard queries inside percolator queries	2017-12-15 10:49:59 +01:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Ryan Ernst	c51e48bec0	Correct docs for binary fields and their default for doc values (#27680 ) closes #27240	2017-12-05 15:10:18 -08:00
Nicholas Knize	8bcf5393f2	[Geo] Add Well Known Text (WKT) Parsing Support to ShapeBuilders This commit adds WKT support to Geo ShapeBuilders. This supports the following format: POINT (30 10) LINESTRING (30 10, 10 30, 40 40) BBOX (-10, 10, 10, -10) POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10)) POLYGON ((35 10, 45 45, 15 40, 10 20, 35 10), (20 30, 35 35, 30 20, 20 30)) MULTIPOINT ((10 40), (40 30), (20 20), (30 10)) MULTIPOINT (10 40, 40 30, 20 20, 30 10) MULTILINESTRING ((10 10, 20 20, 10 40),(40 40, 30 30, 40 20, 30 10)) MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5))) MULTIPOLYGON (((40 40, 20 45, 45 30, 40 40)), ((20 35, 10 30, 10 10, 30 5, 45 20, 20 35), (30 20, 20 15, 20 25, 30 20))) GEOMETRYCOLLECTION (POINT (30 10), MULTIPOINT ((10 40), (40 30), (20 20), (30 10))) closes #9120	2017-12-05 10:56:41 -06:00
Clinton Gormley	0bba2a8438	Update removal_of_types.asciidoc Corrected `include_in_type` to `include_type_name`	2017-12-05 10:44:48 +01:00
Christoph Büscher	0d11b9fe34	[Docs] Unify spelling of Elasticsearch (#27567 ) Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.	2017-11-29 09:44:25 +01:00
David Turner	a165d1df40	Minor improvements to docs for numeric types (#27553 ) * Caps * Fix awkward wording that took multiple passes to parse * Floating point _number_ * Something more descriptive about the `scaled_float` scaling factor.	2017-11-28 11:36:07 +00:00
Mayya Sharipova	57e4d10007	Limit the number of nested documents (#27405 ) Add an index level setting `index.mapping.nested_objects.limit` to control the number of nested json objects that can be in a single document across all fields. Defaults to 10000. Throw an error if the number of created nested documents exceed this limit during the parsing of a document. Closes #26962	2017-11-22 10:16:28 -05:00
Jim Ferenczi	bf72858ce8	[Docs] Restore section about multi-level parent/child relation in parent-join (#27392 ) This section was removed to hide this ability to new users. This change restores the section and adds a warning regarding the expected performance. Closes #27336	2017-11-16 11:29:16 +01:00
Martijn van Groningen	b4048b4e7f	Use CoveringQuery to select percolate candidate matches and extract all clauses from a conjunction query. When clauses from a conjunction are extracted the number of clauses is also stored in an internal doc values field (minimum_should_match field). This field is used by the CoveringQuery and allows the percolator to reduce the number of false positives when selecting candidate matches and in certain cases be absolutely sure that a conjunction candidate match will match and then skip MemoryIndex validation. This can greatly improve performance. Before this change only a single clause was extracted from a conjunction query. The percolator tried to extract the clauses that was rarest in order (based on term length) to attempt less candidate queries to be selected in the first place. However this still method there is still a very high chance that candidate query matches are false positives. This change also removes the influencing query extraction added via #26081 as this is no longer needed because now all conjunction clauses are extracted. https://www.elastic.co/guide/en/elasticsearch/reference/6.x/percolator.html#_influencing_query_extraction Closes #26307	2017-11-10 07:44:42 +01:00
Nicholas Knize	06ff92d237	Add ignore_malformed to geo_shape fields This commit adds ignore_malformed support to geo_shape field types to skip malformed geoJson fields. closes #23747	2017-11-09 17:59:05 -06:00
Holger Bartnick	aa03fb72b7	[Docs] Correct link target for datatype murmur3 (#27143 )	2017-10-30 09:31:55 +01:00
Martijn van Groningen	f1e944a675	docs: describe parent/child performances	2017-10-26 11:49:13 +02:00
markwalkom	2b864156ca	[Docs] Clarify mapping `index` option default (#27104 )	2017-10-25 12:42:29 +02:00
David Turner	559fc5a4de	Update numbers to reflect 4-byte UTF-8-encoded characters (#27083 ) You need 4 bytes for characters outside the BMP, which includes many emoji and a bunch of less-common writing characters too.	2017-10-24 09:50:47 +01:00
Adrien Grand	4e1ff8d086	Add documentation about disabling `_field_names`. (#26813 ) This field has significant index-time overhead. Closes #26779	2017-10-06 16:49:15 +02:00
Clinton Gormley	eb3ead6561	Update type-field.asciidoc Fixed asciidoc syntax on deprecated annotation	2017-10-06 11:57:27 +02:00
Christoph Büscher	6189c54c84	Reject the `index_options` parameter for numeric fields (#26668 ) Numeric fields no longer support the index_options parameter. This changes the parameter to be rejected in numeric field types after it was deprecated in 6.0. Closes #21475	2017-09-25 23:43:14 +02:00
Michael Basnight	f385e0cf26	Add bad_request to the rest-api-spec catch params (#26539 ) This adds another request to the catch params. It also makes sure that the generic request param does not allow 400 either.	2017-09-14 14:24:03 -05:00
Bernd	59600dfe2d	[Docs] Correct typo in removal_of_types.asciidoc (#26646 )	2017-09-14 15:34:07 +02:00
Daniel A. Ochoa	914416e9f4	[Docs] Update link in removal_of_types.asciidoc (#26614 ) Fix link to [parent-child relationship].	2017-09-14 10:11:03 +02:00
Jim Ferenczi	c709b8d6ac	Fix incomplete sentences in parent-join docs (#26623 ) * Fix incomplete sentences in parent-join docs Closes #26590	2017-09-13 16:09:00 +02:00
Martijn van Groningen	b391425da1	Added support to the percolate query to percolate multiple documents The percolator will add a `_percolator_document_slot` field to all percolator hits to indicate with what document it has matched. This number matches with the order in which the documents have been specified in the percolate query. Also improved the support for multiple percolate queries in a search request.	2017-09-08 17:28:39 +02:00
Martijn van Groningen	a4d5c6418e	percolator: Rename map_unmapped_fields_as_string setting to map_unmapped_fields_as_text The `index.percolator.map_unmapped_fields_as_text` is a more better name, because unmapped fields are mapped to a text field with default settings and string is no longer a field type (it is either keyword or text).	2017-09-04 14:12:44 +02:00
Jim Ferenczi	86d97971a4	Remove the _all metadata field (#26356 ) * Remove the _all metadata field This change removes the `_all` metadata field. This field is deprecated in 6 and cannot be activated for indices created in 6 so it can be safely removed in the next major version (e.g. 7).	2017-08-28 17:43:59 +02:00
Martijn van Groningen	636e85e5b7	percolator: Hint what clauses are important in a conjunction query based on fields The percolator field mapper doesn't need to extract all terms and ranges from a bool query with must or filter clauses. In order to help to default extraction behavior, boost fields can be configured, so that fields that are known for not being selective enough can be ignored in favor for other fields or clauses with specific fields can forcefully take precedence over other clauses. This can help selecting clauses for fields that don't match with a lot of percolator queries over other clauses and thus improving performance of the percolate query. For example a status like field is something that should configured as an ignore field. Queries on this field tend to match with more documents and so if clauses for this fields get selected as best clause then that isn't very helpful for the candidate query that the percolate query generates to filter out percolator queries that are likely not going to match.	2017-08-11 15:32:01 +02:00
Martijn van Groningen	b88cfe2008	docs: Use stackexchange based example to make documentation easier to understand	2017-08-04 16:04:26 +02:00
Martijn van Groningen	ec7ac32772	docs: document work around for the percolator if query time text analysis is expensive.	2017-07-28 15:04:15 +02:00
Martijn van Groningen	7c3735bdc4	percolator: Store the QueryBuilder's Writable representation instead of its XContent representation. The Writeble representation is less heavy to parse and that will benefit percolate performance and throughput. The query builder's binary format has now the same bwc guarentees as the xcontent format. Added a qa test that verifies that percolator queries written in older versions are still readable by the current version.	2017-07-28 12:24:10 +02:00
Martijn van Groningen	5cf56a846a	docs: Remove incorrect warning Closes #25935	2017-07-28 10:53:47 +02:00
Colin Goodheart-Smithe	f1f1725fcf	[DOCS] improve explanation of dynamic mapping setting (#25829 ) Closes #25825	2017-07-21 12:24:38 +01:00
Clinton Gormley	febb4bf7bc	Update removal_of_types.asciidoc Fixed `include_in_type` -> `include_type_name`	2017-07-20 19:18:51 +02:00
Clinton Gormley	f69decf509	NOCONSOLE -> NOTCONSOLE in removal-of-types	2017-07-19 14:06:04 +02:00
Clinton Gormley	ff4a2519f2	Update experimental labels in the docs (#25727 ) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram	2017-07-18 14:06:22 +02:00
Simon Willnauer	e81804cfa4	Add a shard filter search phase to pre-filter shards based on query rewriting (#25658 ) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.	2017-07-12 22:19:20 +02:00
James Baiera	847378a43b	Add another parent value option to join documentation (#25609 ) Indexing a join field on a document requires a value of type "object" and two sub fields "name" and "parent". The "parent" field is only required on child documents, but the "name" field which denotes the name of the relation is always needed. Previously, only the short-hand version of the join field was documented. This adds documentation for the long-hand join field data, and explicitly points out that just specifying the name of the relation for the field value is a convenience shortcut.	2017-07-11 15:36:59 -04:00
Martijn van Groningen	d0f9f425bd	parent/child: Removed ParentJoinFieldSubFetchPhase	2017-07-06 13:15:02 +02:00
Adrien Grand	26de905f1e	Fix the documentation to state that the `_id` field is indexed. (#25540 )	2017-07-05 16:09:31 +02:00
Clinton Gormley	0170e0e8d3	Remove usage of multi-types from the docs and added a page explaining type removal (#25543 ) Closes #25401	2017-07-05 12:30:19 +02:00
Martijn van Groningen	9ce9c21b83	docs: added percolator script query limitation	2017-06-28 17:10:30 +02:00
Nathan Taylor	645bb9d0fb	Docs: Removed duplicated line in mapping docs	2017-06-21 10:47:19 +02:00
Jim Ferenczi	afada69ea9	[Docs] more fix for the parent-join docs	2017-06-16 12:49:16 +02:00
Jim Ferenczi	664193185e	[Docs] Fix cross reference for parent-join field	2017-06-16 11:53:16 +02:00
Jim Ferenczi	ccb3c9aae7	Add documentation for the new parent-join field (#25227 ) * Add documentation for the new parent-join field This commit adds the docs for the new parent-join field. It explains how to define, index and query this new field. Relates #20257	2017-06-16 11:13:23 +02:00
Russ Cam	f6821c41d8	Add half_float and scaled float (#22988 ) to numeric datatypes (cherry picked from commit 67ea06145a80d5ec52ba55d1f2e1e8287e1882b1)	2017-06-13 09:54:44 +10:00
Ryan Ernst	a03b6c2fa5	Scripting: Change keys for inline/stored scripts to source/id (#25127 ) This commit adds back "id" as the key within a script to specify a stored script (which with file scripts now gone is no longer ambiguous). It also adds "source" as a replacement for "code". This is in an attempt to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.	2017-06-09 08:29:25 -07:00
Jim Ferenczi	8250aa4267	Remove the postings highlighter and make unified the default highlighter choice (#25028 ) This change removes the `postings` highlighter. This highlighter has been removed from Lucene master (7.x) because it behaves exactly like the `unified` highlighter when index_options is set to `offsets`: https://issues.apache.org/jira/browse/LUCENE-7815 It also makes the `unified` highlighter the default choice for highlighting a field (if `type` is not provided). The strategy used internally by this highlighter remain the same as before, it checks `term_vectors` first, then `postings` and ultimately it re-analyzes the text. Ultimately it rewrites the docs so that the options that the `unified` highlighter cannot handle are clearly marked as such. There are few features that the `unified` highlighter is not able to handle which is why the other highlighters (`plain` and `fvh`) are still available. I'll open separate issues for these features and we'll deprecate the `fvh` and `plain` highlighters when full support for these features have been added to the `unified`.	2017-06-09 14:09:57 +02:00
Andrey Groshev	e4fd8485ce	Made the same length of opening and closing lines (#23583 )	2017-06-09 00:50:43 -07:00
Jim Ferenczi	ad905924ae	update docs that claim that classic is the default similarity	2017-06-09 09:22:48 +02:00
Adrien Grand	ebf806d38f	Reorganize docs of global ordinals. (#24982 ) Currently global ordinals are documented under `fielddata`. It moves them to their own file since they also work with doc values and fielddata is on the way out. Closes #23101	2017-06-01 16:47:44 +02:00
markharwood	b7197f5e21	SignificantText aggregation - like significant_terms, but for text (#24432 ) * SignificantText aggregation - like significant_terms but doesn’t require fielddata=true, recommended used with `sampler` agg to limit expense of tokenizing docs and takes optional `filter_duplicate_text`:true setting to avoid stats skew from repeated sections of text in search results. Closes #23674	2017-05-24 13:46:43 +01:00
Adrien Grand	a72eaa8e0f	Identify documents by their `_id`. (#24460 ) Now that indices have a single type by default, we can move to the next step and identify documents using their `_id` rather than the `_uid`. One notable change in this commit is that I made deletions implicitly create types. This helps with the live version map in the case that documents are deleted before the first type is introduced. Otherwise there would be no way to differenciate `DELETE index/foo/1` followed by `PUT index/foo/1` from `DELETE index/bar/1` followed by `PUT index/foo/1`, even though those are different if versioning is involved.	2017-05-09 16:33:52 +02:00
Nicholas Knize	0c4eb0a029	Add new ip_range field type This commit adds support for indexing and searching a new ip_range field type. Both IPv4 and IPv6 formats are supported. Tests are updated and docs are added.	2017-05-05 09:43:42 -05:00
Nik Everett	a01f846226	CONSOLEify a few more docs Adds CONSOLE to cross-cluster-search docs but skips them for testing because we don't have a second cluster set up. This gets us the `VIEW IN CONSOLE` and `COPY AS CURL` links and makes sure that they are valid yaml (not json, technically) but doesn't get testing. Which is better than we had before. Adds CONSOLE to the dynamic templates docs and ingest-node docs. The ingest-node docs contain a ton of non-console snippets. We might want to convert them to full examples later, but that can be a separate thing. Relates to #18160	2017-05-04 21:01:14 -04:00
Adrien Grand	1be2800120	Only allow one type on 7.0 indices (#24317 ) This adds the `index.mapping.single_type` setting, which enforces that indices have at most one type when it is true. The default value is true for 6.0+ indices and false for old indices. Relates #15613	2017-04-27 08:43:20 +02:00
Danilo Akamine	0adaf9fb4c	Drop `search_analyzer` parameter from keyword.asciidoc (#24221 ) `search_analyzer` isn't supported by `keyword` fields so this removes it from the documentation for them.	2017-04-25 12:49:50 -04:00
Nik Everett	e429d66956	CONSOLEify some more docs Relates to #18160	2017-04-24 16:08:19 -04:00
Fabien Baligand	4a45579506	token_count type : add an option to count tokens (fix #23227 ) (#24175 ) Add option "enable_position_increments" with default value true. If option is set to false, indexed value is the number of tokens (not position increments count)	2017-04-21 00:53:28 +02:00
Loek van Gool	e11d892562	Update field-names-field.asciidoc (#24178 ) fix typo in field name	2017-04-19 11:57:37 +02:00
Martijn van Groningen	3d9671a668	[PERCOLATOR] Allowing range queries with now ranges inside percolator queries. Before now ranges where forbidden, because the percolator query itself could get cached and then the percolator queries with now ranges that should no longer match, incorrectly will continue to match. By disabling caching when the `percolator` is being used, the percolator can now correctly support range queries with now based ranges. I think this is the right tradeoff. The percolator query is likely to not be the same between search requests and disabling range queries with now ranges really disabled people using the percolator for their use cases. Also fixed an issue that existed in the percolator fieldmapper, it was unable to find forbidden queries inside `dismax` queries. Closes #23859	2017-04-07 08:44:43 +02:00
Lee Hinman	b6b9ef8e26	[DOCS] Remove line about eager loading global ordinals Fielddata can no longer be configured to be loaded eagerly (it only accepts `true` and `false`), so this line is a little misleading because it talks about a procedure we can no longer do.	2017-04-03 12:56:21 -06:00
Nik Everett	653f50973a	CONSOLEify geo-shape docs `CONSOLE`ify geo-shape type and geo-shape query docs. Relates to #18160	2017-03-31 09:11:54 -04:00
Nik Everett	5f91241f57	CONSOLEify geo aggregation docs Turns the top example in each of the geo aggregation docs into a working example that can be opened in CONSOLE. Subsequent examples can all also be opened in console and will work after you've run the first example. All examples are tested as part of the build.	2017-03-30 21:28:52 -04:00
Ali Beyad	8359dd05c9	Adds boolean similarity to Elasticsearch (#23637 ) This commit adds the boolean similarity scoring from Lucene to Elasticsearch. The boolean similarity provides a means to specify that a field should not be scored with typical full-text ranking algorithms, but rather just whether the query terms match the document or not. Boolean similarity scores a query term equal to its query boost only. Boolean similarity is available as a default similarity option and thus a field can be specified to have boolean similarity by declaring in its mapping: "similarity": "boolean" Closes #6731	2017-03-28 10:17:23 -04:00
Martijn van Groningen	b116b8f0cb	[DOCS] Update the docs about the fact that global ordinals for _parent field are loaded eagerly instead of lazily by default. Relates to #8053	2017-03-22 10:39:39 +01:00
Lee Hinman	b3c27a7fdd	Disallow include_in_all for 6.0+ indices Since `_all` is now deprecated and cannot be set for new indices, we should also disallow any field that has the `include_in_all` parameter set. Resolves #22923	2017-02-07 19:31:51 -07:00
AlexNodex	fb8bdbc57a	Update typo in date (#22955 ) your example has yyy and it should be yyyy	2017-02-03 13:16:17 +01:00
Clinton Gormley	19ce039d2d	Update type-field.asciidoc Wildcard type names are not supported	2017-01-27 17:50:28 +01:00
Yannick Welsch	881993de3a	[Docs] Remove outdated info about enabling/disabling doc_values (#22694 )	2017-01-19 17:33:40 +01:00
Daniel Mitterdorfer	aece89d6a1	Make boolean conversion strict (#22200 ) This PR removes all leniency in the conversion of Strings to booleans: "true" is converted to the boolean value `true`, "false" is converted to the boolean value `false`. Everything else raises an error.	2017-01-19 07:59:18 +01:00
Scott Somerville	372812da98	Allow an index to be partitioned with custom routing (#22274 ) This change makes it possible for custom routing values to go to a subset of shards rather than just a single shard. This enables the ability to utilize the spatial locality that custom routing can provide while mitigating the likelihood of ending up with an imbalanced cluster or suffering from a hot shard. This is ideal for large multi-tenant indices with custom routing that suffer from one or both of the following: - The big tenants cannot fit into a single shard or there is so many of them that they will likely end up on the same shard - Tenants often have a surge in write traffic and a single shard cannot process it fast enough Beyond that, this should also be useful for use cases where most queries are done under the context of a specific field (e.g. a category) since it gives a hint at how the data can be stored to minimize the number of shards to check per query. While a similar solution can be achieved with multiple concrete indices or aliases per value today, those approaches breakdown for high cardinality fields. A partitioned index enforces that mappings have routing required, that the partition size does not change when shrinking an index (the partitions will shrink proportionally), and rejects mappings that have parent/child relationships. Closes #21585	2017-01-18 08:51:23 +01:00
Alex	a0c83c4511	Minor doc changes to clarify mapping index param for string type (#22652 ) * Grammatical correction * Add note for legacy string mapping type * Update truncate token filter to not mention the keyword tokenizer The advice predates the existence of the keyword field Closes #22650	2017-01-17 16:43:11 +01:00
Lee Hinman	7a18bb50fc	Disable _all by default This change disables the _all meta field by default. Now that we have the "all-fields" method of query execution, we can save both indexing time and disk space by disabling it. _all can no longer be configured for indices created after 6.0. Relates to #20925 and #21341 Resolves #19784	2017-01-11 16:47:13 -07:00
Nik Everett	75d5b3d9eb	Fix parent_id example in docs And fix some indentation I noticed while looking up the query.	2017-01-10 10:01:31 -05:00
Clinton Gormley	cb7952e71d	Docs: Parent field is no longer indexed and should use parent_id instead of term query Closes #22517	2017-01-10 13:48:07 +01:00
Jason Veatch	20f90178fe	Docs: Detail on false/strict dynamic mapping setting (#22451 ) Reference: https://www.elastic.co/guide/en/elasticsearch/guide/master/dynamic-mapping.html	2017-01-05 14:36:18 -05:00
Adrien Grand	3f805d68cb	Add the ability to set an analyzer on keyword fields. (#21919 ) This adds a new `normalizer` property to `keyword` fields that pre-processes the field value prior to indexing, but without altering the `_source`. Note that only the normalization components that work on a per-character basis are applied, so for instance stemming filters will be ignored while lowercasing or ascii folding will be applied. Closes #18064	2016-12-30 09:36:10 +01:00
Adrien Grand	84edf36f11	Make `-0` compare less than `+0` consistently. (#22173 ) Our `float`/`double` fields generally assume that `-0` compares less than `+0`, except when bounds are exclusive: an exclusive lower bound on `-0` excludes `+0` and an exclusive upper bound on `+0` excludes `-0`. Closes #22167	2016-12-21 16:51:45 +01:00
Adrien Grand	9524c81af9	Document the `locale` option of the `date` field. (#22050 ) This also adds another level of protection against using the default locale. Relates to https://discuss.elastic.co/t/mapping-for-12h-date-format/68433/3.	2016-12-09 09:45:53 +01:00
Nicholas Knize	af1ab68b64	Add RangeFieldMapper for numeric and date range types Lucene 6.2 added index and query support for numeric ranges. This commit adds a new RangeFieldMapper for indexing numeric (int, long, float, double) and date ranges and creating appropriate range and term queries. The design is similar to NumericFieldMapper in that it uses a RangeType enumerator for implementing the logic specific to each type. The following range types are supported by this field mapper: int_range, float_range, long_range, double_range, date_range. Lucene does not provide a DocValue field specific to RangeField types so the RangeFieldMapper implements a CustomRangeDocValuesField for handling doc value support. When executing a Range query over a Range field, the RangeQueryBuilder has been enhanced to accept a new relation parameter for defining the type of query as one of: WITHIN, CONTAINS, INTERSECTS. This provides support for finding all ranges that are related to a specific range in a desired way. As with other spatial queries, DISJOINT can be achieved as a MUST_NOT of an INTERSECTS query.	2016-11-29 10:10:14 -06:00

1 2 3 4 5 ...

519 Commits