OpenSearch

Commit Graph

Author	SHA1	Message	Date
Adrien Grand	d84c643f58	Use the new points API to index numeric fields. #17746 This makes all numeric fields including `date`, `ip` and `token_count` use points instead of the inverted index as a lookup structure. This is expected to perform worse for exact queries, but faster for range queries. It also requires less storage. Notes about how the change works: - Numeric mappers have been split into a legacy version that is essentially the current mapper, and a new version that uses points, eg. LegacyDateFieldMapper and DateFieldMapper. - Since new and old fields have the same names, the decision about which one to use is made based on the index creation version. - If you try to force using a legacy field on a new index or a field that uses points on an old index, you will get an exception. - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them in SORTED_SET doc values using the same encoding (fixed length of 16 bytes and sortable). - The internal MappedFieldType that is stored by the new mappers does not have any of the points-related properties set. Instead, it keeps setting the index options when parsing the `index` property of mappings and does `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }` when parsing documents. Known issues that won't fix: - You can't use numeric fields in significant terms aggregations anymore since this requires document frequencies, which points do not record. - Term queries on numeric fields will now return constant scores instead of giving better scores to the rare values. Known issues that we could work around (in follow-up PRs, this one is too large already): - Range queries on `ip` addresses only work if both the lower and upper bounds are inclusive (exclusive bounds are not exposed in Lucene). We could either decide to implement it, or drop range support entirely and tell users to query subnets using the CIDR notation instead. - Since IP addresses now use a different representation for doc values, aggregations will fail when running a terms aggregation on an ip field on a list of indices that contains both pre-5.0 and 5.0 indices. - The ip range aggregation does not work on the new ip field. We need to either implement range aggs for SORTED_SET doc values or drop support for ip ranges and tell users to use filters instead. #17700 Closes #16751 Closes #17007 Closes #11513	2016-04-14 17:56:23 +02:00
Christoph Büscher	e15e7f7e6e	Remove parser argument from methods where we already pass in a parse context When we pass down both parser and QueryParseContext to a method, we cannot make sure that the parser contained in the context and the parser that is parsed as an argument have the same state. This removes the parser argument from methods where we currently have both the parser and the parse context as arguments and instead retrieves the parse from the context inside the method.	2016-04-14 16:18:05 +02:00
Martijn van Groningen	2928fd6ef3	Cleanup query builder for inner hits construction. * Inner hits can now only be provided and prepared via setter in the nested, has_child and has_parent query. * Also made `score_mode` a required constructor parameter. * Moved has_child's min_child/max_children validation from doToQuery(...) to a setter.	2016-04-14 14:43:21 +02:00
Nik Everett	cca3154c43	Rename isSourceEmpty to hasSource And add a test case for {} to reindex.	2016-04-13 08:19:58 -04:00
Nik Everett	c2e745bf3b	reindex: Guard against user disabling fields	2016-04-13 08:19:58 -04:00
Nik Everett	0f9804b0e2	reindex: gracefully handle when _source is disabled Closes #17666	2016-04-13 08:19:58 -04:00
Adrien Grand	013acf9179	Remove MappedFieldType.value. #17557 This commit removes `MappedFieldType.value` and simplifies `MappedFieldType.valueforSearch`. `valueforSearch` was used to post-process values that come for stored fields (eg. to convert a long back to a string representation of a date in the case of a date field) and also values that are extracted from the source but only in the case of GET calls: it would not be called when performing source filtering on search requests. `valueforSearch` is now only called for stored fields, since values that are extracted from the source should already be formatted as expected.	2016-04-12 09:12:56 +02:00
Adrien Grand	a14db8e17e	Remove MappedFieldType.useTermQueryWithQueryString() and isNumeric(). #17599 In both cases, what elasticsearch is really interested in is whether the field is an analyzed string field. So it can just check `tokenized()` instead.	2016-04-12 08:45:28 +02:00
Adrien Grand	496c7fbd84	Upgrade Lucene 6 Release * upgrades numerics to new Point format * updates geo api changes * adds GeoPointDistanceRangeQuery as XGeoPointDistanceRangeQuery * cuts over to ES GeoHashUtils	2016-04-11 16:50:04 -05:00
Adrien Grand	0eb1a816c8	Allow the query cache to be disabled. #16268 This replaces the internal `index.queries.cache.type` setting with a new `index.queries.cache.enabled` setting, which is documented. Closes #15802	2016-04-11 18:06:16 +02:00
Adrien Grand	42526ac28e	Remove Settings.settingsBuilder. We have both `Settings.settingsBuilder` and `Settings.builder` that do exactly the same thing, so we should keep only one. I kept `Settings.builder` since it has my preference but also it is the one that we use in examples of the Java API.	2016-04-08 18:10:02 +02:00
Adrien Grand	bef38a4d12	Fix test bug.	2016-04-07 09:49:27 +02:00
Adrien Grand	e1bfe23c22	ExtendedStatsAggregator should also pass sigma to emtpy aggs. #17388 Because sigma is also used at reduce time, it should be passed to empty aggs. Otherwise it causes bugs when an empty aggregation is used to perform reduction is it would assume a sigma of zero. Closes #17362	2016-04-07 09:34:11 +02:00
Nik Everett	16c12afabe	Rework ScoreFunctionBuilder registration to remove PROTOTYPEs This removes PROTOTYPEs from ScoreFunctionsBuilders. To do so we rework registration so it doesn't need PROTOTYPEs and lines up with the recent changes to query registration.	2016-04-06 13:04:11 -04:00
Nik Everett	2b6866d26b	Fix references to the removed parsers Mostly stuff is just in the builder now.	2016-04-06 11:15:22 -04:00
Adrien Grand	4c4bbb3e45	Replace FieldStatsProvider with a method on MappedFieldType. #17334 FieldStatsProvider had to perform instanceof calls to properly handle dates or ip addresses. By moving the logic to MappedFieldType, each field type can check whether all values are within bounds its way. Note that this commit only keeps rewriting support for dates, which are the only field for which the rewriting mechanism is likely to help (because of time-based indices).	2016-04-01 10:28:58 +02:00
Nik Everett	14d37baa4b	[reindex] Don't get rejected BulkByScrollTaskTest#testDelayAndRethrottle was getting rejected exceptions every once in a while. This was reproducible ~20% of the time for me. I added a CyclicBarrier to prevent the test from shutting down the thread pool before the threads get finished.	2016-03-31 14:50:14 -04:00
Nik Everett	0c762fca35	Fix test mistake	2016-03-31 12:27:35 -04:00
Nik Everett	7f794e7b77	Test for invalid scroll_size	2016-03-31 12:21:32 -04:00
Nik Everett	30a1862339	Remove PROTOTYPE from BulkItemResponse.Failure Closes #17086	2016-03-31 09:10:36 -04:00
javanna	32b6e529f4	Merge branch 'master' into enhancement/discovery_node_one_getter	2016-03-31 10:49:26 +02:00
Jack Conradson	a37e53c50f	Painless clean up including fixing _score issues and improving type error messages. Closes #17428	2016-03-30 16:40:17 -07:00
Nik Everett	78ab6c5b7f	[reindex] Dynamic throttle! This allows the user to update the reindex throttle on the fly, with changes that speed up the throttling being applied immediately and changes that slow down the throttling being applied during the next batch. This means that if a user throttles reindex in such a way that it tries to sleep for 16 years and then realizes that they've done something wrong then they can change the throttle and reindex will wake up again. We don't apply slow downs immediately so we never get in danger of losing the scan context. Also, if reindex is canceled while it is sleeping (how it honor throttling) then it'll immediately wake up and cancel itself.	2016-03-30 16:40:42 -04:00
javanna	b9f9b2e3ee	Merge branch 'master' into enhancement/discovery_node_one_getter	2016-03-30 17:22:40 +02:00
Nik Everett	1c16d63a9a	Merge pull request #17394 from camilojd/refactor/replace-getrandom Refactor: replace all ocurrences of ESTestCase.getRandom() with LuceneTestCase.random()	2016-03-30 08:58:21 -04:00
javanna	a8bbdff3bc	Remove DiscoveryNode#name in favour of existing DiscoveryNode#getName	2016-03-30 14:47:36 +02:00
Adrien Grand	068c788ec8	Disable fielddata on text fields by defaults. #17386 `text` fields will have fielddata disabled by default. Fielddata can still be enabled on an existing index by setting `fielddata=true` in the mappings.	2016-03-30 14:35:32 +02:00
Camilo Diaz Repka	7be11a36cd	Refactor: replace all ocurrences of ESTestCase.getRandom() for random(). Remove getRandom().	2016-03-29 23:18:05 -04:00
Nik Everett	df08854c60	Remove PROTOTYPEs from suggesters Also stops using guice for suggesters at all and lots of checkstyle.	2016-03-29 17:55:01 -04:00
javanna	061f09d9a4	Merge branch 'master' into enhancement/remove_node_client_setting	2016-03-29 20:19:33 +02:00
Tal Levy	16e888fac3	Merge pull request #17260 from talevy/fix-regex-exceptions Handle regex parsing errors in Gsub and Grok Processors	2016-03-29 08:12:26 -07:00
Clinton Gormley	3087d2b882	Fixed bad YAML in reindex REST test: 50_routing.yaml	2016-03-29 15:03:09 +02:00
Clinton Gormley	52daed0732	Update-by-query rest tests: fixed bad yaml and deleted a client-dependent test	2016-03-29 14:58:29 +02:00
Colin Goodheart-Smithe	ff3fd99074	Prevents exception being raised when ordering by an aggregation which wasn't collected If a terms aggregation was ordered by a metric nested in a single bucket aggregator which did not collect any documents (e.g. a filters aggregation which did not match in that term bucket) an ArrayOutOfBoundsException would be thrown when the ordering code tried to retrieve the value for the metric. This fix fixes all numeric metric aggregators so they return their default value when a bucket ordinal is requested which was not collected. Closes #17225	2016-03-29 13:28:03 +01:00
javanna	8fc9dbbb99	Merge branch 'master' into enhancement/remove_node_client_setting	2016-03-29 14:27:04 +02:00
Clinton Gormley	5f24581de3	The reindex body is now required, which changes the exception thrown by the REST test	2016-03-29 14:09:59 +02:00
Clinton Gormley	b87beeb05f	Rename update-by-query REST tests to update_by_query	2016-03-29 13:13:49 +02:00
Clinton Gormley	97606850e8	Renamed update-by-query REST spec to update_by_query	2016-03-29 11:45:20 +02:00
javanna	de5cbda8e7	Merge branch 'master' into enhancement/remove_node_client_setting	2016-03-29 10:48:47 +02:00
Nik Everett	0e6141e675	Replace is_true: took with took >= 0 This prevents tests from failing on machines that can finish the request less than half a millisecond.	2016-03-28 13:03:48 -04:00
javanna	27d4994aff	Merge branch 'master' into enhancement/remove_node_client_setting	2016-03-24 18:10:11 +01:00
Nik Everett	93ab4cfc99	Stop using PROTOTYPE in NamedWriteableRegistry readFrom is confusing because it requires an instance of the type that it is reading but it doesn't modify it. But we also have (deprecated) methods named readFrom that do modify the instance. The "right" way to implement the non-modifying readFrom is to delegate to a constructor that takes a StreamInput so that the read object can be immutable. Now that we have `@FunctionalInterface`s it is fairly easy to register things by referring directly to the constructor. This change modifying NamedWriteableRegistry so that it does that. It keeps supporting `registerPrototype` which registers objects to be read by readFrom but deprecates it and delegates it to a new `register` method that allows passing a simple functional interface. It also cuts Task.Status subclasses over to using that method. The start of #17085	2016-03-24 11:26:44 -04:00
Nik Everett	48aaebf23d	[reindex] Wait for headers The test was checking that we'd set the headers properly but in some cases the request had yet to come in because it was running on another thread. Now we wait for the headers to show up before failing the test. Closes #17299	2016-03-24 09:55:49 -04:00
Jim Ferenczi	da42f199bd	Enforce isolated mode for all plugins This commit removes the isolated option, each plugin have its own classloader.	2016-03-24 09:17:33 +01:00
Nik Everett	aaa4d57fff	[reindex] Don't attempt to refresh on noop If the user asks for a refresh but their reindex or update-by-query operation touched no indexes we should just skip the resfresh call entirely. Without this commit we refresh all indexes which is totally wrong. Closes #17296	2016-03-23 18:12:40 -04:00
Areek Zillur	ed49ec437f	remove suggest transport action	2016-03-23 16:37:56 -04:00
javanna	030453d320	Merge branch 'master' into enhancement/remove_node_client_setting	2016-03-23 11:25:34 +01:00
Adrien Grand	e50eeeaffb	Refactor fielddata mappings. #17148 The fielddata settings in mappings have been refatored so that: - text and string have a `fielddata` (boolean) setting that tells whether it is ok to load in-memory fielddata. It is true by default for now but the plan is to make it default to false for text fields. - text and string have a `fielddata_frequency_filter` which contains the same thing as `fielddata.filter.frequency` used to (but validated at parsing time instead of being unchecked settings) - regex fielddata filtering is not supported anymore and will be dropped from mappings automatically on upgrade. - text, string and _parent fields have an `eager_global_ordinals` (boolean) setting that tells whether to load global ordinals eagerly on refresh. - in-memory fielddata is not supported on keyword fields anymore at all. - the `fielddata` setting is not supported on other fields that text and string and will be dropped when upgrading if specified.	2016-03-23 09:48:13 +01:00
Tal Levy	534caa8927	Handle regex parsing errors in Gsub and Grok Processors Currently, both Gsub and Grok parse regex strings during Pipeline creation. Thrown parsing exceptions were leaking out, this commit wraps those exceptions in ElasticsearchParseExceptions.	2016-03-22 15:06:29 -07:00
Nik Everett	da96b6e41d	[reindex] Add thottling support The throttle is applied when starting the next scroll request so that its timeout can include the throttle time.	2016-03-22 12:34:14 -04:00

1 2 3 4 5 ...

3016 Commits