OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-06 04:58:50 +00:00

Author	SHA1	Message	Date
javanna	d7187238a2	Merge branch 'master' into feature/query-refactoring Conflicts: core/src/main/java/org/elasticsearch/transport/netty/MessageChannelHandler.java	2015-07-02 11:46:47 +02:00
javanna	cab3a68cc0	Query refactoring: unify boost and query name Following the discussion in #11744, move boost and query _name to base class AbstractQueryBuilder with their getters and setters. Unify their serialization code and equals/hashcode handling in the base class too. This guarantess that every query supports both _name and boost and nothing needs to be done around those in subclasses besides properly parsing the fields in the parsers and printing them out as part of the doXContent method in the builders. More specifically, these are the performed changes: - Introduced printBoostAndQueryName utility method in AbstractQueryBuilder that subclasses can use to print out _name and boost in their doXContent method. - readFrom and writeTo are now final methods that take care of _name and boost serialization. Subclasses have to implement doReadFrom and doWriteTo instead. - toQuery is a final method too that takes care of properly applying _name and boost to the lucene query. Subclasses have to implement doToQuery instead. The query returned will have boost and queryName applied automatically. - Removed BoostableQueryBuilder interface, given that every query is boostable after this change. This won't have any negative effect on filters, as the boost simply gets ignored in that case. - Extended equals and hashcode to handle queryName and boost automatically as well. - Update the query test infra so that queryName and boost are tested automatically, and whenever they are forgotten in parser or doXContent tests fail, so this makes things a lot less error-prone - Introduced DEFAULT_BOOST constant to make sure we don't repeat 1.0f all the time for default boost values. SpanQueryBuilder is again a marker interface only. The convenient toQuery that allowed us to override the return type to SpanQuery cannot be supported anymore due to a clash with the toQuery implementation from AbstractQueryBuilder. We have to go back to castin lucene Query to SpanQuery when dealing with span queries unfortunately. Note that this change touches not only the already refactored queries but also the untouched ones, by making sure that we parse _name and boost whenever we need to and that we print them out as part of QueryBuilder#doXContent. This will result in printing out the default boost all the time rather than skipping it in non refactored queries, something that we would have changed anyway as part of the query refactoring. The following are the queries that support boost now while previously they didn't (parser now parses it and builder prints it out): and, exists, fquery, geo_bounding_box, geo_distance, geo_distance_range, geo_hash_cell, geo_polygon, indices, limit, missing, not, or, script, type. The following are the queries that support _name now while previously they didn't (parser now parses it and builder prints it out): boosting, constant_score, function_score, limit, match_all, type. Range query parser supports now _name at the same level as boost too (_name is still supported on the outer object though for bw comp). There are two exceptions that despite have getters and setters for queryName and boost don't really support boost and queryName: query filter and span multi term query. The reason for this is that they only support a single inner object which is another query that they wrap, no other elements. Relates to #11744 Closes #10776 Closes #11974	2015-07-01 17:48:40 +02:00
Martijn van Groningen	c6ae6fc6d9	percolator: `getTime` -> `time`	2015-06-30 18:44:58 +02:00
Christoph Büscher	6678acfe23	Merge branch 'master' into feature/query-refactoring Conflicts: core/src/main/java/org/elasticsearch/index/query/RangeQueryBuilder.java	2015-06-26 14:48:20 +02:00
Alexander Reelsen	23cf9af495	Dates: Be backwards compatible with pre 2.x indices In order to be backwards compatible, indices created before 2.x must support indexing of a unix timestamp and its configured date format. Indices created with 2.x must configure the `epoch_millis` date formatter in order to support this. Relates #10971	2015-06-25 17:21:29 +02:00
Isabel Drost-Fromm	4f7ed2132e	Remove duplicate operator enums As we now have an enum Operator that comes with many useful helper methods switching to use that instead of the enums defined separately. Also switches to using the new enum's helper methods where applicable removing duplicate parsing logic. This breaks backwards compatibility. Documenting the break in migrate_query_refactoring.asciidoc Relates to #10217	2015-06-25 10:47:39 +02:00
Christoph Büscher	a2122fdc2b	Merge branch 'master' into feature/query-refactoring	2015-06-24 11:29:59 +02:00
David Pursehouse	b49e66c3a1	Replace references to ImmutableSettings with Settings ImmutableSettings was merged into Settings in commit 4873070. Change-Id: I06bd0150381d131593920c2328c46beacf49661f	2015-06-24 14:54:53 +09:00
javanna	99147228d7	Merge branch 'master' into feature/query-refactoring Conflicts: core/src/main/java/org/elasticsearch/index/query/GeoShapeQueryBuilder.java core/src/main/java/org/elasticsearch/index/query/TermsQueryBuilder.java	2015-06-23 10:16:21 +02:00
Ryan Ernst	12e7cbe92b	Mappings: Lockdown _timestamp This is a follow up to #8143 and #6730 for _timestamp. It removes support for `path`, as well as any field type settings, and enables docvalues for _timestamp, for 2.0. Users who need to adjust these settings can use a date field.	2015-06-22 10:21:03 -07:00
Christoph Büscher	b6cdc46a61	Query refactoring: QueryFilterBuilder and Parser Moving the query building functionality from the parser to the builders new toQuery() method analogous to other recent query refactorings. In this case this also includes FQueryFilterParser, since both queries are closely related. Relates to #10217 Closes #11729	2015-06-22 18:17:01 +02:00
Alexander Reelsen	38ddc8159c	Dates: Allow for negative unix timestamps This fixes an issue to allow for negative unix timestamps. An own printer for epochs instead of just having a parser has been added. Added docs that only 10/13 length unix timestamps are supported Added docs in upgrade documentation Fixes #11478	2015-06-22 11:56:31 +02:00
Adrien Grand	ac7ce2b899	Rivers removal. While we had initially planned to keep rivers around in 2.0 to ease migration, keeping support for rivers is challenging as it conflicts with other important changes that we want to bring to 2.0 like synchronous dynamic mappings updates. Nothing impossible to fix, but it would increase the complexity of how we deal with dynamic mappings updates and manage rivers, while handling dynamic mappings updates correctly is important for resiliency and rivers are on the go. So removing rivers in 2.0 may well be a better trade-off.	2015-06-10 09:22:09 +02:00
Alexander Reelsen	3bda78e43b	ResourceWatcher: Rename settings to prevent watcher clash The ResourceWatcher used settings prefixed `watcher.`, which potentially could clash with the watcher plugin. In order to prevent confusion, the settings have been renamed to `resource.reload` prefixes. This also uses the deprecation logging infrastructure introduced in #11033 to log deprecated settings and their alternative at startup. Closes #11175	2015-06-09 10:02:49 +02:00
javanna	2ef0fcfd6a	Plugins: one single (global) way to register custom query parsers There are different ways to register custom query parsers through plugins, a couple of them work per index via index settings, which is probably even too flexible. There also three different ways to add a global custom query parser through either IndicesQueriesModule or IndicesQueriesRegistry. This commit consolidates the registration of custom query parsers via IndicesQueriesModule#addQuery(Class<? extends QueryParser>). The complexity of supporting parsers per index is not needed hence it got removed. Also the other ways of registering global custom parsers are dropped in favour of the one mentioned above. Closes #11481	2015-06-08 12:19:53 +02:00
Adrien Grand	7c698146f5	Rest: Add all meta fields to the top level json document. Some of our meta fields (such as _id, _version, ...) are returned as top-level properties of the json document, while other properties (_timestamp, _routing, ...) are returned under `fields`. This commit makes all meta fields returned as top-level properties. So eg. `GET test/test/1?fields=_timestamp,foo` would now return ```json { "_index": "test", "_type": "test", "_id": "1", "_version": 1, "_timestamp": 10000000, "found": true, "fields": { "foo": [ "bar" ] } } ``` while it used to return ```json { "_index": "test", "_type": "test", "_id": "1", "_version": 1, "found": true, "fields": { "_timestamp": 10000000, "foo": [ "bar" ] } } ```	2015-06-04 23:42:17 +02:00
Lee Hinman	65f43970da	Default to binding to loopback address Binds to the address returned by `InetAddress.getLoopbackAddress()`. Closes #11300	2015-06-04 10:25:49 -06:00
Martijn van Groningen	1cfb6a79f1	Parent/child: refactored _parent field mapper and parent/child queries * Cut the `has_child` and `has_parent` queries over to use Lucene's query time global ordinal join. The main benefit of this change is that parent/child queries can now efficiently execute if parent/child queries are wrapped in a bigger boolean query. If the rest of the query only hit a few documents both has_child and has_parent queries don't need to evaluate all parent or child documents any more. * Cut the `_parent` field over to use doc values. This significantly reduces the on heap memory footprint of parent/child, because the parent id values are never loaded into memory. Breaking changes: * The `type` option on the `_parent` field can only point to a parent type that doesn't exist yet, so this means that an existing type/mapping can't become a parent type any longer. * The `has_child` and `has_parent` queries can no longer be use in alias filters. All these changes, improvements and breaks in compatibility only apply for indices created with ES version 2.0 or higher. For indices creates with ES <= 2.0 the older implementation is used. It is highly recommended to re-index all your indices with parent and child documents to benefit from all the improvements that come with this refactoring. The easiest way to achieve this is by using the scan and bulk apis using a simple script. Closes #6107 Closes #8134	2015-05-29 21:44:17 +02:00
javanna	6c81a8daf3	Internal: count api to become a shortcut to the search api The count api used to have its own execution path, although it would do the same (up to bugs!) of the search api. This commit makes it a shortcut to the search api with size set to 0. The change is made in a backwards compatible manner, by leaving all of the java api code around too, given that you may not want to get back a whole SearchResponse when asking only for number of hits matching a query, also cause migrating from countResponse.getCount() to searchResponse.getHits().totalHits() doesn't look great from a user perspective. We can always decide to drop more code around the count api if we want to break backwards compatibility on the java api, making it a shortcut on the rest layer only. Closes #9117 Closes #11198	2015-05-26 19:12:11 +02:00
Adrien Grand	461683ac58	Mappings: Remove the `compress`/`compress_threshold` options of the BinaryFieldMapper. This option is broken currently since it potentially interprets an incoming binary value as compressed while it just happens that the first bytes are the same as the LZF header.	2015-05-22 14:20:42 +02:00
Igor Motov	dd41c68741	Snapshot/Restore: fix FSRepository location configuration Closes #11068	2015-05-20 22:14:31 -04:00
Adrien Grand	2c241e8a36	Mappings: Remove the `ignore_conflicts` option. Mappings conflicts should not be ignored. If I read the history correctly, this option was added when a mapping update to an existing field was considered a conflict, even if the new mapping was exactly the same. Now that mapping updates are smart enough to detect conflicting options, we don't need an option to ignore conflicts.	2015-05-18 15:28:23 +02:00
javanna	a843008b17	Highlighting: require_field_match set to true by default The default `false` for `require_field_match` is a bit odd and confusing for users, given that field names get ignored by default and every field gets highlighted if it contains terms extracted out of the query, regardless of which fields were queries. Changed the default to `true`, it can always be changed per request. Closes #10627 Closes #11067	2015-05-15 21:38:45 +02:00
javanna	46c521f7ec	Highlighting: nuke XPostingsHighlighter Our own fork of the lucene PostingsHighlighter is not easy to maintain and doesn't give us any added value at this point. In particular, it was introduced to support the require_field_match option and discrete per value highlighting, used in case one wants to highlight the whole content of a field, but get back one snippet per value. These two features won't make it into lucene as they slow things down and shouldn't have been supported from day one on our end probably. One other customization we had was support for a wider range of queries via custom rewrite etc. (yet another way to slow things down), which got added to lucene and works much much better than what we used to do (instead of or rewrite, term s are pulled out of the automata for multi term queries). Removing our fork means the following in terms of features: - dropped support for require_field_match: the postings highlighter will only highlight fields that were queried - some custom es queries won't be supported anymore, meaning they won't be highlighted. The only one I found up until now is the phrase_prefix. Postings highlighter rewrites against an empty reader to avoid slow operations (like the ones that we were performing with the fork that we are removing here), thus the prefix will not be expanded to any term. What the postings highlighter does instead is pulling the automata out of multi term queries, but this is not supported at the moment with our MultiPhrasePrefixQuery. Closes #10625 Closes #11077	2015-05-15 20:41:33 +02:00
Jun Ohtani	597c53a0bb	Add migrationi note for AnalyzeRequest	2015-05-16 00:25:53 +09:00
Martijn van Groningen	ece18f162e	Removed `id_cache` from stats and cat apis. Also removed the `id_cache` option from the clear cache api. Closes #5269	2015-05-15 14:06:18 +02:00
Adrien Grand	630757906a	Query DSL: Add `filter` clauses to `bool` queries. These clauses filter the document space without affecting scoring and map to Lucene's BooleanClause.Occur.FILTER. The `filtered` query is now deprecated and ```json { "filtered": { "query": { //query }, "filter": { //filter } } } ``` should be replaced with ```json { "bool": { "must": { //query }, "filter": { //filter } } } ```	2015-05-13 12:04:56 +02:00
Ryan Ernst	f766b260ba	Add tests for includeInObject backcompat	2015-05-12 23:11:15 -07:00
Ryan Ernst	565ffb16f1	Mappings: Remove ability to set meta fields inside documents A few meta fields can currently be set within a document's source. However, the recommended way to set meta fields like this is through the api, and setting within the document can be a performance trap (e.g. needing to find _id in order to route the document). This change removes the ability to set meta fields within a document source for 2.0+ indexes. closes #11051 closes #11074	2015-05-12 23:09:03 -07:00
Ryan Ernst	e7618b8528	Settings: Remove file based index templates As a follow up to #10870, this removes support for index templates on disk. It also removes a missed place still allowing disk based mappings. closes #11052	2015-05-11 12:51:22 -07:00
Martijn van Groningen	acdd9a5dd9	parent/child: Removed the `top_children` query.	2015-05-10 16:30:19 +02:00
Lee Hinman	c6747ded16	Truncate log messages at 10,000 characters	2015-05-08 10:10:44 -06:00
Adrien Grand	a0af88e996	Query DSL: Remove filter parsers. This commit makes queries and filters parsed the same way using the QueryParser abstraction. This allowed to remove duplicate code that we had for similar queries/filters such as `range`, `prefix` or `term`.	2015-05-07 20:14:34 +02:00
Alex Ksikes	4787cf701f	More Like This: remove percent_terms_to_match Users should use minimum_should_match instead. Closes #11030	2015-05-07 14:21:29 +02:00
Martijn van Groningen	f7c29457d0	parent/child: Deprecated the `top_children` in favour of the `has_child` query.	2015-05-07 09:27:54 +02:00
Alex Ksikes	ec4f12f9ef	More Like This: removal of the MLT API Removes the More Like This API, users should now use the More Like This query. The MLT API tests were converted to their query equivalent. Also some clean ups in MLT tests. Closes #10736 Closes #11003	2015-05-06 18:11:11 +02:00
Ryan Ernst	7a7bd6086a	Mappings: Remove ability to disable _source field Current features (eg. update API) and future features (eg. reindex API) depend on _source. This change locks down the field so that it can no longer be disabled. It also removes legacy settings compress/compress_threshold. closes #8142 closes #10915	2015-05-05 22:04:18 -07:00
Clinton Gormley	e28ad853c7	Docs: Fixed bad asciidoc in migrate_2_0	2015-05-05 11:17:21 +02:00
Pascal Borreli	af6d890ad5	Docs: Fixed typos Closes #10973	2015-05-05 10:38:05 +02:00
Shay Banon	187d79b6df	Centralize admin implementations and action execution This change removes the multiple implementations of different admin interfaces and centralizes it with AbstractClient. It also makes sure all executions of actions now go through a single AbstractClient#execute method, taking care of copying headers and wrapping listener. This also has the side benefit of removing all the code around differnet possible clients, and removes quite a bit of code (most of the + code is actually removal of generics and such). This change also changes how TransportClient is constructed, requiring a Builder to create it, its a breaking change and its noted in the migration guide. Yea another step towards simplifying the action infra and making it simpler...	2015-05-04 23:40:17 +02:00
Robert Muir	4b3672b7df	Add migration note for hunspell dictionaries	2015-05-04 10:00:05 -04:00
Adrien Grand	b72f27a410	Core: Cut over to the Lucene filter cache. This removes Elasticsearch's filter cache and uses Lucene's instead. It has some implications: - custom cache keys (`_cache_key`) are unsupported - decisions are made internally and can't be overridden by users ('_cache`) - not only filters can be cached but also all queries that do not need scores - parent/child queries can now be cached, however cached entries are only valid for the current top-level reader so in practice it will likely only be used on read-only indices - the cache deduplicates filters, which plays nicer with large keys (eg. `terms`) - better stats: we already had ram usage and evictions, but now also hit count, miss count, lookup count, number of cached doc id sets and current number of doc id sets in the cache - dynamically changing the filter cache size is not supported anymore Internally, an important change is that it removes the NoCacheFilter infrastructure in favour of making Query.rewrite specializing the query for the current reader so that it will only be cached on this reader (look for IndexCacheableQuery). Note that consuming filters with the query API (createWeight/scorer) instead of the filter API (getDocIdSet) is important for parent/child queries because otherwise a QueryWrapperFilter(ParentQuery) would run the wrapped query per segment while relations might be cross segments.	2015-05-04 09:02:15 +02:00
Clinton Gormley	c28bf3bb3f	Docs: Updated elasticsearch.org links to elastic.co	2015-05-01 20:46:12 +02:00
Ryan Ernst	4ef9f3ca63	Mappings: Remove file based default mappings Using files that must be specified on each node is an anti-pattern from the API based goal of ES. This change removes the ability to specify the default mapping with a file on each node. closes #10620	2015-04-30 13:50:35 -07:00
Adrien Grand	e5be85d586	Aggs: Change the default `min_doc_count` to 0 on histograms. The assumption is that gaps in histogram are generally undesirable, for instance if you want to build a visualization from it. Additionally, we are building new aggregations that require that there are no gaps to work correctly (eg. derivatives).	2015-04-30 15:48:23 +02:00
Simon Willnauer	94d8b20611	Add multi data.path to migration guide this commit removes the obsolete settings for distributors and updates the documentation on multiple data.path. It also adds an explain to the migration guide. Relates to #9498 Closes #10770	2015-04-29 11:51:37 +02:00
Ryan Ernst	bf09e58cb3	Mappings: Remove includes and excludes from _source Regardless of the outcome of #8142, we should at least enforce that when _source is enabled, it is sufficient to reindex. This change removes the excludes and includes settings, since these modify the source, causing us to lose the ability to reindex some fields. closes #10814	2015-04-28 15:03:51 -07:00
javanna	c914134355	Scripting: remove groovy sandbox Groovy sandboxing was disabled by default from 1.4.3 on though since we found out that it could be worked around, so it makes little sense to keep it and maintain it. Closes #10156 Closes #10480	2015-04-28 11:27:50 +02:00
Jun Ohtani	933edf7bcc	Analysis: Fix wrong position number by analyze API Add breaking chages comment to migrate docs Fix the stopword included text using stopword filter	2015-04-28 17:44:41 +09:00
Simon Willnauer	d164526d27	Remove `_shutdown` API Thsi commit removes the `_shutdown` API entirely without any replacement. Nodes should be managed from the operating system not via REST APIs	2015-04-27 17:19:36 +02:00

1 2 3

123 Commits