The `_index` field is now a completely virtual field thanks
to #12027. It is no longer necessary to index the actual value
of the index name.
closes#12329
Require urls for URL repository to be listed in repositories.url.allowed_urls setting. This change ensures that only authorized URLs can be accessed by elasticsearch
Moving the query building functionality from the parser to the builders
new toQuery() method analogous to other recent query refactorings.
Relates to #10217
Plugin Manager can now use another simplified form when a user wants to install an official plugin hosted at elasticsearch download service.
The form we use is:
```sh
bin/plugin install pluginname
```
As plugins share now the same version as elasticsearch, we can automatically guess what is the exact current version of the plugin manager script.
Also, download service will now use `/org.elasticsearch.plugins/pluginName/pluginName-version.zip` URL path to download a plugin.
If the older form is provided (`user/plugin/version` or `user/plugin`), we will still use:
* elasticsearch download service at `/user/plugin/plugin-version.zip`
* maven central with groupIp=user, artifactId=plugin and version=version
* github with user=user, repoName=plugin and tag=version
* github with user=user, repoName=plugin and branch=master if no version is set
Note that community plugin providers can use other download services by using `--url` option.
If you try to use the new form with a non core elasticsearch plugin, the plugin manager will reject
it and will give you all known core plugins.
```
Usage:
-u, --url [plugin location] : Set exact URL to download the plugin from
-i, --install [plugin name] : Downloads and installs listed plugins [*]
-t, --timeout [duration] : Timeout setting: 30s, 1m, 1h... (infinite by default)
-r, --remove [plugin name] : Removes listed plugins
-l, --list : List installed plugins
-v, --verbose : Prints verbose messages
-s, --silent : Run in silent mode
-h, --help : Prints this help message
[*] Plugin name could be:
elasticsearch-plugin-name for Elasticsearch 2.0 Core plugin (download from download.elastic.co)
elasticsearch/plugin/version for elasticsearch commercial plugins (download from download.elastic.co)
groupId/artifactId/version for community plugins (download from maven central or oss sonatype)
username/repository for site plugins (download from github master)
Elasticsearch Core plugins:
- elasticsearch-analysis-icu
- elasticsearch-analysis-kuromoji
- elasticsearch-analysis-phonetic
- elasticsearch-analysis-smartcn
- elasticsearch-analysis-stempel
- elasticsearch-cloud-aws
- elasticsearch-cloud-azure
- elasticsearch-cloud-gce
- elasticsearch-delete-by-query
- elasticsearch-lang-javascript
- elasticsearch-lang-python
```
If you are using the default date or the named identifiers of dates,
the current implementation was allowed to read a year with only one
digit. In order to make this more strict, this fixes a year to be at
least 4 digits. Same applies for month, day, hour, minute, seconds.
Also the new default is `strictDateOptionalTime` for indices created
with Elasticsearch 2.0 or newer.
In addition a couple of not exposed date formats have been exposed, as they
have been mentioned in the documentation.
Closes#6158
Moving the query building functionality from the parser to the builders
new toQuery() method analogous to other recent query refactorings.
Relates to #10217
Currently parsing inner queries can return `null` which leads to unnecessary complicated
null checks further up the query hierarchy. By introducing a special EmptyQueryBuilder
that can be used to signal that this query clause should be ignored upstream where possible,
we can avoid additional null checks in parent query builders and still allow for this query
to be ignored when creating the lucene query later.
This new builder returns `null` when calling its `toQuery()` method, so at this point
we still need to handle it, but having an explicit object for the intermediate query
representation makes it easier to differentiate between cases where inner query clauses are
empty (legal) or are not set at all (illegal).
Also added check for empty inner builder list to BoolQueryBuilder and OrQueryBuilder.
Removed setters for positive/negatice query in favour of constructor with
mandatory non-null arguments in BoostingQueryBuilder.
Closes#11915
Following the discussion in #11744, move boost and query _name to base class AbstractQueryBuilder with their getters and setters. Unify their serialization code and equals/hashcode handling in the base class too. This guarantess that every query supports both _name and boost and nothing needs to be done around those in subclasses besides properly parsing the fields in the parsers and printing them out as part of the doXContent method in the builders. More specifically, these are the performed changes:
- Introduced printBoostAndQueryName utility method in AbstractQueryBuilder that subclasses can use to print out _name and boost in their doXContent method.
- readFrom and writeTo are now final methods that take care of _name and boost serialization. Subclasses have to implement doReadFrom and doWriteTo instead.
- toQuery is a final method too that takes care of properly applying _name and boost to the lucene query. Subclasses have to implement doToQuery instead. The query returned will have boost and queryName applied automatically.
- Removed BoostableQueryBuilder interface, given that every query is boostable after this change. This won't have any negative effect on filters, as the boost simply gets ignored in that case.
- Extended equals and hashcode to handle queryName and boost automatically as well.
- Update the query test infra so that queryName and boost are tested automatically, and whenever they are forgotten in parser or doXContent tests fail, so this makes things a lot less error-prone
- Introduced DEFAULT_BOOST constant to make sure we don't repeat 1.0f all the time for default boost values.
SpanQueryBuilder is again a marker interface only. The convenient toQuery that allowed us to override the return type to SpanQuery cannot be supported anymore due to a clash with the toQuery implementation from AbstractQueryBuilder. We have to go back to castin lucene Query to SpanQuery when dealing with span queries unfortunately.
Note that this change touches not only the already refactored queries but also the untouched ones, by making sure that we parse _name and boost whenever we need to and that we print them out as part of QueryBuilder#doXContent. This will result in printing out the default boost all the time rather than skipping it in non refactored queries, something that we would have changed anyway as part of the query refactoring.
The following are the queries that support boost now while previously they didn't (parser now parses it and builder prints it out): and, exists, fquery, geo_bounding_box, geo_distance, geo_distance_range, geo_hash_cell, geo_polygon, indices, limit, missing, not, or, script, type.
The following are the queries that support _name now while previously they didn't (parser now parses it and builder prints it out): boosting, constant_score, function_score, limit, match_all, type.
Range query parser supports now _name at the same level as boost too (_name is still supported on the outer object though for bw comp).
There are two exceptions that despite have getters and setters for queryName and boost don't really support boost and queryName: query filter and span multi term query. The reason for this is that they only support a single inner object which is another query that they wrap, no other elements.
Relates to #11744Closes#10776Closes#11974
In order to be backwards compatible, indices created before 2.x must support
indexing of a unix timestamp and its configured date format. Indices created
with 2.x must configure the `epoch_millis` date formatter in order to
support this.
Relates #10971
As we now have an enum Operator that comes with many useful helper methods switching to use
that instead of the enums defined separately. Also switches to using the new enum's helper
methods where applicable removing duplicate parsing logic.
This breaks backwards compatibility. Documenting the break in
migrate_query_refactoring.asciidoc
Relates to #10217
This is a follow up to #8143 and #6730 for _timestamp. It removes
support for `path`, as well as any field type settings, and
enables docvalues for _timestamp, for 2.0. Users who need to
adjust these settings can use a date field.
Moving the query building functionality from the parser to the builders
new toQuery() method analogous to other recent query refactorings. In
this case this also includes FQueryFilterParser, since both queries are
closely related.
Relates to #10217Closes#11729
This fixes an issue to allow for negative unix timestamps.
An own printer for epochs instead of just having a parser has been added.
Added docs that only 10/13 length unix timestamps are supported
Added docs in upgrade documentation
Fixes#11478
While we had initially planned to keep rivers around in 2.0 to ease migration,
keeping support for rivers is challenging as it conflicts with other important
changes that we want to bring to 2.0 like synchronous dynamic mappings updates.
Nothing impossible to fix, but it would increase the complexity of how we
deal with dynamic mappings updates and manage rivers, while handling dynamic
mappings updates correctly is important for resiliency and rivers are on the go.
So removing rivers in 2.0 may well be a better trade-off.
The ResourceWatcher used settings prefixed `watcher.`, which
potentially could clash with the watcher plugin.
In order to prevent confusion, the settings have been renamed to
`resource.reload` prefixes.
This also uses the deprecation logging infrastructure introduced
in #11033 to log deprecated settings and their alternative at
startup.
Closes#11175
There are different ways to register custom query parsers through plugins, a couple of them work per index via index settings, which is probably even too flexible. There also three different ways to add a global custom query parser through either IndicesQueriesModule or IndicesQueriesRegistry. This commit consolidates the registration of custom query parsers via IndicesQueriesModule#addQuery(Class<? extends QueryParser>). The complexity of supporting parsers per index is not needed hence it got removed. Also the other ways of registering global custom parsers are dropped in favour of the one mentioned above.
Closes#11481
Some of our meta fields (such as _id, _version, ...) are returned as top-level
properties of the json document, while other properties (_timestamp, _routing,
...) are returned under `fields`. This commit makes all meta fields returned
as top-level properties.
So eg. `GET test/test/1?fields=_timestamp,foo` would now return
```json
{
"_index": "test",
"_type": "test",
"_id": "1",
"_version": 1,
"_timestamp": 10000000,
"found": true,
"fields": {
"foo": [ "bar" ]
}
}
```
while it used to return
```json
{
"_index": "test",
"_type": "test",
"_id": "1",
"_version": 1,
"found": true,
"fields": {
"_timestamp": 10000000,
"foo": [ "bar" ]
}
}
```
* Cut the `has_child` and `has_parent` queries over to use Lucene's query time global ordinal join. The main benefit of this change is that parent/child queries can now efficiently execute if parent/child queries are wrapped in a bigger boolean query. If the rest of the query only hit a few documents both has_child and has_parent queries don't need to evaluate all parent or child documents any more.
* Cut the `_parent` field over to use doc values. This significantly reduces the on heap memory footprint of parent/child, because the parent id values are never loaded into memory.
Breaking changes:
* The `type` option on the `_parent` field can only point to a parent type that doesn't exist yet, so this means that an existing type/mapping can't become a parent type any longer.
* The `has_child` and `has_parent` queries can no longer be use in alias filters.
All these changes, improvements and breaks in compatibility only apply for indices created with ES version 2.0 or higher. For indices creates with ES <= 2.0 the older implementation is used.
It is highly recommended to re-index all your indices with parent and child documents to benefit from all the improvements that come with this refactoring. The easiest way to achieve this is by using the scan and bulk apis using a simple script.
Closes#6107Closes#8134
The count api used to have its own execution path, although it would do the same (up to bugs!) of the search api. This commit makes it a shortcut to the search api with size set to 0. The change is made in a backwards compatible manner, by leaving all of the java api code around too, given that you may not want to get back a whole SearchResponse when asking only for number of hits matching a query, also cause migrating from countResponse.getCount() to searchResponse.getHits().totalHits() doesn't look great from a user perspective. We can always decide to drop more code around the count api if we want to break backwards compatibility on the java api, making it a shortcut on the rest layer only.
Closes#9117Closes#11198
This option is broken currently since it potentially interprets an incoming
binary value as compressed while it just happens that the first bytes are the
same as the LZF header.
Mappings conflicts should not be ignored. If I read the history correctly, this
option was added when a mapping update to an existing field was considered a
conflict, even if the new mapping was exactly the same. Now that mapping updates
are smart enough to detect conflicting options, we don't need an option to
ignore conflicts.
The default `false` for `require_field_match` is a bit odd and confusing for users, given that field names get ignored by default and every field gets highlighted if it contains terms extracted out of the query, regardless of which fields were queries. Changed the default to `true`, it can always be changed per request.
Closes#10627Closes#11067
Our own fork of the lucene PostingsHighlighter is not easy to maintain and doesn't give us any added value at this point. In particular, it was introduced to support the require_field_match option and discrete per value highlighting, used in case one wants to highlight the whole content of a field, but get back one snippet per value. These two features won't
make it into lucene as they slow things down and shouldn't have been supported from day one on our end probably.
One other customization we had was support for a wider range of queries via custom rewrite etc. (yet another way to slow
things down), which got added to lucene and works much much better than what we used to do (instead of or rewrite, term
s are pulled out of the automata for multi term queries).
Removing our fork means the following in terms of features:
- dropped support for require_field_match: the postings highlighter will only highlight fields that were queried
- some custom es queries won't be supported anymore, meaning they won't be highlighted. The only one I found up until now is the phrase_prefix. Postings highlighter rewrites against an empty reader to avoid slow operations (like the ones that we were performing with the fork that we are removing here), thus the prefix will not be expanded to any term. What the postings highlighter does instead is pulling the automata out of multi term queries, but this is not supported at the moment with our MultiPhrasePrefixQuery.
Closes#10625Closes#11077
These clauses filter the document space without affecting scoring and map to
Lucene's BooleanClause.Occur.FILTER. The `filtered` query is now deprecated and
```json
{
"filtered": {
"query": { //query },
"filter": { //filter }
}
}
```
should be replaced with
```json
{
"bool": {
"must": { //query },
"filter": { //filter }
}
}
```
A few meta fields can currently be set within a document's source.
However, the recommended way to set meta fields like this is through
the api, and setting within the document can be a performance trap
(e.g. needing to find _id in order to route the document).
This change removes the ability to set meta fields within
a document source for 2.0+ indexes.
closes#11051closes#11074