Commit Graph

149 Commits

Author SHA1 Message Date
Zachary Tong c9db2de41d
[7.x] Comprehensively test supported/unsupported field type:agg combinations (#54451)
* Comprehensively test supported/unsupported field type:agg combinations (#52493)

This adds a test to AggregatorTestCase that allows us to programmatically
verify that an aggregator supports or does not support a particular
field type.  It fetches the list of registered field type parsers,
creates a MappedFieldType from the parser and then attempts to run
a basic agg against the field.

A supplied list of supported VSTypes are then compared against the
output (success or exception) and suceeds or fails the test accordingly.

Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com>
* Skip fields that are not aggregatable

* Use newIndexSearcher() to avoid incompatible readers (#52723)

Lucene's `newSearcher()` can generate readers like ParallelCompositeReader
which we can't use.  We need to instead use our helper `newIndexSearcher`
2020-03-31 14:35:03 -04:00
Nik Everett e58ad9fed3
Clean up how pipeline aggs check for multi-bucket (backport of #54161) (#54379)
Pipeline aggregations like `stats_bucket`, `sum_bucket`, and
`percentiles_bucket` only operate on buckets that have multiple buckets.
This adds support for those aggregations to `geo_distance`, `ip_range`,
`auto_date_histogram`, and `rare_terms`.

This all happened because we used a marker interface to mark compatible
aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget
to implement the interface.

This replaces the marker interface with an abstract method in
`AggregationBuilder`, `bucketCardinality` which makes you return `NONE`,
`ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At
this point `ONE` and `NONE` amount to about the same thing, but I
suspect that'll be a useful distinction when validating bucket sorts.

Closes #53215
2020-03-30 10:44:55 -04:00
Jake Landis db3420d757
[7.x] Optimize which Rest resources are used by the Rest tests… (#53766)
This should help with Gradle's incremental compile such that projects
only depend upon the resources they use.

related #52114
2020-03-19 12:28:59 -05:00
Marios Trivyzas dac720d7a1
Add a cluster setting to disallow expensive queries (#51385) (#52279)
Add a new cluster setting `search.allow_expensive_queries` which by
default is `true`. If set to `false`, certain queries that have
usually slow performance cannot be executed and an error message
is returned.

- Queries that need to do linear scans to identify matches:
  - Script queries
- Queries that have a high up-front cost:
  - Fuzzy queries
  - Regexp queries
  - Prefix queries (without index_prefixes enabled
  - Wildcard queries
  - Range queries on text and keyword fields
- Joining queries
  - HasParent queries
  - HasChild queries
  - ParentId queries
  - Nested queries
- Queries on deprecated 6.x geo shapes (using PrefixTree implementation)
- Queries that may have a high per-document cost:
  - Script score queries
  - Percolate queries

Closes: #29050
(cherry picked from commit a8b39ed842c7770bd9275958c9f747502fd9a3ea)
2020-02-12 22:56:14 +01:00
Julie Tibshirani 337d73a7c6 Rename MapperService#fullName to fieldType.
The new name more accurately describes what the method returns.
2020-02-07 10:35:53 -08:00
Mayya Sharipova 0b7309ec9c Fix NPE bug inner_hits (#50709)
When there several subqueries on different relations of the join field,
and only one of subqueries is using inner_hits, NPE occurs.
This PR prevents NPE error.

Closes #50539
2020-01-07 14:21:54 -05:00
Jim Ferenczi d6445fae4b Add a cluster setting to disallow loading fielddata on _id field (#49166)
This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472
is implemented.

Closes #43599
2019-11-28 09:35:28 +01:00
Mark Tozzi 17358b5af7
(refactor) Extract Empty/Script/Missing ValuesSource behavior to an interface (#48320) (#49330)
This is a pure code rearrangement refactor.  Logic for what specific ValuesSource instance to use for a given type (e.g. script or field) moved out of ValuesSourceConfig and into CoreValuesSourceType (previously just ValueSourceType; we extract an interface for future extensibility).  ValueSourceConfig still selects which case to use, and then the ValuesSourceType instance knows how to construct the ValuesSource for that case.
2019-11-19 16:44:29 -05:00
jimczi b858e19bcc Revert #46598 that breaks the cachability of the sub search contexts. 2019-10-15 09:40:59 +02:00
Jim Ferenczi 08f28e642b Replace SearchContext with QueryShardContext in query builder tests (#46978)
This commit replaces the SearchContext used in AbstractQueryTestCase with
a QueryShardContext in order to reduce the visibility of search contexts.

Relates #46523
2019-09-23 20:24:02 +02:00
Jim Ferenczi 4407f3af1b Delay the creation of SubSearchContext to the FetchSubPhase (#46598)
This change delays the creation of the SubSearchContext for nested and parent/child inner_hits
to the fetch sub phase in order to ensure that a SearchContext can built entirely from a
QueryShardContext. This commit also adds a validation step to the inner hits builder that ensures that we fail the request early if the inner hits path is invalid.

Relates #46523
2019-09-12 14:52:15 +02:00
Jim Ferenczi 23bf310c84 Replace the SearchContext with QueryShardContext when building aggregator factories (#46527)
This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them.
The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`.

Relates #46523
2019-09-11 16:43:30 +02:00
Zachary Tong 92ad588275
Remove generic on AggregatorFactory (#43664) (#44079)
AggregatorFactory was generic over itself, but it doesn't appear we
use this functionality anywhere (e.g. to allow the super class
to declare arguments/return types generically for subclasses to
override).  Most places use a wildcard constraint, and even when a
concrete type is specified it wasn't used.

But since AggFactories are widely used, this led to
the generic touching many pieces of code and making type signatures
fairly complex
2019-07-10 13:20:28 -04:00
Christos Soulios d1637ca476
Backport: Refactor aggregation base classes to remove doEquals() and doHashCode() (#43363)
This PR is a backport a of #43214 from v8.0.0

A number of the aggregation base classes have an abstract doEquals() and doHashCode() (e.g. InternalAggregation.java, AbstractPipelineAggregationBuilder.java).

Theoretically this is so the sub-classes can add to the equals/hashCode and don't need to worry about calling super.equals(). In practice, it's mostly just confusing/inconsistent. And if there are more than two levels, we end up with situations like InternalMappedSignificantTerms which has to call super.doEquals() which defeats the point of having these overridable methods.

This PR removes the do versions and just use equals/hashCode ensuring the super when necessary.
2019-06-19 22:31:06 +03:00
Hicham Mallah 22f3b53ed7 Deprecate using 0 value for `min_children` in `has_child` query (#41555)
After changing the allowed minimum value for min_children in has_child query from 0 to 1 in 
the next major version, this PR adds a deprecation warning for these cases.

Closes #41548
2019-04-26 21:00:11 +02:00
Julie Tibshirani be9c37fc76 Small simplifications to mapping validation. (#39777)
These simplifications to `MapperMergeValidator` are possible now that there is
always a single mapping definition.

* Remove the type argument in `validateMapperStructure`.
* Remove unnecessary checks against existing mappers.
2019-03-08 12:34:09 -08:00
Julie Tibshirani c2e9d13ebd
Default include_type_name to false in the yml test harness. (#38058)
This PR removes the temporary change we made to the yml test harness in #37285
to automatically set `include_type_name` to `true` in index creation requests
if it's not already specified. This is possible now that the vast majority of
index creation requests were updated to be typeless in #37611. A few additional
tests also needed updating here.

Additionally, this PR updates the test harness to set `include_type_name` to
`false` in index creation requests when communicating with 6.x nodes. This
mirrors the logic added in #37611 to allow for typeless document write requests
in test set-up code. With this update in place, we can remove many references
to `include_type_name: false` from the yml tests.
2019-02-01 11:44:13 -08:00
Desmond Vehar c1c4abae10 Throw if two inner_hits have the same name (#37645)
This change throws an error if two inner_hits have the same name

Closes #37584
2019-02-01 15:53:50 +01:00
Colin Goodheart-Smithe 21e392e95e
Removes typed calls from YAML REST tests (#37611)
This PR attempts to remove all typed calls from our YAML REST tests. The PR adds include_type_name: false to create index requests that use a mapping and also to put mapping requests. It also removes _type from index requests where they haven't already been removed. The PR ignores tests named *_with_types.yml since this are specifically testing typed API behaviour.

The change also includes changing the test harness to add the type _doc to index, update, get and bulk requests that do not specify the document type when the test is running against a mixed 7.x/6.x cluster.
2019-01-30 16:32:58 +00:00
Boaz Leskes af2f4c8f73 enable bwc tests and bump versions after backporting https://github.com/elastic/elasticsearch/pull/37639 2019-01-24 20:55:55 +01:00
Boaz Leskes 52ba407931
Expose sequence number and primary terms in search responses (#37639)
Users may require the sequence number and primary terms to perform optimistic concurrency control operations. Currently, you can get the sequence number via the `docvalues_fields` API but the primary term is not accessible because it is maintained by the `SeqNoFieldMapper` and the infrastructure can't find it. 

This commit adds a dedicated sub fetch phase to return both numbers that is connected to a new `seq_no_primary_term` parameter.
2019-01-23 09:01:58 +01:00
Zachary Tong 2ba9e361ab
Add helper classes to determine if aggs have a value (#36020)
This adds a set of helper classes to determine if an agg "has a value". 
This is needed because InternalAggs represent "empty" in different 
manners according to convention. Some use `NaN`, `+/- Inf`, `0.0`, etc.

A user can pass the Internal agg type to one of these helper methods
and it will report if the agg contains a value or not, which allows the
user to differentiate "empty" from a real `NaN`.

These helpers are best-effort in some cases.  For example, several
pipeline aggs share a single return class but use different conventions
to mark "empty", so the helper uses the loosest definition that applies
to all the aggs that use the class.

Sums in particular are unreliable.  The InternalSum simply returns 0.0
if the agg is empty (which is correct, no values == sum of zero).  But this
also means the helper cannot differentiate from "empty" and `+1 + -1`.
2019-01-22 12:38:55 -05:00
Jim Ferenczi 95479f1766 Ensure that a non static top docs is created during the search phase
This change fixes an unreleased bug that trips an assertion because a static instance
shared among threads is modified during the search. This commit copies the static
instance in order to ensure that each thread can modify the value without modifying
the other instances.

Closes #37179
Closes #37266
2019-01-09 22:57:34 +01:00
Alpar Torok 7de4d2cb0f Mute failing test ChildQuerySearchIT
Tracked in #37266
2019-01-09 16:48:49 +02:00
Josh Soref 1df66d21fe Spelling: replace uknown with unknown (#37056) 2019-01-02 17:33:02 +01:00
Nhat Nguyen 7580d9d925
Make SourceToParse immutable (#36971)
Today the routing of a SourceToParse is assigned in a separate step
after the object is created. We can easily forget to set the routing.
With this commit, the routing must be provided in the constructor of
SourceToParse.

Relates #36921
2018-12-24 14:06:50 -05:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Dominik Stadler d351422215 Add parent-aggregation to parent-join module (#34210)
Add `parent` aggregation, a special single bucket aggregation that joins children documents to their parent.
2018-11-08 14:13:00 +01:00
Jim Ferenczi 1b879ea8ac
Refactor children aggregator into a generic ParentJoinAggregator (#34845)
This commit adds a new ParentJoinAggregator that implements a join using global ordinals
in a way that can be reused by the `children` and the upcoming `parent` aggregation.
This new aggregator is a refactor of the existing ParentToChildrenAggregator with two main changes:
* It uses a dense bit array instead of a long array when the aggregation does not have any parent.
* It uses a single aggregator per bucket if it is nested under another aggregation.
For the latter case we use a `MultiBucketAggregatorWrapper` in the factory in order to ensure that each
instance of the aggregator handles a single bucket. This is more inlined with the strategy we use for other
aggregations like `terms` aggregation for instance since the number of buckets to handle should be low (thanks to the breadth_first strategy).
This change is also required for #34210 which adds the `parent` aggregation in the parent-join module.

Relates #34508
2018-10-26 16:26:45 +02:00
Christoph Büscher ba3ceeaccf
Clean up "unused variable" warnings (#31876)
This change cleans up "unused variable" warnings. There are several cases were we 
most likely want to suppress the warnings (especially in the client documentation test
where the snippets contain many unused variables). In a lot of cases the unused
variables can just be deleted though.
2018-09-26 14:09:32 +02:00
Christoph Büscher c9131983f5
[Docs] Minor fix in `has_child` javadoc comment (#33674)
The min and max constants are accidentaly the wrong way around.
2018-09-14 09:41:20 +02:00
Alan Woodward 39c3234c2f
Upgrade to latest Lucene snapshot (#33505)
* LeafCollector.setScorer() now takes a Scorable
* Scorers may not have null Weights
* IndexWriter.getFlushingBytes() reports how much memory is being used by IW threads writing to disk
2018-09-10 20:51:55 +01:00
Jim Ferenczi 79cd6385fe
Collapse package structure for metrics aggs (#33463)
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.

Relates #22868
2018-09-07 10:58:06 +02:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Julie Tibshirani 78df00ff24
Simplify the return type of FieldMapper#parse. (#32654) 2018-09-04 01:15:19 +00:00
Nhat Nguyen 24d60c7f4b
Fix from_range in search_after in changes snapshot (#33335)
We can have multiple documents in Lucene with the same seq_no for
parent-child documents (or without rollback). In this case, the usage
"lastSeenSeqNo + 1" is an off-by-one error as it may miss some
documents. This error merely affects the `skippedOperations` contract.

See: https://github.com/elastic/elasticsearch/pull/33222#discussion_r213842257

Closes #33318
2018-09-03 11:58:49 -04:00
Nhat Nguyen 8703d875c0 TEST: Disable soft-deletes in ParentChildTestCase
Tracked at #33318
2018-08-31 12:54:51 -04:00
Jim Ferenczi f4e9729d64
Remove unsupported Version.V_5_* (#32937)
This change removes the es 5x version constants and their usages.
2018-08-24 09:51:21 +02:00
Luca Cavanna 393eec1482
Set maxScore for empty TopDocs to Nan rather than 0 (#32938)
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
2018-08-22 17:23:54 +02:00
Christoph Büscher 22f7b03430
Fix test reproducability in AbstractBuilderTestCase setup (#32403)
Currently AbstractBuilderTestCase generates certain random values in its
`beforeTest()` method annotated with @Before only the first time that a test
method in the suite is run while initializing the serviceHolder that we use for
the rest of the test. This changes the values of subsequent random values
and has the effect that when running single methods from a test suite with
"-Dtests.method=*", the random values it sees are different from when the same
test method is run as part of the whole test suite. This makes it hard to use
the reproduction lines logged on failure.

This change runs the inialization of the serviceHolder and the randomization 
connected to it using the test runners master seed, so reproduction by running
just one method is possible again.


Closes #32400
2018-08-10 15:13:44 +02:00
Nirmal Chidambaram c827a4e8e1 has_parent builder: exception message/param fix (#31182)
has_parent builder throws exception message that it expects a `type`
while parser excepts `parent_type`
2018-06-30 11:17:37 -07:00
Tanguy Leroux bf58660482
Remove all unused imports and fix CRLF (#31207)
The X-Pack opening and the recent other refactorings left a lot of 
unused imports in the codebase. This commit removes them all.
2018-06-11 15:12:12 +02:00
Adrien Grand 231a63fdf8
Remove useless version checks in REST tests. (#30165)
Many tests are added with a version check so that they do not run against a
version that doesn't have the feature yet. Master is 7.0, so all tests that
do not run against 6.0+ can be removed and the version check can be removed
on all tests that always run on 6.0+.
2018-05-02 11:34:15 +02:00
Nik Everett 9c8e015552
Build: Mostly silence warning about html4 javadoc (#30220)
This *mostly* silences `javadoc`'s warning about defaulting to
generating html4 files by enabling generating html5 file for the
projects for which that works. It didn't work in a half dozen projects,
about half of which I've fixed in this PR, entirely by replacing
`<tt>thing</tt>` with `{@code thing}`.

There are a few remaining projects that contain javadoc with invalid
html5. I'll fix those projects in a followup.
2018-04-28 09:50:54 -04:00
Adrien Grand 4918924fae
Remove legacy mapping code. (#29224)
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
2018-04-11 09:41:37 +02:00
Adrien Grand c052e989cf Fix HasChildQueryBuilderTests to not use the `classic` similarity.
Closes #29362
2018-04-04 12:48:41 +02:00
Jason Tedor 4b1ed20a67 Add awaits fix for HasChildQueryBuilderTests
These tests are failing since
569d0c0e89. This commit adds an awaits fix
for them until they can be addressed.
2018-04-03 23:18:51 -04:00
Adrien Grand 569d0c0e89
Improve similarity integration. (#29187)
This improves the way similarities are plugged in in order to:
 - reject the classic similarity on 7.x indices and emit a deprecation
   warning otherwise
 - reject unkwown parameters on 7.x indices and emit a deprecation
   warning otherwise

Even though this breaks the plugin API, I'd like to backport to 7.x so
that users can get deprecation warnings when they are doing something
that will become unsupported in the future.

Closes #23208
Closes #29035
2018-04-03 16:45:25 +02:00
Jim Ferenczi 2aaa057387
Propagate ignore_unmapped to inner_hits (#29261)
In 5.2 `ignore_unmapped` was added to `inner_hits` in order to ignore invalid mapping.
This value was automatically set to the value defined in the parent query (`nested`, `has_child`, `has_parent`) but the refactoring of the parent/child in 5.6 removed this behavior unintentionally.
This commit restores this behavior but also makes sure that we always automatically enforce this value when the query builder is used directly (previously this was only done by the XContent deserialization).

Closes #29071
2018-03-27 18:55:42 +02:00
Lee Hinman 8e8fdc4f0e
Decouple XContentBuilder from BytesReference (#28972)
* Decouple XContentBuilder from BytesReference

This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.

While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).

Relates to #28504
2018-03-14 13:47:57 -06:00
Lee Hinman 3ddea8d8d2
Start switching to non-deprecated ParseField.match method (#28488)
This commit switches all the modules and server test code to use the
non-deprecated `ParseField.match` method, passing in the parser's deprecation
handler or the logging deprecation handler when a parser is not available (like
in tests).

Relates to #28449
2018-02-02 10:10:13 -07:00
Jim Ferenczi dd40b984c4
Add a shallow copy method to aggregation builders (#28430)
This change adds a shallow copy method for aggregation builders. This method returns a copy of the builder replacing the factoriesBuilder and metaDada
This method is used when the builder is rewritten (AggregationBuilder#rewrite) in order to make sure that we create a new instance of the parent builder when sub aggregations are rewritten.

Relates #27782
2018-02-01 09:22:32 +01:00
Adrien Grand 700d9ecc95
Remove the `update_all_types` option. (#28288)
This option is not useful in 7.x since no indices may have more than one type
anymore.
2018-01-22 12:03:07 +01:00
Adrien Grand 1b660821a2
Allow `_doc` as a type. (#27816)
Allowing `_doc` as a type will enable users to make the transition to 7.0
smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`.
This also moves most of the documentation to `_doc` as a type name.

Closes #27750
Closes #27751
2017-12-14 17:47:53 +01:00
Adrien Grand 6323bb0d97
Upgrade to lucene-7.2.0-snapshot-8c94404. (#27619)
This new snapshot mostly brings a change to TopFieldCollector which can now
early terminate collection when trackTotalHits is `false`.

As a follow-up, we should replace our usage of
`EarlyTerminatingSortingCollector` with this new option.
2017-12-04 09:40:08 +01:00
Martijn van Groningen cb1204774b
Include the _index, _type and _id to nested search hits in the top_hits and inner_hits response.
Also include _type and _id for parent/child hits inside inner hits.

In the case of top_hits aggregation the nested search hits are
directly returned and are not grouped by a root or parent document, so
it is important to include the _id and _index attributes in order to know
to what documents these nested search hits belong to.

Closes #27053
2017-11-28 14:05:29 +01:00
Colin Goodheart-Smithe 99aca9cdfc
Enhances exists queries to reduce need for `_field_names` (#26930)
* Enhances exists queries to reduce need for `_field_names`

Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance.

This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`.

Closes #26770

* Addresses review comments

* Addresses more review comments

* implements existsQuery explicitly on every mapper

* Reinstates ability to perform term query on `_field_names`

* Added bwc depending on index created version

* Review Comments

* Skips tests that are not supported in 6.1.0

These values will need to be changed after backporting this PR to 6.x
2017-11-01 10:46:59 +00:00
Simon Willnauer 8dda827ff4 Don't refresh on `_flush` `_force_merge` and `_upgrade` (#27000)
Today all these API calls have a sideeffect of making documents visible
to search requests. While this is sometimes desired it's an unnecessary sideeffect
and now that we have an internal (engine-private) index reader (#26972) we artificially
add a refresh call for bwc. This change removes this sideeffect in 7.0.
2017-10-16 10:16:35 +02:00
Martijn van Groningen dca787ed8a
upgrade to Lucene 7.1.0 snapshot version 2017-10-05 09:06:56 +02:00
Simon Willnauer aab4655e63 Unify Settings xcontent reading and writing (#26739)
This change adds a fromXContent method to Settings that allows to read
the xcontent that is produced by toXContent. It also replaces the entire settings
loader infrastructure and removes the structured map representation. Future PRs will
also tackle the `getAsMap` that exposes the internal represenation of settings for
better encapsulation.
2017-09-25 13:23:01 +02:00
Christoph Büscher 86b00b84bc Remove parse field deprecations in query builders (#26711)
The `fielddata` field and the use of the `_name` field in the short syntax of the range 
query have been deprecated in 5.0 and can be removed.

The same goes for the deprecated `score_mode` field in HasParentQueryBuilder,
the deprecated `like_text`, `ids` and `docs` parameter in the `more_like_this` query,
the deprecated query name in the short version of the `regexp` query, and several
deprecated alternative field names in other query builders.
2017-09-20 16:22:21 +02:00
Martijn van Groningen 78e9c96d7f
Added a limit to from + size in top_hits and inner hits.
Relates to #11511
2017-09-05 08:44:45 +02:00
Colin Goodheart-Smithe ce1d85d7d0 Moves deferring code into its own subclass (#26421)
* Moves deferring code into its own subclass

This change moves the code that deals with deferring collection to a subclass of BucketAggregator called DeferringBucketAggregator. This means that the code in AggregatorBase is simplified and also means that the code for deferring colleciton is in one place and easier to maintain.

* Makes SIngleBucketAggregator an interface

This is so aggregators that extend BucketsAggregator directly and those that extend DeferringBucketAggregator can be a single bucket aggregator

* review comments

* More review comments
2017-08-30 11:15:40 +01:00
olcbean 5c4c1c5e15 Verify that _bulk and _msearch requests are terminated by a newline (#25740) 2017-08-08 10:45:44 +02:00
Simon Willnauer 82fa531ab4 Remove `_index` fielddata hack if cluster alias is present (#26082)
We introduced a hack in #25885 to respect the cluster alias if available on the `_index` field. This is important if aggregations or other field data related operations are executed. Yet, we added a small hack that duplicated an implementation detail from the `_index` field data builder to make this work. This change adds a necessary but simple API change that allows us to remove the hack and only have a single implementation.
2017-08-08 09:24:24 +02:00
Adrien Grand f0cba4fce5 Add a scripted similarity. (#25831)
The goal of this similarity is to help users who would like to keep the
functionality of the `tf-idf` similarity that we want to remove, or to allow
for specific usec-cases (disabling idf, disabling tf, disabling length norm,
etc.) to not have to build a custom plugin and familiarize with the low-level
Lucene API.
2017-08-08 08:55:12 +02:00
Adrien Grand 88d456989e Make FieldMapper.copyTo() always non-null. (#25994)
Otherwise it is confusing that both a null copyTo and an empty copyTo should
be treated the same.
2017-08-02 10:07:29 +02:00
Jim Ferenczi 562c3744ca Merge FunctionScoreQuery and FiltersFunctionScoreQuery (#25889)
This change merges the functionality of the FiltersFunctionScoreQuery in the FunctionScoreQuery.
It also ensures that an exception is thrown when the computed score is equals to Float.NaN or Float.NEGATIVE_INFINITY.
These scores are invalid for TopDocsCollectors that relies on score comparison.

Fixes #15709
Fixes #23628
2017-07-28 09:22:20 +02:00
Simon Willnauer 634ce90dc0 Respect cluster alias in `_index` aggs and queries (#25885)
Today when we aggregate on the `_index` field the cross cluster search
alias is not taken into account. Neither is it respected when we search
on the field. This change adds support for cluster alias when the cluster
alias is present on the `_index` field.

Closes #25606
2017-07-26 09:16:52 +02:00
Adrien Grand 40bb1663ee Index ids in binary form. (#25352)
Indexing ids in binary form should help with indexing speed since we would
have to compare fewer bytes upon sorting, should help with memory usage of
the live version map since keys will be shorter, and might help with disk
usage depending on how efficient the terms dictionary is at compressing
terms.

Since we can only expect base64 ids in the auto-generated case, this PR tries
to use an encoding that makes the binary id equal to the base64-decoded id in
the majority of cases (253 out of 256). It also specializes numeric ids, since
this seems to be common when content that is stored in Elasticsearch comes
from another database that uses eg. auto-increment ids.

Another option could be to require base64 ids all the time. It would make things
simpler but I'm not sure users would welcome this requirement.

This PR should bring some benefits, but I expect it to be mostly useful when
coupled with something like #24615.

Closes #18154
2017-07-07 14:22:47 +02:00
Martijn van Groningen d0f9f425bd
parent/child: Removed ParentJoinFieldSubFetchPhase 2017-07-06 13:15:02 +02:00
Martijn van Groningen 407273f81d
parent/child: Support parent id being specified as number in the _source 2017-07-06 11:48:57 +02:00
Christoph Büscher f576c987ce Remove QueryParseContext (#25486)
QueryParseContext is currently only used as a wrapper for an XContentParser, so
this change removes it entirely and changes the appropriate APIs that use it so
far to only accept a parser instead.
2017-07-03 17:30:40 +02:00
Christoph Büscher 927111c91d Remove QueryParseContext from parsing QueryBuilders (#25448)
Currently QueryParseContext is only a thin wrapper around an XContentParser that
adds little functionality of its own. I provides helpers for long deprecated
field names which can be removed and two helper methods that can be made static
and moved to other classes. This is a first step in helping to remove
QueryParseContext entirely.
2017-06-29 17:10:20 +02:00
olcbean 3518e313b8 Unify the result interfaces from get and search in Java client (#25361)
As GetField and SearchHitField have the same members, they have been unified into
DocumentField.

Closes #16440
2017-06-29 11:35:28 +02:00
Simon Willnauer d338a09812 Remove `mapping.single_type` from parent join test (#25391)
This removes the remaining usage of `mapping.single_type` from the parent join
module and moves it's bwc test to the mixed cluster tests

Relates to #24961
Relates to #20257
2017-06-26 17:33:07 +02:00
Simon Willnauer 4ae426a552 Remove remaining `index.mapping.single_type=false` (#25369)
This change cleans up remaining tests  to not use index.mapping.single_type=false
but instead where applicable use a single type or markt the index as created
with a pre 6.x version.

Yet, there is still on leftover in the client tests that needs special attention.
See `org.elasticsearch.client.SearchIT`

Relates to #24961
2017-06-23 10:26:06 +02:00
Jim Ferenczi 5e64cd08bc [Test] restore BWC for parent-join now that the new mapping format is in 5.x 2017-06-15 15:15:48 +02:00
Jim Ferenczi 9ca33e2450 Add a section named "relations" in the ParentJoinFieldMapper (#25248)
* Add a section named "relation" in the ParentJoinFieldMapper

This commit puts the parent/child definition in an inner section named "relation".
Mapping for the parent-join will look like this:

```
"join_field": {
  "type": "join"
  "relations":
    "parent": "child"
  }
}
```
2017-06-15 14:56:20 +02:00
Tanguy Leroux 27f1206999 Use SPI in High Level Rest Client to load XContent parsers (#25098)
This commit adds a NamedXContentProvider interface that can 
be implemented by plugins or modules using Java's SPI feature 
in order to provide additional NamedXContent parsers to external
applications like the Java High Level Rest Client.
2017-06-15 12:50:02 +02:00
Adrien Grand a8ea2f0df4 Leverage scorerSupplier when applicable. (#25109)
The `scorerSupplier` API allows to give a hint to queries in order to let them
know that they will be consumed in a random-access fashion. We should use this
for aggregations, function_score and matched queries.
2017-06-08 10:19:38 +02:00
Jim Ferenczi 3924fd79ef Add BWC rest test for parent-join after the backport to 5.x 2017-06-07 19:29:01 +02:00
Martijn van Groningen db8aa8e94e
Changed inner_hits to work with the new join field type and
at the same time maintaining support for the `_parent` meta field type/

Relates to #20257
2017-06-07 10:52:49 +02:00
Jim Ferenczi 7e60cf3e54 Move parent_id query to the parent-join module (#25072)
This change moves the parent_id query to the parent-join module and handles the case when only the parent-join field can be declared on an index (index with single type on).
If single type is off it uses the legacy parent join field mapper and switch to the new one otherwise (default in 6).

Relates #20257
2017-06-06 19:35:14 +02:00
Martijn van Groningen 2a71a7bffc
Change `has_child`, `has_parent` queries and `childen` aggregation to work with the new join field type and
at the same time maintaining support for the `_parent` meta field type.

Relates to #20257
2017-06-02 23:27:16 +02:00
Jim Ferenczi 4077600035 Disallow the new parent join field on indices with multiple types
Relates https://github.com/elastic/elasticsearch/pull/24978
2017-06-02 18:28:03 +02:00
Jim Ferenczi b8605775df Add the ability to set eager_global_ordinals in the new parent-join field (#25019)
Defaults to true
2017-06-02 15:34:22 +02:00
Jim Ferenczi f4aee1e583 Disallow multiple parent-join fields per mapping (#25002)
This change ensures that there is a single parent-join field defined per mapping.
The verification is done through the addition of a special field mapper (MetaJoinFieldMapper) with a unique name (_parent_join) that is registered to the mapping service
when the first parent-join field is defined. If a new parent-join is added, this field mapper will clash with the new one and the update will fail.
This change also simplifies the parent join fetch sub phase by retrieving the parent-join field without iterating on all fields in the mapping.
2017-06-02 09:21:15 +02:00
Jim Ferenczi b5d62ae747 Introduce ParentJoinFieldMapper, a field mapper that creates parent/child relation within documents of the same index (#24978)
* Introduce ParentJoinFieldMapper, a field mapper that creates parent/child relation within documents of the same index

This change adds a new field mapper named ParentJoinFieldMapper. This mapper is a replacement for the ParentFieldMapper but instead of using the types in the mapping
it uses an internal field to materialize parent/child relation within a single index.
This change also adds a fetch sub phase that automatically retrieves the join name (parent or child name) and the parent id for child documents in the response hit fields.
The compatibility with `has_parent`, `has_child` queries and `children` agg will be added in a follow up.

Relates #20257
2017-05-31 18:07:21 +02:00
Nik Everett 5da8ce8318 Remove the need for _UNRELEASED suffix in versions (#24798)
Removes the need for the `_UNRELEASED` suffix on versions by detecting if a version should be unreleased or not based on the versions around it. This should make it simpler to automate the task of adding a new version label.
2017-05-26 18:36:32 -04:00
Jim Ferenczi 4707377cea Move InnerHitBuilder queries BWC version to 5.5 after the backport
Relates #24676
2017-05-23 22:41:39 +02:00
Jim Ferenczi 9087803cd9 Add the ability to define custom inner hit sub context builder (#24676)
This commit moves the handling of nested and parent/child inner hits to specialized classes that can be defined outside of ES core.
InnerHitBuilderContext is now used by the parent query (nested or hasChild, ...) to build the sub context from the InnerHitBuilder definition.
BWC is also ensured so that nodes in previous versions can still send/receive inner hits to/from this version.

Relates #20257
2017-05-23 13:06:22 +02:00
javanna db0490343e Merge branch 'master' into feature/client_aggs_parsing 2017-05-19 18:17:06 +02:00
Jim Ferenczi d241c4898e Removes parent child fielddata specialization (#24737)
This change removes the field data specialization needed for the parent field and replaces it with
a simple DocValuesIndexFieldData. The underlying global ordinals are retrieved via a new function called
IndexOrdinalsFieldData#getOrdinalMap.
The children aggregation is also modified to use a simple WithOrdinals value source rather than the deleted WithOrdinals.Parent.

Relates #20257
2017-05-19 17:11:23 +02:00
javanna ce7326eb88 Merge branch 'master' into feature/client_aggs_parsing 2017-05-17 17:59:00 +02:00
Ryan Ernst 2a65bed243 Tests: Change rest test extension from .yaml to .yml (#24659)
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
2017-05-16 17:24:35 -07:00
Christoph Büscher 0b688a8733 Small improvement in InternalAggregationTestCase test setup after changes in master (#24675) 2017-05-15 15:06:01 +02:00
Christoph Büscher 42e8d4b761 Merge branch 'master' into feature/client_aggs_parsing
Conflicts:
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/filter/InternalFilterTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/global/InternalGlobalTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/missing/InternalMissingTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/nested/InternalNestedTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/nested/InternalReverseNestedTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/sampler/InternalSamplerTests.java
	modules/parent-join/src/test/java/org/elasticsearch/join/aggregations/InternalChildrenTests.java
	test/framework/src/main/java/org/elasticsearch/search/aggregations/InternalSingleBucketAggregationTestCase.java
2017-05-15 12:25:07 +02:00
Jim Ferenczi 279a18a527 Add parent-join module (#24638)
* Add parent-join module

This change adds a new module named `parent-join`.
The goal of this module is to provide a replacement for the `_parent` field but as a first step this change only moves the `has_child`, `has_parent` queries and the `children` aggregation to this module.
These queries and aggregations are no longer in core but they are deployed by default as a module.

Relates #20257
2017-05-12 15:58:06 +02:00