This change fixes the lookup during document parsing of whether an
object field is dynamic to handle looking up through parent object
mappers, and also handle when a parent object mapper was created during
document parsing.
closes#17854
Now there aren't any more specialized methods to read or write
NamedWriteables! Now StreamInput and StreamOutput don't need to know
about large chunks of the application!
In Elasticsearch 5.0.0, by default unquoted field names in JSON will be
rejected. This can cause issues, however, for documents that were
already indexed with unquoted field names. To alleviate this, a system
property has been added that can be enabled so migration can occur.
This system property will be removed in Elasticsearch 6.0.0
Resolves#17674
When I pulled on the thread that is "Remove PROTOTYPEs from
SignificanceHeuristics" I ended up removing SignificanceHeuristicStreams
and replacing it with readNamedWriteable. That seems like a lot at once
but it made sense at the time. And it is what we want in the end, I think.
Anyway, this also converts registration of SignificanceHeuristics to
use ParseFieldRegistry to make them consistent with Queries, Aggregations
and lots of other stuff.
Adds a new and wonderous hack to support serialization checking of
NamedWriteables registered by plugins!
Related to #17085
Extracts all the replication logic that is done on the Primary to a separated class called ReplicationOperation. The goal
here is to make unit testing of this logic easier and in the future allow setting up tests that work directly on IndexShards
without the need for networking.
Closes#16492
Removes deprecated registration methods from SearchModule and
NamedWriteableRegistry and removes the "shims" used to migrate
aggregations to the new registration methods.
Relates to #17085
* Added an extra `field` parameter to the `percolator` query to indicate what percolator field should be used. This must be an existing field in the mapping of type `percolator`.
* The `.percolator` type is now forbidden. (just like any type that starts with a `.`)
This only applies for new indices created on 5.0 and later. Indices created on previous versions the .percolator type is still allowed to exist.
The new `percolator` field type isn't active in such indices and the `PercolatorQueryCache` knows how to load queries from these legacy indices.
The `PercolatorQueryBuilder` will not enforce that the `field` parameter is of type `percolator`.
This commit refactors the UUID-generating methods out of Strings into
their own class. The primary motive for this refactoring is to avoid a
chain of class initializers from loading this class earlier than
necessary. This was discovered when it was noticed that starting
Elasticsearch without any active network interfaces leads to some
logging statements being executed before logging had been
initailized. Thus:
- these UUID methods have no place being on Strings
- removing them reduces spooky action-at-distance loading of this class
- removed the troublesome, logging statements from MacAddressProvider,
logging using statically-initialized instances of ESLogger are prone
to this problem
Relates #17837
We have this TransportAddressSerializers that works similarly to
NamedWriteables except it uses shorts instead of streams. I don't know
enough to propose removing it in favor of NamedWriteables to I just ported
it to using Writeable.Reader and left it alone.
Relates to #17085
This test should demonstrate that a single (larger)
request is processed but on of multiple large concurrent requests
is rejected. This test broke too early under some circumstances in
network mode as the limit is quite low.
With this commit we reduce the size of the individual large
requests but issue more concurrent ones thus increasing stability
of this test.
The queryShardContext we create during setup was sometimes
accessed directly, sometimes by making a copy through
the createShardContext() helper. This should be the default.
Also making sure that strict parsing is switched on via
IndexSettings in the test testup.
This change cleans up a few things in QueryParseContext and QueryShardContext
that make it hard to reason about the state of these context objects and are
thus error prone and should be simplified.
Currently the parser that used in QueryParseContext can be set and reset any
time from the outside, which makes reasoning about it hard. This change makes
the parser a mandatory constructor argument removes ability to later set a
different ParseFieldMatcher. If none is provided at construction time, the
one set inside the parser is used. If a ParseFieldMatcher is specified at
construction time, it is implicitely set on the parser that is beeing used.
The ParseFieldMatcher is only kept inside the parser.
Also currently the QueryShardContext historically holds an inner QueryParseContext
(in the super class QueryRewriteContext), that is mainly used to hold the parser
and parseFieldMatcher. For that reason, the parser can also be reset, which leads
to the same problems as above. This change removes the QueryParseContext from
QueryRewriteContext and removes the ability to reset or retrieve it from the
QueryShardContext. Instead, `QueryRewriteContext#newParseContext(parser)` can be
used to create new parse contexts with the given parser from a shard context
when needed.
Currently we are able to set a ParseFieldMatcher on XContentParsers,
mainly to conveniently carry it around to be available where the
actual parsing happens. This was just recently introduced together
with ObjectParser so that ObjectParser can make use of deprecation
logging and throwing errors while parsing.
This however is trappy because we create parsers in so many places in
the code and it is easy to forget setting the right ParseFieldMatcher.
Instead we should hold the ParseFieldMatcher only in the parse contexts
(e.g. QueryParseContext).
This PR removes the ParseFieldMatcher from XContentParser. ObjectParser
can still make use of it because we can make the otherwise unbounded
`context` type to extend an interface that makes sure contexts used in
ObjectParser can supply a ParseFieldMatcher. Contexts in ObjectParser
are now no longer optional, but it is sufficient to pass in a small
lambda expression in places where no other context is available.
Relates to #17417
and remove its PROTOTYPE. This is the first aggregation builder that
serializes its targetValueType so ValuesSourceAggregatorBuilder had to
grow support for that.
Relates to #17085
The current api allows for choosing which "case" response json keys are
written in. This has the options of camelCase or underscore. camelCase
is going to be deprecated from the query apis. However, with the case
api, it is not necessary to deprecate, as users who were using it in 2.x
can transition completely on 2.x before upgrading by simply using
the underscore option.
This change removes the 'case' option from rest apis.
see #8988
During the bulk action a hierachy of tasks is getting created: bulk->bulk[s] (coord node) -> bulk[s] (primary shard node) -> bulk[s][p] and bulk[s][r]. Due to a bug the first bulk[s] task didn't have bulk's task id is set as a parent id. This commit fixes this bug.
Handling of the current path when parsing a document is very sensitive.
This fixes a subtle bug in array parsing, where the path that was added
by parsing an array would not be cleared. It also adds a hard state
check at the end of parsing to ensure we ended with a clean path.
The doc parser uses a context object to store the state of parsing,
namely the existing mappings, new mappings, and the parsed document.
Currently this uses a threadlocal which is "reset" for each doc parsed.
However, the thread local doesn't actually save anything, since
resetting is constructing new objects. This change removes the thread
local, which also simplifies the mapper service as it now does not need
to be closeable.
In 2.0 we began restricting fields to not contains dots in their names.
This change adds back part of dots in fieldnames support. Specifically,
it allows indexing documents that contain dots in the field names, when
the correct corresponding mappers exist. For example, if mappings
contain an object field `foo`, and a subfield `bar`, then indexing a
document with `foo.bar` will work.
see #15951
This commit really reverts the inadvertent removal of allowing duplicate
calls to Node#start to be a no-op (but was mistakenly restored to
Node#stop in ddfa3a661510f25c2ce431dfd6fb86ac11eb8888).
This makes all numeric fields including `date`, `ip` and `token_count` use
points instead of the inverted index as a lookup structure. This is expected
to perform worse for exact queries, but faster for range queries. It also
requires less storage.
Notes about how the change works:
- Numeric mappers have been split into a legacy version that is essentially
the current mapper, and a new version that uses points, eg.
LegacyDateFieldMapper and DateFieldMapper.
- Since new and old fields have the same names, the decision about which one
to use is made based on the index creation version.
- If you try to force using a legacy field on a new index or a field that uses
points on an old index, you will get an exception.
- IP addresses now support IPv6 via Lucene's InetAddressPoint and store them
in SORTED_SET doc values using the same encoding (fixed length of 16 bytes
and sortable).
- The internal MappedFieldType that is stored by the new mappers does not have
any of the points-related properties set. Instead, it keeps setting the index
options when parsing the `index` property of mappings and does
`if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }`
when parsing documents.
Known issues that won't fix:
- You can't use numeric fields in significant terms aggregations anymore since
this requires document frequencies, which points do not record.
- Term queries on numeric fields will now return constant scores instead of
giving better scores to the rare values.
Known issues that we could work around (in follow-up PRs, this one is too large
already):
- Range queries on `ip` addresses only work if both the lower and upper bounds
are inclusive (exclusive bounds are not exposed in Lucene). We could either
decide to implement it, or drop range support entirely and tell users to
query subnets using the CIDR notation instead.
- Since IP addresses now use a different representation for doc values,
aggregations will fail when running a terms aggregation on an ip field on a
list of indices that contains both pre-5.0 and 5.0 indices.
- The ip range aggregation does not work on the new ip field. We need to either
implement range aggs for SORTED_SET doc values or drop support for ip ranges
and tell users to use filters instead. #17700Closes#16751Closes#17007Closes#11513
This commit contains the following improvements/fixes:
1. Renaming method names and variables to better reflect the purpose
of the method and the semantics of the variable.
2. For deleting indexes, replace the closed parameter passed to the
delete index/store methods with obtaining the index's state from the
IndexSettings that is already passed in.
3. Added tests to the IndexWithShadowReplicaIT suite, some of which
show issues in the shadow replica delete process that are captured in
Github issue 17695.
Closes#17638