Adds verification that geohashes are not empty and contain only
valid characters. It fixes the issue when en empty geohash is
treated as [-180, -90] and geohashes with non-geohash character
are getting resolved into invalid coordinates.
Closes#23579
The test indexes new documents and is thus correct in testing that the response result
is `CREATED`. Sadly we can't guarantee exactly once delivery just yet.
Relates #9967Closes#21658
Changes how data is read from CipherInputStream
Instead of using `read()` and checking that the bytes read are what we
expect, use `readFully()` which will read exactly the number of bytes
while keep reading until the end of the stream or throw an
`EOFException` if not all bytes can be read.
This approach keeps the simplicity of using CipherInputStream while
working as expected with both JCE and BCFIPS Security Providers
This PR adds support for the Get Settings API to the java high-level rest client.
Furthermore, logic related to the retrieval of default settings has been moved from the rest layer into the transport layer and now default settings may be retrieved consistency via both the rest API and the transport API.
We were recently looking at bugs that can only occur if two different documents were indexed concurrently. For example, what happens if the local checkpoint advances above the sequence number of a document that's being indexed. That can only happen if another concurrent operation caused the checkpoint to advance. It has to be another document to allow concurrency as we acquire a per uid lock.While our investigation proved that the suspected bug doesn't exists, we still discovered our unit testing coverage is not good enough to cover this case.
This PR extend the test concurrent out of order replica processing to use two documents in its history.
Fix NPE when CumulativeSum agg encounters null/empty bucket
If the cusum agg encounters a null value, it's because the value is
missing (like the first value from a derivative agg), the path is
not valid, or the bucket in the path was empty.
Previously cusum would just explode on the null, but this changes it
so we only increment the sum if the value is non-null and finite.
This is safe because even if the cusum encounters all null or empty
buckets, the cumulative sum is still zero (like how the sum agg returns
zero even if all the docs were missing values)
I went ahead and tweaked AggregatorTestCase to allow testing pipelines,
so that I could delete the IT test and reimplement it as AggTests.
Closes#27544
This commit removes the http.enabled setting. While all real nodes (started with bin/elasticsearch) will always have an http binding, there are many tests that rely on the quickness of not actually needing to bind to 2 ports. For this case, the MockHttpTransport.TestPlugin provides a dummy http transport implementation which is used by default in ESIntegTestCase.
closes#12792
Suggester Options have a collate match field that is returned when the prune
option is set to true. These values should be merged together in the query
reduce phase, otherwise good suggestions that result in rare hits in shards with
results that do not arrive first may be incorrectly marked as not matching the
collate query.
At the end of recovery, we mark the recovering shard as "in sync" on the primary. From this point on
the primary will treat any replication failure on it as critical and will reach out to the master to fail the
shard. To do so, we wait for the local checkpoint of the recovered shard to be above the global
checkpoint (in order to maintain global checkpoint invariant).
If the master decides to cancel the allocation of the recovering shard while we wait, the method can
currently hang and fail to return. It will also ignore the interrupts that are triggered by the cancelled
recovery due to the primary closing.
Note that this is crucial as this method is called while holding a primary permit. Since the method
never comes back, the permit is never released. The unreleased permit will then block any primary
relocation *and* while the primary is trying to relocate all indexing will be blocked for 30m as it
waits to acquire the missing permit.
The code in `SourceRecoveryHandler` runs under a `CancellableThreads` instance in order to allow long running operations to be interrupted when the recovery is cancelled. Sadly if this happens at just the wrong moment while acquiring a permit from the primary, that primary can be leaked and never be freed.
Note that this is slightly better than it sounds - we only cancel recoveries on the source side if the primary shard itself is closed.
Relates to https://github.com/elastic/elasticsearch/pull/30316
This adds a new `_ignored` meta field which indexes and stores fields that have
been ignored at index time because of the `ignore_malformed` option. It makes
malformed documents easier to identify by using `exists` or `term(s)` queries
on the `_ignored` field.
Closes#29494
* WIP commit to try calling rewrite on coordinating node during TransportSearchAction
* Use re-written query instead of using the original query
* fix incorrect/unused imports and wildcarding
* add error handling for cases where an exception is thrown
* correct exception handling such that integration tests pass successfully
* fix additional case covered by IndicesOptionsIntegrationIT.
* add integration test case that verifies queries are now valid
* add optional value for index
* address review comments: catch superclass of XContentParseException
fixes#29483
The variadic constructor was only used in a few places and the
RepositoriesMetaData class is backed by a List anyway, so just using a
List will make it simpler to instantiate it.
We still don't have a strong reason for the failures of
testDoNotRenewSyncedFlushWhenAllSealed and
testSyncedFlushSkipOutOfSyncReplicas.
This commit adds debug logging for these two tests.
Today when an index is created from shrinking or splitting an existing
index, the target index inherits almost none of the source index
settings. This is surprising and a hassle for operators managing such
indices. Given this is the default behavior, we can not simply change
it. Instead, we start by introducing the ability to copy settings. This
flag can be set on the REST API or on the transport layer and it has the
behavior that it copies all settings from the source except non-copyable
settings (a property of a setting introduced in this
change). Additionally, settings on the request will always override.
This change is the first step in our adventure:
- this flag is added here in 7.0.0 and immediately deprecated
- this flag will be backported to 6.4.0 and remain deprecated
- then, we will remove the ability to set this flag to false in 7.0.0
- finally, in 8.0.0 we will remove this flag and the only behavior will
be for settings to be copied
Just like `ElasticsearchException`, the inner most
`XContentParseException` tends to contain the root cause of the
exception and show be show to the user in the `root_cause` field.
The effectively undoes most of the changes that #29373 made to the
`root_cause` for parsing exceptions. The `type` field still changes from
`parse_exception` to `x_content_parse_exception`, but this seems like a
fairly safe change.
`ElasticsearchWrapperException` *looks* tempting to implement this but
the behavior isn't quite right. `ElasticsearchWrapperExceptions` are
entirely unwrapped until the cause no longer
`implements ElasticsearchWrapperException` but `XContentParseException`
should be unwrapped until its cause is no longer an
`XContentParseException` but no further. In other words,
`ElasticsearchWrapperException` are unwrapped one step too far.
Closes#30261
Starting with the refactoring in https://github.com/elastic/elasticsearch/pull/22778 (released in 5.3) we may fail to properly replicate operation when a mapping update on master fails. If a bulk
operations needs a mapping update half way, it will send a request to the master before continuing
to index the operations. If that request times out or isn't acked (i.e., even one node in the cluster
didn't process it within 30s), we end up throwing the exception and aborting the entire bulk. This is
a problem because all operations that were processed so far are not replicated any more to the
replicas. Although these operations were never "acked" to the user (we threw an error) it cause the
local checkpoint on the replicas to lag (on 6.x) and the primary and replica to diverge.
This PR does a couple of things:
1) Most importantly, treat *any* mapping update failure as a document level failure, meaning only
the relevant indexing operation will fail.
2) Removes the mapping update callbacks from `IndexShard.applyIndexOperationOnPrimary` and
similar methods for simpler execution. We don't use exceptions any more when a mapping
update was successful.
I think we need to do more work here (the fact that a single slow node can prevent those mappings
updates from being acked and thus fail operations is bad), but I want to keep this as small as I can
(it is already too big).
Currently, the only way to get the REST response for the `/_cluster/state`
call to return the `cluster_uuid` is to request the `metadata` metrics,
which is one of the most expensive response structures. However, external
monitoring agents will likely want the `cluster_uuid` to correlate the
response with other API responses whether or not they want cluster
metadata.
Today when a resize operation is performed, we copy the analysis,
similarity, and sort settings from the source index. It is possible for
the resize request to include additional index settings including
analysis, similarity, and sort settings. We reject sort settings when
validating the request. However, we silently ignore analysis and
similarity settings on the request that are already set on the source
index. Since it is possible to change the analysis and similarity
settings on an existing index, this should be considered a bug and the
sort of leniency that we abhor. This commit addresses this bug by
allowing the request analysis/similarity settings to override the
existing analysis/similarity settings on the target.
The `testDeleteSnapshotWithMissingIndexAndShardMetadata` test uses an
obsolete repository directory structure based on index names instead of
UUIDs. Because it swallows exceptions when deleting test files the test
never failed when the directory structure changed.
This commit fixes the test to use the right directory structure and file
names and to not swallow exceptions anymore.
A NullPointerException is thrown when trying to create or delete
a snapshot in a repository that has been written to by an older
Elasticsearch after writing to it with a newer Elasticsearch version.
This is because the way snapshots are formatted in the repository
snapshots index file changed in #24477.
This commit changes the parsing of the repository index file so that
it now detects a corrupted index file and fails early the snapshot
operation.
closes#29052
Today we update index settings directly via IndexService instead of the
cluster state in IndexServiceTests. However, those changes will be lost
if there is a cluster state update. In general, we should update index
settings via client and limit the direct usage in only special tests.
This commit replaces direct usages by the updateSettings api of client.
Closes#24491
This commit propagates the preference and routing of the original SearchRequest in the ShardSearchRequest.
This information is then use to fix a bug in sliced scrolls when executed with a preference (or a routing).
Instead of computing the slice query from the total number of shards in the index, this commit computes this number from the number of shards per index that participates in the request.
Fixes#27550
Today we always add no-ops to translog regardless of its origin, thus a
noop may appear in the translog multiple times. This is not a big deal
as noops are small and rare to appear.
This commit ensures to add a noop to translog only if its origin is not
from local translog. This restriction has been applied for index and
delete.
This metric previously existed for backwards compatibility reasons
although the suggest stats were folded into search stats. This metric
was deprecated in 6.3.0 and this commit removes them for 7.0.0.
This commit fixes two issues with the byte size value equals/hash code
test.
The first problem is due to a test failure when the original instance is
zero bytes and we pick the mutation branch where we preserve the size
but change the unit. The mutation should result in a different byte size
value but changing the unit on zero bytes still leaves us with zero
bytes.
During the course of fixing this test I discovered another problem. When
we need to randomize size, we could randomly select a size that would
lead to an overflow of Long.MAX_VALUE.
This commit fixes both of these issues.
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
This commit adds the distribution flavor (default versus oss) to the
build process which is passed through the startup scripts to
Elasticsearch. This change will be used to customize the message on
attempting to install/remove x-pack based on the distribution flavor.
Adds a check in BlobstoreRepository.snapshot(...) that prevents duplicate snapshot names and fails
the snapshot before writing out the new index file. This ensures that you cannot end up in this
situation where the index file has duplicate names and cannot be read anymore .
Relates to #28906
The suggest stats were folded into the search stats as part of the
indices stats API in 5.0.0. However, the suggest metric remained as a
synonym for the search metric for BWC reasons. This commit deprecates
usage of the suggest metric on the indices stats API.
Similarly, due to the changes to fold the suggest stats into the search
stats, requesting the suggest index metric on the indices metric on the
nodes stats API has produced an empty object as the response since
5.0.0. This commit deprecates this index metric on the indices metric on
the nodes stats API.
This commit implements the ability to remove values from a Cache using
the values iterator. This brings the values iterator in line with the
keys iterator and adds support for removing items in the cache that are
not easily found by the key used for the cache.
Previously we did not put an indexing to a version map if that map does
not require safe access but removed the existing delete tombstone only
if assertion enabled. In #29585, we removed the side-effect caused by
assertion then this test started failing. This failure can be explained
as follows:
- Step 1: Index a doc then delete that doc
- Step 2: The version map can switch to unsafe mode because of
concurrent refreshes (implicitly called by flushes)
- Step 3: Index a document - the version map won't add this version
value and won't prune the tombstone (previously it did)
- Step 4: Delete a document - this will return NOT_FOUND instead of
DELETED because of the stale delete tombstone
This failure is actually fixed by #29619 in which we never leave stale
delete tombstones
Closes#29626
Today the VersionMap does not clean up a stale delete tombstone if it
does not require safe access. However, in a very rare situation due to
concurrent refreshes, the safe-access flag may be flipped over then an
engine accidentally consult that stale delete tombstone.
This commit ensures to never leave stale delete tombstones in a version
map by always pruning delete tombstones when putting a new index entry
regardless of the value of the safe-access flag.
The name of the bulk thread pool was renamed to "write" with "bulk" as a
fallback name. This change was made in 6.x for BWC reasons yet in 7.0.0
we are removing this fallback. This commit removes this fallback for the
write thread pool.
Today when a version map does not require safe access, we will skip that
document. However, if the assertion is enabled, we remove the delete
tombstone of that document if existed. This side-effect may accidentally
hide bugs in which stale delete tombstone can be accessed.
This change ensures putAssertionMap not modify the tombstone maps.
This commit renames the bulk thread pool to the write thread pool. This
is to better reflect the fact that the underlying thread pool is used to
execute any document write request (single-document index/delete/update
requests, and bulk requests).
With this change, we add support for fallback settings
thread_pool.bulk.* which will be supported until 7.0.0.
We also add a system property so that the display name of the thread
pool remains as "bulk" if needed to avoid breaking users.
Now that single-document indexing requests are executed on the bulk
thread pool the index thread pool is no longer needed. This commit
removes this thread pool from Elasticsearch.
Binary doc values are retrieved during the DocValueFetchSubPhase through an instance of ScriptDocValues.
Since 6.0 ScriptDocValues instances are not allowed to reuse the object that they return (https://github.com/elastic/elasticsearch/issues/26775) but BinaryScriptDocValues doesn't follow this restriction and reuses instances of BytesRefBuilder among different documents.
This results in `field` values assigned to the wrong document in the response.
This commit fixes this issue by recreating the BytesRef for each value that needs to be returned.
Fixes#29565
When comparing doubles, fixed epsilons can fail because the absolute
difference in values may be quite large, even though the relative
difference is tiny (e.g. with two very large numbers).
Instead, we can scale epsilon by the absolute value of the expected
value. This means we are looking for a diff that is epsilon-percent
away from the value, rather than just epsilon.
This is basically checking the relative error using junit's assertEqual.
Closes#29456, unmutes the test
As part of adding support for new API to the high-level REST client,
we added support for the `flat_settings` parameter to some of our
request classes. We added documentation that such flag is only ever
read by the high-level REST client, but the truth is that it doesn't
do anything given that settings are always parsed back into a `Settings`
object, no matter whether they are returned in a flat format or not.
It was a mistake to add support for this flag in the context of the
high-level REST client, hence this commit removes it.
This refactors MapperService so that it wraps a single `DocumentMapper` rather
than a `Map<String, DocumentMapper>`. We will need follow-ups since I haven't
fixed most APIs that still expose collections of types of mappers, but this is
a start...
Today the translog of an engine is exposed and can be accessed directly.
While this exposure offers much flexibility, it also causes these troubles:
- Inconsistent behavior between translog method and engine method.
For example, rolling a translog generation via an engine also trims
unreferenced files, but translog's method does not.
- An engine does not get notified when critical errors happen in translog
as the access is direct.
This change isolates translog of an engine and enforces all accesses to
translog via the engine.
The index thread pool is no longer needed as its primary use-case for
single-document indexing requests has been relieved now that
single-document indexing requests are converted to bulk indexing
requests (with a single document payload).
With the move long ago to execute all single-document indexing requests
as bulk indexing request, the method
PipelineExecutionService#executeIndexRequest is unused and will never be
used in production code. This commit removes this method and cuts over
all tests to use PipelineExecutionService#executeBulkRequest.
CRUD: Parsing changes for UpdateRequest (#29293)
Use `ObjectParser` to parse `UpdateRequest` so we reject unknown fields
and drop support for the `_fields` parameter because it was deprecated
in 5.x.
Today when reading an operation from the current generation fails
tragically we attempt to close the translog. However, by invoking close
before releasing the read lock we end up in self-deadlock because
closing tries to acquire the write lock and the read lock can not be
upgraded to a write lock. To avoid this, we move the close invocation
outside of the try-with-resources that acquired the read lock. As an
extra guard against this, we document the problem and add an assertion
that we are not trying to invoke close while holding the read lock.
This change adds a client that is connected to a remote cluster.
This allows plugins and internal structures to invoke actions on
remote clusters just like a if it's a local cluster. The remote
cluster must be configured via the cross cluster search infrastructure.
This adds 2 testcases that test if a shard goes idle
pending (uncommitted) segments are committed and unreferenced
files will be freed.
Relates to #29482
Control max size and count of warning headers
Add a static persistent cluster level setting
"http.max_warning_header_count" to control the maximum number of
warning headers in client HTTP responses.
Defaults to unbounded.
Add a static persistent cluster level setting
"http.max_warning_header_size" to control the maximum total size of
warning headers in client HTTP responses.
Defaults to unbounded.
With every warning header that exceeds these limits,
a message will be logged in the main ES log,
and any more warning headers for this response will be
ignored.
This change adds the current primary term to the header of the current
translog file. Having a term in a translog header is a prerequisite step
that allows us to trim translog operations given the max valid seq# for
that term.
This commit also updates tests to conform the primary term invariant
which guarantees that all translog operations in a translog file have
its terms at most the term stored in the translog header.
This commit moves the `TimeValue` class into the elasticsearch-core project.
This allows us to use this class in many of our other projects without relying
on the entire `server` jar.
Relates to #28504
* Decouple TimeValue from Elasticsearch server classes
This commit decouples the `TimeValue` class from the other server classes. This
is in preperation to move `TimeValue` into the `elasticsearch-core` jar,
allowing us to use it from projects that cannot depend on the elasticsearch-core
library.
Relates to #28504
Currently, a flush stats contains only the total flush which is the sum
of manual flush (via API) and periodic flush (async triggered when the
uncommitted translog size is exceeded the flush threshold). Sometimes,
it's useful to know these two numbers independently. This commit tracks
and returns a periodic flush count in a flush stats.
This change validates that the `_search` request does not have trailing
tokens after the main object and fails the request with a parsing exception otherwise.
Closes#28995
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
#29409 removed the nearlyEquals() double comparison snippet, which
makes these tests very flaky because they can generate very large or
very small doubles which don't work well with absolute error comparison.
We need to either refactor these tests to guarantee they stay in a small
range (which could be difficult due to holt/holt-winters) or re-implement
the more robust double comparison.
Tracking issue: #29456
* Remove copy-pasted code
We had two instances of copy-pasted code with a bad license from
another website. The code was doing something rather simple, and
that functionality already exists within junit. This PR simply leverages
the junit functionality.
`BaseRandomBinaryDocValuesRangeQueryTestCase.testRandomBig` should only run with
nightly tests. It doesn't make sense to make it part of every test run.
`UUIDTests` had a slow test for compression, which I made a bit faster by
decreasing the number of indexed docs.
This commit fixes the formatting of the values in the composite
aggregation response. `date` fields should return timestamp as longs
when used in a `terms` source and `ip` fields should always be formatted as strings.
This commit also fixes the parsing of the `after` key for these field types.
Finally, this commit disables the index optimization for the `ip` field and any source that provides a `missing` value.
* Move Streams.copy into elasticsearch-core and make a multi-release jar
This moves the method `Streams.copy(InputStream in, OutputStream out)` into the
`elasticsearch-core` project (inside the `o.e.core.internal.io` package). It
also makes this class into a multi-release class where the Java 9 equivalent
uses `InputStream#transferTo`.
This is a followup from
https://github.com/elastic/elasticsearch/pull/29300#discussion_r178147495
* Move ObjectParser into the x-content lib
This moves `ObjectParser`, `AbstractObjectParser`, and
`ConstructingObjectParser` into the libs/x-content dependency. This decoupling
allows them to be used for parsing for projects that don't want to depend on the
entire Elasticsearch jar.
Relates to #28504
* Fixes query_string query equals timezone check
This change fixes a bug where two `QueryStringQueryBuilder`s were found
to be equal if they had the same timezone set even if the query string
in the builders were different
Closes#29403
* Adds mutate function to QueryStringQueryBuilderTests
* iter
Fixes instances of
- Equals methods without type check
- Equals methods where the field of `this` was compared to the same
field of `this` instead of the `that` object that is compared to
Since #26542 the NodeVersionAllocationDecider tries to explain its NO decisions
as follows:
... may not support codecs or postings formats for a newer Lucene version
However, this message often appears during a rolling upgrade, and experience
has shown that it seems to cause more confusion and worry than it needs to.
This change fixes that by removing the explanation again, reducing the message
to a statement of fact about the respective nodes' versions.
Additionally, the same wording was used for version incompatibilities when
allocating a primary (vs its previous location) and a replica (vs its primary).
This change separates these two cases so they can have separate, clearer
wording.
Fixes#29228
This change fixes the handling of the `quote_field_suffix` option on `query_string`
query. The expansion was not applied to default fields query.
Closes#29324
Today when you input a byte size setting that is out of bounds for the
setting, you get an error message that indicates the maximum value of
the setting. The problem is that because we use ByteSize#toString, we
end up with a representation of the value that does not really tell you
what the bound is. For example, if the bound is 2^31 - 1 bytes, the
output would be 1.9gb which does not really tell you want the limit as
there are many byte size values that we format to the same 1.9gb with
ByteSize#toString. We have a method ByteSize#getStringRep that uses the
input units to the value as the output units for the string
representation, so we end up with no loss if we use this to report the
bound. This commit does this.