The parent filter for nested sort should always match **all** parents regardless
of the child queries. It is used to find the boundaries of a single parent and we use
the child query to match all the filters set in the nested tree so there is no need to
repeat the nested filters.
With this change we ensure that we build bitset filters
only to find the root docs (or the docs at the level where the sort applies) that can be reused
among queries.
Closes#31554Closes#32130Closes#31783
Co-authored-by: Dominic Bevacqua <bev@treatwell.com>
* Enhance Parent circuit breaker error message
This adds information about either the current real usage (if tracking "real"
memory usage) or the child breaker usages to the exception message when the
parent circuit breaker trips.
The messages now look like:
```
[parent] Data too large, data for [my_request] would be [211288064/201.5mb], which is larger than the limit of [209715200/200mb], usages [request=157286400/150mb, fielddata=54001664/51.5mb, in_flight_requests=0/0b, accounting=0/0b]
```
Or when tracking real memory usage:
```
[parent] Data too large, data for [request] would be [251/251b], which is larger than the limit of [200/200b], real usage: [181/181b], new bytes reserved: [70/70b]
```
* Only call currentMemoryUsage once by returning structured object
Normally translog operations will not be replayed on the primary.
Following engine is an exception where we replay translog on both
primary and replica as a non-primary strategy. Even though we won't use
the version_type in the following engine, we still need to pass a valid
value for the primary operation in order not to trip assertions in an
engine.
This commit passes version_type EXTERNAL for translog operation if its
origin is primary.
Relates #31945
Resolving wildcards in aliases expression is challenging as we may end
up with no aliases to replace the original expression with, but if we
replace with an empty array that means _all which is quite the opposite.
Now that we support and serialize the original requested aliases,
whenever aliases are replaced we will be able to know what was
initially requested. `MetaData#findAliases` can then be updated to not
return anything in case it gets empty aliases, but the original aliases
were not empty. That means that empty aliases are interpreted as _all
only if they were originally requested that way.
Relates to #31516
* master:
Painless: Simplify Naming in Lookup Package (#32177)
Handle missing values in painless (#32207)
add support for write index resolution when creating/updating documents (#31520)
ECS Task IAM profile credentials ignored in repository-s3 plugin (#31864)
Remove indication of future multi-homing support (#32187)
Rest test - allow for snapshots to take 0 milliseconds
Make x-pack-core generate a pom file
Rest HL client: Add put watch action (#32026)
Build: Remove pom generation for plugin zip files (#32180)
Fix comments causing errors with Java 11
Fix rollup on date fields that don't support epoch_millis (#31890)
Detect and prevent configuration that triggers a Gradle bug (#31912)
[test] port linux package packaging tests (#31943)
Revert "Introduce a Hashing Processor (#31087)" (#32178)
Remove empty @return from JavaDoc
Adjust SSLDriver behavior for JDK11 changes (#32145)
[test] use randomized runner in packaging tests (#32109)
Add support for field aliases. (#32172)
Painless: Fix caching bug and clean up addPainlessClass. (#32142)
Call setReferences() on custom referring tokenfilters in _analyze (#32157)
Fix BwC Tests looking for UUID Pre 6.4 (#32158)
Improve docs for search preferences (#32159)
use before instead of onOrBefore
Add more contexts to painless execute api (#30511)
Add EC2 credential test for repository-s3 (#31918)
A replica can be promoted and started in one cluster state update (#32042)
Fix Java 11 javadoc compile problem
Fix CP for namingConventions when gradle home has spaces (#31914)
Fix `range` queries on `_type` field for singe type indices (#31756)
[DOCS] Update TLS on Docker for 6.3 (#32114)
ESIndexLevelReplicationTestCase doesn't support replicated failures but it's good to know what they are
Remove versionType from translog (#31945)
Switch distribution to new style Requests (#30595)
Build: Skip jar tests if jar disabled
Painless: Add PainlessClassBuilder (#32141)
Build: Make additional test deps of check (#32015)
Disable C2 from using AVX-512 on JDK 10 (#32138)
Build: Move shadow customizations into common code (#32014)
Painless: Fix Bug with Duplicate PainlessClasses (#32110)
Remove empty @param from Javadoc
Re-disable packaging tests on suse boxes
Docs: Fix missing example script quote (#32010)
[ML] Wait for aliases in multi-node tests (#32086)
[ML] Move analyzer dependencies out of categorization config (#32123)
Ensure to release translog snapshot in primary-replica resync (#32045)
Handle TokenizerFactory TODOs (#32063)
Relax TermVectors API to work with textual fields other than TextFieldType (#31915)
Updates the build to gradle 4.9 (#32087)
Mute :qa:mixed-cluster indices.stats/10_index/Index - all’
Check that client methods match API defined in the REST spec (#31825)
Enable testing in FIPS140 JVM (#31666)
Fix put mappings java API documentation (#31955)
Add exclusion option to `keep_types` token filter (#32012)
[Test] Modify assert statement for ssl handshake (#32072)
Throw an exception for doc['field'].value
if this document is missing a value for the field.
After deprecation changes have been backported to 6.x,
make this a default behaviour in 7.0
Closes#29286
Now write operations like Index, Delete, Update rely on the write-index associated with
an alias to operate against. This means writes will be accepted even when an alias points to multiple indices, so long as one is the write index. Routing values will be used from the AliasMetaData for the alias in the write-index. All read operations are left untouched.
* Add basic support for field aliases in index mappings. (#31287)
* Allow for aliases when fetching stored fields. (#31411)
* Add tests around accessing field aliases in scripts. (#31417)
* Add documentation around field aliases. (#31538)
* Add validation for field alias mappings. (#31518)
* Return both concrete fields and aliases in DocumentFieldMappers#getMapper. (#31671)
* Make sure that field-level security is enforced when using field aliases. (#31807)
* Add more comprehensive tests for field aliases in queries + aggregations. (#31565)
* Remove the deprecated method DocumentFieldMappers#getFieldMapper. (#32148)
When building custom tokenfilters without an index in the _analyze endpoint,
we need to ensure that referring filters are correctly built by calling
their #setReferences() method
Fixes#32154
When a replica is fully recovered (i.e., in `POST_RECOVERY` state) we send a request to the master
to start the shard. The master changes the state of the replica and publishes a cluster state to that
effect. In certain cases, that cluster state can be processed on the node hosting the replica
*together* with a cluster state that promotes that, now started, replica to a primary. This can
happen due to cluster state batched processing or if the master died after having committed the
cluster state that starts the shard but before publishing it to the node with the replica. If the master
also held the primary shard, the new master node will remove the primary (as it failed) and will also
immediately promote the replica (thinking it is started).
Sadly our code in IndexShard didn't allow for this which caused [assertions](13917162ad/server/src/main/java/org/elasticsearch/index/seqno/ReplicationTracker.java (L482)) to be tripped in some of our tests runs.
With the introduction of single types in 6.x, the `_type` field is no longer
indexed, which leads to certain queries that were working before throw errors
now. One such query is the `range` query, that, if performed on a single typer
index, currently throws an IAE since the field is not indexed.
This change adds special treatment for this case in the TypeFieldMapper,
comparing the range queries lower and upper bound to the one existing type and
either returns a MatchAllDocs or a MatchNoDocs query.
Relates to #31632Closes#31476
With the introduction of sequence number, we no longer use versionType to
resolve out of order collision in replication and recovery requests.
This PR removes removes the versionType from translog. We can only remove
it in 7.0 because it is still required in a mixed cluster between 6.x and 5.x.
This commit moves additional unit test runners from being dependencies
of the test task to dependencies of check. Without this change,
reproduce lines are incorrect due to the additional test runner not
matching any of the reproduce class/method info.
closes#31964
Previously we create a translog snapshot inside the resync method,
and that snapshot will be closed by the resync listener. However, if
the resync method throws an exception before the resync listener
is initialized, the translog snapshot won't be released.
Closes#32030
Ensure our tests can run in a FIPS JVM
JKS keystores cannot be used in a FIPS JVM as attempting to use one
in order to init a KeyManagerFactory or a TrustManagerFactory is not
allowed.( JKS keystore algorithms for private key encryption are not
FIPS 140 approved)
This commit replaces JKS keystores in our tests with the
corresponding PEM encoded key and certificates both for key and trust
configurations.
Whenever it's not possible to refactor the test, i.e. when we are
testing that we can load a JKS keystore, etc. we attempt to
mute the test when we are running in FIPS 140 JVM. Testing for the
JVM is naive and is based on the name of the security provider as
we would control the testing infrastrtucture and so this would be
reliable enough.
Other cases of tests being muted are the ones that involve custom
TrustStoreManagers or KeyStoreManagers, null TLS Ciphers and the
SAMLAuthneticator class as we cannot sign XML documents in the
way we were doing. SAMLAuthenticator tests in a FIPS JVM can be
reenabled with precomputed and signed SAML messages at a later stage.
IT will be covered in a subsequent PR
The current docs of the put-mapping Java API is currently broken. It its current
form, it creates an index and uses the whole mapping definition given as a JSON
string as the type name. Since we didn't check the index created in the
IndicesDocumentationIT so far this went unnoticed.
This change adds test to catch this error to the documentation test, changes the
documentation so it works correctly now and adds an input validation to
PutMappingRequest#buildFromSimplifiedDef() which was used internally to reject
calls where no mapping definition is given.
Closes#31906
* es/master:
Add Index UUID to `/_stats` Response (#31871)
Painless: Move and Rename Several Methods in the lookup package (#32105)
Bypass highlight query terms extraction on empty fields (#32090)
Switch non-x-pack to new style requests (#32106)
[Rollup] Add new capabilities endpoint for concrete rollup indices (#30401)
Revert "[test] disable packaging tests for suse boxes"
SQL: allow LEFT and RIGHT as function names (#32066)
DOCS: put LIMIT 10 to the SQL query (#32065)
[test] turn on host io cache for opensuse (#32053)
Tweaked Elasticsearch Service links for SEO
Dealing with empty fields in the highlight phase can
slow down the query because the query terms extraction is done independently
on each field. This change shortcuts the highlighting performed by the unified highlighter
for fields that are not present in the document. In such cases there is nothing to higlight so
we don't need to visit the query to build the highligh builder.
* es/master: (21 commits)
Tweaked Elasticsearch Service links for SEO
Watcher: Store username on watch execution (#31873)
Use correct formatting for links (#29460)
Painless: Separate PainlessLookup into PainlessLookup and PainlessLookupBuilder (#32054)
Scripting: Remove dead code from painless module (#32064)
[Rollup] Replace RollupIT with a ESRestTestCase version (#31977)
[TEST] Consistent algorithm usage (#32077)
[Rollup] Fix duplicate field names in test (#32075)
Ensure only parent breaker trips in unit test
Unmute field collapsing rest tests
Fix BWC check after backport
[Tests] Fix failure due to changes exception message (#32036)
Remove unused params from SSource and Walker (#31935)
[Test] Mute MlJobIT#testDeleteJobAfterMissingAliases
Turn off real-mem breaker in REST tests
Turn off real-mem breaker in single node tests
Fix broken OpenLDAP Vagrant QA test
Cleanup Duplication in `PainlessScriptEngine` (#31991)
SCRIPTING: Remove unused MultiSearchTemplateRequestBuilder (#32049)
Fix compile issues introduced by merge (#32058)
...
With this commit we raise the limit of the child circuit breaker used in
the unit test for the circuit breaker service so it is high enough to trip
only the parent circuit breaker. The previous limit was 300 bytes but
theoretically (considering overhead) we could reach 346 bytes. Thus any
value larger than 300 bytes could trip the child circuit breaker leading
to spurious failures.
Relates #31767
* Replace Ingest ScriptContext with Custom Interface
* Make org.elasticsearch.ingest.common.ScriptProcessorTests#testScripting more precise
* Don't mock script factory in ScriptProcessorTests
* Adjust mock script plugin in IT for new API
Make SnapshotInfo and CreateSnapshotResponse parsers lenient for backwards compatibility. Remove extraneous fields from CreateSnapshotRequest toXContent.
* Adds a new auto-interval date histogram
This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time.
This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary.
Closes#9572
* Adds documentation
* Added sub aggregator test
* Fixes failing docs test
* Brings branch up to date with master changes
* trying to get tests to pass again
* Fixes multiBucketConsumer accounting
* Collects more buckets than needed on shards
This gives us more options at reduce time in terms of how we do the
final merge of the buckeets to produce the final result
* Revert "Collects more buckets than needed on shards"
This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5.
* Adds ability to merge within a rounding
* Fixes nonn-timezone doc test failure
* Fix time zone tests
* iterates on tests
* Adds test case and documentation changes
Added some notes in the documentation about the intervals that can bbe
returned.
Also added a test case that utilises the merging of conseecutive buckets
* Fixes performance bug
The bug meant that getAppropriate rounding look a huge amount of time
if the range of the data was large but also sparsely populated. In
these situations the rounding would be very low so iterating through
the rounding values from the min key to the max keey look a long time
(~120 seconds in one test).
The solution is to add a rough estimate first which chooses the
rounding based just on the long values of the min and max keeys alone
but selects the rounding one lower than the one it thinks is
appropriate so the accurate method can choose the final rounding taking
into account the fact that intervals are not always fixed length.
Thee commit also adds more tests
* Changes to only do complex reduction on final reduce
* merge latest with master
* correct tests and add a new test case for 10k buckets
* refactor to perform bucket number check in innerBuild
* correctly derive bucket setting, update tests to increase bucket threshold
* fix checkstyle
* address code review comments
* add documentation for default buckets
* fix typo
Because this is a static method on a public API, and one that we encourage
plugin authors to use, the method with the typo is deprecated in 6.x
rather than just renamed.
With this commit we introduce a new circuit-breaking strategy to the parent
circuit breaker. Contrary to the current implementation which only accounts for
memory reserved via child circuit breakers, the new strategy measures real heap
memory usage at the time of reservation. This allows us to be much more
aggressive with the circuit breaker limit so we bump it to 95% by default. The
new strategy is turned on by default and can be controlled with the new cluster
setting `indices.breaker.total.userealmemory`.
Note that we turn it off for all integration tests with an internal test cluster
because it leads to spurious test failures which are of no value (we cannot
fully control heap memory usage in tests). All REST tests, however, will make
use of the real memory circuit breaker.
Relates #31767
* master:
[TEST] Mute SlackMessageTests.testTemplateRender
Docs: Explain closing the high level client
[ML] Re-enable memory limit integration tests (#31328)
[test] disable packaging tests for suse boxes
Add nio transport to security plugin (#31942)
XContentTests : Insert random fields at random positions (#30867)
Force execution of fetch tasks (#31974)
Fix unreachable error condition in AmazonS3Fixture (#32005)
Tests: Fix SearchFieldsIT.testDocValueFields (#31995)
Add Expected Reciprocal Rank metric (#31891)
[ML] Get ForecastRequestStats doc in RestoreModelSnapshotIT (#31973)
SQL: Add support for single parameter text manipulating functions (#31874)
[ML] Ensure immutability of MlMetadata (#31957)
Tests: Mute SearchFieldsIT.testDocValueFields()
muted tests due to #31940
Work around reported problem in eclipse (#31960)
Move build integration tests out of :buildSrc project (#31961)
Tests: Remove use of joda time in some tests (#31922)
[Test] Reactive 3rd party tests on CI (#31919)
SQL: Support for escape sequences (#31884)
SQL: HAVING clause should accept only aggregates (#31872)
Docs: fix typo in datehistogram (#31972)
Switch url repository rest tests to new style requests (#31944)
Switch reindex tests to new style requests (#31941)
Docs: Added note about cloud service to installation and getting started
[DOCS] Removes alternative docker pull example (#31934)
Add Snapshots Status API to High Level Rest Client (#31515)
ingest: date_index_name processor template resolution (#31841)
Test: fix null failure in watcher test (#31968)
Switch test framework to new style requests (#31939)
Switch low level rest tests to new style Requests (#31938)
Switch high level rest tests to new style requests (#31937)
[ML] Mute test failing due to Java 11 date time format parsing bug (#31899)
[TEST] Mute SlackMessageTests.testTemplateRender
Fix assertIngestDocument wrongfully passing (#31913)
Remove unused reference to filePermissionsCache (#31923)
rolling upgrade should use a replica to prevent relocations while running a scroll
HLREST: Bundle the x-pack protocol project (#31904)
Increase logging level for testStressMaybeFlush
Added lenient flag for synonym token filter (#31484)
[X-Pack] Beats centralized management: security role + licensing (#30520)
HLRest: Move xPackInfo() to xPack().info() (#31905)
Docs: add security delete role to api call table (#31907)
[test] port archive distribution packaging tests (#31314)
Watcher: Slack message empty text (#31596)
[ML] Mute failing DetectionRulesIT.testCondition() test
Fix broken NaN check in MovingFunctions#stdDev() (#31888)
Date: Add DateFormatters class that uses java.time (#31856)
[ML] Switch native QA tests to a 3 node cluster (#31757)
Change trappy float comparison (#31889)
Fix building AD URL from domain name (#31849)
Add opaque_id to audit logging (#31878)
re-enable backcompat tests
add support for is_write_index in put-alias body parsing (#31674)
Improve release notes script (#31833)
[DOCS] Fix broken link in painless example
Handle missing values in painless (#30975)
Remove the ability to index or query context suggestions without context (#31007)
Ingest: Enable Templated Fieldnames in Rename (#31690)
[Docs] Fix typo in the Rollup API Quick Reference (#31855)
Ingest: Add ignore_missing option to RemoveProc (#31693)
Add template config for Beat state to X-Pack Monitoring (#31809)
Watcher: Add ssl.trust email account setting (#31684)
Remove link to oss-MSI (#31844)
Painless: Restructure Definition/Whitelist (#31879)
HLREST: Add x-pack-info API (#31870)
Forces fetch tasks to queue even in the event that the queue is
already full. The reasoning is that fetch tasks may only be follow-up
to query tasks, so the number of additional fetch tasks that may enter
the threadpool is expected to be reasonable.
Closes#29442
This test produced different implementations of joda time classes,
depending on if the data was serialized or not (DateTime vs
MutableDateTime). This now uses a common base class to extract the
milliseconds from the data.
Closes#31992
We introduced these changes in #26708 because for CCR.
However, CCR now uses Lucene instead of translog.
This commit reverts these changes so that we can
minimize differences between the ccr and the master branch.
Relates ##26708
* Added lenient flag for synonym-tokenfilter.
Relates to #30968
* added docs for synonym-graph-tokenfilter
-- Also made lenient final
-- changed from !lenient to lenient == false
* Changes after review (1)
-- Renamed to ElasticsearchSynonymParser
-- Added explanation for ElasticsearchSynonymParser::add method
-- Changed ElasticsearchSynonymParser::logger instance to static
* Added lenient option for WordnetSynonymParser
-- also added more documentation
* Added additional documentation
* Improved documentation
The current shard follow mechanism is complex and does not give us easy ways the have visibility into the system (e.g. why we are falling behind).
The main reason why it is complex is because the current design is highly asynchronous. Also in the current model it is hard to apply backpressure
other than reducing the concurrent reads from the leader shard.
This PR has the following changes:
* Rewrote the shard follow task to coordinate the shard follow mechanism between a leader and follow shard in a single threaded manner.
This allows for better unit testing and makes it easier to add stats.
* All write operations read from the shard changes api should be added to a buffer instead of directly sending it to the bulk shard operations api.
This allows to apply backpressure. In this PR there is a limit that controls how many write ops are allowed in the buffer after which no new reads
will be performed until the number of ops is below that limit.
* The shard changes api includes the current global checkpoint on the leader shard copy. This allows reading to be a more self sufficient process;
instead of relying on a background thread to fetch the leader shard's global checkpoint.
* Reading write operations from the leader shard (via shard changes api) is a separate step then writing the write operations (via bulk shards operations api).
Whereas before a read would immediately result into a write.
* The bulk shard operations api returns the local checkpoint on the follow primary shard, to keep the shard follow task up to date with what has been written.
* Moved the shard follow logic that was previously in ShardFollowTasksExecutor to ShardFollowNodeTask.
* Moved over the changes from #31242 to make shard follow mechanism resilient from node and shard failures.
Relates to #30086