With the upcoming instance bindings, the singular *Binding name isn't descriptive enough
with multiple binding types. This renames the existing *Binding classes to *ClassBinding.
Mechanical change (with some error messages changed from binding to class binding by
hand).
We currently special-case SynonymFilterFactory and SynonymGraphFilterFactory, which need to
know their predecessors in the analysis chain in order to correctly analyze their synonym lists. This
special-casing doesn't work with Referring filter factories, such as the Multiplexer or Conditional
filters. We also have a number of filters (eg the Multiplexer) that will break synonyms when they
appear before them in a chain, because they produce multiple tokens at the same position.
This commit adds two methods to the TokenFilterFactory interface.
* `getChainAwareTokenFilterFactory()` allows a filter factory to rewrite itself against its preceding
filter chain, or to resolve references to other filters. It replaces `ReferringFilterFactory` and
`CustomAnalyzerProvider.checkAndApplySynonymFilter`, and by default returns `this`.
* `getSynonymFilter()` defines whether or not a filter should be applied when building a synonym
list `Analyzer`. By default it returns `true`.
Fixes#33609
* MINOR: Drop Redundant Ctx. Check in ScriptService
* This check is completely redundant, the expression script
engine will throw anyway (and with a similar message) for
those contexts that it cannot compile. Moreover, the update context
is not the only context that is not suported by the expression engine
at this point so handling the update context separately here makes
no sense.
This commit switches the joda time backcompat in scripting to use
augmentation over ZonedDateTime. The augmentation methods provide
compatibility with the missing methods between joda's DateTime and
java's ZonedDateTime. Due to getDayOfWeek returning an enum in the java
API, ZonedDateTime is wrapped so that the method can return int like the
joda time does. The java time api version is renamed to
getDayOfWeekEnum, which will be kept through 7.x for compatibility while
users switch back to getDayOfWeek once joda compatibility is removed.
This allows users to filter out tokens from a TokenStream using painless scripts,
instead of having to write specialised Java code and packaging it up into a plugin.
The commit also refactors the AnalysisPredicateScript.Token class so that it wraps
and makes read-only an AttributeSource.
* LeafCollector.setScorer() now takes a Scorable
* Scorers may not have null Weights
* IndexWriter.getFlushingBytes() reports how much memory is being used by IW threads writing to disk
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.
Relates #22868
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309Closes#32899
Historically we have had a ESLoggingHandler in the netty module that
logs low-level connection operations. This class just extends the netty
logging handler with some (broken) message deserialization. This commit
fixes this message serialization and moves the class to server.
This new logger logs inbound and outbound messages. Eventually, we
should move other event logging to this class (connect, close, flush).
That way we will have consistent logging regards of which transport is
loaded.
Resolves#27306 on master. Older branches will need a different fix.
This commit is related to #32517. It allows an "server_name"
attribute on a DiscoveryNode to be propagated to the server using
the TLS SNI extentsion. This functionality is only implemented for
the netty security transport.
This allows tokenfilters to be applied selectively, depending on the status of the current token in the tokenstream. The filter takes a scripted predicate, and only applies its subfilter when the predicate returns true.
We can have multiple documents in Lucene with the same seq_no for
parent-child documents (or without rollback). In this case, the usage
"lastSeenSeqNo + 1" is an off-by-one error as it may miss some
documents. This error merely affects the `skippedOperations` contract.
See: https://github.com/elastic/elasticsearch/pull/33222#discussion_r213842257Closes#33318
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:
- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)
Relates #30086Closes#29530
---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:
Co-authored-by: Adrien Grand <jpountz@gmail.com>
Co-authored-by: Boaz Leskes <b.leskes@gmail.com>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: Simon Willnauer <simonw@apache.org>
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:
- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)
Relates #30086Closes#29530
---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:
Co-authored-by: Adrien Grand jpountz@gmail.com
Co-authored-by: Boaz Leskes b.leskes@gmail.com
Co-authored-by: Jason Tedor jason@tedor.me
Co-authored-by: Martijn van Groningen martijn.v.groningen@gmail.com
Co-authored-by: Nhat Nguyen nhat.nguyen@elastic.co
Co-authored-by: Simon Willnauer simonw@apache.org
When the change was made to the format for in the whitelist for bindings, parameters from
both the constructor and the method were combined into a single list instead of separate
lists. The check for method parameters was being executed from the start of the combined
list rather than the correct position. The tests for bindings used a constructor and a method
that only used the int types so this was not caught. The test has been changed to also use
a double type and this issue is fixed.
- third party audit detects jar hell with JDK so we disable it
- jdk non portable in forbiddenapis detects classes being used from the
JDK ( for fips ) that are not portable, this is intended so we don't
scan for it on fips.
- different exclusion rules for third party audit on fips
Closes#33179
Trailers (statements following something like an if statement) that don't use brackets currently require a semicolon even if they're the last statement. This is a regression caused by (#29566) and noted by (#33193). This change fixes the regression and adds a test for the broken case.
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. In a
long series of PRs I've changed all of the old style requests that I
could find with `grep`. In this PR I change all requests that I could
find by *removing* the deprecated methods. Since this is a non-trivial
change I do not include actually removing the deprecated requests. I'll
do that in a follow up. But this should be the last set of usage
removals before the actual deprecated method removal. Yay!
* ingest: Introduce the dissect processor
The ingest node dissect processor is an alternative to Grok
to split a string based on a pattern. Dissect differs from
Grok such that regular expressions are not used to split the
string.
Dissect can be used to parse a source text field with a
simpler pattern, and is often faster the Grok for basic string
parsing. This processor uses the dissect library which
does most of the work.
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
In our Netty layer we have had to take extra precautions against Netty
catching throwables which prevents them from reaching the uncaught
exception handler. This code has taken on additional uses in NIO layer
and now in the scheduler engine because there are other components in
stack traces that could catch throwables and suppress them from reaching
the uncaught exception handler. This commit is a simple cleanup of the
iterative evolution of this code to refactor all uses into a single
method in ExceptionsHelper.
This is related to #32517. This commit passes the DiscoveryNode to the
initiateChannel method for different Transport implementation. This
will allow additional attributes (besides just the socket address) to be
used when opening channels.
This is a followup to #31886. After that commit the
TransportConnectionListener had to be propogated to both the
Transport and the ConnectionManager. This commit moves that listener
to completely live in the ConnectionManager. The request and response
related methods are moved to a TransportMessageListener. That listener
continues to live in the Transport class.
This removes def from the classes map in PainlessLookup and instead always special
cases it. This prevents potential calls against the def type that shouldn't be made and
forces all cases of def throughout Painless code to be special cased.