Uses JDK 11's per-socket configuration of TCP keepalive (supported on Linux and Mac), see
https://bugs.openjdk.java.net/browse/JDK-8194298, and exposes these as transport settings.
By default, these options are disabled for now (i.e. fall-back to OS behavior), but we would like
to explore whether we can enable them by default, in particular to force keepalive configurations
that are better tuned for running ES.
Currently in the transport-nio work we connect and bind channels on the
a thread before the channel is registered with a selector. Additionally,
it is at this point that we set all the socket options. This commit
moves these operations onto the event-loop after the channel has been
registered with a selector. It attempts to set the socket options for a
non-server channel at registration time. If that fails, it will attempt
to set the options after the channel is connected. This should fix
#41071.
Currently, we do not handle READ or WRITE events until the channel
connection process is complete. However, the external write queue path
allows a write to be attempted when the conneciton is not complete. This
commit closes the loophole and only queues write operations when the
connection process is not complete.
This change allows the Kotlin compiler to type check methods annotated with the
org.elasticsearch.common.Nullable annotation in Elasticsearch Java
APIs as described in: https://kotlinlang.org/docs/reference/java-interop.html#jsr-305-support.
(cherry picked from commit 0d0485ad9cf10e16b75b862b023b42827c375599)
Switches to more robust way of generating random test geometries by
reusing lucene's GeoTestUtil. Removes duplicate random geometry
generators by moving them to the test framework.
Closes#37278
We often start testing with early access versions of new Java
versions and this have caused minor issues in our tests
(i.e. #43141) because the version string that the JVM reports
cannot be parsed as it ends with the string -ea.
This commit changes how we parse and compare Java versions to
allow correct parsing and comparison of the output of java.version
system property that might include an additional alphanumeric
part after the version numbers
(see [JEP 223[(https://openjdk.java.net/jeps/223)). In short it
handles a version number part, like before, but additionally a
PRE part that matches ([a-zA-Z0-9]+).
It also changes a number of tests that would attempt to parse
java.specification.version in order to get the full version
of Java. java.specification.version only contains the major
version and is thus inappropriate when trying to compare against
a version that might contain a minor, patch or an early access
part. We know parse java.version that can be consistently
parsed.
Resolves#43141
Registering a channel with a selector is a required operation for the
channel to be handled properly. Currently, we mix the registeration with
other setup operations (ip filtering, SSL initiation, etc). However, a
fail to register is fatal. This PR modifies how registeration occurs to
immediately close the channel if it fails.
There are still two clear loopholes for how a user can interact with a
channel even if registration fails. 1. through the exception handler.
2. through the channel accepted callback. These can perhaps be improved
in the future. For now, this PR prevents writes from proceeding if the
channel is not registered.
* Fix Exceptions in EventHandler#postHandling Breaking Select Loop
* We can run into the `write` path for SSL channels when they are not fully registered (if registration fails and a close message is attempted to be written) and thus into NPEs from missing selection keys
* This is a quick fix to quiet down tests, a cleaner solution will be incoming for #44343
* Relates #44343
* Remove Redundant Setting of OP_WRITE Interest
* We shouldn't have to set OP_WRITE interest before running into a partial write. Since setting OP_WRITE is handled by the `eventHandler.postHandling` logic, I think we can simply remove this operation and simplify/remove tests that were testing the setting of the write interest
In some cases we need to parse some XContent that is already parsed into
a map. This is currently happening in handling source in SQL and ingest
processors as well as parsing null_value values in geo mappings. To avoid
re-serializing and parsing the value again or writing another map-based
parser this commit adds an iterator that iterates over a map as if it was
XContent. This makes reusing existing XContent parser on maps possible.
Relates to #43554
By default, we don't check ranges while indexing geo_shapes. As a
result, it is possible to index geoshapes that contain contain
coordinates outside of -90 +90 and -180 +180 ranges. Such geoshapes
will currently break SQL and ML retrieval mechanism. This commit removes
these restriction from the validator is used in SQL and ML retrieval.
* Use atomic boolean to guard wakeups
* Don't trigger wakeups from the select loops thread itself for registering and closing channels
* Don't needlessly queue writes
Co-authored-by: Tim Brooks <tim@uncontended.net>
* Enhance TimeValue.toString() to allow specifying fractional values.
This enhances the `TimeValue` class to allow specifying the number of
truncated fractional decimals when calling `toString()`. The default
remains 1, however, more (or less, such as 0) can be specified to change
the output.
This commit also re-organizes some things in `TimeValue` such as putting
all the class variables near the top of the class, and moving the
constructors to the first methods in the class, in order to follow the
structure of our other code.
* Rename `toString(...)` to `toHumanReadableString(...)`
Currently nio implements ip filtering at the channel context level. This
is kind of a hack as the application logic should be implemented at the
handler level. This commit moves the ip filtering into a channel
handler. This requires adding an indicator to the channel handler to
show when a channel should be closed.
This PR proposes to model big integers as longs (and big decimals as doubles)
in the context of dynamic mappings.
Previously, the dynamic mapping logic did not recognize big integers or
decimals, and would an error of the form "No matching token for number_type
[BIG_INTEGER]" when a dynamic big integer was encountered. It now accepts these
numeric types and interprets them as 'long' and 'double' respectively. This
allows `dynamic_templates` to accept and and remap them as another type such as
`keyword` or `scaled_float`.
Addresses #37846.
Fsyncing directories on Windows is not possible. We always suppressed
this by allowing that an AccessDeniedException is thrown when attemping
to open the directory for reading. Yet, this suppression also allowed
other IOExceptions to be suppressed, and that was a bug (e.g., the
directory not existing, or a filesystem error and reasons that we might
get an access denied there, like genuine permissions issues). This
leniency was previously removed yet it exposed that we were suppressing
this case on Windows. Rather than relying on exceptions for flow control
and continuing to suppress there, we simply return early if attempting
to fsync a directory on Windows (we will not put this burden on the
caller).
Today in the method IOUtils#fsync we ignore IOExceptions when fsyncing a
directory. However, the catch block here is too broad, for example it
would be ignoring IOExceptions when we try to open a non-existant
file. This commit addresses that by scoping the ignored exceptions only
to the invocation of FileChannel#force.
Currently, when the SSLEngine needs to produce handshake or close data,
we must manually call the nonApplicationWrite method. However, this data
is only required when something triggers the need (starting handshake,
reading from the wire, initiating close, etc). As we have a dedicated
outbound buffer, this data can be produced automatically. Additionally,
with this refactoring, we combine handshake and application mode into a
single mode. This is necessary as there are non-application messages that
are sent post handshake in TLS 1.3. Finally, this commit modifies the
SSLDriver tests to test against TLS 1.3.
ObjectParser has two ways of dealing with unknown fields: ignore them entirely,
or throw an error. Sometimes it can be useful instead to gather up these unknown
fields and record them separately, for example as arbitrary entries in a map.
This commit adds the ability to specify an unknown field consumer on an ObjectParser,
called with the field name and parsed value of each unknown field encountered during
parsing. The public API of ObjectParser is largely unchanged, with a single new
constructor method and interface definition.
Refactors the WKT and GeoJSON parsers from an utility class into an
instantiatable objects. This is a preliminary step in
preparation for moving out coordinate validators from Geometry
constructors. This should allow us to make validators plugable.
This change moves the construction of the result
HashMap in Grok.captures() into the branch that
actually needs it.
This probably will not make a measurable difference
for ingest pipelines, but it is beneficial to the
ML find_file_structure endpoint, as it tries out
many Grok patterns that will fail to match.
This commit updates the default ciphers and TLS protocols that are used
when the runtime JDK supports them. New cipher support has been
introduced in JDK 11 and 12 along with performance fixes for AES GCM.
The ciphers are ordered with PFS ciphers being most preferred, then
AEAD ciphers, and finally those with mainstream hardware support. When
available stronger encryption is preferred for a given cipher.
This is a backport of #41385 and #41808. There are known JDK bugs with
TLSv1.3 that have been fixed in various versions. These are:
1. The JDK's bundled HttpsServer will endless loop under JDK11 and JDK
12.0 (Fixed in 12.0.1) based on the way the Apache HttpClient performs
a close (half close).
2. In all versions of JDK 11 and 12, the HttpsServer will endless loop
when certificates are not trusted or another handshake error occurs. An
email has been sent to the openjdk security-dev list and #38646 is open
to track this.
3. In JDK 11.0.2 and prior there is a race condition with session
resumption that leads to handshake errors when multiple concurrent
handshakes are going on between the same client and server. This bug
does not appear when client authentication is in use. This is
JDK-8213202, which was fixed in 11.0.3 and 12.0.
4. In JDK 11.0.2 and prior there is a bug where resumed TLS sessions do
not retain peer certificate information. This is JDK-8212885.
The way these issues are addressed is that the current java version is
checked and used to determine the supported protocols for tests that
provoke these issues.
This is related to #27260. Currently we have a single read buffer that
is no larger than a single TLS packet. This prevents us from reading
multiple TLS packets in a single socket read call. This commit modifies
our TLS work to support reading similar to the plaintext case. The data
will be copied to a (potentially) recycled TLS packet-sized buffer for
interaction with the SSLEngine.
This is related to #27260. Currently there is a setting
http.read_timeout that allows users to define a read timeout for the
http transport. This commit implements support for this functionality
with the transport-nio plugin. The behavior here is that a repeating
task will be scheduled for the interval defined. If there have been
no requests received since the last run and there are no inflight
requests, the channel will be closed.
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.
This commit also backports the following commit:
Handle WRAP ops during SSL read
It is possible that a WRAP operation can occur while decrypting
handshake data in TLS 1.3. The SSLDriver does not currently handle this
well as it does not have access to the outbound buffer during read call.
This commit moves the buffer into the Driver to fix this issue. Data
wrapped during a read call will be queued for writing after the read
call is complete.
This commit refactors GeoHashUtils class into a new Geohash utility class located in the ES geo library. The intent is to not only better control what geo methods are whitelisted for painless scripting but to clean up the geo utility API in general.
Under random seed 4304ED44CB755610 the generated byte pattern causes
BC-FIPS to throw
java.io.IOException: DER length more than 4 bytes: 101
Rather than simply returning an empty list (as it does for most random
values).
Backport of: #40939
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.
hamcrest has some improvements in newer versions, like FileMatchers
that make assertions regarding file exists cleaner. This commit upgrades
to the latest version of hamcrest so we can start using new and improved
matchers.
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)
This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.
(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)
* Fix forking JVM runner
* Don't bump shadow plugin version
When geo point parsing threw a parse exception, it did not consume
remaining tokens from the parser. This in turn meant that
indexing documents with malformed geo points into mappings with
ignore_malformed=true would fail in some cases, since DocumentParser
expects geo_point parsing to end on the END_OBJECT token.
Related to #17617
Today if you try and insert a very large number like `1e9999999` into a long
field we first construct this number as a `BigDecimal`, convert this to a
`BigInteger` and then reject it because it is out of range. Unfortunately
making such a large `BigInteger` is rather expensive.
We can avoid this expense by performing a (weaker) range check on the
`BigDecimal` representation of incoming `long`s too.
Relates #26137Closes#40323
Recently we have had a number of test issues related to blocking
activity occuring on the io thread. This commit adds a log warning for
when handling event takes a >150 milliseconds. This is implemented
for the MockNioTransport which is the transport used in
ESIntegTestCase.
* Un-mute and fix BuildExamplePluginsIT
There doesn't seem to be anything wrong with the test iteself.
I think the failure were CI performance related, but while it was muted,
some failures managed to sneak in.
Closes#38784
* PR review
* A few warnings could be observed in test logs about `NoSuchElementException` being thrown in `InboundChannelBuffer#sliceBuffersTo`.
These were the result of calls to this method after the relevant channel and hence the buffer was closed already as a result of a failed IO operation.
* Fixed by adding the necessary guard statements to break out in these cases. I don't think there is a need here to do any additional error handling since `eventHandler.postHandling(channelContext);` at the end of the `processKey`
call in the main selection loop handles closing channels and invoking callbacks for writes that failed to go through already.
The build script file for the `:libs:elasticsearch-ssl-config` and
`:libs:ssl-config-tests` projects was incorrectly named `eclipse.build.gradle`
while the expected name was `eclipse-build.gradle`.
In addition, this also adds a missing snippet in the `build.gradle` conf file,
that fixes the project setup for Eclipse users.
This commit enables the use of TLSv1.3 with security by enabling us to
properly map `TLSv1.3` in the supported protocols setting to the
algorithm for a SSLContext. Additionally, we also enable TLSv1.3 by
default on JDKs that support it.
An issue was uncovered with the MockWebServer when TLSv1.3 is used that
ultimately winds up in an endless loop when the client does not trust
the server's certificate. Due to this, SSLConfigurationReloaderTests
has been pinned to TLSv1.2.
Closes#32276
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.
This is a continuation of #28667, #36137 and also fixes#37708.
Replaces intermediate geo objects built by ShapeBuilders with
objects from the libs/geo hierarchy. This should allow us to build
all geo functionality around a single hierarchy.
Follow up for #35320
* Remove empty statements
There are a couple of instances of undocumented empty statements all across the
code base. While they are mostly harmless, they make the code hard to read and
are potentially error-prone. Removing most of these instances and marking blocks
that look empty by intention as such.
* Change test, slightly more verbose but less confusing
The default value for ssl.supported_protocols no longer includes TLSv1
as this is an old protocol with known security issues.
Administrators can enable TLSv1.0 support by configuring the
appropriate `ssl.supported_protocols` setting, for example:
xpack.security.http.ssl.supported_protocols: ["TLSv1.2","TLSv1.1","TLSv1"]
Relates: #36021
* Testing conventions now checks for tests in main
This is the last outstanding feature of the old NamingConventionsTask,
so time to remove it.
* PR review
This commit introduces a NetworkMessage class. This class has two
subclasses - InboundMessage and OutboundMessage. These messages can
be serialized and deserialized independent of the transport. This allows
more granular testing. Additionally, the serialization mechanism is now
a simple Supplier. This builds the framework to eventually move the
serialization of transport messages to the network thread. This is the
one serialization component that is not currently performed on the
network thread (transport deserialization and http serialization and
deserialization are all on the network thread).
Currently we create dedicated network threads for both the http and
transport implementations. Since these these threads should never
perform blocking operations, these threads could be shared. This commit
modifies the nio-transport to have 0 http workers be default. If the
default configs are used, this will cause the http transport to be run
on the transport worker threads. The http worker setting will still exist
in case the user would like to configure dedicated workers. Additionally,
this commmit deletes dedicated acceptor threads. We have never had these
for the netty transport and they can be added back if a need is
determined in the future.
This introduces a new ssl-config library that can parse
and validate SSL/TLS settings and files.
It supports the standard configuration settings as used in the
Elastic Stack such as "ssl.verification_mode" and
"ssl.certificate_authorities" as well as all file formats used
in other parts of Elasticsearch security (such as PEM, JKS,
PKCS#12, PKCS#8, et al).
Adds a set of geo classes to represent geo data in the JDBC driver and
to be used as an intermediate format to pass geo shapes for indexing
and query generation in #35320.
Relates to #35767 and #35320
Currently we read and write 64KB at a time in the nio libraries. As a
single byte buffer per event loop thread does not consume much memory,
there is little reason to not increase it further. This commit increases
the buffer to 256KB but still limits a single write to 64KB. The write
limit could be increased, but too high of a write limit will lead to
copying more data (if all the data is not flushed and needs to be copied
on the next call). This is something to explore in the future.
Closing a channel using TLS/SSL requires reading and writing a
CLOSE_NOTIFY message (for pre-1.3 TLS versions). Many implementations do
not actually send the CLOSE_NOTIFY message, which means we are depending
on the TCP close from the other side to ensure channels are closed. In
case there is an issue with this, we need a timeout. This commit adds a
timeout to the channel close process for TLS secured channels.
As part of this change, we need a timer service. We could use the
generic Elasticsearch timeout threadpool. However, it would be nice to
have a local to the nio event loop timer service dedicated to network needs. In
the future this service could support read timeouts, connect timeouts,
request timeouts, etc. This commit adds a basic priority queue backed
service. Since our timeout volume (channel closes) is very low, this
should be fine. However, this can be updated to something more efficient
in the future if needed (timer wheel). Everything being local to the event loop
thread makes the logic simple as no locking or synchronization is necessary.
`PageCacheRecycler` is the class that creates and holds pages of arrays
for various uses. `BigArrays` is just one user of these pages. This
commit moves the constants that define the page sizes for the recycler
to be on the recycler class.
This is related to #27260. In Elasticsearch all of the messages that we
serialize to write to the network are composed of heap bytes. When you
read or write to a nio socket in java, the heap memory you passed down
must be copied to/from direct memory. The JVM internally does some
buffering of the direct memory, however it is essentially unbounded.
This commit introduces a simple mechanism of buffering and copying the
memory in transport-nio. Each network event loop is given a 64kb
DirectByteBuffer. When we go to read we use this buffer and copy the
data after the read. Additionally, when we go to write, we copy the data
to the direct memory before calling write. 64KB is chosen as this is the
default receive buffer size we use for transport-netty4
(NETTY_RECEIVE_PREDICTOR_SIZE).
Since we only have one buffer per thread, we could afford larger.
However, if we the buffer is large and not all of the data is flushed in
a write call, we will do excess copies. This is something we can
explore in the future.
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).
Relates #33028
This commit is related to #32517. It allows an "sni_server_name"
attribute on a DiscoveryNode to be propagated to the server using
the TLS SNI extentsion. Prior to this commit, this functionality
was only support for the netty transport. This commit adds this
functionality to the security nio transport.
Adds an XContent sub parser class that can to wrap another
XContent parser at the beginning of an object and allow skiping
all children in case of the parsing failure. It also uses this
subparser to ignore the rest of the GeoJson shape if the
parsing fails and we need to ignore the geoshape due to the
ignore_malformed flag.
Supersedes #34498Closes#34047
This change adds a `frozen` engine that allows lazily open a directory reader
on a read-only shard. The engine wraps general purpose searchers in a LazyDirectoryReader
that also allows to release and reset the underlying index readers after any and before
secondary search phases.
Relates to #34352
Adds checks for misbehaving parsers. The checks aren't perfect at all but
they are simple and fast enough that we can do them all the time so
they'll catch most badly behaving parsers.
Closes#34351
The testCharsBeginsWith test has a check that a random prefix of length
2 is not the prefix of a char[]. However, there is no check that the
char[] is not randomly generated with the same two characters as the
prefix. This change ensures that the char[] does not begin with the
prefix.
Closes#34765
With this commit we cleanup hand-coded duplicate checks in XContent
parsing. They were necessary previously but since we reconfigured the
underlying parser in #22073 and #22225, these checks are obsolete and
were also ineffective unless an undocumented system property has been
set. As we also remove this escape hatch, we can remove the additional
checks as well.
Closes#22253
Relates #34588
Although we allow to index BigInteger and BigDecimal into a keyword
field, source filtering on these fields would fail
as XContentBuilder was not able to deserialize BigInteger and BigDecimal
to json.
This modifies XContentBuilder to allow to handle BigInteger and
BigDecimal.
Closes#32395
This change cleans up "unused variable" warnings. There are several cases were we
most likely want to suppress the warnings (especially in the client documentation test
where the snippets contain many unused variables). In a lot of cases the unused
variables can just be deleted though.
This change adds a `_source` only snapshot repository that allows to wrap
any existing repository as a _backend_ to snapshot only the `_source` part
including live docs markers. Snapshots taken with the `source` repository
won't include any indices, doc-values or points. The snapshot will be reduced in size and
functionality such that it requires full re-indexing after it's successfully restored.
The restore process will copy the `_source` data locally starts a special shard and engine
to allow `match_all` scrolls and searches. Any other query, or get call will fail with and unsupported operation exception. The restored index is also marked as read-only.
This feature aims mainly for disaster recovery use-cases where snapshot size is
a concern or where time to restore is less of an issue.
**NOTE**: The snapshot produced by this repository is still a valid lucene index. This change doesn't allow for any longer retention policies which is out of scope for this change.