Recent simplifications to org.elasticsearch.Version led to "-SNAPSHOT"
to not be displayed in the node version upon startup. Yet, this is
useful information to have. This simple commit adds back the display of
"-SNAPSHOT" in the version on startup if the build is a snapshot build.
Closes#16908
This commit refactors the bootstrap checks into a dedicated class. The
refactoring provides a model for different limits per operating system,
and provides a model for unit tests for individual checks.
Closes#16844
In this case we compute elapsed time and `System.nanoTime()` is designed to do just do that.
The absolute timing `System.currentTimeMillis()` provides in not needed and the relative timing `System.nanoTime()` provides is likely to be more accurate.
Use 'includeSegmentFileSizes' as the flag name to report disk usage.
Added test that verifies reported segment disk usage is growing accordingly after adding a document.
Documentation: Reference the new parameter as part of indices stats.
I've noticed throughout the code that we have a need to remove the boilerplate lifecycle check when starting/rescheduling certain runnables. This provides a simpler implementation to get this functionality without duplicating it.
The `ingest_took` is separate from `took`, which keeps track how much time is spent on indexing/deleting/updating.
The `ingest_took` is only visible in the rest response if at least for one bulk item has ingest enabled.
In case where the publish address is different to all bound addresses and an explicit publish_port
setting is not provided, the publish_port is now selected as the unique port of bound addresses.
An exception is thrown in case of ambiguities.
Closes#16626
If a node was isolated from the cluster while a delete was happening,
the node will ignore the deleted operation when rejoining as we couldn't
detect whether the new master genuinely deleted the indices or it is a
new fresh "reset" master that was started without the old data folder.
We can now be smarter and detect these reset masters and actually delete
the indices on the node if its not the case of a reset master.
Note that this new protection doesn't hold if the node was shut down. In
that case it's indices will still be imported as dangling indices.
Closes#16825Closes#11665
If you connect to a client node and call _cat/nodes, and there is at least another client node in the cluster, the http address cannot be retrieved thus we get an NPE. This commit prevents an NPE from being thrown. This bug was introduced with #16770
The suffix TransportAction is misleading as it may make think that it extends TransportAction, but it does not. This class makes accessible the different search operations exposed by SearchService through the transport layer. Also resolved few compiler warnings in the class itself.
The big win here is catching tests that are incorrectly named and will
be skipped by gradle, providing a false sense of security.
The whole thing takes about 10 seconds on my Macbook Air, not counting
compiling the test classes, which seems worth it. Because this runs as
a gradle task with propery UP-TO-DATE handling it can be skipped if the
tests haven't been changed which should save some time.
I chose to keep this in test:framework rather than a new subproject of
buildSrc because ESIntegTestCase and doesn't inroduce any additional
dependencies.
DiscoveryService was a bridge into the discovery universe. This is unneeded and we can just access discovery directly or do things in a different way.
One of those different ways, is not having a dedicated discovery implementation for each our dicovery plugins but rather reuse ZenDiscovery.
UnicastHostProviders are now classified by discovery type, removing unneeded checks on plugins.
Closes#16821
Its useful when you need to have defaults that don't match
SearchSourceBuilder's defaults. You build your SearchSourceBuilder, set the
defaults, and then layer the XContent on top of that.
TransportSearchTypeAction and subclasses are not actually transport actions, but just support classes useful for their inner async actions that can easily be extracted out so that we get rid of one too many level of abstraction.
Same pattern can be applied to TransportSearchScrollQueryAndFetchAction & TransportSearchScrollQueryThenFetchAction which we could remove in favour of keeping only their inner classes named SearchScrollQueryAndFetchAsyncAction and SearchScrollQueryThenFetchAsyncAction.
Remove org.elasticsearch.action.search.type package, collapsed remaining classes into existing org.elasticsearch.action.search package
Make also ParsedScrollId ScrollIdForNode and TransportSearchHelper classes and their methods package private.
Closes#11710
As we rely on active allocation ids persisted in the cluster state to select
the primary shard copy, we can write shard state metadata on the allocated node
as soon as the node knows about receiving this shard. This also ensures that
in case of primary relocation, when the relocation target is marked as started
by the master node, the shard state metadata with the correct allocation id has
already been written on the relocation target. Before this change, shard state
metadata was only written once the node knows it is marked as started. In case
of failures between master marking the node as started and the node
receiving and processing this event, the relation between the shard copy on disk
and the cluster state could get lost. This means that manual allocation of
the shard using the reroute command allocate_stale_primary was necessary.
Closes#16625
As part of #10136 we removed the transport action for broadcast deletes in case routing is required but not specified. Bulk api worked differently though and kept on doing the broadcast delete internally in that case. This commit makes sure that delete items are marked as failed in such cases. Also the check has been moved up in the code together with the existing check for the update api, and we now make sure that the exception is the same as the one thrown for single document apis (delete/update).
Note that the failure for the update api contained the wrong optype (the type of the document rather than "update"), that's been fixed too and tested.
Closes#16645
This commit disables the production limits checks on snapshot
builds. This is at a minimum short-term relief for developers that do in
fact bind to external network interfaces, and is possibly a long-term
fix as well. The situation with using the JVM flag MaxFDLimit is far too
complicated.
Closes#16835
At some time in the distant past, Bootstrap could be used by those
embedding elasticsearch. However, it is no longer allowed (now package
private), and so any errors (for example, log4j missing, or any other
exception while initializing) should be exposed directly and cause
elasticsearch to fail to start. This change removes hiding of logging
initialization exceptions.
Removes all our logger wrappers except the wrapper for log4j1.2. If you
depend on Elasticsearch's jar in your application you'll need to declare
log4j 1.2 and/or some bridge to your favorite logger.
We did this to simplify our builds and code. No more commons-logging like
log implementation sniffing. No more optional dependency hacks in gradle.
We might one day want to use j.u.l instead of log4j. If we do want that
we can recover its wrapper by studying this commit. We didn't go directly
to j.u.l in this commit because that is a bigger change. Our logging
configuration is based on log4j1.2 and people are used to it. So it'd
be a much more fraught breaking change to do that conversion.
It is possible to register multiple settings with complex matchers that could both match
a given key. The behavior when this occurs can lead to issues and depends on the
number of settings that have been registered. In order to identify the setting for a given
key, we iterate over the values in a map to find the first setting that matches the given key
and iteration order of a map should not be relied upon.
This commit checks complex settings when adding them and if the keys for these overlap,
an IllegalArgumentException is now thrown.
We should open up the node to the world when it's as ready as possiblAt the moment we open up the transport service before the local node has been fully initialized. This causes bug as some data structures are not fully initialized yet. See for example #16723.
Sadly, we can't just start the TransportService last (as we do with the HTTP server) because the ClusterService needs to know the bound published network address for the local DiscoveryNode. This address can only be determined by actually binding (people may use, for example, port 0). Instead we start the TransportService as late as possible but block any incoming requests until the node has completed initialization.
A couple of other cleanup during start time:
1) The gateway service now starts before the initial cluster join so we can simplify the logic to recover state if the local node has become master.
2) The discovery is started before the transport service accepts requests, but we only start the join process later using a dedicated method.
Closes#16723Closes#16746
This commit removes the system property "es.useLinkedTransferQueue" that
defaulted to false and was used to control the queue implementation used
in a few places.
Closes#16786
This commit adds a check on startup for G1 GC while running on early
versions of HotSpot version 25. This is to prevent potential data
corruption issues that can occur on those versions.
Closes#16737
Java NIO has the notion of gathering writes. These are writes that
gather data from multiple buffers into a single channel. These gathering
writes in Netty have been enabled by default with the possibility to
disable them using "es.netty.gathering". This flag was added in case
having gathering writes on by default did not work out. We have not
published this ability and sufficient time has passed to render
judgement that using gathering writes is okay.
Closes#16774
Expose http address in cat/nodes and cat/nodeattrs APIs
We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
Elasticsearch should reject ids that are this long, to ensure a document
always remains retrievable for clients that impose a maximum URI length
Closes#16034
Most elements in SearchSourceBuilder (e.g. aggs, queries) write their top-level
ParseField name in toXContent(), while HighlightBuilder used to do it in
its own toXContent() method. Moved this up so SeachSourceBuilder for consistency.
Today we might start a node and some of the paths might not have the
required permissions. This commit goes through all data directories as
well as index, shard and state directories and ensures we have write access.
To make this work across all OS etc. we are trying to write a real file
and remove it again in each of those directories
This commit removes the es.max-open-files flag as the same information
can be obtained from the cluster nodes info API, and is warn logged on
startup if it's set too low anyway.
Closes#16757
This commit tries to 'guess' if a user starts a node in production by
checking if any network host is configured. If that is the case soft-limits
that are only logged otherwise are enforced like number of open file descriptors.
Closes#16727