* [Upgrade] Lucene 9.1.0-snapshot-ea989fe8f30
Upgrades from Lucene 9.0.0 to 9.1.0-snapshot-ea989fe8f30 in preparation for
9.1.0 GA.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
* Add spanishplural token filter
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
* fix KNOWN_TOKENIZERS
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
* Make default number of shards configurable
The default number of primary shards for a new index, when the number of shards are not provided in the request, can be configured for the cluster.
Signed-off-by: Arunabh Singh <arunabs@amazon.com>
* Address PR comments
Signed-off-by: Arunabh Singh <arunabs@amazon.com>
Co-authored-by: Arunabh Singh <arunabs@amazon.com>
This commit adds the SPDX license header and modifications copyright to security
policy files.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
This commit adds the SPDX Apache-2.0 license header along with an additional
copyright header for all modifications.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
Fix miscellaneous issues identified during `gradle precommit`. These issues are the side effects of the renaming to OpenSearch work.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
This commit cleans up imports, variable names, comments, and other misc usages
of ES with the new OpenSearch name.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
This commit refactors the remaining o.e.index and o.e.test packages in the
test/fixtures module. References throughout the codebase are also refactored.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
This commit refactors the heavily used ESPolicy, Elasticsearch (main class), and Elasticsearch
prefixed test classes used in the bootstrap package under the server module. Refactoring the
namespace will come in a separate commit.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
* Rename directory elasticsearch to opensearch
Rename EvilElasticsearchCliTests EvilOpenSearchCliTests
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename org.elasticsearch to org.opensearch
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename waitForElasticsearch to waitForOpenSearch
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename OpensearchNode to OpenSearchNode
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename "elasticsearch.version" to "opensearch.version"
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearchVersionString to opensearchVersionString
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch.yml to opensearch.yml
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename runElasticsearchTests to runOpenSearchTests
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearchVersion to opensearchVersion
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch to opensearch in gradle files
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename ElasticsearchAssertions to OpenSearchAssertions
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename folder share/elasticsearch to share/opensearch
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch to opensearch
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch-service-x64 to opensearch-service-x64
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch-service.bat to opensearch-service.bat
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename Elasticsearch to Opensearch
Rename elasticsearch to opensearch
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename ELASTIC_PASSWORD_FILE to OPENSEARCH_PASSWORD_FILE
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename ELASTICSEARCH_PASSWORD to OPENSEARCH_PASSWORD
Rename elasticsearch to opensearch
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch to opensearch
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch.log to opensearch.log
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename es-repo to opensearch-repo
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename ESTestCase to OpenSearchTestCase
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename ESRestTestCase to OpenSearchRestTestCase
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch to opensearch
Rename "Starts ElasticSearch" to "Starts OpenSearch"
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename ESElasticsearchCliTestCase to BaseOpenSearchCliTestCase
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename "elasticsearch:test" to "opensearch:test"
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename test91ElasticsearchShardCliPackaging to test91OpenSearchShardCliPackaging
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch.toString to opensearch.toString
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename elasticsearch.pid to opensearch.pid
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename "Opensearch" to "OpenSearch"
Rename "elasticsearch" to "opensearch"
Signed-off-by: Harold Wang <harowang@amazon.com>
* Rename Elasticsearch to OpenSearch
Remove unecessary dot after opensearch.
Signed-off-by: Harold Wang <harowang@amazon.com>
* [Rename] Rename qa folder
Signed-off-by: Harold Wang <harowang@amazon.com>
* Remove the dot in the end of "package org.opensearch."
Signed-off-by: Harold Wang <harowang@amazon.com>
* Add semicolon
Signed-off-by: Harold Wang <harowang@amazon.com>
The recursive data.path FilePermission check is an extremely hot
codepath in Elasticsearch. Unfortunately the FilePermission check in
Java is extremely allocation heavy. As it iterates through different
file permissions, it allocates byte arrays for each Path component that
must be compared. This PR improves the situation by adding the recursive
data.path FilePermission it its own PermissionsCollection object which
is checked first.
Backport of #61474.
Part of #46106. Simplify the implementation of deprecation logging by
relying of log4j more completely, and implementing additional behaviour
through custom appenders and filters.
DeprecationLogger's constructor should not create two loggers. It was
taking parent logger instance, changing its name with a .deprecation
prefix and creating a new logger.
Most of the time parent logger was not needed. It was causing Log4j to
unnecessarily cache the unused parent logger instance.
depends on #61515
backports #58435
Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog).
Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed.
relates #55699
relates #52369
backports #55941
Backport of #55115.
Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...),
with an appropriate key, so that all messages can potentially be deduplicated.
I've noticed that a lot of our tests are using deprecated static methods
from the Hamcrest matchers. While this is not a big deal in any
objective sense, it seems like a small good thing to reduce compilation
warnings and be ready for a new release of the matcher library if we
need to upgrade. I've also switched a few other methods in tests that
have drop-in replacements.
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
It's simple to deprecate a field used in an ObjectParser just by adding deprecation
markers to the relevant ParseField objects. The warnings themselves don't currently
have any context - they simply say that a deprecated field has been used, but not
where in the input xcontent it appears. This commit adds the parent object parser
name and XContentLocation to these deprecation messages.
Note that the context is automatically stripped from warning messages when they
are asserted on by integration tests and REST tests, because randomization of
xcontent type during these tests means that the XContentLocation is not constant
* Record Force Merges in live commit data
Prerequisite of #52182. Record force merges in the live commit data
so two shard states with the same sequence number that differ only in whether
or not they have been force merged can be distinguished when creating snapshots.
This is a follow up of https://github.com/elastic/elasticsearch/issues/43453 where we added
a system property to disallow allocation awareness in search requests. Since search requests
will no longer check the allocation awareness attributes for routing in the next major version,
this change adds a deprecation warning on any setup that uses these attributes.
Relates #43453
This is a follow up of #19191 for 7.x.
This change adds a system property called "es.routing.search_ignore_awareness_attributes" that when set to true will
effectively ignore allocation awareness attributes when routing search and get requests. This is now the default in 8.x so this
commit adds a way to opt-in to this new behavior in a minor version of 7.x.
Relates #45735
When the Elasticsearch process does not have write permissions to
upgrade the Elasticsearch keystore, we bail with an error message that
indicates there is a filesystem permissions problem. This commit
clarifies that error message by pointing out the directory where write
permissions are required, or that the user can also run the
elasticsearch-keystore upgrade command manually before starting the
Elasticsearch process. In this case, the upgrade would not be needed at
runtime, so the permissions would not be needed then.
Most of our CLI tools use the Terminal class, which previously did not provide methods for writing to standard output. When all output goes to standard out, there are two basic problems. First, errors and warnings are "swallowed" in pipelines, making it hard for a user to know when something's gone wrong. Second, errors and warnings are intermingled with legitimate output, making it difficult to pass the results of interactive scripts to other tools.
This commit adds a second set of print commands to Terminal for printing to standard error, with errorPrint corresponding to print and errorPrintln corresponding to println. This leaves it to developers to decide which output should go where. It also adjusts existing commands to send errors and warnings to stderr.
Usage is printed to standard output when it's correctly requested (e.g., bin/elasticsearch-keystore --help) but goes to standard error when a command is invoked incorrectly (e.g. bin/elasticsearch-keystore list-with-a-typo | sort).
This commit applies a normalization process to environment paths, both
in how they are stored internally, also their settings values. This
normalization is done via two means:
- we make the paths absolute
- we remove redundant name elements from the path (what Java calls
"normalization")
This change ensures that when we compare and refer to these paths within
the system, we are using a common ground. For example, prior to the
change if the data path was relative, we would not compare it correctly
to paths from disk usage. This is because the paths in disk usage were
being made absolute.
* Mute failing test
tracked in #44552
* mute EvilSecurityTests
tracking in #44558
* Fix line endings in ESJsonLayoutTests
* Mute failing ForecastIT test on windows
Tracking in #44609
* mute BasicRenormalizationIT.testDefaultRenormalization
tracked in #44613
* fix mute testDefaultRenormalization
* Increase busyWait timeout windows is slow
* Mute failure unconfigured node name
* mute x-pack internal cluster test windows
tracking #44610
* Mute JvmErgonomicsTests on windows
Tracking #44669
* mute SharedClusterSnapshotRestoreIT testParallelRestoreOperationsFromSingleSnapshot
Tracking #44671
* Mute NodeTests on Windows
Tracking #44256
Executors of type fixed_auto_queue_size (i.e. search / search_throttled) wrap runnables into
TimedRunnable, which is an AbstractRunnable. This is dangerous as it might silently swallow
exceptions, and possibly miss calling a response listener. While this has not triggered any failures in
the tests I have run so far, it might help uncover future problems.
Follow-up to #36137
Instead of logging warnings we now rethrow exceptions thrown inside
scheduled/submitted tasks. This will still log them as warnings in
production but has the added benefit that if they are thrown during
unit/integration test runs, the test will be flagged as an error.
This is a continuation of #38014
Fixed NPE that caused CCR tests (IndexFollowingIT and likely others)
to fail.
schedule could bubble rejected exception to uncaught exception
handler when not using SAME executor if thread pool is terminated.
Now ignore rejected exception silently if executor is shutdown.
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.
This is a continuation of #28667, #36137 and also fixes#37708.
This is a continuation of #28667 and has as goal to convert all executors to propagate errors to the
uncaught exception handler. Notable missing ones were the direct executor and the scheduler. This
commit also makes it the property of the executor, not the runnable, to ensure this property. A big
part of this commit also consists of vastly improving the test coverage in this area.
When the deprecation log is written to within scripting support code
like ScriptDocValues, it runs under the reduces privileges of scripts.
Sometimes this can trigger log rolling, which then causes uncaught
security errors, as was handled in #28485. While doing individual
deprecation handling within each deprecation scripting location is
possible, there are a growing number of deprecations in scripts.
This commit wraps the logging call within the deprecation logger use a
doPrivileged block, just was we would within individual logging call
sites for scripting utilities.
There are certain BootstrapCheck checks that may need access environment-specific
values. Watcher's EncryptSensitiveDataBootstrapCheck passes in the node's environment
via a constructor to bypass the shortcoming in BootstrapContext. This commit
pulls in the node's environment into BootstrapContext.
Another case is found in #36519, where it is useful to check the state of the
data-path. Since PathUtils.get and Paths.get are forbidden APIs, we rely on
the environment to retrieve references to things like node data paths.
This means that the BootstrapContext will have the same Settings used in the
Environment, which currently differs from the Node's settings.
data files under the cluster name subdirectory has been deprecated and was
meant to be removed in 6.0. This commit removes some leftover referrences to
these paths.
Some very old ancient versions of Linux do not have /etc/os-release. For
example, old Red Hat-like OS. This commit adds a fallback for handling
pretty name for these OS.
Some OS (e.g., Oracle Linux Server 6.9) have a trailing space at the end
of the PRETTY_NAME line in /etc/os-release. This commit addresses this
by accounting for this trailing space when extracting the pretty name.
Today our OS information returned in node stats only returns a
high-level name of the OS (e.g., "Linux"). Yet, for some uses this is
too high-level and knowing at a finer level of granularity the
underlying OS can be useful. This commit extracts the pretty name on
Linux from /etc/os-release. This pretty name usually includes the Linux
vendor and the Linux vendor version number (e.g., Fedora 28).
Changes the default of the `node.name` setting to the hostname of the
machine on which Elasticsearch is running. Previously it was the first 8
characters of the node id. This had the advantage of producing a unique
name even when the node name isn't configured but the disadvantage of
being unrecognizable and not being available until fairly late in the
startup process. Of particular interest is that it isn't available until
after logging is configured. This forces us to use a volatile read
whenever we add the node name to the log.
Using the hostname is available immediately on startup and is generally
recognizable but has the disadvantage of not being unique when run on
machines that don't set their hostname or when multiple elasticsearch
processes are run on the same host. I believe that, taken together, it
is better to default to the hostname.
1. Running multiple copies of Elasticsearch on the same node is a fairly
advanced feature. We do it all the as part of the elasticsearch build
for testing but we make sure to set the node name then.
2. That the node.name defaults to some flavor of "localhost" on an
unconfigured box feels like it isn't going to come up too much in
production. I expect most production deployments to at least set the
hostname.
As a bonus, production deployments need no longer set the node name in
most cases. At least in my experience most folks set it to the hostname
anyway.
Change the logging infrastructure to handle when the node name isn't
available in `elasticsearch.yml`. In that case the node name is not
available until long after logging is configured. The biggest change is
that the node name logging no longer fixed at pattern build time.
Instead it is read from a `SetOnce` on every print. If it is unset it is
printed as `unknown` so we have something that fits in the pattern.
On normal startup we don't log anything until the node name is available
so we never see the `unknown`s.
Remove a few of the logger constructors that aren't widely used or
aren't used at all and deprecate a few more logger constructors in favor
of log4j2's `LogManager`.
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.
Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```
The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```
Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```
The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.
This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.
Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.