* ingest: Introduce the dissect processor
The ingest node dissect processor is an alternative to Grok
to split a string based on a pattern. Dissect differs from
Grok such that regular expressions are not used to split the
string.
Dissect can be used to parse a source text field with a
simpler pattern, and is often faster the Grok for basic string
parsing. This processor uses the dissect library which
does most of the work.
* Adding new MonitoredSystem for APM server
* Teaching Monitoring template utils about APM server monitoring indices
* Documenting new monitoring index for APM server
* Adding monitoring index template for APM server
* Copy pasta typo
* Removing metrics.libbeat.config section from mapping
* Adding built-in user and role for APM server user
* Actually define the role :)
* Adding missing import
* Removing index template and system ID for apm server
* Shortening line lengths
* Updating expected number of built-in users in integration test
* Removing "system" from role and user names
* Rearranging users to make tests pass
* Search: Support of wildcard on docvalue_fields
For consistency with stored_fields, docvalue_fields should support the use of wildcards.
Documentation of doc values fields is updated accordingly.
See also: #26390Closes#26299
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
Currently docs don't explain how `ignore_above` behaves with arrays of
strings.
Clarify how `ignore_above` applies for arrays of strings and
also note that all string(s) will still be visible in the
`_source` field.
Relates #33057
Today `_msearch` doesn't allow modifying the `max_concurrent_shard_requests`
per sub search request. This change adds support for setting this parameter on
all sub-search requests in an `_msearch`.
Relates to #31877
Add documentation for #31238
- Add documentation for the req_authn_context_class_ref setting
- Add a section in SAML Guide regarding the use of SAML
Authentication Context.
The main installation instructions page for the Windows MSI installer includes a header at the top to indicate that the installer is in beta, but the Installing Elasticsearch page does not. This commit adds the beta label to the MSI entry within the installation options.
The maximum map count boostrap check can be a hindrance to users that do
not own the underlying platform on which they are executing
Elasticsearch. This is because addressing it requires tuning the kernel
and a platform provider might now allow this, especially on shared
infrastructure. However, this bootstrap check is not needed if mmapfs is
not in use. Today we do not have a way for the user to communicate that
they are not going to use mmapfs. This commit therefore adds a setting
that enables the user to disallow mmapfs. When mmapfs is disallowed, the
maximum map count bootstrap check is not enforced. Additionally, we
fallback to a different default index store and prevent the explicit use
of mmapfs for an index.
* Add relevant documentation for FIPS 140-2 compliance.
* Introduce `fips_mode` setting.
* Discuss necessary configuration for FIPS 140-2
* Discuss introduced limitations by FIPS 140-2
* [DOCS] Add configurable password hashing docs
Adds documentation about the newly introduced configuration option
for setting the password hashing algorithm to be used for the users
cache and for storing credentials for the native and file realm.
Expands and clarifies exactly what is and isn't allowed when specifying a
subset of the nodes as targets of a cluster API, and adds missing links to this
from the hot threads and cluster stats API docs.
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Yu <yu.liu003@gmail.com>
This commit adds documentation for configuring Kerberos realm.
Configuring Kerberos realm documentation highlights important
terminology and requirements before creating Kerberos realm.
Most of the documentation is centered around configuration from
Elasticsearch rather than go deep into Kerberos implementation.
Kerberos realm settings are mentioned in the security settings
for Kerberos realm.
* Modified a reference to real time to match the previous line reference of
realtime.
* Modified eg to e.g. as it's an abbreviation for the latin exempli gratia
* Added missing pronoun to `_executing_filters` section.
Currently, if geo context is represented by something other than
geo_point or an object with lat and lon fields, the parsing of it
as a geo context can result in ignoring the context altogether,
returning confusing errors such as number_format_exception or trying
to parse the number specifying as long-encoded hash code. It would also
fail if the geo_point was stored.
This commit makes the mapping parsing more strict and will fail during
mapping update or index creation if the geo context doesn't point to
a geo_point field.
Supersedes #32412Closes#32202
It took me quite a while of online searching and experimenting to realize the function-call asymmetry in the Add versus Remove from a list, like the "tags" list! I realize we cannot give examples for every single thing the user wants to do in Painless, but this is such a common use case (removing a tag from a single doc, or from a set of docs with Update-By-Query) that I believe it ought to be demonstrated immediately after the "add a tag" example. We have an example of removing an entire document field, but not removing one element of a list (a multi-valued field).
Also, a minor grammar fix: I have added an apostrophe to the word "its" in the accompanying text of the example just above.
The docs here incorrectly state that it is okay for a heap dump file to
exist when heap dump path is configured to a fixed filename. This is
incorrect, the JVM will fail to write the heap dump if a heap dump file
already exists at the specified location (see the DumpWriter constructor
DumpWriter::DumpWriter(const char* path) in the JVM source).
Suggestion responses were previously serialized as streamables which
made writing suggesters in plugins with custom suggestion response types
impossible. This commit makes them serialized as named writeables and
provides a facility for registering a reader for suggestion responses
when registering a suggester.
This also makes Suggestion responses abstract, requiring a suggester
implementation to provide its own types. Suggesters which do not need
anything additional to what is defined in Suggest.Suggestion should
provide a minimal subclass.
The existing plugin suggester integration tests are removed and
replaced with an equivalent implementation as an example
plugin.
On some Linux distributions tmpfiles.d cleans files and
directories under /tmp if they haven't been accessed for
10 days.
This can cause problems for ML as ML is currently the only
component that uses the temp directory more than a few
seconds after startup. If you didn't open an ML job for
10 days and then tried to open one then the temp directory
would have been deleted.
This commit prevents the problem occurring in the case of
Elasticsearch being managed by systemd, as systemd private
temp directories are not subject to periodic cleanup (by
default).
Additionally there are now some docs to warn people about
the risk and suggest a manual mitigation for .tar.gz users.
* Make cluster stats response contain cluster UUID
* Updating constructor usage in Monitoring tests
* Adding cluster_uuid field to Cluster Stats API reference doc
* Adding rest api spec test for expecting cluster_uuid in cluster stats response
* Adding missing newline
* Indenting do section properly
* Missed a spot!
* Fixing the test cluster ID
This commit adds a boolean system property, `es.scripting.use_java_time`,
which controls the concrete return type used by doc values within
scripts. The return type of accessing doc values for a date field is
changed to Object, essentially duck typing the type to allow
co-existence during the transition from joda time to java time.
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.
Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```
The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```
Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```
The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.
This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.
Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.
Rollover should not swap aliases when `is_write_index` is set to `true`.
Instead, both the new and old indices should have the rollover alias,
with the newly created index as the new write index
Updates Rollover to leverage the ability to preserve aliases and swap which is the write index.
Historically, Rollover would swap which index had the designated alias for writing documents against. This required users to keep a separate read-alias that enabled reading against both rolled over and newly created indices, whiles the write-alias was being re-assigned at every rollover.
With the ability for aliases to designate a write index, Rollover can be a bit more flexible with its use of aliases.
Updates include:
- Rollover validates that the target alias has a write index (the index that is being rolled over). This means that the restriction that aliases only point to one index is no longer necessary.
- Rollover explicitly (and atomically) swaps which index is the write-index by explicitly assigning the existing index to have `is_write_index: false` and have the newly created index have its rollover alias as `is_write_index: true`. This is only done when `is_write_index: true` on the write index. Default behavior of removing the alias from the rolled over index stays when `is_write_index` is not explicitly set
Relevant things that are staying the same:
- Rollover is rejected if there exist any templates that match the newly-created index and configure the rollover-alias
- I think this existed to prevent the situation where an alias pointed to two indices for a short while. Although this can technically be relaxed, the specific cases that are safe are really particular and difficult to reason, so leaving the broad restriction sounds good
Stating that the Fuzzy Query generates "all possible" matching terms is misleading, given that the query's default behavior is to generate a maximum of 50 matching terms.
(cherry picked from commit 345a0071a2a41fd7f80ae9ef8a39a2cb4991aedd)