Following performance optimisations to the adjacency_matrix aggregation we no longer require this setting. Marked as deprecated and due for removal in 8.0
Related #46324
This is a follow up of #19191 for 7.x.
This change adds a system property called "es.routing.search_ignore_awareness_attributes" that when set to true will
effectively ignore allocation awareness attributes when routing search and get requests. This is now the default in 8.x so this
commit adds a way to opt-in to this new behavior in a minor version of 7.x.
Relates #45735
this field can be present in search slow logs and deprecation logs. The
docs describes how to enable this functionality and what expect in logs.
closes#44851
This PR merges the `vectors-optimize-brute-force` feature branch, which makes
the following changes to how vector functions are computed:
* Precompute the L2 norm of each vector at indexing time. (#45390)
* Switch to ByteBuffer for vector encoding. (#45936)
* Decode vectors and while computing the vector function. (#46103)
* Use an array instead of a List for the query vector. (#46155)
* Precompute the normalized query vector when using cosine similarity. (#46190)
Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
Previously we only turned on tests if we saw either `// CONSOLE` or
`// TEST`. These magic comments are difficult for the docs build to deal
with so it has moved away from using them where possible. We should
catch up. This adds another trigger to enable testing: marking a snippet
with the `console` language. It looks like this:
```
[source,console]
----
GET /
----
```
This saves a line which is nice, I guess. But it is more important to me
that this is consistent with the way the docs build works now.
Similarly this enables response testing when you mark a snippet with the
language `console-result`. That looks like:
```
[source,console-result]
----
{
"result": "0.1"
}
----
```
`// TESTRESPONSE` is still available for situations like `// TEST`: when
the response isn't *in* the console-result language (like `_cat`) or
when you want to perform substitutions on the generated test.
Should unblock #46159.
Besides a rename, this changes allows to processor to attach multiple
enrich docs to the document being ingested.
Also in order to control the maximum number of enrich docs to be
included in the document being ingested, the `max_matches` setting
is added to the enrich processor.
Relates #32789
This commit updates the docs about translog retention and flushing to reflect
recent changes in how peer recoveries work. It also adds some docs to describe
how history is retained for replay using soft deletes and shard history
retention leases.
Relates #45473
Though we allow CCS within datafeeds, users could prevent nodes from accessing remote clusters. This can cause mysterious errors and difficult to troubleshoot.
This commit adds a check to verify that `cluster.remote.connect` is enabled on the current node when a datafeed is configured with a remote index pattern.
* [DOCS] Reformats delete by query API (#46051)
* Reformats delete by query API
* Update docs/reference/docs/delete-by-query.asciidoc
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
* Updated common parms includes.
* [DOCS] Fixed issue in Common Parms.
Refresh the setup for the new versions of DbVisualizer and SQL
Workbench/J which have Elasticsearch JDBC support out of the box.
(cherry picked from commit 6d257194c1055d060505e0faaaa37b41e21699f5)
The _cat/health call in getting-started assumes that the master task max
wait time is always 0 (-), however, the test could sometimes run into a
short wait time (like some ms). Fixed to allow this.
While the plugin installation directory used to be settable, it has not
been so for several major versions. This commit removes a lingering
reference to the plugins directory in upgrade docs.
closes#45889
The existing privilege model for API keys with privileges like
`manage_api_key`, `manage_security` etc. are too permissive and
we would want finer-grained control over the cluster privileges
for API keys. Previously APIs created would also need these
privileges to get its own information.
This commit adds support for `manage_own_api_key` cluster privilege
which only allows api key cluster actions on API keys owned by the
currently authenticated user. Also adds support for retrieval of
the API key self-information when authenticating via API key
without the need for the additional API key privileges.
To support this privilege, we are introducing additional
authentication context along with the request context such that
it can be used to authorize cluster actions based on the current
user authentication.
The API key get and invalidate APIs introduce an `owner` flag
that can be set to true if the API key request (Get or Invalidate)
is for the API keys owned by the currently authenticated user only.
In that case, `realm` and `username` cannot be set as they are
assumed to be the currently authenticated ones.
The changes cover HLRC changes, documentation for the API changes.
Closes#40031
This commit introduces PKI realm delegation. This feature
supports the PKI authentication feature in Kibana.
In essence, this creates a new API endpoint which Kibana must
call to authenticate clients that use certificates in their TLS
connection to Kibana. The API call passes to Elasticsearch the client's
certificate chain. The response contains an access token to be further
used to authenticate as the client. The client's certificates are validated
by the PKI realms that have been explicitly configured to permit
certificates from the proxy (Kibana). The user calling the delegation
API must have the delegate_pki privilege.
Closes#34396
This adds support for verifying that snippets with the `console-result`
language are valid json. It also switches the response snippets on the
`docs/get` page from `js` to `console-result` which will allow clients
to provide "alternatives" for them like they can now do with
`// CONSOLE` snippets.
This adds a pipeline aggregation that calculates the cumulative
cardinality of a field. It does this by iteratively merging in the
HLL sketch from consecutive buckets and emitting the cardinality up
to that point.
This is useful for things like finding the total "new" users that have
visited a website (as opposed to "repeat" visitors).
This is a Basic+ aggregation and adds a new Data Science plugin
to house it and future advanced analytics/data science aggregations.
Previously, the stats API reports a progress percentage
for DF analytics tasks that are running and are in the
`reindexing` or `analyzing` state.
This means that when the task is `stopped` there is no progress
reported. Thus, one cannot distinguish between a task that never
run to one that completed.
In addition, there are blind spots in the progress reporting.
In particular, we do not account for when data is loaded into the
process. We also do not account for when results are written.
This commit addresses the above issues. It changes progress
to being a list of objects, each one describing the phase
and its progress as a percentage. We currently have 4 phases:
reindexing, loading_data, analyzing, writing_results.
When the task stops, progress is persisted as a document in the
state index. The stats API now reports progress from in-memory
if the task is running, or returns the persisted document
(if there is one).
A policy type controls how the enrich index is created and
the query executed against the match field. Currently there
is a single policy type (`exact_match`). In the near future
more policy types will be added and different policy may have
different configuration options.
For this reason type should be a json object instead of a string field:
```
{
"exact_match": {
...
}
}
```
instead of:
```
{
"type": "exact_match",
...
}
```
This will make streaming parsing of enrich policies easier as in the
new format, the parsing code can know ahead what configuration fields
to expect. In the latter format that is not possible if the type field
appears not as the first field.
Relates to #32789
Customers occasionally discover a known behavior in Elasticsearch's pagination that does not appear to be documented. This warning is intended to educate customers of this behavior while still highlighting alternative solutions.
This change adds a new SSL context
xpack.notification.email.ssl.*
that supports the standard SSL configuration settings (truststore,
verification_mode, etc). This SSL context is used when configuring
outbound SMTP properties for watcher email notifications.
Backport of: #45272
Since #45136, we use soft-deletes instead of translog in peer recovery.
There's no need to retain extra translog to increase a chance of
operation-based recoveries. This commit ignores the translog retention
policy if soft-deletes is enabled so we can discard translog more
quickly.
Backport of #45473
Relates #45136
* [DOCS] Add template docs to scripts. Reorder template examples.
* Adds a 'Search template' section to the 'How to use scripts' chapter.
This links to the 'Search template' chapter for detailed info and
examples.
* Reorders and retitles several examples in the 'Search template'
chapter. This is primarily to make examples for storing, deleting, and
using search templates more prominent.
* Change <templatename> to <templateid>
Enrich processor configuration changes:
* Renamed `enrich_key` option to `field` option.
* Replaced `set_from` and `targets` options with `target_field`.
The `target_field` option behaves different to how `set_from` and
`targets` worked. The `target_field` is the field that will contain
the looked up document.
Relates to #32789
* Add is_write_index column to cat.aliases (#44772)
Aliases have had the option to set `is_write_index` since 6.4,
but the cat.aliases action was never updated.
* correct version bounds to 7.4
* Repository Cleanup Endpoint (#43900)
* Snapshot cleanup functionality via transport/REST endpoint.
* Added all the infrastructure for this with the HLRC and node client
* Made use of it in tests and resolved relevant TODO
* Added new `Custom` CS element that tracks the cleanup logic.
Kept it similar to the delete and in progress classes and gave it
some (for now) redundant way of handling multiple cleanups but only allow one
* Use the exact same mechanism used by deletes to have the combination
of CS entry and increment in repository state ID provide some
concurrency safety (the initial approach of just an entry in the CS
was not enough, we must increment the repository state ID to be safe
against concurrent modifications, otherwise we run the risk of "cleaning up"
blobs that just got created without noticing)
* Isolated the logic to the transport action class as much as I could.
It's not ideal, but we don't need to keep any state and do the same
for other repository operations
(like getting the detailed snapshot shard status)
This change adds a new option called user_dictionary_rules to
Kuromoji's tokenizer. It can be used to set additional tokenization rules
to the Japanese tokenizer directly in the settings (instead of using a file).
This commit also adds a check that no rules are duplicated since this is not allowed
in the UserDictionary.
Closes#25343
This commit updates the documentation to
- use the batch file included with the zip distribution; the exe file is included in the MSI only.
- introduce a space between the -E arguments and their values. Without a space (or quoted, but adding a space is cleaner), the argument will fail with PowerShell
(cherry picked from commit 5c8dbcedb0edf3a48ca1ec52aad9ea41fa941f8a)
The get and list APIs are a single API in this commit. Whether
requesting one named policy or all policies, a list of policies is
returened. The list API code has all been removed and the GET api is
what remains, which contains much of the list response code.
* Search enhancement: pinned queries (#44345)
Search enhancement: - new query type allows selected documents to be promoted above any "organic” search results.
This is the first feature in a new module `search-business-rules` which will house licensed (non OSS) logic for rewriting queries according to business rules.
The PinnedQueryBuilder class offers a new `pinned` query in the DSL that takes an array of promoted IDs and an “organic” query and ensures the documents with the promoted IDs rank higher than the organic matches.
Closes#44074
Changes the order of parameters in Geometries from lat, lon to lon, lat
and moves all Geometry classes are moved to the
org.elasticsearch.geomtery package.
Backport of #45332Closes#45048
This fixes the mappings and types required to run watcher and other
examples. A new set of seat data will be updated and available for
download to go with this change.
This commit adds CNAME reporting for transport.publish_address same way
it's done for http.publish_address.
Relates #32806
Relates #39970
(cherry picked from commit e0a2558a4c3a6b6fbfc6cd17ed34a6f6ef7b15a9)
* Introduce Spatial Plugin (#44389)
Introduce a skeleton Spatial plugin that holds new licensed features coming to
Geo/Spatial land!
* [GEO] Refactor DeprecatedParameters in AbstractGeometryFieldMapper (#44923)
Refactor DeprecatedParameters specific to legacy geo_shape out of
AbstractGeometryFieldMapper.TypeParser#parse.
* [SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980)
Add a new ShapeFieldMapper to the xpack spatial module for
indexing arbitrary cartesian geometries using a new field type called shape.
The indexing approach leverages lucene's new XYShape field type which is
backed by BKD in the same manner as LatLonShape but without the WGS84
latitude longitude restrictions. The new field mapper builds on and
extends the refactoring effort in AbstractGeometryFieldMapper and accepts
shapes in either GeoJSON or WKT format (both of which support non geospatial
geometries).
Tests are provided in the ShapeFieldMapperTest class in the same manner
as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests.
Documentation for how to use the new field type and what parameters are
accepted is included. The QueryBuilder for searching indexed shapes is
provided in a separate commit.
* [SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108)
Add a new ShapeQueryBuilder to the xpack spatial module for
querying arbitrary Cartesian geometries indexed using the new shape field
type.
The query builder extends AbstractGeometryQueryBuilder and leverages the
ShapeQueryProcessor added in the previous field mapper commit.
Tests are provided in ShapeQueryTests in the same manner as
GeoShapeQueryTests and docs are updated to explain how the query works.
If a pipeline that refrences the policy exists, we should not allow the
policy to be deleted. The user will need to remove the processor from
the pipeline before deleting the policy. This commit adds a check to
ensure that the policy cannot be deleted if it is referenced by any
pipeline in the system.
This change adds the support for the RankFeatureQuery in the HLRC by
providing an extra dependency on mapper-extras-client. It also removes
the dependency on lang-painless in mapper-extras which is not needed
anymore since the move of the vector field into a dedicated module.
Closes#43634
* SQL: ODBC: document newest conn string parameters
This commit adds the documentation for two newly added connection string
parameters: AutoEscapePVA and IndexIncludeFrozen.
It also removes the recommended OSes from the prerequisites list and
places the recommendation distinctively: the unmet prerequisites will
fail the installation, while the driver would install on other OSes than
those recommended.
* address review suggestions.
- adjust phrasing for clearer message.
(cherry picked from commit e18ac10c6e163a04f5b7cf7fa72f262882ffb711)
This commit replaces task_state and indexer_state in the
data frame _stats output with a single top level state
that combines the two. It is defined as:
- failed if what's currently reported as task_state is failed
- stopped if there is no persistent task
- Otherwise what's currently reported as indexer_state
Backport of #45276
Adds to the `index.blocks.read_only_allow_delete` docs the information that
this block may be added or removed automatically, and rewords the
breaking-changes docs to mention the blocks explicitly and to recommend using a
different block.
Relates #42559
* [ML][Data Frame] Add update transform api endpoint (#45154)
This adds the ability to `_update` stored data frame transforms. All mutable fields are applied when the next checkpoint starts. The exception being `description`.
This PR contains all that is necessary for this addition:
* HLRC
* Docs
* Server side
Our docs previously included several community plugins that are only supported for versions 5.x and earlier. This removes those plugins for our 6.6+ docs.
If a node exceeds the flood-stage disk watermark then we add a block to all of
its indices to prevent further writes as a last-ditch attempt to prevent the
node completely exhausting its disk space. However today this block remains in
place until manually removed, and this block is a source of confusion for users
who current have ample disk space and did not even realise they nearly ran out
at some point in the past.
This commit changes our behaviour to automatically remove this block when a
node drops below the high watermark again. The expectation is that the high
watermark is some distance below the flood-stage watermark and therefore the
disk space problem is truly resolved.
Fixes#39334
Previously, the reindex examples did not include `_doc` as the destination type.
This would result in the reindex failing with the error "Rejecting mapping
update to [users] as the final mapping would have more than 1 type: [_doc,
user]".
Relates to #43100.
Uses JDK 11's per-socket configuration of TCP keepalive (supported on Linux and Mac), see
https://bugs.openjdk.java.net/browse/JDK-8194298, and exposes these as transport settings.
By default, these options are disabled for now (i.e. fall-back to OS behavior), but we would like
to explore whether we can enable them by default, in particular to force keepalive configurations
that are better tuned for running ES.
Sometimes the recovery in this docs test takes long enough that it is expressed
in `s` rather than `ms`. This commit relaxes the assertion to account for this.
In many cases, including migration from previous versions of data
shippers (e.g. Beats), it is useful to use ILM to manage historical
indices, which are no longer being written to. This commit adds a guide
which gives an example of how to do that.
This adjusts the `buckets_path` parser so that pipeline aggs can
select specific buckets (via their bucket keys) instead of fetching
the entire set of buckets. This is useful for bucket_script in
particular, which might want specific buckets for calculations.
It's possible to workaround this with `filter` aggs, but the workaround
is hacky and probably less performant.
- Adjusts documentation
- Adds a barebones AggregatorTestCase for bucket_script
- Tweaks AggTestCase to use getMockScriptService() for reductions and
pipelines. Previously pipelines could just pass in a script service
for testing, but this didnt work for regular aggs. The new
getMockScriptService() method fixes that issue, but needs to be used
for pipelines too. This had a knock-on effect of touching MovFn,
AvgBucket and ScriptedMetric
Introduce shift field to MovingFunction aggregation.
By default, shift = 0. Behavior, in this case, is the same as before.
Increasing shift by 1 moves starting window position by 1 to the right.
To simply include current bucket to the window, use shift = 1
For center alignment (n/2 values before and after the current bucket), use shift = window / 2
For right alignment (n values after the current bucket), use shift = window.
Introduce shift field to MovingFunction aggregation.
By default, shift = 0. Behavior, in this case, is the same as before.
Increasing shift by 1 moves starting window position by 1 to the right.
To simply include current bucket to the window, use shift = 1
For center alignment (n/2 values before and after the current bucket), use shift = window / 2
For right alignment (n values after the current bucket), use shift = window.
Today the lag detector may remove nodes from the cluster if they fail to apply
a cluster state within a reasonable timeframe, but it is rather unclear from
the default logging that this has occurred and there is very little extra
information beyond the fact that the removed node was lagging. Moreover the
only forewarning that the lag detector might be invoked is a message indicating
that cluster state publication took unreasonably long, which does not contain
enough information to investigate the problem further.
This commit adds a good deal more detail to make the issues of slow nodes more
prominent:
- after 10 seconds (by default) we log an INFO message indicating that a
publication is still waiting for responses from some nodes, including the
identities of the problematic nodes.
- when the publication times out after 30 seconds (by default) we log a WARN
message identifying the nodes that are still pending.
- the lag detector logs a more detailed warning when a fatally-lagging node is
detected.
- if applying a cluster state takes too long then the cluster applier service
logs a breakdown of all the tasks it ran as part of that process.
Most of the circuit breaker settings are dynamically configurable.
However, `indices.breaker.total.use_real_memory` is not. With this
commit we add a clarifying note that this specific setting is static.
Closes#44974
In order to make it easier to interpret the output of the ILM Explain
API, this commit adds two request parameters to that API:
- `only_managed`, which causes the response to only contain indices
which have `index.lifecycle.name` set
- `only_errors`, which causes the response to contain only indices in an
ILM error state
"Error state" is defined as either being in the `ERROR` step or having
`index.lifecycle.name` set to a policy that does not exist.
This PR addresses the feedback in https://github.com/elastic/ml-team/issues/175#issuecomment-512215731.
* Adds an example to `analyzed_fields`
* Includes `source` and `dest` objects inline in the resource page
* Lists `model_memory_limit` in the PUT API page
* Amends the `analysis` section in the resource page
* Removes Properties headings in subsections
With this change, we will return primary_term and seq_no of the current
document if an update is detected as a noop. We already return the
version; hence we should also return seq_no and primary_term.
Relates #42497
Adds an API to clone an index. This is similar to the index split and shrink APIs, just with the
difference that the number of primary shards is kept the same. In case where the filesystem
provides hard-linking capabilities, this is a very cheap operation.
Indexing cloning can be done by running `POST my_source_index/_clone/my_target_index` and it
supports the same options as the split and shrink APIs.
Closes#44128
* Switch from using docvalue_fields to extracting values from _source
where applicable. Doing this means parsing the _source and handling the
numbers parsing just like Elasticsearch is doing it when it's indexing
a document.
* This also introduces a minor limitation: aliases type of fields that
are NOT part of a tree of sub-fields will not be able to be retrieved
anymore. field_caps API doesn't shed any light into a field being an
alias or not and at _source parsing time there is no way to know if a
root field is an alias or not. Fields of the type "a.b.c.alias" can be
extracted from docvalue_fields, only if the field they point to can be
extracted from docvalue_fields. Also, not all fields in a hierarchy of
fields can be evaluated to being an alias.
(cherry picked from commit 8bf8a055e38f00df5f49c8d97f632f69d6e00c2c)
We already have a note that the order of actions is up to ILM for each
phase, this commit puts the actions in the same order as they will be
executed.
Resolves#41729
The `// TEARDOWN` test snippet was added with #34716. You can use this
snippet to end and clean up a test series started with `// TESTSETUP` or
`// TEST[setup:name]`.
This change adjusts the data frame transforms stats
endpoint to return a structure that is easier to
understand.
This is a breaking change for clients of the data frame
transforms stats endpoint, but the feature is in beta so
stability is not guaranteed.
Backport of #44350
Some small clarifications about force-merging and global ordinals, particularly
that global ordinals are cheap on a single-segment index and how this relates
to frozen indices.
Fixes#41687
In data frame transforms the same scheduler controls both
retries in the event of search failures and gaps between
checks for changes when the transform is running continuously.
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
This adds a new dynamic cluster setting `xpack.data_frame.num_transform_failure_retries`.
This setting indicates how many times non-critical failures should be retried before a data frame transform is marked as failed and should stop executing. At the time of this commit; Min: 0, Max: 100, Default: 10
Several files in the REST APIs nav section are included using
:leveloffset: tags. This increments headings (h2 -> h3, h3 -> h4, etc.)
in those files and removes the :leveloffset: tags.
Other supporting changes:
* Alphabetizes top-level REST API nav items.
* Change 'indices APIs' heading to 'index APIs.'
* Changes 'Snapshot lifecycle management' heading to sentence case.
elastic/elasticsearch#41281 added custom metadata parameter to
snapshots. During review, the parameter name was changed from '_meta' to
'metadata,' but the documentation wasn't updated. This corrects the
documentation to use the 'metadata' name.
* Expose index age in ILM explain output
This adds the index's age to the ILM explain output, for example:
```
{
"indices" : {
"ilm-000001" : {
"index" : "ilm-000001",
"managed" : true,
"policy" : "full-lifecycle",
"lifecycle_date" : "2019-07-16T19:48:22.294Z",
"lifecycle_date_millis" : 1563306502294,
"age" : "1.34m",
"phase" : "hot",
"phase_time" : "2019-07-16T19:48:22.487Z",
... etc ...
}
}
}
```
This age can be used to tell when ILM will transition the index to the
next phase, based on that phase's `min_age`.
Resolves#38988
* Expose age in getters and in HLRC
Today we reroute the cluster as part of the process of starting a shard, which
runs at `URGENT` priority. In large clusters, rerouting may take some time to
complete, and this means that a mere trickle of shard-started events can cause
starvation for other, lower-priority, tasks that are pending on the master.
However, it isn't really necessary to perform a reroute when starting a shard,
as long as one occurs eventually. This commit removes the inline reroute from
the process of starting a shard and replaces it with a deferred one that runs
at `NORMAL` priority, avoiding starvation of higher-priority tasks.
Backport of #44433 and #44543.
Moves the following API sections under the REST APIs navigations:
- API Conventions
- Document APIs
- Search APIs
- Index APIs (previously named Indices APIs)
- cat APIs
- Cluster APIs
Other supporting changes:
- Removes the previous index APIs page under REST APIs. Adds a redirect for the removed page.
- Removes several [partintro] macros so the docs build correctly.
- Changes anchors for pages that become sections of a parent page.
- Adds several redirects for existing pages that become sections of a parent page.
This commit re-applies changes from #44238. Changes from that PR were reverted due to broken links in several repos. This commit adds redirects for those broken links.
The versioning of Update API doesn't rely on version number anymore (and
rather on sequence number). But in rest api level we ignored the
"version" and "version_type" parameter, so that the server cannot raise
the exception when whey were set.
This PR restores "version" and "version_type" parsing in Update Rest API
so that we can get the appropriate errors.
Relates to #42497
* Add Snapshot Lifecycle Management (#43934)
* Add SnapshotLifecycleService and related CRUD APIs
This commit adds `SnapshotLifecycleService` as a new service under the ilm
plugin. This service handles snapshot lifecycle policies by scheduling based on
the policies defined schedule.
This also includes the get, put, and delete APIs for these policies
Relates to #38461
* Make scheduledJobIds return an immutable set
* Use Object.equals for SnapshotLifecyclePolicy
* Remove unneeded TODO
* Implement ToXContentFragment on SnapshotLifecyclePolicyItem
* Copy contents of the scheduledJobIds
* Handle snapshot lifecycle policy updates and deletions (#40062)
(Note this is a PR against the `snapshot-lifecycle-management` feature branch)
This adds logic to `SnapshotLifecycleService` to handle updates and deletes for
snapshot policies. Policies with incremented versions have the old policy
cancelled and the new one scheduled. Deleted policies have their schedules
cancelled when they are no longer present in the cluster state metadata.
Relates to #38461
* Take a snapshot for the policy when the SLM policy is triggered (#40383)
(This is a PR for the `snapshot-lifecycle-management` branch)
This commit fills in `SnapshotLifecycleTask` to actually perform the
snapshotting when the policy is triggered. Currently there is no handling of the
results (other than logging) as that will be added in subsequent work.
This also adds unit tests and an integration test that schedules a policy and
ensures that a snapshot is correctly taken.
Relates to #38461
* Record most recent snapshot policy success/failure (#40619)
Keeping a record of the results of the successes and failures will aid
troubleshooting of policies and make users more confident that their
snapshots are being taken as expected.
This is the first step toward writing history in a more permanent
fashion.
* Validate snapshot lifecycle policies (#40654)
(This is a PR against the `snapshot-lifecycle-management` branch)
With the commit, we now validate the content of snapshot lifecycle policies when
the policy is being created or updated. This checks for the validity of the id,
name, schedule, and repository. Additionally, cluster state is checked to ensure
that the repository exists prior to the lifecycle being added to the cluster
state.
Part of #38461
* Hook SLM into ILM's start and stop APIs (#40871)
(This pull request is for the `snapshot-lifecycle-management` branch)
This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also
manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are
cancelled.
Relates to #38461
* Add tests for SnapshotLifecyclePolicyItem (#40912)
Adds serialization tests for SnapshotLifecyclePolicyItem.
* Fix improper import in build.gradle after master merge
* Add human readable version of modified date for snapshot lifecycle policy (#41035)
* Add human readable version of modified date for snapshot lifecycle policy
This small change changes it from:
```
...
"modified_date": 1554843903242,
...
```
To
```
...
"modified_date" : "2019-04-09T21:05:03.242Z",
"modified_date_millis" : 1554843903242,
...
```
Including the `"modified_date"` field when the `?human` field is used.
Relates to #38461
* Fix test
* Add API to execute SLM policy on demand (#41038)
This commit adds the ability to perform a snapshot on demand for a policy. This
can be useful to take a snapshot immediately prior to performing some sort of
maintenance.
```json
PUT /_ilm/snapshot/<policy>/_execute
```
And it returns the response with the generated snapshot name:
```json
{
"snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug"
}
```
Note that this does not allow waiting for the snapshot, and the snapshot could
still fail. It *does* record this information into the cluster state similar to
a regularly trigged SLM job.
Relates to #38461
* Add next_execution to SLM policy metadata (#41221)
* Add next_execution to SLM policy metadata
This adds the next time a snapshot lifecycle policy will be executed when
retriving a policy's metadata, for example:
```json
GET /_ilm/snapshot?human
{
"production" : {
"version" : 1,
"modified_date" : "2019-04-15T21:16:21.865Z",
"modified_date_millis" : 1555362981865,
"policy" : {
"name" : "<production-snap-{now/d}>",
"schedule" : "*/30 * * * * ?",
"repository" : "repo",
"config" : {
"indices" : [
"foo-*",
"important"
],
"ignore_unavailable" : true,
"include_global_state" : false
}
},
"next_execution" : "2019-04-15T21:16:30.000Z",
"next_execution_millis" : 1555362990000
},
"other" : {
"version" : 1,
"modified_date" : "2019-04-15T21:12:19.959Z",
"modified_date_millis" : 1555362739959,
"policy" : {
"name" : "<other-snap-{now/d}>",
"schedule" : "0 30 2 * * ?",
"repository" : "repo",
"config" : {
"indices" : [
"other"
],
"ignore_unavailable" : false,
"include_global_state" : true
}
},
"next_execution" : "2019-04-16T02:30:00.000Z",
"next_execution_millis" : 1555381800000
}
}
```
Relates to #38461
* Fix and enhance tests
* Figured out how to Cron
* Change SLM endpoint from /_ilm/* to /_slm/* (#41320)
This commit changes the endpoint for snapshot lifecycle management from:
```
GET /_ilm/snapshot/<policy>
```
to:
```
GET /_slm/policy/<policy>
```
It mimics the ILM path only using `slm` instead of `ilm`.
Relates to #38461
* Add initial documentation for SLM (#41510)
* Add initial documentation for SLM
This adds the initial documentation for snapshot lifecycle management.
It also includes the REST spec API json files since they're sort of
documentation.
Relates to #38461
* Add `manage_slm` and `read_slm` roles (#41607)
* Add `manage_slm` and `read_slm` roles
This adds two more built in roles -
`manage_slm` which has permission to perform any of the SLM actions, as well as
stopping, starting, and retrieving the operation status of ILM.
`read_slm` which has permission to retrieve snapshot lifecycle policies as well
as retrieving the operation status of ILM.
Relates to #38461
* Add execute to the test
* Fix ilm -> slm typo in test
* Record SLM history into an index (#41707)
It is useful to have a record of the actions that Snapshot Lifecycle
Management takes, especially for the purposes of alerting when a
snapshot fails or has not been taken successfully for a certain amount of
time.
This adds the infrastructure to record SLM actions into an index that
can be queried at leisure, along with a lifecycle policy so that this
history does not grow without bound.
Additionally,
SLM automatically setting up an index + lifecycle policy leads to
`index_lifecycle` custom metadata in the cluster state, which some of
the ML tests don't know how to deal with due to setting up custom
`NamedXContentRegistry`s. Watcher would cause the same problem, but it
is already disabled (for the same reason).
* High Level Rest Client support for SLM (#41767)
* High Level Rest Client support for SLM
This commit add HLRC support for SLM.
Relates to #38461
* Fill out documentation tests with tags
* Add more callouts and asciidoc for HLRC
* Update javadoc links to real locations
* Add security test testing SLM cluster privileges (#42678)
* Add security test testing SLM cluster privileges
This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm`
cluster privileges.
Relates to #38461
* Don't redefine vars
* Add Getting Started Guide for SLM (#42878)
This commit adds a basic Getting Started Guide for SLM.
* Include SLM policy name in Snapshot metadata (#43132)
Keep track of which SLM policy in the metadata field of the Snapshots
taken by SLM. This allows users to more easily understand where the
snapshot came from, and will enable future SLM features such as
retention policies.
* Fix compilation after master merge
* [TEST] Move exception wrapping for devious exception throwing
Fixes an issue where an exception was created from one line and thrown in another.
* Fix SLM for the change to AcknowledgedResponse
* Add Snapshot Lifecycle Management Package Docs (#43535)
* Fix compilation for transport actions now that task is required
* Add a note mentioning the privileges needed for SLM (#43708)
* Add a note mentioning the privileges needed for SLM
This adds a note to the top of the "getting started with SLM"
documentation mentioning that there are two built-in privileges to
assist with creating roles for SLM users and administrators.
Relates to #38461
* Mention that you can create snapshots for indices you can't read
* Fix REST tests for new number of cluster privileges
* Mute testThatNonExistingTemplatesAreAddedImmediately (#43951)
* Fix SnapshotHistoryStoreTests after merge
* Remove overridden newResponse functions that have been removed
* Fix compilation for backport
* Fix get snapshot output parsing in test
* [DOCS] Add redirects for removed autogen anchors (#44380)
* Switch <tt>...</tt> in javadocs for {@code ...}
After starting up elasticsearch the documentation said that their node
name was "6-bjhwl" but in the documentation's output I did not see that
node name. Instead I saw the node name as `localhost.localdomain`
Two new settings were introduced in #43669 to control the
behaviour of the Document Level Security BitSet cache.
This change adds documentation for these 2 settings.
Backport of: #44100
Test clusters currently has its own set of logic for dealing with
finding different versions of Elasticsearch, downloading them, and
extracting them. This commit converts testclusters to use the
DistributionDownloadPlugin.
* HLRC: Fix '+' Not Correctly Encoded in GET Req.
* Encode `+` correctly as `%2B` in URL paths
* Keep encoding `+` as space in URL parameters
* Closes#33077
This commit documents the backup and restore of a cluster's
security configuration.
It is not possible to only backup (or only restore) security
configuration, independent to the rest of the cluster's conf,
so this describes how a full configuration backup&restore
will include security as well. Moreover, it explains how part
of the security conf data resides on the special .security
index and how to backup that using regular data snapshot API.
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
Co-Authored-By: Tim Vernum <tim@adjective.org>
Previously a data frame transform would check whether the
source index was changed every 10 seconds. Sometimes it
may be desirable for the check to be done less frequently.
This commit increases the default to 60 seconds but also
allows the frequency to be overridden by a setting in the
data frame transform config.
* Provide an Option to Use Path-Style-Access with S3 Repo
* As discussed, added the option to use path style access back again and
deprecated it.
* Defaulted to `false`
* Added warning to docs
* Closes#41816
Currently when a document misses a vector value, vector function
returns 0 as a score for this document. We think this is incorrect
behaviour.
With this change, an error will be thrown if vector functions are
used with docs that are missing vector doc values.
Also VectorScriptDocValues is modified to allow size() function,
which can be used to check if a document has a value for the
vector field.
This PR adds the reference documentation pages of the data frame analytics APIs (PUT, START, STOP, GET, GET stats, DELETE, Evaluate) to the ML APIs pool.
This brings TokenizerFactory into line with CharFilterFactory and TokenFilterFactory,
and removes the need to pass around tokenizer names when building custom analyzers.
As this means that TokenizerFactory is no longer a functional interface, the commit also
adds a factory method to TokenizerFactory to make construction simpler.
This introduces a `failed` state to which the data frame analytics
persistent task is set to when something unexpected fails. It could
be the process crashing, the results processor hitting some error,
etc. The failure message is then captured and set on the task state.
From there, it becomes available via the _stats API as `failure_reason`.
The df-analytics stop API now has a `force` boolean parameter. This allows
the user to call it for a failed task in order to reset it to `stopped` after
we have ensured the failure has been communicated to the user.
This commit also adds the analytics version in the persistent task
params as this allows us to prevent tasks to run on unsuitable nodes in
the future.
This commit deprecates the `transport.profiles.*.xpack.security.type`
setting. This setting is used to configure a profile that would only
allow client actions. With the upcoming removal of the transport client
the setting should also be deprecated so that it may be removed in
a future version.
Typically, dense vectors of both documents and queries must have the same
number of dimensions. Different number of dimensions among documents
or query vector indicate an error. This PR enforces that all vectors
for the same field have the same number of dimensions. It also enforces
that query vectors have the same number of dimensions.
* [ML][Data Frame] add node attr to GET _stats (#43842)
* [ML][Data Frame] add node attr to GET _stats
* addressing testing issues with node.attributes
* adjusting for backport
This change explains why Painless doesn't natively support datetime now, and
gives examples of how to create a version of now through user-defined
parameters.
Currently the repsonse of the "_reload_search_analyzer" endpoint contains the
index names and nodeIds of indices were analyzers reloading was triggered. This
change add the names of the search-time analyzers that were reloaded.
Closes#43804
Clarifies the roles of a dedicated voting-only master-eligible node.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
Co-Authored-By: David Turner <david.turner@elastic.co>
A few places in the documentation had mentioned 6.7 as the version to
upgrade from, when doing an upgrade to 7.0. While this is technically
possible, this commit will replace all those mentions to 6.8, as this is
the latest version with the latest bugfixes, deprecation checks and
ugprade assistant features - which should be the one used for upgrades.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
This adds a `rare_terms` aggregation. It is an aggregation designed
to identify the long-tail of keywords, e.g. terms that are "rare" or
have low doc counts.
This aggregation is designed to be more memory efficient than the
alternative, which is setting a terms aggregation to size: LONG_MAX
(or worse, ordering a terms agg by count ascending, which has
unbounded error).
This aggregation works by maintaining a map of terms that have
been seen. A counter associated with each value is incremented
when we see the term again. If the counter surpasses a predefined
threshold, the term is removed from the map and inserted into a cuckoo
filter. If a future term is found in the cuckoo filter we assume it
was previously removed from the map and is "common".
The map keys are the "rare" terms after collection is done.
Following the removal of the `unzip` package from the Elasticsearch
Docker image in #39040, update setup instructions for TLS in Docker.
Also avoid cross-platform ownership+permission issues by not relying
on local bind mounts for storing generated certs and don't require
`curl` locally installed.
Backport of #43748
Removes the suggestion to use IP addresses for `cluster.initial_master_nodes`
in the "important settings" discovery docs, leaving only the suggestion to use
node names.
Relates #41179, #41569
This commit merges the `object-fields` feature branch. The new 'flattened
object' field type allows an entire JSON object to be indexed into a field, and
provides limited search functionality over the field's contents.
This commit adds a wildcard intervals source, similar to the prefix. It
also changes the term parameter in prefix to read prefix, to bring it
in to line with the pattern parameter in wildcard.
Closes#43198
Currently changing resources (like dictionaries, synonym files etc...) of search
time analyzers is only possible by closing an index, changing the underlying
resource (e.g. synonym files) and then re-opening the index for the change to
take effect.
This PR adds a new API endpoint that allows triggering reloading of certain
analysis resources (currently token filters) that will then pick up changes in
underlying file resources. To achieve this we introduce a new type of custom
analyzer (ReloadableCustomAnalyzer) that uses a ReuseStrategy that allows
swapping out analysis components. Custom analyzers that contain filters that are
markes as "updateable" will automatically choose this implementation. This PR
also adds this capability to `synonym` token filters for use in search time
analyzers.
Relates to #29051
We should throw an exception at construction time if a list of
articles is not provided, otherwise we can get random NPEs during
indexing.
Relates to #43002
It is possible for internal ML indices like `.data-frame-notifications-1` to leak,
causing other docs tests to fail when they accidentally search over these
indices. This PR updates the ignore_above tests to only search a specific index.
This commit adds a prefix intervals source, allowing you to search
for intervals that contain terms starting with a given prefix. The source
can make use of the index_prefixes mapping option.
Relates to #43198
* [ML][Data Frame] Add support for allow_no_match for endpoints (#43490)
* [ML][Data Frame] Add support for allow_no_match parameter in endpoints
Adds support for:
* Get Transforms
* Get Transforms stats
* stop transforms
* Update DataFrameTransformDocumentationIT.java
Given a nested structure composed of Lists and Maps, getByPath will return the value
keyed by path. getByPath is a method on Lists and Maps.
The path is string Map keys and integer List indices separated by dot. An optional third
argument returns a default value if the path lookup fails due to a missing value.
Eg.
['key0': ['a', 'b'], 'key1': ['c', 'd']].getByPath('key1') = ['c', 'd']
['key0': ['a', 'b'], 'key1': ['c', 'd']].getByPath('key1.0') = 'c'
['key0': ['a', 'b'], 'key1': ['c', 'd']].getByPath('key2', 'x') = 'x'
[['key0': 'value0'], ['key1': 'value1']].getByPath('1.key1') = 'value1'
Throws IllegalArgumentException if an item cannot be found and a default is not given.
Throws NumberFormatException if a path element operating on a List is not an integer.
Fixes#42769
This change introduces a new setting,
xpack.ml.process_connect_timeout, to enable
the timeout for one of the external ML processes
to connect to the ES JVM to be increased.
The timeout may need to be increased if many
processes are being started simultaneously on
the same machine. This is unlikely in clusters
with many ML nodes, as we balance the processes
across the ML nodes, but can happen in clusters
with a single ML node and a high value for
xpack.ml.node_concurrent_job_allocations.
A voting-only master-eligible node is a node that can participate in master elections but will not act
as a master in the cluster. In particular, a voting-only node can help elect another master-eligible
node as master, and can serve as a tiebreaker in elections. High availability (HA) clusters require at
least three master-eligible nodes, so that if one of the three nodes is down, then the remaining two
can still elect a master amongst them-selves. This only requires one of the two remaining nodes to
have the capability to act as master, but both need to have voting powers. This means that one of
the three master-eligible nodes can be made as voting-only. If this voting-only node is a dedicated
master, a less powerful machine or a smaller heap-size can be chosen for this node. Alternatively, a
voting-only non-dedicated master node can play the role of the third master-eligible node, which
allows running an HA cluster with only two dedicated master nodes.
Closes#14340
Co-authored-by: David Turner <david.turner@elastic.co>
This merges the initial work that adds a framework for performing
machine learning analytics on data frames. The feature is currently experimental
and requires a platinum license. Note that the original commits can be
found in the `feature-ml-data-frame-analytics` branch.
A new set of APIs is added which allows the creation of data frame analytics
jobs. Configuration allows specifying different types of analysis to be performed
on a data frame. At first there is support for outlier detection.
The APIs are:
- PUT _ml/data_frame/analysis/{id}
- GET _ml/data_frame/analysis/{id}
- GET _ml/data_frame/analysis/{id}/_stats
- POST _ml/data_frame/analysis/{id}/_start
- POST _ml/data_frame/analysis/{id}/_stop
- DELETE _ml/data_frame/analysis/{id}
When a data frame analytics job is started a persistent task is created and started.
The main steps of the task are:
1. reindex the source index into the dest index
2. analyze the data through the data_frame_analyzer c++ process
3. merge the results of the process back into the destination index
In addition, an evaluation API is added which packages commonly used metrics
that provide evaluation of various analysis:
- POST _ml/data_frame/_evaluate
The existing language was misleading about the model snapshots and where they are located. Saying "to disk" sounds like files external to Elasticsearch IMO. It raises the obvious question, where on disk? which node? Is it in the Elasticsearch snapshot repo? The model snapshots are held in an internal index.
* Example of how to set slow logs dynamically per-index
* Make _settings API example more explicit
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
* Add TEST directive to fix CI
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
the geo-bounding-box and phrase-suggest docs were susceptible to
failing due to other indices in the cluster. This change restricts
the queries to the index that is set up for the test.
relates to #43271.
This commit tweaks the docs for secure settings to ensure the user is
aware adding non secure settings to the keystore will result in
elasticsearch not starting.
fixes#43328
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
Together with types removal, any mention of "fields with the same name in the same index" doesn't make sense anymore.
(cherry picked from commit c5190106cbd4c007945156249cce462956933326)
* [ML][Data Frame] adds new pipeline field to dest config (#43124)
* [ML][Data Frame] adds new pipeline field to dest config
* Adding pipeline support to _preview
* removing unused import
* moving towards extracting _source from pipeline simulation
* fixing permission requirement, adding _index entry to doc
* adjusting for java 8 compatibility
* adjusting bwc serialization version to 7.3.0
These docs were misleading for package installations of
Elasticsearch. Instead, we should refer to $ES_CONFIG/ingest-geoip as
the path to place the custom database files. For non-package
installations, this is the same as $ES_HOME/config, but for package
installations this is not the case as the config directory for package
installations is /etc/elasticsearch, and is not relative to
$ES_HOME. This commit corrects the docs.
* [DOCS] Add introduction to Elasticsearch.
* [DOCS] Incorporated review comments.
* [DOCS] Minor edits to add an abbreviated title and cross refs.
* [DOCS] Added sizing tips & link to quantatative sizing video.
To be consistent with the `search.max_buckets` default setting,
set the hard limit of the PriorityQueue used for in memory sorting,
when sorting on an aggregate function, to 10000.
Fixes: #43168
(cherry picked from commit 079e012fdea68ea0a7daae078359495047e9c407)
The machine learning feature of xpack has native binaries with a
different commit id than the rest of code. It is currently exposed in
the xpack info api. This commit adds that commit information to the ML
info api, so that it may be removed from the info api.
Now emphasises the test is for indexed values.
Previous documentation only mentioned the state of the input JSON doc (null values) but this is only one of several reasons why an indexed value may not exist.
Closes#24256
The description field of xpack featuresets is optionally part of the
xpack info api, when using the verbose flag. However, this information
is unnecessary, as it is better left for documentation (and the existing
descriptions describe anything meaningful). This commit removes the
description field from feature sets.
Rest docs page update
- have the section be on separate pages
- add an Overview page
- add other formats examples
(cherry picked from commit 309bd691ff3f8625f67ca09fc1dd8e265f7e6c92)
* [ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds
* only supporting doc_values for geo_point fields
* moving validation into GeoPointField ctor
Previously, a reindex request had two different size specifications in the body:
* Outer level, determining the maximum documents to process
* Inside the source element, determining the scroll/batch size.
The outer level size has now been renamed to max_docs to
avoid confusion and clarify its semantics, with backwards compatibility and
deprecation warnings for using size.
Similarly, the size parameter has been renamed to max_docs for
update/delete-by-query to keep the 3 interfaces consistent.
Finally, all 3 endpoints now support max_docs in both body and URL.
Relates #24344
This change adds the earliest and latest timestamps into
the field stats for fields of type "date" in the output of
the ML find_file_structure endpoint. This will enable the
cards for date fields in the file data visualizer in the UI
to be made to look more similar to the cards for date
fields in the index data visualizer in the UI.
Adds a metadata field to snapshots which can be used to store arbitrary
key-value information. This may be useful for attaching a description of
why a snapshot was taken, tagging snapshots to make categorization
easier, or identifying the source of automatically-created snapshots.
The `replace` option in the phonetic token filter can have suprising side
effects, e.g. such as described in #26921. This PR adds a note to be mindful
about such scenarios and offers alternatives to using the `replace` option.
Closes#26921
This change abstracts the specific types away from the different
representations of datetime as a datetime representation in code can be all
kinds of different things. This defines the three most common types of
datetimes as numeric, string, and complex while outlining the type most
typically used for these as long, String, and ZonedDateTime, respectively.
Documentation uses the definitions while examples use the types. This makes
the documentation easier to consume especially for people from a non-Java
background.
This commit adds functionality so that aliases that are manipulated on
leader indices are replicated by the shard follow tasks to the follower
indices. Note that we ignore write indices. This is due to the fact that
follower indices do not receive direct writes so the concept is not
useful.
Relates #41815
Adding notes to the existing docs about how using `preference` might increase
request cache utilization but also add warning about the downsides.
Closes#24278
This commit addresses a few more frequently-asked questions:
* clarifies that bootstrapping doesn't happen even after a full cluster
restart.
* removes the example that uses IP addresses, to try and further encourage the
use of node names for bootstrapping.
* clarifies that auto-bootstrapping might form different clusters on different
hosts, and gives a process for starting again if this wasn't what you wanted.
* adds the "do not stop half-or-more of the master-eligible nodes" slogan that
was notably absent.
* reformats one of the console examples to a narrower width
For `multi_match` query: link `boost` param to the generic reference
for query usage and `slop` to the `match_phrase` query where its usage
is documented.
Fixes: #40091
(cherry picked from commit 69993049a8bd9e7f042935729fe69a8266d95a0a)
Add an explanatory NOTE section to draw attention to the difference
between small and capital letters used for the index date patterns.
e.g.: HH vs hh, MM vs mm.
Closes: #22322
(cherry picked from commit c8125417dc33215651f9bb76c9b1ffaf25f41caf)
Fix a couple of wrong links because of the order of the anchor
and the usage of backquotes.
(cherry picked from commit 4e0c6525153b60a57202937c2ae57968c8e35285)
When analysing a semi-structured text file the
find_file_structure endpoint merges lines to form
multi-line messages using the assumption that the
first line in each message contains the timestamp.
However, if the timestamp is misdetected then this
can lead to excessive numbers of lines being merged
to form massive messages.
This commit adds a line_merge_size_limit setting
(default 10000 characters) that halts the analysis
if a message bigger than this is created. This
prevents significant CPU time being spent subsequently
trying to determine the internal structure of the
huge bogus messages.
Adding an example of how to re-implement the polish stempel analyzer
in case a user want to modify or extend it. In order for the analyzer to be
able to use polish stopwords, also registering a polish_stop filter for the
stempel plugin.
Closes#13150
This commit clones the existing AnalyzeRequest/AnalyzeResponse classes
to the high-level rest client, and adjusts request converters to use these new
classes.
This is a prerequisite to removing the Streamable interface from the internal
server version of these classes.
This PR updates the docs for `docvalue_fields` and `stored_fields` to clarify
that nested fields must be accessed through `inner_hits`. It also tweaks the
nested fields documentation to make this point more visible.
Addresses #23766.
In AsciiDoc, `subs="attributes,callouts,macros"` options were required
to render `include-tagged::` in a code block.
With elastic/docs#827, Elasticsearch Reference documentation migrated
from AsciiDoc to Asciidoctor.
In Asciidoctor, the `subs="attributes,callouts,macros"` options are no
longer needed to render `include-tagged::` in a code block. This commit
removes those unneeded options.
Resolves#41589
Several `ifdef::asciidoctor` conditionals were added so that AsciiDoc
and Asciidoctor doc builds rendered consistently.
With https://github.com/elastic/docs/pull/827, Elasticsearch Reference
documentation migrated completely to Asciidoctor. We no longer need to
support AsciiDoc so we can remove these conditionals.
Resolves#41722
Several `ifdef::asciidoctor` conditionals were added so that AsciiDoc
and Asciidoctor doc builds rendered consistently.
With https://github.com/elastic/docs/pull/827, Elasticsearch Reference
documentation migrated completely to Asciidoctor. We no longer need to
support AsciiDoc so we can remove these conditionals.
Resolves#41722