* [ML] Add data frame task state object and field
* A new state item is added so that the overall task state can be
accoutned for
* A new FAILED state and reason have been added as well so that failures
can be shown to the user for optional correction
* Addressing PR comments
* adjusting after master merge
* addressing pr comment
* Adjusting auditor usage with failure state
* Refactor, renamed state items to task_state and indexer_state
* Adding todo and removing redundant auditor call
* Address HLRC changes and PR comment
* adjusting hlrc IT test
To avoid having to specify each spec by hand (which can miss specs to be
added), the test infrastructure now performs classpath discovery so that
each spec added, is automatically considered.
Relates #40358
(cherry picked from commit d0f60b4425c731509aa8ca765d55f563f866ef90)
* [ML] make source and dest objects in the transform config
* addressing PR comments
* Fixing compilation post merge
* adding comment for Arrays.hashCode
* addressing changes for moving dest to object
* fixing data_frame yml tests
* fixing API test
Extend CAST to support all data types notations (whether SQL or ES
specific)
Fix#40282
(cherry picked from commit eb2ee8a344da946920598839a5db76c8bb9bc3fe)
This commit expands the ccr overview page to include more information
about the lifecycle of following an index. It adds information linking
to the remote recovery documentation. And describes how an index can
fall-behind and how to fix it when this happens.
This is the equivalent of the `field_masking_span` query, allowing users to
merge intervals from multiple fields - for example, to search for stemmed tokens
near unstemmed tokens.
This change adds an option to convert a `date` field to nanoseconds resolution
and a `date_nanos` field to millisecond resolution when sorting.
The resolution of the sort can be set using the `numeric_type` option of the
field sort builder. The conversion is done at the shard level and is restricted
to dates from 1970 to 2262 for the nanoseconds resolution in order to avoid
numeric overflow.
The new cluster coordination subsystem introduced in 7.0 will only keep an
unresponsive node in the cluster for 30 seconds, whereas in earlier versions it
might have remained in the cluster for 90 seconds. This commit adds a note to
the migration documentation to that effect.
This commit clarifies how the gateway selection works when configuring
remote clusters for CCR or CCS. Specifically, it clarifies compatibility
between different versions which is a very common question.
The Migration Assistance API has been functionally replaced by the
Deprecation Info API, and the Migration Upgrade API is not used for the
transition from ES 6.x to 7.x, and does not need to be kept around to
repair indices that were not properly upgraded before upgrading the
cluster, as was the case in 6.
For cases where fields can have multi values, allow the behavior to be
customized through a dedicated configuration field.
By default this will be enabled on the drivers so that existing datasets
work instead of throwing an exception.
For regular SQL usage, the behavior is false so that the user is aware
of the underlying data.
Fix#39700
(cherry picked from commit 2b351571961f172fd59290ee079126bbd081ceaf)
The `GET /_cluster/state` API returns an internal representation of the cluster
state that does change from version to version. It's useful for debugging, but
it is not intended for regular use by clients.
This change adjusts the documentation of `GET /_cluster/state` to clarify that
this API yields an internal representation that should not be expected to
remain stable between versions.
Relates #40061, #40016
This change adds an option to the `FieldSortBuilder` that allows to transform the type
of a numeric field into another. Possible values for this option are `long` that transforms
the source field into an integer and `double` that transforms the source field into a floating point.
This new option is useful for cross-index search when the sort field is mapped differently on some
indices. For instance if a field is mapped as a floating point in one index and as an integer in another
it is possible to align the type for both indices using the `numeric_type` option:
```
{
"sort": {
"field": "my_field",
"numeric_type": "double" <1>
}
}
```
<1> Ensure that values for this field are transformed to a floating point if needed.
Currently if a field alias is updated, any percolator queries that contain the
alias will still refer to its old target. This PR documents the issue while we
look into addressing it.
Relates to #37212.
`document_type` is the type to use for parsing the document to percolate, which
is already optional and deprecated. However `percotale` queries also have the
ability to percolate existing documents, identified by an index, an id and a
type. This change makes the latter optional and deprecated.
Closes#39963
This commit, mostly authored by @DaveCTurner,
adds documentation for elasticsearch-node tool #37696.
(cherry picked from commit 09425d5a5158c2d3fdad794411b3bbc4bba47b15)
Lucene 8.0 includes a [change](https://issues.apache.org/jira/browse/LUCENE-8635)
that moves the terms index off-heap for all fields but ID fields. I'm
including this in the migration notes so that users who have queries that match
lots of terms won't be surprised in case of slowdown.
- new `rank_feature`/`script_score` queries
- new `index_phrases`/`index_prefixes` options
- disabling `_field_names` doesn't help anymore
- adaptive replica selection is on by default
We were missing release notes for 7.0.0-alpha1. I generated them by running
the release-notes script with a quick hack that filtered out pull requests
that had been closed on or after 2018-11-15.
Some breaking changes had been documented in the release notes rather than
the migration guide so I moved them.
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.
This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the
only index created is the original or one set by the ingest node pipeline.
This was the execution order prior to 6.5.0 (#32786).
The execution order was changed in 6.5 to better support default pipelines.
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such
that if the target index does not exist when ingest node pipeline runs, it
will now pull the default pipeline (if one exists) from the settings of the
best matched of the index template.
Relates #32786
Relates #32758Closes#36545
This commit introduces the forget follower API. This API is needed in cases that
unfollowing a following index fails to remove the shard history retention leases
on the leader index. This can happen explicitly through user action, or
implicitly through an index managed by ILM. When this occurs, history will be
retained longer than necessary. While the retention lease will eventually
expire, it can be expensive to allow history to persist for that long, and also
prevent ILM from performing actions like shrink on the leader index. As such, we
introduce an API to allow for manual removal of the shard history retention
leases in this case.
This change does the following:
1. Makes the per-node setting xpack.ml.max_open_jobs
into a cluster-wide dynamic setting
2. Changes the job node selection to continue to use the
per-node attributes storing the maximum number of open
jobs if any node in the cluster is older than 7.1, and
use the dynamic cluster-wide setting if all nodes are on
7.1 or later
3. Changes the docs to reflect this
4. Changes the thread pools for native process communication
from fixed size to scaling, to support the dynamic nature
of xpack.ml.max_open_jobs
5. Renames the autodetect thread pool to the job comms
thread pool to make clear that it will be used for other
types of ML jobs (data frame analytics in particular)
Backport of #39320
This is related to #35975. It adds documentation on the remote recovery
process. Additionally, it adds documentation about the various settings
that can impact the process.
* SYS COLUMNS will skip UNSUPPORTED field types in ODBC and JDBC, as well.
NESTED and OBJECT types were already skipped in ODBC mode, now they are
skipped in JDBC mode, as well.
(cherry picked from commit 9e0df64b2d36c9069dfa506570468f0522c86417)
These docs are out of date, now that we override the infinite DNS cache
within Elasticsearch. This commit completely removes this content, as
specific guidance is no longer needed here.
* Add "columnar" option for REST requests (but be lenient for non-"plain"
modes) for json, yaml, smile and cbor formats.
* Updated documentation
(cherry picked from commit 5b7e0de237fb514d14a61a347bc669d4b4adbe56)
While running these commands from alias, facing issues using kill `cat pid`, In some situations, the more compact:
```
pkill -F /var/run/myProcess.pid
```
is the way to go.
* Remove Hipchat support from Watcher (#39199)
Hipchat has been shut down and has previously been deprecated in
Watcher (#39160), therefore we should remove support for these actions.
* Add migrate note
This fixes a bug in the sensing of the current OS family in the test cluster
formation code. Previously all builds would assume every environment
was windows and would jump to using the windows zip build. This fixes
the OS sensing code as well as updates some tests to account for
different build flavors.
Backport of #38457
Currently remote compression and ping schedule settings are dynamic.
However, we do not listen for changes. This commit adds listeners for
changes to those two settings. Additionally, when those settings change
we now close existing connections and open new ones with the settings
applied.
Fixes#37201.
In #30209 we deprecated the camel case `nGram` filter name in favour of `ngram` and
did the same for `edgeNGram` and `edge_ngram` and we are removing those names in
8.0. This change disallows using the deprecated names for new indices created in 7.0 by
throwing an error if these filters are used.
Relates to #38911
* missing 'test2' index example (#39055)
If I got the idea of aliases properly, I think that the index "test2" should have
a reference in the example above of the following sentence:
" ... we associate the alias `alias1` to both `test` and `test2` ... "
* add PUT test2
* Update aliases.asciidoc
swap which is write/read
Co-authored-by: Guilherme Ferreira <guilhermeaferreira_t@yahoo.com.br>
Rollup jobs should be stopped + deleted before the indices are removed.
It's possible for an active rollup job to issue a bulk request, the test
ends and the cleanup code deletes all indices. The in-flight bulk
request will then stall + error because the index no-longer exists...
but this process might take longer than the StopRollup timeout.
Which means the test fails, and often fails several other tests since
the job is still active (e.g. other tests cannot create the same-named
job, or fail to stop the job in their cleanup because it's still stalled).
This tends to knock over several tests before the bulk finally times
out and the job shuts down.
Instead, we need to simply stop jobs first. Inflight bulks will resolve
quickly, and we can carry on with deleting indices after the jobs are
confirmed inactive.
stop-job.asciidoc tended to trigger this issue because it executed
an async stop API and then exited, which setup the above situation. In
can and did happen with other tests though. As an extra precaution,
the doc test was modified to substitute in wait_for_completion
to help head off these issues too.