There are currently half a dozen ways to add plugins and modules for
test clusters to use. All of them require the calling project to peek
into the plugin or module they want to use to grab its bundlePlugin
task, and then both depend on that task, as well as extract the archive
path the task will produce. This creates cross project dependencies that
are difficult to detect, and if the dependent plugin/module has not yet
been configured, the build will fail because the task does not yet
exist.
This commit makes the plugin and module methods for testclusters
symmetetric, and simply adding a file provider directly, or a project
path that will produce the plugin/module zip. Internally this new
variant uses normal configuration/dependencies across projects to get
the zip artifact. It also has the added benefit of no longer needing the
caller to add to the test task a dependsOn for bundlePlugin task.
We now link to the top-level keyword type family page instead of its individual
subsections. This better fits the page format, where each type name is a link.
This commit enhances the verbose output for the
`_ingest/pipeline/_simulate?verbose` api. Specifically
this adds the following:
* the pipeline processor is now included in the output
* the conditional (if) and result is now included in the output iff it was defined
* a status field is always displayed. the possible values of status are
* `success` - if the processor ran with out errors
* `error` - if the processor ran but threw an error that was not ingored
* `error_ignored` - if the processor ran but threw an error that was ingored
* `skipped` - if the process did not run (currently only possible if the if condition evaluates to false)
* `dropped` - if the the `drop` processor ran and dropped the document
* a `processor_type` field for the type of processor (e.g. set, rename, etc.)
* throw a better error if trying to simulate with a pipeline that does not exist
closes#56004
This commit adds the functionality to allocate newly created indices on nodes in the "hot" tier by
default when they are created.
This does not break existing behavior, as nodes with the `data` role are considered to be part of
the hot tier. Users that separate their deployments by using the `data_hot` (and `data_warm`,
`data_cold`, `data_frozen`) roles will have their data allocated on the hot tier nodes now by
default.
This change is a little more complicated than changing the default value for
`index.routing.allocation.include._tier` from null to "data_hot". Instead, this adds the ability to
have a plugin inject a setting into the builder for a newly created index. This has the benefit of
allowing this setting to be visible as part of the settings when retrieving the index, for example:
```
// Create an index
PUT /eggplant
// Get an index
GET /eggplant?flat_settings
```
Returns the default settings now of:
```json
{
"eggplant" : {
"aliases" : { },
"mappings" : { },
"settings" : {
"index.creation_date" : "1597855465598",
"index.number_of_replicas" : "1",
"index.number_of_shards" : "1",
"index.provided_name" : "eggplant",
"index.routing.allocation.include._tier" : "data_hot",
"index.uuid" : "6ySG78s9RWGystRipoBFCA",
"index.version.created" : "8000099"
}
}
}
```
After the initial setting of this setting, it can be treated like any other index level setting.
This new setting is *not* set on a new index if any of the following is true:
- The index is created with an `index.routing.allocation.include.<anything>` setting
- The index is created with an `index.routing.allocation.exclude.<anything>` setting
- The index is created with an `index.routing.allocation.require.<anything>` setting
- The index is created with a null `index.routing.allocation.include._tier` value
- The index was created from an existing source metadata (shrink, clone, split, etc)
Relates to #60848
* updated shard limit doc
As the documentation was not so clear. I have updated saying this limit includes open indices with unassigned primaries and replicas count towards the limit.
* [DOCS] Incorporated edits.
Co-authored-by: Deb Adair <debadair@elastic.co>
Co-authored-by: gadekishore <50092970+gadekishore@users.noreply.github.com>
Backport to add case insensitive support for regex queries.
Forks a copy of Lucene’s RegexpQuery and RegExp from Lucene master.
This can be removed when 8.7 Lucene is released.
Closes#59235
The building block of the eql response is currently the SearchHit. This
is a problem since it is tied to an actual search, and thus has scoring,
highlighting, shard information and a lot of other things that are not
relevant for EQL.
This becomes a problem when doing sequence queries since the response is
not generated from one search query and thus there are no SearchHits to
speak of.
Emulating one is not just conceptually incorrect but also problematic
since most of the data is missed or made-up.
As such this PR introduces a simple class, Event, that maps nicely to
the terminology while hiding the ES internals (the use of SearchHit or
GetResult/GetResponse depending on the API used).
Fix#59764Fix#59779
Co-authored-by: Igor Motov <igor@motovs.org>
(cherry picked from commit 997376fbe6ef2894038968842f5e0635731ede65)
No-op changes to:
* Move `Search your data` source files into the same directory
* Rename `Search your data` source files based on page ID
* Remove unneeded includes
* Remove the `Request` dir
* [ML] adding docs + hlrc for data frame analysis feature_processors (#61149)
Adds HLRC and some docs for the new feature_processors field in Data frame analytics.
Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
Changes:
* Removes narrative around URI searches. These aren't commonly used in production. The `q` param is already covered in the search API docs: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-search.html#search-api-query-params-q
* Adds a common options section that highlights narrative docs for query DSL, aggregations, multi-index search, search fields, pagination, sorting, and async search.
* Adds a `Search shard routing` page. Moves narrative docs for adaptive replica selection, preference, routing , and shard limits to that section.
* Moves search timeout and cancellation content to the `Search your data` page.
* Creates a `Search multiple data streams and indices` page. Moves related narrative docs for multi-target syntax searches and `indices_boost` to that page.
* Removes narrative examples for the `search_type` parameters. Moves documentation for this parameter to the search API docs.
Previously migration guide incorrectly stated that joda-time patterns have to be fixed before upgrading to 7.x
since (7.7) #52555 and our bwc policy 6.x created indices even with joda-time are supported
relates #60374
Per #35284, it looks like we changed this from a max field expansions limit to a soft limit using the `indices.query.bool.max_clause_count` dynamic cluster settting.
* First crack at rewriting the CCR introduction.
* Emphasizing Kibana in configuring CCR (part one).
* Many more edits, plus new files.
* Fixing test case.
* Removing overview page and consolidating that information in the main page.
* Adding redirects for moved and deleted pages.
* Removing, consolidating, and adding redirects.
* Fixing duplicate ID in redirects and removing outdated reference.
* Adding test case and steps for recreating a follower index.
* Adding steps for managing CCR tasks in Kibana.
* Adding tasks for managing auto-follow patterns.
* Fixing glossary link.
* Fixing glossary link, again.
* Updating the upgrade information and other stuff.
* Apply suggestions from code review
* Incorporating review feedback.
* Adding more edits.
* Fixing link reference.
* Adding use cases for #59812.
* Incorporating feedback from reviewers.
* Apply suggestions from code review
* Incorporating more review comments.
* Condensing some of the steps for accessing Kibana.
* Incorporating small changes from reviewers.
Adds an important admonition for the built-in `metrics-*-*` and `logs-*-*` index
templates.
Updates several put index template snippets to include a priority.
Followup to #60216, fixing the formatting of
`transport.tcp.reuse_address` and clarifying some wording around the
distinction between the transport and HTTP layers.
Changes:
* Moves "Notes" sections for the joining queries and percolate query
pages to the parent page
* Adds related redirects for the moved "Notes" pages
* Assigns explicit anchor IDs to other "Notes" headings. This was required for
the redirects to work.
This adds a frozen phase to ILM that will allow the execution of the
set_priority, unfollow, allocate, freeze and searchable_snapshot actions.
The frozen phase will be executed after the cold and before the delete phase.
(cherry picked from commit 6d0148001c3481290ed7e60dab588e0191346864)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
Elasticsearch currently blocks writes by default when a master is unavailable. The cluster.no_master_block setting allows
a user to change this behavior to also block reads when a master is unavailable. This PR introduces a way to now also still
allow writes when a master is offline. Writes will continue to work as long as routing table changes are not needed (as
those require the master for consistency), or if dynamic mapping updates are not required (as again, these require the
master for consistency).
Eventually we should switch the default of cluster.no_master_block to this new mode.
This commit introduces a new thread pool, `system_read`, which is
intended for use by system indices for all read operations (get and
search). The `system_read` pool is a fixed thread pool with a maximum
number of threads equal to lesser of half of the available processors
or 5. Given the combination of both get and read operations in this
thread pool, the queue size has been set to 2000. The motivation for
this change is to allow system read operations to be serviced in spite
of the number of user searches.
In order to avoid a significant performance hit due to pattern matching
on all search requests, a new metadata flag is added to mark indices
as system or non-system. Previously created system indices will have
flag added to their metadata upon upgrade to a version with this
capability.
Additionally, this change also introduces a new class, `SystemIndices`,
which encapsulates logic around system indices. Currently, the class
provides a method to check if an index is a system index and a method
to find a matching index descriptor given the name of an index.
Relates #50251
Relates #37867
Backport of #57936
Split the autoscaling decider into a service and configuration
in order to enable having additional context information available
in the service. Added AutoscalingDeciderContext holding generic
information all deciders are expected to need. Implemented GET
_autoscaling/decision
There is no point in timing out a join attempt any more once a cluster
is entirely in 7.x. Timing out and retrying with the same master is
pointless, and an in-flight join attempt to one master no longer blocks
attempts to join other masters. This commit deprecates this unnecessary
setting and removes its effect from the joining process.
Relates #60873 which removes this setting in master.
This adds a force-merge step to the searchable snapshot action, enabled by default,
but parameterizable using the `force_merge-index" optional boolean.
eg.
```
PUT _ilm/policy/my_policy
{
"policy": {
"phases": {
"cold": {
"actions": {
"searchable_snapshot" : {
"snapshot_repository" : "backing_repo",
"force_merge_index": true
}
}
}
}
}
}
```
(cherry picked from commit d0a17b2d35f1b083b574246bdbf3e1929471a4a9)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
This commit makes IpFieldMapper extend ParametrizedFieldMapper. It also
updates the IpFieldMapper docs to add the ignore_malformed parameter,
which was not previously documented.
Changes:
* Moves `Retrieve selected fields` to its own page and adds a title abbreviation.
* Adds existing script and stored fields content to `Retrieve selected fields`
* Adds a xref for `Retrieve selected fields` to `Search your data`
* Adds related redirects and updates existing xrefs
Uses `my-data-stream` in place of `logs` for data stream examples.
This provides a more intuitive experience for users that copy/paste
their own values into snippets.
Add VPC endpoint as the recommended way of connecting to s3 in private subnets
Backport of #60654
Co-authored-by: Bill Mitchell <vocatan@users.noreply.github.com>
Co-authored-by: David Turner <david.turner@elastic.co>
Changes:
* Moves sample data to reusable rest test
* Combines EQL index, requirements, and run a search pages
* Combines EQL syntax and limitations pages
* Adds related redirects
This commit uses the new location for the reindex java-api documentation.
Temporary files have been left behind to pacify the docs build.
related #60339
The current `tee` command appends a definition to
`/etc/apt/sources.list.d/elastic-{version}.list`.
This can lead to duplicate lines and significantly slow apt-get
operations.
This updates the command to overwrite rather than append.
This commit fixes the list dangling indices response.
The dangling_indices array is an array of objects
that represent aggregated dangling index information
(cherry picked from commit 24c72d4e71c95f2d7690090933e0657152f6af9b)
* [DOCS] Add info about why we removed test fw docs
* Apply suggestions from code review
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
When a new cluster starts, the HTTP layer becomes ready to accept incoming
requests while the basic license is still being populated in the background.
When a get license request comes in before the license is ready, it can get
404 error. This PR fixes it by either wrap the license check in assertBusy or
ensure the license is ready before perform the check.
This is a backport for both #60498 and #60573
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.
While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.
In addition a few community links have been removed, as they do not seem
to exist anymore.
Co-authored-by: Alexander Reelsen <alexander@reelsen.net>
* SQL: Add option to provide the delimiter for the CSV format (#59907)
* Add option to provide the delimiter to the CSV fmt
This adds the option to provide the desired character as the separator
for the CSV format (the default remains comma).
A set of characters are excluded though - like CR, LF, `"` - to avoid
slipping onto the CSV-dialects slope. The tab is also forbidden, the
user needs to choose the "tsv" format explicitely.
Update the doc to make it clear that the textual CSV, TSV and TXT
formats pass the cursor back to the user through the Cursor HTTP header.
(cherry picked from commit 3a8b00cc7480f7ada57fcea3cbac957facac08fc)
* Java8 fixes
- replace Set#of();
- URLDecoder#decode() requires a string (vs a charset) as 2nd arg.
Changes:
* Adds the `number_of_routing_shards` index setting to index modules docs.
* Updates the split API docs to mention that `number_of_routing_shards`
is a static setting.
Today there are a few places in the transport layer docs where we talk
about communication between nodes _within a cluster_. We also use the
transport layer for remote cluster connections, and these statements
also apply there, but this is not clear from today's docs. This commit
generalises these statements to make it clear that they apply to remote
cluster connections too.
It also adds a link from the docs on configuring TCP retries to the
(deeply-buried) docs on preserving long-lived connections.
This feature adds a new `fields` parameter to the search request, which
consults both the document `_source` and the mappings to fetch fields in a
consistent way. The PR merges the `field-retrieval` feature branch.
Addresses #49028 and #55363.
Transport connections between nodes remain in place until one or other
node shuts down or the connection is disrupted by a flaky network.
Today it is very difficult to demonstrate that transient failures and
cluster instability are caused by the network even though this is often
the case. In particular, transport connections open and close without
logging anything, even at `DEBUG` level, making it very hard to quantify
the scale of the problem or to correlate the networking problems with
external events.
This commit adds the missing `DEBUG`-level logging when transport
connections open and close, and also tracks the total number of
transport connections a node has opened as a measure of the stability of
the underlying network.
* Adds table with icons for simplicity.
* Updating table for clarity.
* Changing table formatting and incorporating more feedback.
* Changing table alignment.
Keepalive options are not well-documented (only in transport section, although also available at http and network level).
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are
killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their
configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of
distributed systems such as Elasticsearch.
This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations
that support it (>= Java 11 & (MacOS || Linux)) and where the system defaults are set to something higher than 5
minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings
unless they are deemed to be set too high by providing better out-of-the-box defaults.