* SQL: Rename SQL data type DATE to DATETIME
SQL data type DATE has only the date part (e.g.: 2019-01-14)
without any time information. Previously the SQL type DATE was
referring to the ES DATE which contains also the time part along
with TZ information. To conform with SQL data types the data type
`DATE` is renamed to `DATETIME`, since it includes also the time,
as a new runtime SQL `DATE` data type will be introduced down the road,
which only contains the date part and meets the SQL standard.
Closes: #36440
* Address comments
Some systems default to a nofile ulimit of 65535. To reduce the pain of
deploying Elasticsearch to such systems, this commit lowers the required
limit from 65536 to 65535.
In order to distinguish the ES-SQL type from the standard SQL type
add a new ES-SQL column that will make clear this distingstion,
e.g.: datetime vs TIMSTAMP
Fixes: #37519
The semantics of the API changed considerably since the documentation was written.
The main change is to remove references to memory reduction (this is related to refresh).
Instead, flush refers to recovery times. I also removed the references to trimming the translog
as the translog may be required for other purposes (operation history for ops based recovery
and complement ongoing file based recoveries).
Closes#32869
* Default include_type_name to false for get and put mappings.
* Default include_type_name to false for get field mappings.
* Add a constant for the default include_type_name value.
* Default include_type_name to false for get and put index templates.
* Default include_type_name to false for create index.
* Update create index calls in REST documentation to use include_type_name=true.
* Some minor clean-ups around the get index API.
* In REST tests, use include_type_name=true by default for index creation.
* Make sure to use 'expression == false'.
* Clarify the different IndexTemplateMetaData toXContent methods.
* Fix FullClusterRestartIT#testSnapshotRestore.
* Fix the ml_anomalies_default_mappings test.
* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.
We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.
This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.
* Fix more REST tests.
* Improve some wording in the create index documentation.
* Add a note about types removal in the create index docs.
* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.
* Make sure to mention include_type_name in the REST docs for affected APIs.
* Make sure to use 'expression == false' in FullClusterRestartIT.
* Mention include_type_name in the REST templates docs.
This commit removes the fallback for SSL settings. While this may be
seen as a non user friendly change, the intention behind this change
is to simplify the reasoning needed to understand what is actually
being used for a given SSL configuration. Each configuration now needs
to be explicitly specified as there is no global configuration or
fallback to some other configuration.
Closes#29797
Today file-chunks are sent sequentially one by one in peer-recovery. This is a
correct choice since the implementation is straightforward and recovery is
network bound in most of the time. However, if the connection is encrypted, we
might not be able to saturate the network pipe because encrypting/decrypting
are cpu bound rather than network-bound.
With this commit, a source node can send multiple (default to 2) file-chunks
without waiting for the acknowledgments from the target.
Below are the benchmark results for PMC and NYC_taxis.
- PMC (20.2 GB)
| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| -------- | -------- | -------- | -------- |
| Plain | 184s | 137s | 106s | 105s | 106s |
| TLS | 346s | 294s | 176s | 153s | 117s |
| Compress | 1556s | 1407s | 1193s | 1183s | 1211s |
- NYC_Taxis (38.6GB)
| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| ---------| ---------| ---------| -------- |
| Plain | 321s | 249s | 191s | * | * |
| TLS | 618s | 539s | 323s | 290s | 213s |
| Compress | 2622s | 2421s | 2018s | 2029s | n/a |
Relates #33844
Update the scroll example ascii and Java docs, so it is more clear when to
consume the scroll documents. Before this change the user could loose
the first results if one uses copy & paste.
This adds a configurable whitelist to the HTTP client in watcher. By
default every URL is allowed to retain BWC. A dynamically configurable
setting named "xpack.http.whitelist" was added that allows to
configure an array of URLs, which can also contain simple regexes.
Closes#29937
`+` for index name inclusions is no longer supported for 6.x+. This
commit removes references of the `+` from the documenation. System
indices additional example is also included.
fixes#37237
Added warnings checks to existing tests
Added “defaultTypeIfNull” to DocWriteRequest interface so that Bulk requests can override a null choice of document type with any global custom choice.
Related to #35190
Previously these were only linked in a circuitous way rather than being
available from the top level API documentation and "Put Lifecycle" API docs.
This makes them slightly easier to find for a user.
* provide overriden `hashCode` and toString methods to account for `DISTINCT`
* change the analyzer for scenarios where `COUNT <field_name>` and `COUNT DISTINCT` have different paths
* defined a new `filter` aggregation encapsulating an `exists` query to filter out null or missing values
Upgrading the Elastic Stack perfectly documents the process to
upgrade ES from 5 to 6 when internal indices are present. However,
the rolling upgrade docs do not mention anything about internal indices.
This adds a warning in the rolling upgrade procedure, highlighting that
internal indices should be upgraded before the rolling upgrade procedure
can be started.
This change adds support for the 'include_type_name' parameter for the
indices.get API. This parameter, which defaults to `false` starting in 7.0,
changes the response to not include the indices type names any longer.
If the parameter is set in the request, we additionally emit a deprecation
warning since using the parameter should be only temporarily necessary while
adapting to the new response format and we will remove it with the next major
version.
* [Analysis] Deprecate Standard Html Strip Analyzer
Deprecate only Standard Html Strip Analyzer
If user create index with the analyzer since 7.0, es throws an exception.
If an index was created before 7.0, es issue deprecation log
We will remove it in 8.0
Related #4704
In Lucene 8 searches can skip non-competitive hits if the total hit count is not requested.
It is also possible to track the number of hits up to a certain threshold. This is a trade off to speed up searches while still being able to know a lower bound of the total hit count. This change adds the ability to set this threshold directly in the track_total_hits search option. A boolean value (true, false) indicates whether the total hit count should be tracked in the response. When set as an integer this option allows to compute a lower bound of the total hits while preserving the ability to skip non-competitive hits when enough matches have been collected.
Relates #33028
Today it's very difficult to see which indices are frozen or rather
throttled via the commonly used monitoring APIs. This change adds
a cell to the `_cat/indices` API to render if an index is `search.throttled`
Relates to #34352
Adds an example on translating geohashes returned by geohashgrid
agg as bucket keys into geo bounding box filters in elasticsearch as well
as 3rd party applications.
Closes#36413
Enhance error message for the case that the 2nd argument of PERCENTILE
and PERCENTILE_RANK is not a foldable, as it doesn't make sense to have
a dynamic value coming from a field.
Fixes: #36903
Types can be used both in the source and dest section of the body which will
be translated to search and index requests respectively. Adding a deprecation warning
for those cases and removing examples using more than one type in reindex since
support for this is going to be removed.
There are a handful of examples in the ILM documentation that could result in
rolling over indices more quickly than we might normally recommend,
contributing to over-sharding in cases where the examples are copied without
modification. This change makes some numbers bigger to try and avoid this.
With this commit we rename `node.store.allow_mmapfs` to
`node.store.allow_mmap`. Previously this setting has controlled whether
`mmapfs` could be used as a store type. With the introduction of
`hybridfs` which also relies on memory-mapping,
`node.store.allow_mmapfs` also applies to `hybridfs` and thus we rename
it in order to convey that it is actually used to allow memory-mapping
but not a specific store type.
Relates #36668
Relates #37070
When executing terms aggregations we set the shard_size, meaning the
number of buckets to collect on each shard, to a value that's higher than
the number of requested buckets, to guarantee some basic level of
precision. We have an optimization in place so that we leave shard_size
set to size whenever we are searching against a single shard, in which
case maximum precision is guaranteed by definition.
Such optimization requires us access to the total number of shards that
the search is executing against. In the context of cross-cluster search,
once we will introduce multiple reduction steps (one per cluster) each
cluster will only know the number of local shards, which is problematic
as we should only optimize if we are searching against a single shard in a
single cluster. It could be that we are searching against one shard per cluster
in which case the current code would optimize number of terms causing
a loss of precision.
While discussing how to address the CCS scenario, we decided that we do
not want to introduce further complexity caused by this single shard
optimization, as it benefits only a minority of cases, especially when
the benefits are not so great.
This commit removes the single shard optimization, meaning that we will
always have heuristic enabled on how many number of buckets to collect
on the shards, even when searching against a single shard.
This will cause more buckets to be collected when searching against a single
shard compared to before. If that becomes a problem for some users, they
can work around that by setting the shard_size equal to the size.
Relates to #32125
With this commit we introduce a new store type `hybridfs` that is a
hybrid between `mmapfs` and `niofs`. This store type chooses different
strategies to read Lucene files based on the read access pattern (random
or linear) in order to optimize performance.
This store type has been available in earlier versions of Elasticsearch
as `default_fs`. We have chosen a different name now in order to convey
the intent of the store type instead of tying it to the fact whether it
is the default choice.
Relates #36668
This commit fixes some cross-doc links from the old ingest plugins page
to the new ingest processor pages that arose after converting
ingest-geoip and ingest-user-agent to modules.
This commit adds a placeholder ingest-geoip plugin page as there are
other components in the Elastic Stack that still refer to these
pages. These docs would be broken without this placeholder page forcing
teams responsible for those docs to scramble to fix the build over the
weekend before a holiday period. Instead, we add a placeholder page so
the docs build continues to function, and those teams can fix their docs
without the constraint of a broken build. We also cleanup a few minor
docs issues that were missed during the initial changes to convert
ingest-geoip to a module.
This extra scenario describes the case where an updated
policy increases the current phase's `min_age`. Now, the
docs explicitly describe this scenario as to what is
expected -- old min_age is used.
Closes#35356.
* Added Limitations page
* Made the aggregations page follow the common template for functions
* Modified all tables to have the first row's cells content centered
* Polishing in other various sections
This is a follow-up to some discussions around #36399. Currently we have
relatively confusing compression behavior where compression can be
configured for requests based on transport.compress or a specific
setting for a remote cluster. However, we can only compress responses
based on transport.compress as we do not know where a request is
coming from (currently).
This commit modifies the behavior to NEVER compress responses based on
settings. Instead, a response will only be compressed if the request was
compressed. This commit also updates the documentation to more clearly
described transport level compression.
Allow scripts to correctly reference grouping functions
Fix bug in translation of date/time functions mixed with histograms.
Enhance Verifier to prevent histograms being nested inside other
functions inside GROUP BY (as it implies double grouping)
Extend Histogram docs
This commit breaks the single ingest docs file into multiple files,
factoring out the processor docs into a documentation file per
processor. This will help make this content easier to maintain.
This commit overhauls the documentation of discovery and cluster coordination,
removing mention of the Zen Discovery module and replacing it with docs for the
new cluster coordination mechanism introduced in 7.0.
Relates #32006
Leaving `index.lifecycle.indexing_complete` in place when removing the
lifecycle policy from an index can cause confusion, as if a new policy
is associated with the policy, rollover will be silently skipped.
Removing that setting when removing the policy from an index makes
associating a new policy with the index more involved, but allows ILM to
fail loudly, rather than silently skipping operations which the user may
assume are being performed.
* Adjust order of checks in WaitForRolloverReadyStep
This allows ILM to error out properly for indices that have a valid
alias, but are not the write index, while still handling
`indexing_complete` on old-style aliases and rollover (that is, those
which only point to a single index at a time with no explicit write
index)
This is related to #36652. In 7.0 we plan to deprecate a number of
settings that make reference to the concept of a tcp transport. We
mostly just have a single transport type now (based on tcp). Settings
should only reference tcp if they are referring to socket options. This
commit updates the settings in the docs. And removes string usages of
the old settings. Additionally it adds a missing remote compress setting
to the docs.