OpenSearch

Commit Graph

Author	SHA1	Message	Date
David Roberts	09e8910b0f	[DOCS] Adding ML-specific prerequisites to setup docs (#42529 )	2019-05-24 10:49:41 -07:00
James Rodewig	43dd081e22	[DOCS] Fix nested def list for Asciidoctor (#42353 )	2019-05-24 13:39:49 -04:00
Simon Willnauer	46ccfba808	Remove IndexStore and DirectoryService (#42446 ) Both of these classes are basically a bloated wrapper around a simple construct that can simply be a DirectoryFactory interface. This change removes both classes and replaces them with a simple stateless interface that creates a new `Directory` per shard. The concept of `index.store` is preserved since it makes sense from a configuration perspective.	2019-05-24 12:14:56 +02:00
David Roberts	f472186b9f	[ML] Improve file structure finder timestamp format determination (#41948 ) This change contains a major refactoring of the timestamp format determination code used by the ML find file structure endpoint. Previously timestamp format determination was done separately for each piece of text supplied to the timestamp format finder. This had the drawback that it was not possible to distinguish dd/MM and MM/dd in the case where both numbers were 12 or less. In order to do this sensibly it is best to look across all the available timestamps and see if one of the numbers is greater than 12 in any of them. This necessitates making the timestamp format finder an instantiable class that can accumulate evidence over time. Another problem with the previous approach was that it was only possible to override the timestamp format to one of a limited set of timestamp formats. There was no way out if a file to be analysed had a timestamp that was sane yet not in the supported set. This is now changed to allow any timestamp format that can be parsed by a combination of these Java date/time formats: yy, yyyy, M, MM, MMM, MMMM, d, dd, EEE, EEEE, H, HH, h, mm, ss, a, XX, XXX, zzz Additionally S letter groups (fractional seconds) are supported providing they occur after ss and separated from the ss by a dot, comma or colon. Spacing and punctuation is also permitted with the exception of the question mark, newline and carriage return characters, together with literal text enclosed in single quotes. The full list of changes/improvements in this refactor is: - Make TimestampFormatFinder an instantiable class - Overrides must be specified in Java date/time format - Joda format is no longer accepted - Joda timestamp formats in outputs are now derived from the determined or overridden Java timestamp formats, not stored separately - Functionality for determining the "best" timestamp format in a set of lines has been moved from TextLogFileStructureFinder to TimestampFormatFinder, taking advantage of the fact that TimestampFormatFinder is now an instantiable class with state - The functionality to quickly rule out some possible Grok patterns when looking for timestamp formats has been changed from using simple regular expressions to the much faster approach of using the Shift-And method of sub-string search, but using an "alphabet" consisting of just 1 (representing any digit) and 0 (representing non-digits) - Timestamp format overrides are now much more flexible - Timestamp format overrides that do not correspond to a built-in Grok pattern are mapped to a %{CUSTOM_TIMESTAMP} Grok pattern whose definition is included within the date processor in the ingest pipeline - Grok patterns that correspond to multiple Java date/time patterns are now handled better - the Grok pattern is accepted as matching broadly, and the required set of Java date/time patterns is built up considering all observed samples - As a result of the more flexible acceptance of Grok patterns, when looking for the "best" timestamp in a set of lines timestamps are considered different if they are preceded by a different sequence of punctuation characters (to prevent timestamps far into some lines being considered similar to timestamps near the beginning of other lines) - Out-of-the-box Grok patterns that are considered now include %{DATE} and %{DATESTAMP}, which have indeterminate day/month ordering - The order of day/month in formats with indeterminate day/month order is determined by considering all observed samples (plus the server locale if the observed samples still do not suggest an ordering) Relates #38086 Closes #35137 Closes #35132	2019-05-24 09:10:08 +01:00
Adrien Grand	f3c33d6d96	Add 7.1.1 release notes.	2019-05-24 09:26:04 +02:00
Costin Leau	9fdf4215dd	Docs: Documentation for the upcoming SQL support of frozen indices (#41863 ) (cherry picked from commit a3cc03eb1503df24c1706a721fcc9af38c3b2873) (cherry picked from commit f42dcf2ffd7bd25f3f91aa6127515f393cd1860f)	2019-05-23 21:16:16 +03:00
Yannick Welsch	f57fdc57e9	Deprecate max_local_storage_nodes (#42426 ) Allows this setting to be removed in 8.0, see #42428	2019-05-23 15:59:55 +02:00
Jim Ferenczi	4ca5649a0d	Upgrade to lucene 8.1.0-snapshot-e460356abe (#40952 )	2019-05-23 11:45:33 +02:00
Jake Landis	496fee3333	bump to 7.3 (#42365 )	2019-05-22 11:57:07 -05:00
swstepp	4181c5ccf5	Fix grammar problem in stemming reference. (#42148 )	2019-05-22 09:50:30 -07:00
Julie Tibshirani	a3caed2bee	Fix a rendering issue in the geo envelope docs. (#42332 ) Previously the formatting information didn't display in the docs, and the sentence just rendered as "bounding rectangle in the format :".	2019-05-22 09:49:58 -07:00
Luca Cavanna	e747326b04	Adapt low-level REST client to java 8 (#41537 ) As a follow-up to #38540 we can use lambda functions and method references where convenient in the low-level REST client. Also, we need to update the docs to state that the minimum java version required is 1.8.	2019-05-22 18:47:54 +02:00
Alpar Torok	eb1639c5fc	TestClusters: Convert docs (#42100 ) * TestClusters: Convert docs	2019-05-22 14:44:08 +03:00
David Turner	b1c413ea63	Rework discovery-ec2 docs (#41630 ) This commit reworks and clarifies the docs for the `discovery-ec2` plugin: - folds the tiny "Getting started with AWS" into the page on configuration - spells out the name of each setting in full instead of noting the `discovery.ec2` prefix at the top of the page. - replaces each `(Secure)` marker with a sentence describing what that means in situ - notes some missing defaults - clarifies the behaviour of `discovery.ec2.groups` (dependent on `.any_group`) - clarifies what `discovery.ec2.host_type` is for - adds `discovery.ec2.tag.TAGNAME` as a (meta-)setting rather than describing it in a separate section - notes that the tags mentioned in `discovery.ec2.tag.TAGNAME` cannot contain colons (see #38406) - clarifies the EC2-specific interface names and what they're for - reorders and rewords the recommendations for storage - expands on why you should not span a cluster across regions - adds a suggestion on protecting instances against termination during scale-in - reformat to 80 columns where possible Fixes #38406	2019-05-22 09:46:56 +01:00
Jack Conradson	813db163d8	Reorganize Painless doc structure (#42303 )	2019-05-21 10:50:21 -07:00
Glen Smith	a6204a5eaf	Remove stray back tick that's messing up table format (#41705 )	2019-05-21 09:00:06 -04:00
Mayya Sharipova	216c74d10a	Add experimental and warnings to vector functions (#42205 )	2019-05-21 06:39:05 -04:00
David Turner	7abeaba8bb	Prevent in-place downgrades and invalid upgrades (#41731 ) Downgrading an Elasticsearch node to an earlier version is unsupported, because we do not make any attempt to guarantee that a node can read any of the on-disk data written by a future version. Yet today we do not actively prevent downgrades, and sometimes users will attempt to roll back a failed upgrade with an in-place downgrade and get into an unrecoverable state. This change adds the current version of the node to the node metadata file, and checks the version found in this file against the current version at startup. If the node cannot be sure of its ability to read the on-disk data then it refuses to start, preserving any on-disk data in its upgraded state. This change also adds a command-line tool to overwrite the node metadata file without performing any version checks, to unsafely bypass these checks and recover the historical and lenient behaviour.	2019-05-21 08:04:30 +01:00
Jake Landis	df8fef3c1a	fix assumption that 6.7 is last 6.x release (#42255 )	2019-05-20 14:35:28 -05:00
Jake Landis	87bff89500	7.1.0 release notes forward port (#42252 ) Forward port of #42208	2019-05-20 14:39:17 -04:00
Zachary Tong	6ae6f57d39	[7.x Backport] Force selection of calendar or fixed intervals (#41906 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-20 12:07:29 -04:00
Jay Modi	dbbdcea128	Update ciphers for TLSv1.3 and JDK11 if available (#42082 ) This commit updates the default ciphers and TLS protocols that are used when the runtime JDK supports them. New cipher support has been introduced in JDK 11 and 12 along with performance fixes for AES GCM. The ciphers are ordered with PFS ciphers being most preferred, then AEAD ciphers, and finally those with mainstream hardware support. When available stronger encryption is preferred for a given cipher. This is a backport of #41385 and #41808. There are known JDK bugs with TLSv1.3 that have been fixed in various versions. These are: 1. The JDK's bundled HttpsServer will endless loop under JDK11 and JDK 12.0 (Fixed in 12.0.1) based on the way the Apache HttpClient performs a close (half close). 2. In all versions of JDK 11 and 12, the HttpsServer will endless loop when certificates are not trusted or another handshake error occurs. An email has been sent to the openjdk security-dev list and #38646 is open to track this. 3. In JDK 11.0.2 and prior there is a race condition with session resumption that leads to handshake errors when multiple concurrent handshakes are going on between the same client and server. This bug does not appear when client authentication is in use. This is JDK-8213202, which was fixed in 11.0.3 and 12.0. 4. In JDK 11.0.2 and prior there is a bug where resumed TLS sessions do not retain peer certificate information. This is JDK-8212885. The way these issues are addressed is that the current java version is checked and used to determine the supported protocols for tests that provoke these issues.	2019-05-20 09:45:36 -04:00
Lisa Cawley	fd2d4d761b	[DOCS] Updates TLS configuration info (#41983 )	2019-05-20 09:13:37 -04:00
Nhat Nguyen	1362944c23	Minor improvement translog docs (#42184 ) Closes #42183	2019-05-19 20:45:34 -04:00
David Turner	51376f98a7	Clarify rolling upgrade fallback to restart upgrade (#42161 ) Adds a note that restarting half-or-more of the master-eligible nodes means you're no longer doing a rolling upgrade, and may need to upgrade all the things before the cluster returns to health.	2019-05-16 13:38:48 -04:00
Hendrik Muhs	4063701f5e	[DOCS] add a warning about bypassing PUT API's, update example responses (#42062 ) Configurations are stored in the .data-frame-internal-1 index, but users should not add configurations directly to the index as additional information to enable access control is added. This adds a warning against allowing access to the internal index.	2019-05-16 10:12:19 -04:00
Ryan Ernst	fa1d1d1f57	Deprecate the native realm migration tool (#42142 ) The migrate tool was added when the native realm was created, to aid users in converting from file realms that were per node, into the cluster managed native realm. While this tool was useful at the time, users should now be using the native realm directly. This commit deprecates the tool, to be removed in a followup for 8.0.	2019-05-16 09:52:31 -04:00
Igor Motov	2f8c5ac6f8	Docs: Mark SQL Geo functionality as beta (#42138 ) Adds beta marker to geosql documentation	2019-05-15 10:51:33 -04:00
David Turner	15fd233ae3	Minor cluster coordination docs fixes (#42111 ) Fixes a typo and a badly-formatted warning.	2019-05-15 09:27:08 -04:00
Igor Motov	70ea3cf847	SQL: Add initial geo support (#42031 ) (#42135 ) Adds an initial limited implementations of geo features to SQL. This implementation is based on the [OpenGIS® Implementation Standard for Geographic information - Simple feature access](http://www.opengeospatial.org/standards/sfs), which is the current standard for GIS system implementation. This effort is concentrate on SQL option AKA ISO 19125-2. Queries that are supported as a result of this initial implementation Metadata commands - `DESCRIBE table` - returns the correct column types `GEOMETRY` for geo shapes and geo points. - `SHOW FUNCTIONS` - returns a list that includes supported `ST_` functions - `SYS TYPES` and `SYS COLUMNS` display correct types `GEO_SHAPE` and `GEO_POINT` for geo shapes and geo points accordingly. Returning geoshapes and geopoints from elasticsearch - `SELECT geom FROM table` - returns the geoshapes and geo_points as libs/geo objects in JDBC or as WKT strings in console. - `SELECT ST_AsWKT(geom) FROM table;` and `SELECT ST_AsText(geom) FROM table;`- returns the geoshapes ang geopoints in their WKT representation; Using geopoints to elasticsearch - The following functions will be supported for geopoints in queries, sorting and aggregations: `ST_GeomFromText`, `ST_X`, `ST_Y`, `ST_Z`, `ST_GeometryType`, and `ST_Distance`. In most cases when used in queries, sorting and aggregations, these function are translated into script. These functions can be used in the SELECT clause for both geopoints and geoshapes. - `SELECT * FROM table WHERE ST_Distance(ST_GeomFromText(POINT(1 2), point) < 10;` - returns all records for which `point` is located within 10m from the `POINT(1 2)`. In this case the WHERE clause is translated into a range query. Limitations: Geoshapes cannot be used in queries, sorting and aggregations as part of this initial effort. In order to fully take advantage of geoshapes we would need to have access to geoshape doc values, which is coming in #37206. `ST_Z` cannot be used on geopoints in queries, sorting and aggregations since we don't store altitude in geo_point doc values. Relates to #29872 Backport of #42031	2019-05-14 18:57:12 -05:00
James Rodewig	58f2e91684	[DOCS] Rewrite 'rewrite' parameter docs (#42018 )	2019-05-13 08:43:12 -04:00
Benjamin Trent	febee07dcc	[ML] adding pivot.max_search_page_size option for setting paging size (#41920 ) (#42079 ) * [ML] adding pivot.size option for setting paging size * Changing field name to address PR comments * fixing ctor usage * adjust hlrc for field name change	2019-05-10 13:22:31 -05:00
Jason Tedor	cd5f1b53e8	Remove reference to fs.data.spins in docs We long ago removed fs.data.spins from the nodes stats. This commit removes reference to this in the docs.	2019-05-10 11:49:01 -04:00
David Turner	1be5bb5bfd	Recognise direct buffers in heap size docs (#42070 ) This commit slightly reworks the recommendations in the docs about setting the heap size: * the "rules of thumb" are actually instructions that should be followed * the reason for setting `Xmx` to 50% of the heap size is more subtle than just leaving space for the filesystem cache * it is normal to see Elasticsearch using more memory than `Xmx` * replace `cutoff` and `limit` with `threshold` since all three terms are used interchangeably * since we recommend setting `Xmx` equal to `Xms`, avoid talking about setting `Xmx` in isolation Relates #41954	2019-05-10 13:56:47 +01:00
Christian Mesh	99a50ac3b7	Add painless string split function (splitOnToken) (#39772 ) Adds two String split functions to Painless that can be used without enabling regexes.	2019-05-09 15:16:11 -07:00
James Rodewig	732ef15f0d	[DOCS] Adds placeholder for 7.1.0 release notes (#42024 )	2019-05-09 13:17:04 -04:00
James Rodewig	ea5019665a	[DOCS] Replace table with def list for ids query (#41865 )	2019-05-09 09:52:20 -04:00
Daniel Schneiter	0b21fb0ee6	Mentioned the name of the icu_analyzer	2019-05-09 15:08:31 +02:00
Alexander Reelsen	8e33a5292a	Add HTML strip processor (#41888 ) This processor uses the lucene HTMLStripCharFilter class to remove HTML entities from a field. This adds to the char filter, so that there is possibility to store the stripped version as well. Note, that the characeter filter replaces tags with a newline, so that the produced HTML will look slightly different than the incoming HTML with regards to newlines.	2019-05-09 13:01:07 +02:00
Flavio Pompermaier	83fef23fd1	Fix wrong property name (#40636 )	2019-05-09 08:53:05 +02:00
Gordon Brown	4358cc6ac8	Add note about ILM action ordering (#41771 ) Adds a note clarifying that actions are ordered automatically.	2019-05-08 16:42:50 -06:00
Jack Conradson	2c561481cd	Add static section whitelist info to api docs generation (#41870 ) This change adds imported methods, class bindings, and instance bindings to the documentation generation for the Painless Context APIs.	2019-05-08 11:15:38 -07:00
David Turner	60f84a2eb2	Remove mention of bulk threadpool in examples (#41935 ) The `bulk` threadpool is now called `write`, but `bulk` is still used in some examples. This commit fixes that. Also, the only way `threadpool.bulk.write: 30` is a valid increase in the size of this threadpool is if you have 29 processors, which is an odd number of processors to have. This commit removes the "more threads" bit.	2019-05-08 12:14:23 +01:00
David Turner	99b5a27ea0	Node names in bootstrap config have no ports (#41569 ) In cases where node names and transport addresses can be muddled, it is unclear that `cluster.initial_master_nodes: master-a:9300` means to look for a node called `master-a:9300` rather than a node called `master-a` with transport port `9300`. This commit adds docs to that effect.	2019-05-08 10:38:40 +01:00
Yannick Welsch	818e05c05f	Highlight the use of single-node discovery in docker docs (#41241 ) Relates to https://discuss.elastic.co/t/es-7-and-docker-single-node-cluster/176585	2019-05-08 09:38:37 +02:00
David Turner	4c909e93bb	Reject port ranges in `discovery.seed_hosts` (#41905 ) Today Elasticsearch accepts, but silently ignores, port ranges in the `discovery.seed_hosts` setting: ``` discovery.seed_hosts: 10.1.2.3:9300-9400 ``` Silently ignoring part of a setting like this is trappy. With this change we reject seed host addresses of this form. Closes #40786 Backport of #41404	2019-05-08 08:34:32 +01:00
Tim Vernum	e04953a2bf	Clarify settings in default SSL/TLS (#41930 ) The settings listed under the "Default values for TLS/SSL settings" heading are not actual settings, rather they are common suffixes that are used for settings that exist in a variety of contexts. This commit changes the way they are presented to reduce this confusion. Backport of: #41779	2019-05-08 16:07:21 +10:00
Marios Trivyzas	d5b0badeb7	SQL: Remove CircuitBreaker from parser (#41835 ) The CircuitBreaker was introduced as means of preventing a `StackOverflowException` during the build of the AST by the parser. The ANTLR4 grammar causes a weird behaviour for a Parser Listener. The `enterEveryRule()` method is often called with a different parsing context than the respective `exitEveryRule()`. This makes it difficult to keep track of the tree's depth, and a custom Map was used as an attempt of matching the contextes as they are encounter during `enter` and during `exit` of the rules. This approach had 2 important drawbacks: 1. It's hard to maintain this custom Map as the grammar changes. 2. The CircuitBreaker could often lead to false positives which caused valid queries to return an Exception and prevent them from executing. So, this removes completely the CircuitBreaker which is replaced be a simple handling of the `StackOverflowException` Fixes: #41471 (cherry picked from commit 1559a8e2dbd729138b52e89b7e80264c9f4ad1e7)	2019-05-07 23:25:37 +03:00
Lisa Cawley	cf8a2be27b	[DOCS] Fix callouts for dataframe APIs (#41904 )	2019-05-07 10:07:04 -07:00
James Rodewig	77f634ba25	[DOCS] Rewrite `exists` query docs (#41868 )	2019-05-07 09:23:20 -04:00

1 2 3 4 5 ...

6700 Commits