Commit Graph

473 Commits

Author SHA1 Message Date
Adrien Grand 866a5459f0 Make significant terms work on fields that are indexed with points. #18031
It will keep using the caching terms enum for keyword/text fields and falls back
to IndexSearcher.count for fields that do not use the inverted index for
searching (such as numbers and ip addresses). Note that this probably means that
significant terms aggregations on these fields will be less efficient than they
used to be. It should be ok under a sampler aggregation though.

This moves tests back to the state they were in before numbers started using
points, and also adds a new test that significant terms aggs fail if a field is
not indexed.

In the long term, we might want to follow the approach that Robert initially
proposed that consists in collecting all documents from the background filter in
order to compute frequencies using doc values. This would also mean that
significant terms aggregations do not require fields to be indexed anymore.
2016-05-11 16:52:58 +02:00
Jason Tedor 2bf585e642 Require /bin/bash in packaging
This commit adds a hard requirement to the RPM and Debian packages for
/bin/bash to be present, and adds a note regarding this to the migration
docs.

Relates #18259
2016-05-10 21:17:09 -04:00
Lee Hinman 1c54033e92 Merge branch 'pr/18068' 2016-05-10 08:27:43 -06:00
Alexander Kazakov 667a091205 Add note about cat field data API changes into migration doc 2016-05-10 16:41:21 +03:00
Jason Tedor 7d1fd17172 Remove plugin script parsing of system properties
The plugin script parses command-line options looking for Java system
properties and extracts these arguments to pass to the java command when
starting the JVM. Since elasticsearch-plugin allows arbitrary user
arguments to the JVM via ES_JAVA_OPTS, this parsing is unnecessary. This
commit removes this unnecessary 

Relates #18207
2016-05-09 13:06:18 -04:00
Clinton Gormley 3f594089c2 Renamed all AUTOSENSE snippets to CONSOLE (#18210) 2016-05-09 15:42:23 +02:00
Adrien Grand de8354dd7f Allow binary sort values. #17959
The `ip` field uses a binary representation internally. This breaks when
rendering sort values in search responses since elasticsearch tries to write a
binary byte[] as an utf8 json string. This commit extends the `DocValueFormat`
API in order to give fields a chance to choose how to render values.

Closes #6077
2016-05-06 09:27:02 +02:00
Adrien Grand 7d8708716e QueryBuilder does not need generics. #18133
QueryBuilder has generics, but those are never used: all call sites use
`QueryBuilder<?>`. Only `AbstractQueryBuilder` needs generics so that the base
class can contain a default implementation for setters that returns `this`.
2016-05-06 08:38:20 +02:00
Nik Everett 4b1c116461 Generate and run tests from the docs
Adds infrastructure so `gradle :docs:check` will extract tests from
snippets in the documentation and execute the tests. This is included
in `gradle check` so it should happen on CI and during a normal build.

By default each `// AUTOSENSE` snippet creates a unique REST test. These
tests are executed in a random order and the cluster is wiped between
each one. If multiple snippets chain together into a test you can annotate
all snippets after the first with `// TEST[continued]` to have the
generated tests for both snippets joined.

Snippets marked as `// TESTRESPONSE` are checked against the response
of the last action.

See docs/README.asciidoc for lots more.

Closes #12583. That issue is about catching bugs in the docs during build.
This catches *some* bugs in the docs during build which is a good start.
2016-05-05 13:58:03 -04:00
Isabel Drost-Fromm 6b9ac46402 Merge branch 'master' into enhancement/switch_geodistancesortbuilder_to_geovalidationmethod 2016-05-04 11:30:15 +02:00
Daniel Mitterdorfer 0a6f40c7f5 Enable HTTP compression by default with compression level 3
With this commit we compress HTTP responses provided the client
supports it (as indicated by the HTTP header 'Accept-Encoding').

We're also able to process compressed HTTP requests if needed.

The default compression level is lowered from 6 to 3 as benchmarks
have indicated that this reduces query latency with a negligible
increase in network traffic.

Closes #7309
2016-05-03 08:53:15 +02:00
Clinton Gormley 04bd55d61c Added perl migration script for indexed scripts to migration docs 2016-04-29 14:18:27 +02:00
Martijn van Groningen 6c3beaa2eb Drop top level inner hits in favour of inner hits defined in the query dsl.
Fix a limitation that prevent from hierarchical inner hits be defined in query dsl.

Removed the nested_path, parent_child_type and query options from inner hits dsl. These options are only set by ES
upon parsing the has_child, has_parent and nested queries are using their respective query builders.

These options are still used internally, when these options are set a new private copy is created based on the
provided InnerHitBuilder and configuring either nested_path or parent_child_type and the inner query of the query builder
being used.

Closes #11118
2016-04-29 11:17:24 +02:00
Isabel Drost-Fromm c1fa9cd18e Add note that coerce and ignore_malformed are deprecated for geo
distance sorting
2016-04-28 14:13:58 +02:00
Isabel Drost-Fromm a19c426e0f Deprecate coerce/ignore_malformed for GeoBoundingBoxQuery 2016-04-28 14:10:59 +02:00
Isabel Drost-Fromm 3f743a30cf Deprecate coerce/ignore_malformed in GeoDistanceQueryBuilder 2016-04-28 14:06:27 +02:00
Isabel Drost-Fromm 3160798084 Deprecate coerce/ignore_malformed for GeoDistanceRangeQuery 2016-04-28 14:01:54 +02:00
Isabel Drost-Fromm 5306de3ce3 Deprecate coerce/ignore_malformed for GeoPolygonQueryBuilder
Includes update to parsing code, tests, migration docs and reference
docs.
2016-04-28 13:56:50 +02:00
Areek Zillur afacc18dcc Merge branch 'master' into docs/completion_suggester 2016-04-26 10:16:38 -04:00
Areek Zillur cc99b24bf7 Document completion suggest breaking changes 2016-04-26 10:15:21 -04:00
Martijn Laarman 166cfcee8a Document missing shard version in routing table of cluster state (#17945)
as breaking change

removed as per: https://github.com/elastic/elasticsearch/pull/16243

because of: https://github.com/elastic/elasticsearch/issues/14739
2016-04-26 10:50:38 +02:00
Pius 66686040ca Update settings.asciidoc
Changed "must bet set" to "must be set" under Discovery Settings.
2016-04-26 00:15:39 -07:00
Pius f6656aa6ff Update settings.asciidoc
Added the 2 missing ` under Request Cache Settings section.
2016-04-25 23:57:42 -07:00
Pius 1364cc89f1 Update mapping.asciidoc
Changed "referrer to" to "refer to".
2016-04-25 23:35:42 -07:00
Clinton Gormley d56a8e5dd8 Update index-apis.asciidoc
Asciidoc typo
2016-04-25 13:06:57 +02:00
Lee Hinman 5fe1916be9 Merge pull request #17924 from elastic/russcam-patch-1
Update settings.asciidoc
2016-04-24 18:25:08 -06:00
Clinton Gormley b9978ace40 Update settings.asciidoc
Asciidoc typo
2016-04-23 13:44:42 +02:00
Russ Cam fb58ae3b4f Update settings.asciidoc
Add note for removal of index.translog.interval
2016-04-23 11:44:59 +10:00
Martijn van Groningen c5ad2e2865 Changed indexed scripts to be stored in the cluster state instead of the `.scripts` index.
Also added max script size soft limit for stored scripts.

Closes #16651
2016-04-22 13:42:55 +02:00
Christoph Büscher 0ec4ffcb3a Remove QueryFilterBuilder section from migration docs.
This query builder was deprecated in 2.0 and has been removed.
2016-04-21 18:11:01 +02:00
Jun Ohtani 9eb242a5fe Analyze API : Rename filters/token_filters/char_filter to filter/token_filter/char_filter
Closes #15189
2016-04-21 18:05:11 +09:00
Martijn van Groningen 81449fc912 percolator: renamed `percolator` query to `percolate` query 2016-04-20 15:23:54 +02:00
Lee Hinman b8899cdb78 Merge remote-tracking branch 'dakrone/allow-bad-json' 2016-04-19 10:02:53 -06:00
Martijn van Groningen ba08313417 settings: Removed `action.get.realtime` setting
Closes #12543
2016-04-19 17:14:23 +02:00
Lee Hinman a1e8fb794c Allow JSON with unquoted field names by enabling system property
In Elasticsearch 5.0.0, by default unquoted field names in JSON will be
rejected. This can cause issues, however, for documents that were
already indexed with unquoted field names. To alleviate this, a system
property has been added that can be enabled so migration can occur.

This system property will be removed in Elasticsearch 6.0.0

Resolves #17674
2016-04-19 09:14:13 -06:00
Clinton Gormley c024504842 Update search.asciidoc
Corrected breaking changes for `has_parent`.  Relates to https://github.com/elastic/elasticsearch/pull/17841
2016-04-19 11:54:48 +02:00
Martijn van Groningen 40c22fc654 percolator: removed .percolator type instead a field of type `percolator` should be configured before indexing percolator queries
* Added an extra `field` parameter to the `percolator` query to indicate what percolator field should be used. This must be an existing field in the mapping of type `percolator`.
* The `.percolator` type is now forbidden. (just like any type that starts with a `.`)

This only applies for new indices created on 5.0 and later. Indices created on previous versions the .percolator type is still allowed to exist.
The new `percolator` field type isn't active in such indices and the `PercolatorQueryCache` knows how to load queries from these legacy indices.
The `PercolatorQueryBuilder` will not enforce that the `field` parameter is of type `percolator`.
2016-04-19 11:20:31 +02:00
Clinton Gormley 40b84d2ef6 Update mapping.asciidoc
Correct `fielddata.frequency.regex` to `fielddata.filter.regex` in breaking changes
2016-04-18 21:00:27 +02:00
Adrien Grand d84c643f58 Use the new points API to index numeric fields. #17746
This makes all numeric fields including `date`, `ip` and `token_count` use
points instead of the inverted index as a lookup structure. This is expected
to perform worse for exact queries, but faster for range queries. It also
requires less storage.

Notes about how the change works:
 - Numeric mappers have been split into a legacy version that is essentially
   the current mapper, and a new version that uses points, eg.
   LegacyDateFieldMapper and DateFieldMapper.
 - Since new and old fields have the same names, the decision about which one
   to use is made based on the index creation version.
 - If you try to force using a legacy field on a new index or a field that uses
   points on an old index, you will get an exception.
 - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them
   in SORTED_SET doc values using the same encoding (fixed length of 16 bytes
   and sortable).
 - The internal MappedFieldType that is stored by the new mappers does not have
   any of the points-related properties set. Instead, it keeps setting the index
   options when parsing the `index` property of mappings and does
   `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }`
   when parsing documents.

Known issues that won't fix:
 - You can't use numeric fields in significant terms aggregations anymore since
   this requires document frequencies, which points do not record.
 - Term queries on numeric fields will now return constant scores instead of
   giving better scores to the rare values.

Known issues that we could work around (in follow-up PRs, this one is too large
already):
 - Range queries on `ip` addresses only work if both the lower and upper bounds
   are inclusive (exclusive bounds are not exposed in Lucene). We could either
   decide to implement it, or drop range support entirely and tell users to
   query subnets using the CIDR notation instead.
 - Since IP addresses now use a different representation for doc values,
   aggregations will fail when running a terms aggregation on an ip field on a
   list of indices that contains both pre-5.0 and 5.0 indices.
 - The ip range aggregation does not work on the new ip field. We need to either
   implement range aggs for SORTED_SET doc values or drop support for ip ranges
   and tell users to use filters instead. #17700

Closes #16751
Closes #17007
Closes #11513
2016-04-14 17:56:23 +02:00
Jason Tedor 3879aa2a98 Add JVM options configuration file
This commit adds a new configuration file jvm.options to centralize and
simplify management of JVM options. This separates the configuration of
the JVM from the packaging scripts (bin/elasticsearch*, bin/service.bat,
and init.d/elasticsearch) simplifying end-user operational management of
custom JVM options.
2016-04-12 11:19:16 -04:00
Adrien Grand 4adc31fe11 Use `mmapfs` by default.
I case any problem was discovered, you can still enable the legacy `default`
directory instead. But the plan is to get rid of it in 6.0.

Closes #16983
2016-04-08 20:23:27 +02:00
Jimmy Jones f157dae053 Disallow unquoted field names, fix testcases using unquoted JSON 2016-04-06 14:37:15 -06:00
Martijn van Groningen 7e2696c570 Refactored inner hits parsing and intoduced InnerHitBuilder
Both top level and inline inner hits are now covered by InnerHitBuilder.
Although there are differences between top level and inline inner hits,
they now make use of the same builder logic.

The parsing of top level inner hits slightly changed to be more readable.
Before the nested path or parent/child type had to be specified as encapsuting
json object, now these settings are simple fields. Before this was required
to allow streaming parsing of inner hits without missing contextual information.

Once some issues are fixed with inline inner hits (around multi level hierachy of inner hits),
top level inner hits will be deprecated and removed in the next major version.
2016-03-30 15:15:56 +02:00
Simon Willnauer 8b075dbb75 Remove ability to specify arbitrary node attributes with `node.` prefix
Today the basic node settings like `node.data` and `node.master` can't really be fully validated
since we allow to specify custom user attributes on the node level. We have to, in order to
support that, add a wildcard setting for `node.*` to let these setting pass validation.
Instead we should require a more contraint prefix like `node.attr.` that defines a namespace
that is reserved for user attributes.
This commit adds a new namespace for attributes in `node.attr`.

Closes #17280
2016-03-30 13:29:48 +02:00
Isabel Drost-Fromm f27399dc0e Merge pull request #17282 from MaineC/deprecation/sort-option-reverse-removal
Remove deprecated reverse option from sorting
2016-03-30 11:02:19 +02:00
javanna 19eeb68bc4 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-29 21:53:22 +02:00
javanna ae34c20a62 add node.client breaking changes to migrate guide 2016-03-29 20:33:59 +02:00
Isabel Drost-Fromm 407e2cdcf9 Merge branch 'master' into deprecation/sort-option-reverse-removal
Conflicts:
	core/src/main/java/org/elasticsearch/search/sort/ScoreSortBuilder.java
	core/src/test/java/org/elasticsearch/search/sort/FieldSortBuilderTests.java
2016-03-29 11:04:02 +02:00
spalger ce44bbfadf [docs] clarify where discovery.zen.minimum_master_node is required
https://github.com/elastic/elasticsearch/pull/17288 added a check to enforce that the `discovery.zen.minimum_master_nodes` configuration is set when nodes have the `host`, `port`, or `bind_host` set in either `transport` or general `network` configuration sections. This was documented incorrectly as "nodes that are bound to a non-loopback interface", which lead to confusion as I set `network.host: "localhost"` and the check was still failing.

This change updates the docs to detail the actual check. I think it also highlights how complex the check is and the need for a simpler solution.
2016-03-28 12:53:40 -07:00
Boaz Leskes b8227a7222 Enforce `discovery.zen.minimum_master_nodes` is set when bound to a public ip #17288
discovery.zen.minimum_master_nodes is the single most important setting to set on a production cluster. We have no way of supplying a good default so it must be set by the user. Binding a node to a public IP (as opposed to the default local host) is a good enough indication that a node will be part of a production cluster cluster and thus it's a good tradeoff to enforce the settings. Note that nothing prevent users from setting it to 1 in a single node cluster.

Closes #17288
2016-03-25 12:56:20 +01:00