Commit Graph

435 Commits

Author SHA1 Message Date
Adrien Grand d84c643f58 Use the new points API to index numeric fields. #17746
This makes all numeric fields including `date`, `ip` and `token_count` use
points instead of the inverted index as a lookup structure. This is expected
to perform worse for exact queries, but faster for range queries. It also
requires less storage.

Notes about how the change works:
 - Numeric mappers have been split into a legacy version that is essentially
   the current mapper, and a new version that uses points, eg.
   LegacyDateFieldMapper and DateFieldMapper.
 - Since new and old fields have the same names, the decision about which one
   to use is made based on the index creation version.
 - If you try to force using a legacy field on a new index or a field that uses
   points on an old index, you will get an exception.
 - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them
   in SORTED_SET doc values using the same encoding (fixed length of 16 bytes
   and sortable).
 - The internal MappedFieldType that is stored by the new mappers does not have
   any of the points-related properties set. Instead, it keeps setting the index
   options when parsing the `index` property of mappings and does
   `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }`
   when parsing documents.

Known issues that won't fix:
 - You can't use numeric fields in significant terms aggregations anymore since
   this requires document frequencies, which points do not record.
 - Term queries on numeric fields will now return constant scores instead of
   giving better scores to the rare values.

Known issues that we could work around (in follow-up PRs, this one is too large
already):
 - Range queries on `ip` addresses only work if both the lower and upper bounds
   are inclusive (exclusive bounds are not exposed in Lucene). We could either
   decide to implement it, or drop range support entirely and tell users to
   query subnets using the CIDR notation instead.
 - Since IP addresses now use a different representation for doc values,
   aggregations will fail when running a terms aggregation on an ip field on a
   list of indices that contains both pre-5.0 and 5.0 indices.
 - The ip range aggregation does not work on the new ip field. We need to either
   implement range aggs for SORTED_SET doc values or drop support for ip ranges
   and tell users to use filters instead. #17700

Closes #16751
Closes #17007
Closes #11513
2016-04-14 17:56:23 +02:00
Jason Tedor 3879aa2a98 Add JVM options configuration file
This commit adds a new configuration file jvm.options to centralize and
simplify management of JVM options. This separates the configuration of
the JVM from the packaging scripts (bin/elasticsearch*, bin/service.bat,
and init.d/elasticsearch) simplifying end-user operational management of
custom JVM options.
2016-04-12 11:19:16 -04:00
Adrien Grand 4adc31fe11 Use `mmapfs` by default.
I case any problem was discovered, you can still enable the legacy `default`
directory instead. But the plan is to get rid of it in 6.0.

Closes #16983
2016-04-08 20:23:27 +02:00
Jimmy Jones f157dae053 Disallow unquoted field names, fix testcases using unquoted JSON 2016-04-06 14:37:15 -06:00
Martijn van Groningen 7e2696c570 Refactored inner hits parsing and intoduced InnerHitBuilder
Both top level and inline inner hits are now covered by InnerHitBuilder.
Although there are differences between top level and inline inner hits,
they now make use of the same builder logic.

The parsing of top level inner hits slightly changed to be more readable.
Before the nested path or parent/child type had to be specified as encapsuting
json object, now these settings are simple fields. Before this was required
to allow streaming parsing of inner hits without missing contextual information.

Once some issues are fixed with inline inner hits (around multi level hierachy of inner hits),
top level inner hits will be deprecated and removed in the next major version.
2016-03-30 15:15:56 +02:00
Simon Willnauer 8b075dbb75 Remove ability to specify arbitrary node attributes with `node.` prefix
Today the basic node settings like `node.data` and `node.master` can't really be fully validated
since we allow to specify custom user attributes on the node level. We have to, in order to
support that, add a wildcard setting for `node.*` to let these setting pass validation.
Instead we should require a more contraint prefix like `node.attr.` that defines a namespace
that is reserved for user attributes.
This commit adds a new namespace for attributes in `node.attr`.

Closes #17280
2016-03-30 13:29:48 +02:00
Isabel Drost-Fromm f27399dc0e Merge pull request #17282 from MaineC/deprecation/sort-option-reverse-removal
Remove deprecated reverse option from sorting
2016-03-30 11:02:19 +02:00
javanna 19eeb68bc4 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-29 21:53:22 +02:00
javanna ae34c20a62 add node.client breaking changes to migrate guide 2016-03-29 20:33:59 +02:00
Isabel Drost-Fromm 407e2cdcf9 Merge branch 'master' into deprecation/sort-option-reverse-removal
Conflicts:
	core/src/main/java/org/elasticsearch/search/sort/ScoreSortBuilder.java
	core/src/test/java/org/elasticsearch/search/sort/FieldSortBuilderTests.java
2016-03-29 11:04:02 +02:00
spalger ce44bbfadf [docs] clarify where discovery.zen.minimum_master_node is required
https://github.com/elastic/elasticsearch/pull/17288 added a check to enforce that the `discovery.zen.minimum_master_nodes` configuration is set when nodes have the `host`, `port`, or `bind_host` set in either `transport` or general `network` configuration sections. This was documented incorrectly as "nodes that are bound to a non-loopback interface", which lead to confusion as I set `network.host: "localhost"` and the check was still failing.

This change updates the docs to detail the actual check. I think it also highlights how complex the check is and the need for a simpler solution.
2016-03-28 12:53:40 -07:00
Boaz Leskes b8227a7222 Enforce `discovery.zen.minimum_master_nodes` is set when bound to a public ip #17288
discovery.zen.minimum_master_nodes is the single most important setting to set on a production cluster. We have no way of supplying a good default so it must be set by the user. Binding a node to a public IP (as opposed to the default local host) is a good enough indication that a node will be part of a production cluster cluster and thus it's a good tradeoff to enforce the settings. Note that nothing prevent users from setting it to 1 in a single node cluster.

Closes #17288
2016-03-25 12:56:20 +01:00
Boaz Leskes 6dd164d0bd Include pings from client nodes in master election
We currently have a `discovery.zen.master_election.filter_client` setting that control whether their ping responses are ignored for master election (which is the current default). With the push to treat client nodes as normal nodes (and promote the transport/rest clients for client work), this should be changed. This commit remove this setting and it's companion `discovery.zen.master_election.filter_data` setting (currently defaulting to  false) in favor of singe `discovery.zen.master_election.ignore_non_master_pings` setting with more intuitive name (defaulting to false).

Resolves #17325
Closes #17329
2016-03-24 17:48:05 +01:00
javanna ce86fc5647 Cluster Stats: remove mem section
The available memory metric was always set to `0` since 2.0.beta1 (bug).  was left behind but never set. Turns out the section wasn't that useful, as it would only output the total memory available throughout all nodes in the cluster. We decided to remove the section then.
2016-03-24 15:49:27 +01:00
Isabel Drost-Fromm 08d989d9b6 Merge branch 'master' into deprecation/sort-option-reverse-removal
Conflicts:
	core/src/main/java/org/elasticsearch/search/sort/FieldSortBuilder.java
	core/src/main/java/org/elasticsearch/search/sort/ScoreSortBuilder.java
2016-03-24 12:06:10 +01:00
Isabel Drost-Fromm 801d178ade Remove mention of reverse in docs and add to migration doc 2016-03-24 12:04:31 +01:00
Jim Ferenczi da42f199bd Enforce isolated mode for all plugins
This commit removes the isolated option, each plugin have its own classloader.
2016-03-24 09:17:33 +01:00
Areek Zillur e16e113691 Remove suggest threadpool
In #17198, we removed suggest transport action, which
used the `suggest` threadpool to execute requests. Now
`suggest` threadpool is unused and suggest requests are
executed on the `search` threadpool.
2016-03-23 18:01:45 -04:00
Areek Zillur 442a6e0009 document suggest stats being merged with search stats 2016-03-23 16:37:57 -04:00
Areek Zillur e7e93f98e3 add migration guide to use search api for suggest 2016-03-23 16:37:57 -04:00
Colin Goodheart-Smithe d6fe7515fd Merge pull request #17243 from colings86/docs/searchRequestBreakingChanges
added breaking changes for the Java API to the breaking changes doc for 5.0
2016-03-22 15:58:40 +00:00
Colin Goodheart-Smithe 25c4446942 iter 2016-03-22 15:58:12 +00:00
Colin Goodheart-Smithe ee7e84acc3 review comments 2016-03-22 15:34:47 +00:00
Adrien Grand c52b1f3a7c An `exists` query on an object should query a single term.
Currently if you run an `exists` query on an object, it will resolve all sub
fields and create a disjunction for all those fields. However the `_field_names`
mapper indexes paths for objects so we could query object paths directly.

I also changed the query parser to reject `exists` queries if the `_field_names`
field is disabled since it would be a big performance trap.
2016-03-22 16:26:45 +01:00
Adrien Grand b42f66c8ac Document 5.0 mapping changes. 2016-03-22 16:22:58 +01:00
Colin Goodheart-Smithe b8a96d9a65 added breaking changes for the Java API to the breaking changes doc for 5.0 2016-03-22 14:39:16 +00:00
Simon Willnauer 7f16a1d9a7 Improve upgrade experience of node level index settings
In 5.0 we don't allow index settings to be specified on the node level ie.
in yaml files or via commandline argument. This can cause problems during
upgrade if this was used extensively. For instance if analyzers where
specified on a node level this might cause the index to be closed when
imported (see #17187). In such a case all indices relying on this
must be updated via `PUT /${index}/_settings`. Yet, this API has slightly
different semantics since it overrides existing settings. To make this less
painful this change adds a `preserve_existing` parameter on that API to ensure
we have the same semantics as if the setting was applied on the node level.

This change also adds a better error message and a change to the migration guide
to ensure upgrades are smooth if index settings are specified on the node level.

If a index setting is detected this change fails the node startup and prints a message
like this:
```
*************************************************************************************
Found index level settings on node level configuration.

Since elasticsearch 5.x index level settings can NOT be set on the nodes
configuration like the elasticsearch.yaml, in system properties or command line
arguments.In order to upgrade all indices the settings must be updated via the
/${index}/_settings API. Unless all settings are dynamic all indices must be closed
in order to apply the upgradeIndices created in the future should use index templates
to set default values.

Please ensure all required values are updated on all indices by executing:

curl -XPUT 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{
  "index.number_of_shards" : "1",
  "index.query.default_field" : "main_field",
  "index.translog.durability" : "async",
  "index.ttl.disable_purge" : "true"
}'
*************************************************************************************
```
2016-03-21 20:12:18 +01:00
Martijn van Groningen e3b7e5d75a percolator: Replace percolate api with the new percolator query
Also replaced the PercolatorQueryRegistry with the new PercolatorQueryCache.

The PercolatorFieldMapper stores the rewritten form of each percolator query's xcontext
in a binary doc values field. This make sure that the query rewrite happens only during
indexing (some queries for example fetch shapes, terms in remote indices) and
the speed up the loading of the queries in the percolator query cache.

Because the percolator now works inside the search infrastructure a number of features
(sorting fields, pagination, fetch features) are available out of the box.

The following feature requests are automatically implemented via this refactoring:

Closes #10741
Closes #7297
Closes #13176
Closes #13978
Closes #11264
Closes #10741
Closes #4317
2016-03-21 12:21:50 +01:00
Martijn van Groningen 3b17ddcd46 Removed old 1.x parent/child logic that should have been removed.
`0` really means, don't match any child docs.
2016-03-18 10:07:27 +01:00
Martijn van Groningen 1dd2be81c3 nested / parent child: Removed `total` score mode in favour of `sum` score mode.
Closes #17083
2016-03-18 10:07:26 +01:00
Areek Zillur da165f425f update migration doc for removing gateway.format setting 2016-03-16 18:48:02 -04:00
Christoph Büscher 39667b5793 Merge branch 'master' into feature-suggest-refactoring
Conflicts:
	docs/reference/migration/migrate_5_0/java.asciidoc
2016-03-16 12:06:42 +01:00
Jason Tedor 618441aea3 Merge pull request #17088 from jasontedor/simplify-bootstrap-settings
Bootstrap does not set system properties
2016-03-15 19:25:16 -04:00
Jason Tedor 2f7e181318 Fix typo inadvertently introduced 2016-03-15 10:05:28 -04:00
Christoph Büscher bc84cdfed1 Using SortMode enum in all sort builders 2016-03-15 12:43:19 +01:00
Christoph Büscher b4b874f0d8 Merge branch 'master' into feature-suggest-refactoring 2016-03-15 12:11:39 +01:00
Areek Zillur 35f7cfb6c0 Add upgrader to upgrade old indices to new naming convention 2016-03-14 23:24:05 -04:00
Christoph Büscher 97638c95fc Merge branch 'master' into feature-suggest-refactoring
Conflicts:
	docs/reference/migration/migrate_5_0.asciidoc
2016-03-14 11:13:47 +01:00
Clinton Gormley c90b4f3bae Docs: Added note about upgrading from 1.x to 5.x 2016-03-14 09:58:46 +01:00
Jason Tedor 8a05c2a2be Bootstrap does not set system properties
Today, certain bootstrap properties are set and read via system
properties. This action-at-distance way of managing these properties is
rather confusing, and completely unnecessary. But another problem exists
with setting these as system properties. Namely, these system properties
are interpreted as Elasticsearch settings, not all of which are
registered. This leads to Elasticsearch failing to startup if any of
these special properties are set. Instead, these properties should be
kept as local as possible, and passed around as method parameters where
needed. This eliminates the action-at-distance way of handling these
properties, and eliminates the need to register these non-setting
properties. This commit does exactly that.

Additionally, today we use the "-D" command line flag to set the
properties, but this is confusing because "-D" is a special flag to the
JVM for setting system properties. This creates confusion because some
"-D" properties should be passed via arguments to the JVM (so via
ES_JAVA_OPTS), and some should be passed as arguments to
Elasticsearch. This commit changes the "-D" flag for Elasticsearch
settings to "-E".
2016-03-13 20:09:15 -04:00
Jason Tedor 8ac5a98b87 Remove links to nonexistent migration docs 2016-03-13 19:12:06 -04:00
Clinton Gormley 5f48b9c86a Removed breaking changes docs for < 5.0 2016-03-13 21:18:44 +01:00
Clinton Gormley 5c845f8bb5 Reworked 5.0 breaking changes docs 2016-03-13 21:17:48 +01:00
Jason Tedor f465d98eb3 Add raw recovery progress to cat recovery API
This commit adds fields bytes_recovered and files_recovered to the cat
recovery API. These fields, respectively, indicate the total number of
bytes and files recovered. Additionally, for consistency, some totals
fields and translog recovery fields have been renamed.

Closes #17064
2016-03-11 08:27:09 -05:00
Christoph Büscher daeffb149c Merge branch 'master' into feature-suggest-refactoring 2016-03-11 10:37:28 +01:00
Daniel Mitterdorfer 94aa025b93 Document breaking change in ClusterHealthResponse in 2.2 2016-03-11 09:47:53 +01:00
Yannick Welsch 295d33c2a6 Merge pull request #17021 from ywelsch/fix/block-delete-on-snapshot
Fail closing or deleting indices during a full snapshot
2016-03-10 18:51:04 +01:00
Yannick Welsch 266394c3ab Fail closing or deleting indices during a full snapshot
Closes #16321
2016-03-10 18:11:47 +01:00
Lee Hinman 22e716551b Add -XX+AlwaysPreTouch JVM flag
Enables the touching of all memory pages used by the JVM heap spaces
during initialization of the HotSpot VM, which commits all memory pages
at initialization time. By default, pages are committed only as they are
needed.
2016-03-10 10:11:32 -07:00
Nicholas Knize f7a2dbfcaf fixing silly typo in docs 2016-03-10 07:28:13 -06:00