Commit Graph

1313 Commits

Author SHA1 Message Date
Martijn van Groningen 6a2f9c2682 docs: fixed title out of sequence 2015-04-23 09:57:31 +02:00
Martijn van Groningen 5705537ecf Added field stats api
The field stats api returns field level statistics such as lowest, highest values and number of documents that have at least one value for a field.

An api like this can be useful to explore a data set you don't know much about. For example you can figure at with the lowest and highest response times are, so that you can create a histogram or range aggregation with sane settings.

This api doesn't run a search to figure this statistics out, but rather use the Lucene index look these statics up (using Terms class in Lucene). So finding out these stats for fields is cheap and quick.

The min/max values are based on the type of the field. So for a numeric field min/max are numbers and date field the min/max date and other fields the min/max are term based.

Closes #10523
2015-04-23 08:52:34 +02:00
Lee Hinman a4f98e7400 [DOCS] Add example of setting disk threshold decider settings
Fixes #10686
2015-04-22 11:53:19 -06:00
Clinton Gormley a60571c597 Docs: Removed some unused callout from the scroll docs 2015-04-22 12:49:06 +02:00
Jun Ohtani 0955c127c0 Rest: Add json in request body to scroll, clear scroll, and analyze API
Change analyze.asciidoc and scroll.asciidoc
Add json support to Analyze and Scroll, and clear scrollAPI
Add rest-api-spec/test

Closes #5866
2015-04-22 17:53:20 +09:00
Nicholas Knize 453217fd7a [GEO] Prioritize tree_level and precision parameters over default distance_error_pct
If a user explicitly defined the tree_level or precision parameter in a geo_shape mapping their specification was always overridden by the default_error_pct parameter (even though our docs say this parameter is a 'hint'). This lead to unexpected accuracy problems in the results of a geo_shape filter. (example provided in issue #9691)

This simple patch fixes the unexpected behavior by setting the default distance_error_pct parameter to zero when the tree_level or precision parameters are provided by the user. Under the covers the quadtree will now use the tree level defined by the user. The docs will be updated to alert the user to exercise caution with these parameters.  Specifying a precision of "1m" for an index using large complex shapes can quickly lead to OOM issues.

closes #9691
2015-04-21 14:42:10 -05:00
Adrien Grand d7abb12100 Replace deprecated filters with equivalent queries.
In Lucene 5.1 lots of filters got deprecated in favour of equivalent queries.
Additionally, random-access to filters is now replaced with approximations on
scorers. This commit
 - replaces the deprecated NumericRangeFilter, PrefixFilter, TermFilter and
   TermsFilter with NumericRangeQuery, PrefixQuery, TermQuery and TermsQuery,
   wrapped in a QueryWrapperFilter
 - replaces XBooleanFilter, AndFilter and OrFilter with a BooleanQuery in a
   QueryWrapperFilter
 - removes DocIdSets.isBroken: the new two-phase iteration API will now help
   execute slow filters efficiently
 - replaces FilterCachingPolicy with QueryCachingPolicy

Close #8960
2015-04-21 15:32:43 +02:00
markharwood 63db34f649 New feature - Sampler aggregation used to limit any nested aggregations' processing to a sample of the top-scoring documents.
Optionally, a “diversify” setting can limit the number of collected matches that share a common value such as an "author".

Closes #8108
2015-04-21 10:22:05 +01:00
Adrien Grand f4d5914511 Docs: Warn about the fact that min_doc_count=0 might return terms that only belong to different types. 2015-04-21 00:57:57 +02:00
Honza Král e929c1560d [DOCS] Be explicit about scan doing no scoring 2015-04-20 18:05:45 +02:00
Tanguy Leroux b3d91b1cbb Doc: Change the wording a bit for the HOSTNAME environment variable
I should have done this while merging #9474.
2015-04-17 10:24:50 +02:00
Tanguy Leroux a806314e2c Merge pull request #9474 from AndreKR/export-hostname-for-config
Export the hostname as environment variable
2015-04-17 10:17:55 +02:00
André Hänsel c107f0bcb9 Export the hostname as environment variable and mention it in the docs 2015-04-17 09:17:02 +02:00
Michael McCandless 399f0ccce9 Core: add only_ancient_segments to upgrade API, so only segments with an old Lucene version are upgraded
This option defaults to false, because it is also important to upgrade
the "merely old" segments since many Lucene improvements happen within
minor releases.

But you can pass true to do the minimal work necessary to upgrade to
the next major Elasticsearch release.

The HTTP GET upgrade request now also breaks out how many bytes of
ancient segments need upgrading.

Closes #10213

Closes #10540

Conflicts:
	dev-tools/create_bwc_index.py
	rest-api-spec/api/indices.upgrade.json
	src/main/java/org/elasticsearch/action/admin/indices/optimize/OptimizeRequest.java
	src/main/java/org/elasticsearch/action/admin/indices/optimize/ShardOptimizeRequest.java
	src/main/java/org/elasticsearch/action/admin/indices/optimize/TransportOptimizeAction.java
	src/main/java/org/elasticsearch/index/engine/InternalEngine.java
	src/test/java/org/elasticsearch/bwcompat/StaticIndexBackwardCompatibilityTest.java
	src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
	src/test/java/org/elasticsearch/rest/action/admin/indices/upgrade/UpgradeReallyOldIndexTest.java
2015-04-16 05:24:33 -04:00
Alex Ksikes d339ee4005 Term Vectors: terms filtering
This adds a new feature to the Term Vectors API which allows for filtering of
terms based on their tf-idf scores. With `dfs` option on, this could be useful
for finding out a good characteric vector of a document or a set of documents.
The parameters are similar to the ones used in the MLT Query.

Closes #9561
2015-04-14 19:11:09 +02:00
Clinton Gormley 7c147b2db5 Doc: Updates to resiliency page for 1.5.0/1 2015-04-14 15:30:33 +02:00
Alex Ksikes c347dfe91c Validate API: support for verbose explanation of succesfully validated queries
This commit adds a `rewrite` parameter to the validate API in order to shown
how the given query is re-written into primitive queries. For example, an MLT
query is re-written into a disjunction of the selected terms. Other use cases
include `fuzzy`, `common_terms`, or `match` query especially with a
`cutoff_frequency` parameter. Note that the explanation is only given for a
single randomly chosen shard only, so the output may vary from one shard to
another.

Relates #1412
Closes #10147
2015-04-13 19:17:58 +02:00
Clinton Gormley ab3fa78ae0 Docs: Reverte migration docs mentioning parent removal from update request
Relates to #9612
2015-04-13 16:35:21 +02:00
Benoit Delbosc 1b35854768 Docs: Fix simple_query_string example
The "&" is not part of the simple_query_string DSL

Closes #10563
2015-04-13 14:46:47 +02:00
Adrien Grand ab8926bc6a Docs: fix build. 2015-04-10 17:38:36 +02:00
Adrien Grand 5b3cc2f07c Search: deprecate the limit filter.
This is really a Collector instead of a filter. This commit deprecates the
`limit` filter, makes it a no-op and recommends to use the `terminate_after`
parameter instead that we introduced in the meantime.
2015-04-10 17:18:50 +02:00
Adrien Grand 919589b908 Queries: Remove fuzzy-like-this support.
The fuzzy-like-this query builds very expensive queries and only serves esoteric
use-cases.
2015-04-10 17:16:02 +02:00
Clinton Gormley abc7de96ae Docs: Updated version annotations in master 2015-04-09 14:50:11 +02:00
David Pilato 88ee7a5dca Deprecate rivers
* In code, we mark `River`, `AbstractRiverComponent`, `RiverComponent` and `RiverName` classes as deprecated
* We log that information when a cluster is still using it
* We add this information in the plugins list as well
2015-04-09 14:29:16 +02:00
Adrien Grand fae124103a Merge pull request #10420 from jpountz/feature/numeric_resolution
Mappings: Bring back numeric_resolution.

Close #10420
2015-04-09 12:28:33 +02:00
Adrien Grand aecd9ac515 Aggregations: Speed up include/exclude in terms aggregations with regexps.
Today we check every regular expression eagerly against every possible term.
This can be very slow if you have lots of unique terms, and even the bottleneck
if your query is selective.

This commit switches to Lucene regular expressions instead of Java (not exactly
the same syntax yet most existing regular expressions should keep working) and
uses the same logic as RegExpQuery to intersect the regular expression with the
terms dictionary. I wrote a quick benchmark (in the PR) to make sure it made
things faster and the same request that took 750ms on master now takes 74ms with
this change.

Close #7526
2015-04-09 12:12:56 +02:00
javanna acabf2d55a Cluster state REST api: print routing_nodes out only when requested through specific flag
For bacwards compatibility reasons routing_nodes were previously printed out when routing_table was requested, together with the actual routing_table. Now they are printed out only when requests through `routing_nodes` flag.

Relates to #10412
Closes #10486
2015-04-08 16:10:36 +02:00
marko asplund 5585175173 Docs: fix typos in example JSON data
Closes #10479
2015-04-08 13:40:35 +02:00
javanna d9aebf4906 Scripting: remove deprecated methods from ScriptService
Removed the following methods from `ScriptService`, which don't require the `ScriptContext` argument:

```
public CompiledScript compile(String lang,  String script, ScriptType scriptType)

public ExecutableScript executable(String lang, String script, ScriptType scriptType, Map<String, Object> vars)

public SearchScript search(SearchLookup lookup, String lang, String script, ScriptType scriptType, @Nullable Map<String, Object> vars)
```

Also removed the ScriptContext.Standard.GENERIC_PLUGIN enum value, as it was used only for backwards compatibility.

 Plugins that make use of scripts should declare their own script contexts through `ScriptModule#registerScriptContext` and use them when compiling/executing scripts.

Closes #10476
2015-04-08 12:20:03 +02:00
javanna 7bd7ea8f13 Scripting: allow plugins to define custom operations that they use scripts for
Plugins can now define multiple operations/contexts that they use scripts for. Fine-grained settings can then be used to enable/disable scripts based on each single registered context.

Also added a new generic category called `plugin`, which will be used as a default when the context is not specified. This allows us to restore backwards compatibility for plugins on `ScriptService` by restoring the old methods that don't require the script context and making them internally use the `plugin` context, as they can only be called from plugins.

Closes #10347
Closes #10419
2015-04-08 11:57:00 +02:00
blackorzar 5d4a4fc3ff Docs: Include groovy community
This change includes the a new community: groovy to point to the official client.

Closes #10470
2015-04-08 11:46:05 +02:00
Stefan f7e6d79569 [DOCS] update versions in java api module.xml
The filenames are updated to fit to the current elasticsearch version
2015-04-07 18:10:53 +02:00
Isabel Drost-Fromm 60bb65c4d9 Docs: Note on shard vs. index level doc frequencies.
Relates to #10154 and #10150

Adds link to additional information on how document frequencies are treated across shards to the cutoff_frequency parameter documentation.

Closes #10451
2015-04-07 14:28:01 +02:00
joelbourbon 3c52bc1098 Docs: Missing 1 escape character in example
Closes #10446
2015-04-07 14:10:17 +02:00
Isabel Drost-Fromm 110b4c9d81 Update README.md
Updating link to documentation docs from old elasticsearch to new elastic repository.
2015-04-07 10:31:48 +02:00
Lee Hinman eed7c8af6d [DOCS] Document `indices.recovery.concurrent_small_file_streams` 2015-04-06 11:16:50 -06:00
Christian Rohling 795b889da2 Warning in documentation for deprecation of rivers
Adding a warning in the documentation on the wiki to inform users that rivers are being deprecated.

Closes #10423
2015-04-05 20:54:26 +02:00
Hakan Özler a97af113f4 Add Pes Plugin to the plugin page
I do not know normally whether this section is the right place or not, please let me know.

Here is the pes plugin's GitHub site :https://github.com/kodcu/pes

Closes #10398
2015-04-05 20:14:24 +02:00
Alexander Reelsen 6170c274ba Documentation: Add note about not having sources in repositories
Closes #10390
2015-04-05 18:08:57 +02:00
Dustin Shiver ae60144123 Update for clarification
Make it clear which nodes in the cluster should have `http.enabled` set to `false`.

Closes #10305
2015-04-05 18:05:14 +02:00
Maurizio De Magnis 589f8d4468 typo in ruby docs
Closes #10331
2015-04-05 15:14:40 +02:00
Clinton Gormley 276dbc2925 Update repositories.asciidoc
Added a warning explaining how `add-apt-repository` adds a `deb-src` entry, which can result in errors.

Closes #10223
2015-04-05 11:43:49 +02:00
Clinton Gormley 9607f4c22d Docs: Elasticsearch will refuse to start with a known bad JVM 2015-04-04 20:21:37 +02:00
Clinton Gormley a95b11ca61 Document `doc_values` for field type `ip`
Closes #9809
2015-04-04 17:51:28 +02:00
Clinton Gormley c046398093 Update indices.asciidoc
Fixed typo in cat indices

Relates to #7936
2015-04-04 16:50:04 +02:00
B1nj0y bb7cbea71e Docs: Update client.asciidoc
change a typo.

Closes #10039
2015-04-04 15:21:11 +02:00
Adrien Grand c7115f8364 Mappings: Bring back numeric_resolution.
We had an undocumented parameter called `numeric_resolution` which allows to
configure how to deal with dates when provided as a number. The default is to
handle them as milliseconds, but you can also opt-on for eg. seconds.

Close #10072
2015-04-03 19:54:14 +02:00
Guillaume Dievart adcb782423 Update core-types.asciidoc 2015-04-03 14:12:29 +02:00
Adrien Grand 08f93cf33f Add doc values support to boolean fields.
This pull request makes boolean handled like dates and ipv4 addresses: things
are stored as as numerics under the hood and aggregations add some special
formatting logic in order to return true/false in addition to 1/0.

For example, here is an output of a terms aggregation on a boolean field:
```
   "aggregations": {
      "top_f": {
         "doc_count_error_upper_bound": 0,
         "buckets": [
            {
               "key": 0,
               "key_as_string": "false",
               "doc_count": 2
            },
            {
               "key": 1,
               "key_as_string": "true",
               "doc_count": 1
            }
         ]
      }
   }
```

Sorted numeric doc values are used under the hood.

Close #4678
Close #7851
2015-04-02 15:40:46 +02:00
Tanguy Leroux eeec90be79 [DOCS] Add verify parameter to snapshot documentation
Add verify parameter to snapshot documentation and remove 'verify' setting at FS repository level (not supported)
2015-04-02 14:23:31 +02:00