Commit Graph

7456 Commits

Author SHA1 Message Date
Britta Weber a3cefd919e significant terms: add google normalized distance, add chi square
closes #6858
2014-08-04 08:15:26 +02:00
Shay Banon 95762e8126 Support "default" for tcpNoDelay and tcpKeepAlive
Allow to set the value default to network.tcp.no_delay and network.tcp.keep_alive so they won't be set at all, since on solaris, setting tcpNoDelay can actually cause failure
relates to #7115
2014-08-02 17:32:41 +02:00
uboness 3c9c9f33e2 Aggregations Added Filters aggregation
A multi-bucket aggregation where multiple filters can be defined (each filter defines a bucket). The buckets will collect all the documents that match their associated filter.

This aggregation can be very useful when one wants to compare analytics between different criterias. It can also be accomplished using multiple definitions of the single filter aggregation, but here, the user will only need to define the sub-aggregations only once.

Closes #6118
2014-08-01 16:01:08 +01:00
Adrien Grand d9d5b35be9 Sort: Make `ignore_unmapped` work for cross-index queries.
Close #2255
2014-08-01 15:30:17 +02:00
Stefan Antoni 8e862f15c1 [DOCS] fixed small typo in percolate.asciidoc 2014-08-01 12:38:35 +02:00
Britta Weber d6a18ab2ba Docs: add 1.4.0 label to many to many geo distance sort 2014-08-01 12:30:08 +02:00
Kurt Hurtado 66560acebb Update fielddata-fields.asciidoc 2014-08-01 09:20:19 +02:00
Areek Zillur 1d581e6286 Search Exists API: Checks if any matching documents exist for a given query
Implements a new Exists API allowing users to do fast exists check on any matched documents for a given query.
This API should be faster then using the Count API as it will:
 - early terminate the search execution once any document is found to exist
 - return the response as soon as the first shard reports matched documents

closes #6995
2014-07-31 15:42:30 -04:00
David Pilato 85eb0ea0e7 Generate timestamp when path is null
Index process fails when having `_timestamp` enabled and `path` option is set.
It fails with a `TimestampParsingException[failed to parse timestamp [null]]` message.

Reproduction:

```
DELETE test
PUT  test
{
    "mappings": {
        "test": {
            "_timestamp" : {
                "enabled" : "yes",
                "path" : "post_date"
            }
        }
    }
}
PUT test/test/1
{
  "foo": "bar"
}
```

You can define a default value for when timestamp is not provided
within the index request or in the `_source` document.

By default, the default value is `now` which means the date the document was processed by the indexing chain.

You can disable that default value by setting `default` to `null`. It means that `timestamp` is mandatory:

```
{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "default" : null
        }
    }
}
```

If you don't provide any timestamp value, indexation will fail.

You can also set the default value to any date respecting timestamp format:

```
{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "format" : "YYYY-MM-dd",
            "default" : "1970-01-01"
        }
    }
}
```

If you don't provide any timestamp value, indexation will fail.

Closes #4718.
Closes #7036.
2014-07-31 19:48:22 +02:00
Britta Weber fe86c8bc88 _geo_distance sort: allow many to many geo point distance
Add computation of disyance to many geo points. Example request:

```
{
  "sort": [
    {
      "_geo_distance": {
        "location": [
          {
            "lat":1.2,
            "lon":3
          },
          {
             "lat":1.2,
            "lon":3
          }
        ],
        "order": "desc",
        "unit": "km",
        "sort_mode": "max"
      }
    }
  ]
}
```

closes #3926
2014-07-31 17:33:45 +02:00
Clinton Gormley 4b0a89d4fb Update translog.asciidoc
Documented `index.gateway.local.sync`
2014-07-31 14:06:24 +02:00
Clinton Gormley 36e1c7928c Rewrote post-filter.asciidoc
Closes #5166
2014-07-31 12:56:11 +02:00
Nik Everett 34426eb8c2 Docs: Fix syntax on lang-analyzer
Some of the language analyzer documentation contained invalid json.

Closes #7098
2014-07-30 20:17:27 +02:00
Alex Ksikes e3b3b6c055 Term Vectors API: adds support for wildcards in selected fields
This could useful to generate all term vectors or a chosen set of them.

Closes #7061
2014-07-30 17:44:37 +02:00
gabriel-tessier eaac8141cc Docs: Fix typo in scripting.asciidoc
Replace the mvel by groovy in the forgotten place.
I add the previous change in this one.
Sorry for the spam!

Closes #7071
2014-07-29 12:30:09 +02:00
gabriel-tessier c2c2190d27 Docs: Fix typo in scripting.asciidoc
Closes #7070
2014-07-29 12:28:41 +02:00
Adrien Grand 1fe76b891b Docs: Add links to the equivalent aggs in facets documentation. 2014-07-28 15:22:49 +02:00
Lee Hinman 6abe4c951d Add HierarchyCircuitBreakerService
Adds a breaker for request BigArrays, which are used for parent/child
queries as well as some aggregations. Certain operations like Netty HTTP
responses and transport responses increment the breaker, but will not
trip.

This also changes the output of the nodes' stats endpoint to show the
parent breaker as well as the fielddata and request breakers.

There are a number of new settings for breakers now:

`indices.breaker.total.limit`: starting limit for all memory-use breaker,
defaults to 70%

`indices.breaker.fielddata.limit`: starting limit for fielddata breaker,
defaults to 60%
`indices.breaker.fielddata.overhead`: overhead for fielddata breaker
estimations, defaults to 1.03

(the fielddata breaker settings also use the backwards-compatible
setting `indices.fielddata.breaker.limit` and
`indices.fielddata.breaker.overhead`)

`indices.breaker.request.limit`: starting limit for request breaker,
defaults to 40%
`indices.breaker.request.overhead`: request breaker estimation overhead,
defaults to 1.0

The breaker service infrastructure is now generic and opens the path to
adding additional circuit breakers in the future.

Fixes #6129

Conflicts:
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldDataService.java
	src/main/java/org/elasticsearch/index/fielddata/RamAccountingTermsEnum.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/InternalGlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/plain/AbstractIndexOrdinalsFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/DisabledIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/IndexIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/NonEstimatingEstimator.java
	src/main/java/org/elasticsearch/index/fielddata/plain/PackedArrayIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/ParentChildIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/SortedSetDVOrdinalsIndexFieldData.java
	src/main/java/org/elasticsearch/node/internal/InternalNode.java
	src/test/java/org/elasticsearch/index/aliases/IndexAliasesServiceTests.java
	src/test/java/org/elasticsearch/index/codec/CodecTests.java
	src/test/java/org/elasticsearch/index/fielddata/AbstractFieldDataTests.java
	src/test/java/org/elasticsearch/index/fielddata/IndexFieldDataServiceTests.java
	src/test/java/org/elasticsearch/index/mapper/MapperTestUtils.java
	src/test/java/org/elasticsearch/index/query/IndexQueryParserFilterCachingTests.java
	src/test/java/org/elasticsearch/index/query/SimpleIndexQueryParserTests.java
	src/test/java/org/elasticsearch/index/query/guice/IndexQueryParserModuleTests.java
	src/test/java/org/elasticsearch/index/search/FieldDataTermsFilterTests.java
	src/test/java/org/elasticsearch/index/search/child/ChildrenConstantScoreQueryTests.java
	src/test/java/org/elasticsearch/index/similarity/SimilarityTests.java
2014-07-28 11:27:33 +02:00
Clinton Gormley be86556946 Update request-body.asciidoc
Added link from `timeout` to time-units

Closes #6361
2014-07-28 11:08:59 +02:00
mikemccand 96ecec34d1 Docs: fix documentation for bloom filter defaults 2014-07-27 18:39:29 -04:00
Clinton Gormley c367ae09e3 Update nested-query.asciidoc
Changed score_mode `total` to `sum` to be consistent with parent-child etc
2014-07-26 22:32:28 +02:00
Clinton Gormley 10b4177def Docs: Fixed path to search-shards 2014-07-26 15:05:53 +02:00
Clinton Gormley 88c8754a3c Docs: Removed search-shards from request-body 2014-07-26 14:52:50 +02:00
Clinton Gormley 93d9628975 Docs: Reorganised the search-shards API docs 2014-07-26 14:51:44 +02:00
Colin Goodheart-Smithe 655157c83a Aggregations: Added an option to show the upper bound of the error for the terms aggregation.
This is only applicable when the order is set to _count.  The upper bound of the error in the doc count is calculated by summing the doc count of the last term on each shard which did not return the term.  The implementation calculates the error by summing the doc count for the last term on each shard for which the term IS returned and then subtracts this value from the sum of the doc counts for the last term from ALL shards.

Closes #6696
2014-07-25 14:24:24 +01:00
Justin Honold 593fffc7a1 Docs: Changing ES_MAX_MEM default from '1gb' to '1g'
If you set ES_HEAP_SIZE to '1gb' as suggested, Java will yield an "Invalid initial heap size".

Closes #6824
2014-07-25 12:50:59 +02:00
rendel 50634e6a3d Docs: Added new entry for the SIREn plugin.
Closes #6961
2014-07-25 12:49:50 +02:00
Alexander Reelsen a1e335b1e9 CORS: Support regular expressions for origin to match against
This commit adds regular expression support for the allow-origin
header depending on the value of the request `Origin` header.

The existing HttpRequestBuilder is also extended to support the
OPTIONS HTTP method.

Relates #5601
Closes #6891
2014-07-25 10:51:22 +02:00
Lee Hinman 1fb9f404df [DOCS] correct documentation about groovy/mvel defaults and deprecations 2014-07-25 10:39:33 +02:00
Simon Willnauer bd51d7a07f Add `wait_if_ongoing` option to _flush requests
This commit adds the ability to force blocking on the flush operaition
to make sure all files have been written and synced to disk. Without
this option a flush might be executing at the same time causing the
current flush to fail and return before all files being synced.

Closes #6996
2014-07-24 15:34:53 +02:00
Areek Zillur 5487c56c70 Search & Count: Add option to early terminate doc collection
Allow users to control document collection termination, if a specified terminate_after number is
set. Upon setting the newly added parameter, the response will include a boolean terminated_early
flag, indicating if the document collection for any shard terminated early.

closes #6876
2014-07-23 15:10:15 -04:00
Lee Hinman a1a03a184c [DOCS] Fix nested root object indexing documentation
Types can no longer be specified when indexing, see:
https://github.com/elasticsearch/elasticsearch/pull/4552
2014-07-23 18:34:27 +02:00
Britta Weber 10201d511c [doc] Correct decay function equations in function_score description
Impact of decay and scale was missing from the equations.

Closes #6983
2014-07-23 17:33:22 +02:00
Clinton Gormley 0f943850a0 Update named-queries-and-filters.asciidoc 2014-07-23 17:28:49 +02:00
Simon Willnauer 5bfea56457 [DOCS] move all coming tags to added in master 2014-07-23 16:37:19 +02:00
babeya 81a83aab22 Docs: Update query-string-syntax.asciidoc
Closes #6253
2014-07-23 16:32:32 +02:00
Konrad Feldmeier 48812ff1f2 Reflect that 'field_value_factor' is only in 1.2.x
While the blogpost http://www.elasticsearch.org/blog/2014-04-02-this-week-in-elasticsearch/ states, that feature #5519 was
added to 1.x, the release notes for, e.g. v1.1.2, however tell otherwise.
Only the release notes for 1.2.0 list #5519 as a new feature.

Since the 1.x docs deprecate/discourage from using `_boost`, and seemingly give a migration example at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-boost-field.html#function-score-instead-of-boost
users of 1.1.x should be warned.
2014-07-23 15:49:03 +02:00
Peter Johnson @insertcoffee 9a4abc2620 Docs: typo
example fails in bash

Closes #6977
2014-07-23 12:43:43 +02:00
mikemccand cc4d7c6272 Core: don't load bloom filters by default
This change just changes the default for index.codec.bloom.load to
false: with recent performance improvements to ID lookup, such as
#6298, bloom filters don't give much of a performance gain anymore,
and they can consume non-trivial RAM when there are many tiny
documents.

For now, we still index the bloom filters, so if a given app wants
them back, it can just update the index.codec.bloom.load to true.

Closes #6959
2014-07-23 05:58:41 -04:00
Clinton Gormley 3f9aea883f Docs: Made current version, branch and jdk into asciidoc attributes 2014-07-23 11:55:35 +02:00
Clinton Gormley 17df714229 Docs: Change public signing key instructions to work with sudo
Closes #6823
2014-07-23 11:13:03 +02:00
Clinton Gormley 254aa71693 Docs: Added Tiki Wiki integration
Closes #6746
2014-07-23 11:00:46 +02:00
Areek Zillur f39d4e1f89 PhraseSuggester: Collate option should allow returning phrases with no matching docs
A new option `prune` has been added to allow users to control phrase suggestion pruning when `collate`
is set. If the new option is set, the phrase suggestion option will contain a boolean `collate_match`
indicating whether the respective result had hits in collation.

CLoses #6927
2014-07-22 17:17:15 -04:00
Brian Murphy b98f19a54b [DOCS] Fix typo 2014-07-22 14:51:31 +01:00
Brian Murphy 3c5de7d4a1 [DOCS] Fix indentation 2014-07-22 14:49:45 +01:00
Brian Murphy e3b1aed0fc [DOCS] Update examples to groovy. 2014-07-22 14:45:46 +01:00
Nik Everett 79433d23e3 Update: Detect noop updates sent with doc_as_upsert
This should help prevent spurious updates that just cause extra writing
and cache invalidation for no real reason.

Close #6822
2014-07-22 14:55:34 +02:00
Clinton Gormley 8aefaef68a Update scripting.asciidoc
Added an ID for native java scripts
2014-07-22 11:36:40 +02:00
Peter Johnson @insertcoffee 77a2c979ab typo
causes the example to fail in bash
2014-07-21 19:09:22 +02:00
Clinton Gormley a862732434 Docs: Typo 2014-07-21 18:51:49 +02:00
Adrien Grand abeefbddea Docs: Update documentation about execution hints for the terms aggregation. 2014-07-21 11:55:57 +02:00
Clinton Gormley 6a7a77eada Docs: Add links to client helper classes for bulk/scroll/reindexing 2014-07-18 13:55:47 +02:00
Alex Ksikes f22f3db30f Term Vectors API: Computes term vectors on the fly if not stored in the index.
Adds the ability to the Term Vector API to generate term vectors for some
chosen fields, even though they haven't been explicitely stored in the index.

Relates to #5184
Closes #6567
2014-07-17 23:29:05 +02:00
Peter Kim 6a25d9b7b5 [DOCS] Fixed typos 2014-07-17 15:25:34 -04:00
Simon Willnauer f9a9348508 [DOCS] Move benchmark API to 1.4 2014-07-16 15:02:20 +02:00
Brian Murphy d6cd2c2b73 [DOCS][FIX] Fix reference check in indexed scripts/templates doc. 2014-07-16 11:24:18 +01:00
Brian Murphy bc570919ee [DOCS][FIX] Fix doc parsing, broken closing block 2014-07-16 11:18:21 +01:00
Brian Murphy cbd2a97abd [DOCS] : Indexed scripts/templates
These are the docs for the indexed scripts/templates feature.
Also moved the namespace for the REST endpoints.

Closes #6851
2014-07-16 10:49:02 +01:00
Nik Everett da5fb34163 Mappings: Add transform to document before index.
Closes #6566
2014-07-15 18:40:46 +02:00
mikemccand 63cab559e3 Docs: explain that SerialMergeScheduler just maps to CMS for back compat
Closes #6878
2014-07-15 11:38:43 -04:00
Ryan Ernst 64ab22816c Scripting: Add script engine for lucene expressions.
These are javascript expressions, which can only access numeric
fielddata, parameters, and _score. They can only be used for searches (not document updates).

closes #6818
2014-07-15 07:49:01 -07:00
Areek Zillur d0d1b98d23 Stats: Expose IndexWriter and VersionMap RAM usage to ShardStats and _cat endpoint
This commit adds the RAM usage of IndexWriter and VersionMap

Closes #6483
2014-07-14 19:46:12 -04:00
Areek Zillur 76343899ea Phrase Suggester: Add collate option to PhraseSuggester
The newly added collate option will let the user provide a template query/filter which will be executed for every phrase suggestions generated to ensure that the suggestion matches at least one document for the filter/query.
The user can also add routing preference `preference` to route the collate query/filter and additional `params` to inject into the collate template.

Closes #3482
2014-07-14 16:07:52 -04:00
Malte Schirnacher 647a2a64a1 Docs: Update query-string-syntax.asciidoc
Closes #6853
2014-07-14 16:35:17 +02:00
Clinton Gormley 6e70edb0a4 Analysis: Improve Hunspell error messages
The Hunspell service would throw a confusing error message if more than
one affix file was present.  This commit distinguishes between the two
error cases: where there are no affix files and when there are too many
affix files.

Also implements lazy dictionary loading, which was used in the tests
but not implemented.

Closes #6850
2014-07-14 12:13:32 +02:00
Britta Weber 74927adced significant terms: infrastructure for changing easily the significance heuristic
This commit adds the infrastructure to allow pluging in different
measures for computing the significance of a term.
Significance measures can be provided externally by overriding

- SignificanceHeuristic
- SignificanceHeuristicBuilder
- SignificanceHeuristicParser

closes #6561
2014-07-14 11:00:50 +02:00
Igor Motov 60b317caa4 Snapshot/Restore: Add ability to restore indices without their aliases
Closes #6457
2014-07-13 17:52:41 +09:00
Florian Hopf 3689f67a76 Docs: Fixed invalid word count in geodistance agg doc
Closes #6838
2014-07-11 18:35:36 +02:00
mikemccand 6c78147f5f Docs: remove orphan comma 2014-07-11 08:26:08 -04:00
mikemccand b4e80999a7 Docs: fix merge docs to match the code (the max_thread_count default is 'aggressive' (favor SSDs)) 2014-07-11 07:00:57 -04:00
Boaz Leskes f480969503 [Gateway] set a default of 5m to `recover_after_time` when any to the `expected*Nodes` is set
The `recovery_after_time` tells the gateway to wait before starting recovery from disk. The goal here is to allow for more nodes to join the cluster and thus not start potentially unneeded replications. The `expectedNodes` setting (and friends) tells the gateway when it can start recovering even if the `recover_after_time` has not yet elapsed. However, `expectedNodes` is useless if one doesn't set `recovery_after_time`. This commit changes that by setting a sensible default of 5m for `recover_after_time` *if* a `expectedNodes` setting is present.

Closes #6742
2014-07-11 11:28:45 +02:00
Iulia Pasov eed3513c37 Docs: Update plugins.asciidoc to fix typo
Changed the name of the European Environment Agency (from European Environmental Agency)

Closes #6807
2014-07-10 14:04:26 +02:00
Simon Willnauer 154bd0309c [DOCS] Fix typo in reference 2014-07-10 08:47:18 +02:00
Simon Willnauer d82a434d10 [STORE] Make a hybrid directory default using `mmapfs` and `niofs`
`mmapfs` is really good for random access but can have sideeffects if
memory maps are large depending on the operating system etc. A hybrid
solution where only selected files are actually memory mapped but others
mostly consumed sequentially brings the best of both worlds and
minimizes the memory map impact.
This commit mmaps only the `dvd` and `tim` file for fast random access
on docvalues and term dictionaries.

Closes #6636
2014-07-10 00:01:43 +02:00
Shay Banon 8910e09beb Disable JSONP by default
By default, disable the option to use JSONP in our REST layer
closes #6795
2014-07-09 21:17:17 +02:00
Iulia Pasov a79d0744d3 Docs: Update plugins.asciidoc
Closes #6683
2014-07-09 16:15:59 +02:00
Clinton Gormley b6baa4be4a Update preference.asciidoc
Clarify that `preference` is a query string parameter only
and provide an example.
2014-07-09 11:13:17 +02:00
Clinton Gormley 6c30ad1ce6 Docs: Improved the docs for nested mapping
Closes #1643
2014-07-08 15:54:11 +02:00
Clinton Gormley feb81e228b Docs: Rewrote the scroll/scan docs
Closes #6774
2014-07-08 11:54:53 +02:00
Andrii Gakhov 80321d89d9 Docs: Update histogram-aggregation.asciidoc
filter in a filtered query should be under "filter" key

Closes #6738
2014-07-07 10:44:11 +02:00
Carsten Brandt bd4699da7e Docs: fixed a typo in the docs
Closes: #6718
2014-07-07 10:41:36 +02:00
Clinton Gormley e4baa56f4b Docs: Language analyzers
Clarified the use of stem_exclusion and the keyword_marker
token filter

Closes #6613
2014-07-07 10:06:18 +02:00
Clinton Gormley 54790eea10 Update lang-analyzer.asciidoc
Clarified the use of the `stem_exclusion` token filter.

Closes #6613
2014-07-04 17:50:43 +02:00
Shinsuke Sugaya 4bddb4e346 Update plugins.asciidoc 2014-07-05 00:44:02 +09:00
Shikhar Bhushan 1e894111b0 Docs: Link to eskka discovery plugin from doc
Closes #6721
2014-07-04 17:06:51 +02:00
Clinton Gormley d3f8c66e26 Updated cache.asciidoc
The index level filter cache was removed a long time ago

Closes #6455
2014-07-04 14:26:20 +02:00
David Pilato 162c62dbcc [DOCS] Add information regarding _type parameter requirement for _mget
Change ID to `[[mget-type]]`

Closes #6670.
2014-07-03 15:38:06 +02:00
David Pilato de48d7f94c [DOCS] Add information regarding _type parameter requirement for _mget
Closes #6670.
2014-07-03 15:23:35 +02:00
Jun Ohtani 0c6a859357 Docs: fixed ICU plugin documentation
add ICU Normalization CharFilter to docs

Closes #6711
2014-07-03 15:21:51 +02:00
Mikhail Korobov 955473f475 Docs: unescape regexes in Pattern Tokenizer docs
Currently regexes in Pattern Tokenizer docs are escaped (it seems according to Java rules). I think it is better not to escape them because JSON escaping should be automatic in client libraries, and string escaping depends on a client language used. The default pattern is `\W+`, not `\\W+`.

Closes #6615
2014-07-03 13:34:13 +02:00
hanneskaeufler 6e6f4def5d Docs: Fix typo in timestamp-field.asciidoc
Closes #6661
2014-07-03 13:27:37 +02:00
Robert Muir 2935b751e9 Fix doc formatting. Norwegian stemmers and Scandinavian normalizers
were missing commas between entries.
2014-07-03 07:08:33 -04:00
Robert Muir b9a09c2b06 Analysis: Add additional Analyzers, Tokenizers, and TokenFilters from Lucene
Add `irish` analyzer
Add `sorani` analyzer (Kurdish)

Add `classic` tokenizer: specific to english text and tries to recognize hostnames, companies, acronyms, etc.
Add `thai` tokenizer: segments thai text into words.

Add `classic` tokenfilter: cleans up acronyms and possessives from classic tokenizer
Add `apostrophe` tokenfilter: removes text after apostrophe and the apostrophe itself
Add `german_normalization` tokenfilter: umlaut/sharp S normalization
Add `hindi_normalization` tokenfilter: accounts for hindi spelling differences
Add `indic_normalization` tokenfilter: accounts for different unicode representations in Indian languages
Add `sorani_normalization` tokenfilter: normalizes kurdish text
Add `scandinavian_normalization` tokenfilter: normalizes Norwegian, Danish, Swedish text
Add `scandinavian_folding` tokenfilter: much more aggressive form of `scandinavian_normalization`
Add additional languages to stemmer tokenfilter: `galician`, `minimal_galician`, `irish`, `sorani`, `light_nynorsk`, `minimal_nynorsk`

Add support access to default Thai stopword set "_thai_"

Fix some bugs and broken links in documentation.

Closes #5935
2014-07-03 05:47:49 -04:00
Matthew L Daniel 53f2301eea Docs: Add clarifying text about regexp and terms
For the casual reader, the reference to "term queries" may be glossed over, yielding an unexpected result when using `regexp` queries.
This attempts to make that distinction more prominent.

Closes #6698
2014-07-03 11:39:57 +02:00
jnguyenx 1883f74cc0 Docs: Fixed missing comma in multi match query example 2014-07-03 08:17:09 +02:00
Ian Babrou 698eb7de9b Fixed JSON in fielddata docs 2014-07-01 12:53:10 +02:00
Duncan Angus Wilkie 60a8515fb7 Update histogram-facet.asciidoc
Spotted a typo, which I've fixed.
2014-07-01 10:49:43 +02:00
Igor Motov 1425e28639 Add ability to restore partial snapshots
Closes #5742
2014-06-30 20:18:02 -04:00
Lee Hinman b43b56a6a8 Add a transformer to translate constant BigDecimal to double 2014-06-26 10:52:28 +02:00
mahdeto e78f1edca3 DOC:Added field data circuit breaker settings 2014-06-26 10:29:41 +02:00
Clinton Gormley 30c80319c0 Match query with operator and, cutoff_frequency and stacked tokens
If the match query with cutoff_frequency encounters stacked tokens,
like synonyms in the same position, it returns a boolean query instead
of a common terms query.  However, if the original operator was set
to "and", it was ignoring that and resetting the operator to "or".

In fact, if operator is "and" then there is little benefit in using
a common terms query as a must query is already
executed efficiently.
2014-06-25 17:53:43 +02:00
Lee Hinman 5c6d28240f Switch to Groovy as the default scripting language
This is a breaking change to move from MVEL -> Groovy
2014-06-25 12:15:12 +02:00
Clinton Gormley 64a4acc49b Docs: Added IDs to the highlighters for linking 2014-06-22 16:46:42 +02:00
Clinton Gormley cf059378d1 Docs: Updated stop token filter docs 2014-06-21 18:42:38 +02:00
Clinton Gormley fac724cc99 Docs: Updated the explanation about memory usage with parent/child 2014-06-21 16:32:29 +02:00
Clinton Gormley e52364a95a Docs: Updated cluster health docs 2014-06-20 18:05:46 +02:00
Clinton Gormley adf6e794b6 Docs: Rewrote the filtered query docs to be clearer
Closes #1688
2014-06-19 16:34:26 +02:00
Adrien Grand 703dbff83d Index field names of documents.
The `exists` and `missing` filters need to merge postings lists of all existing
terms, which can be very costly, especially on high-cardinality fields. This
commit indexes the field names of a document under `_field_names` and reuses it
to speed up the `exists` and `missing` filters.

This is only enabled for indices that are created on or after Elasticsearch
1.3.0.

Close #5659
2014-06-19 11:50:06 +02:00
Fitblip d18fb8bfbd REST API: Allow to configure JSONP/callback support
Added the http.jsonp.enable option to configure disabling of JSONP responses, as those
might pose a security risk, and can be disabled if unused.

This also fixes bugs in NettyHttpChannel
* JSONP responses were never setting application/javascript as the content-type
* The content-type and content-length headers were being overwritten even if they were set before

Closes #6164
2014-06-19 08:34:38 +02:00
Chris 011e20678d [DOCS] Fixed json example in nested-aggregation.asciidoc 2014-06-18 19:38:02 +02:00
Colin Goodheart-Smithe 7423ce0560 Aggregations: Added percentile rank aggregation
Percentile Rank Aggregation is the reverse of the Percetiles aggregation.  It determines the percentile rank (the proportion of values less than a given value) of the provided array of values.

Closes #6386
2014-06-18 12:02:08 +01:00
Clinton Gormley 69350dc426 Update stemmer-override-tokenfilter.asciidoc 2014-06-18 11:34:20 +02:00
Clinton Gormley 3eb291f334 Docs: tidied configuration.asciidoc 2014-06-17 17:37:07 +02:00
Shay Banon f450c3ea30 update docs to reflect how default write consistency with 1 replica behaves 2014-06-17 14:25:04 +02:00
Matt Janssen 946dde287a [DOCS] Fixed is/if typo in Api Conventions doc 2014-06-16 15:44:47 +02:00
Volker Fröhlich 06192686a2 [DOCS] Fixd typo in http.asciidoc 2014-06-16 10:42:34 +02:00
stephlag 13d910f016 Added missing comma in suggester example 2014-06-13 16:01:04 +02:00
Adrien Grand 7a34702925 [DOCS] Clarify the trade-off of the `disk` doc values format. 2014-06-13 13:24:53 +02:00
Adrien Grand 01327d7136 Facets: deprecation.
Users are encouraged to move to the new aggregation framework that was
introduced in Elasticsearch 1.0.

Close #6485
2014-06-13 13:13:44 +02:00
Clinton Gormley eb6c9fe111 Docs: Linked to fielddata formats from core types
Closes #6489
2014-06-13 12:58:03 +02:00
Boaz Leskes 7fb16c783d Added caching support to geohash_filter
Caching is turned off by default.

Closes #6478
2014-06-12 22:19:34 +02:00
Shay Banon 2330421816 Wait till node is part of cluster state for join process
When a node sends a join request to the master, only send back the response after it has been added to the master cluster state and published.
This will fix the rare cases where today, a join request can return, and the master, since its under load, have not yet added the node to its cluster state, and the node that joined will start a fault detect against the master, failing since its not part of the cluster state.
Since now the join request is longer, also increase the join request timeout default.
closes #6480
2014-06-12 18:15:51 +02:00
Lee Hinman 3a3f81d59b Enable DiskThresholdDecider by default, change default limits to 85/90%
Fixes #6200
Fixes #6201
2014-06-12 16:35:29 +02:00
Clinton Gormley c41e63c2f9 Docs: Updated index-modules/store and setup/configuration
Explain how to set different index storage types, and
added the vm settings required to stop mmapfs from running
out of memory

Closes #6327
2014-06-12 13:56:06 +02:00
shadow000fire 1b45b216fd Update nested-query.asciidoc
Added note that fields inside a nested query must be full qualified.
2014-06-12 12:48:23 +02:00
Luke Fender f9da5259bc [DOCS] Fixed typo in post-filter.asciidoc
Remove 'be' where it is not needed
2014-06-12 12:09:19 +02:00
Igor Motov 56a264cf6d [DOCS] Snapshot/restore: add more information about snapshot and restore monitoring 2014-06-11 20:52:45 -04:00
Clinton Gormley f546662e8f Docs: Hunspell tidied
Tidied some formatting
2014-06-11 21:49:02 +02:00
Clinton Gormley 04dacaaf27 Docs: Use the "stemmer" token filter for the english analyzer, to be consistent 2014-06-11 13:47:07 +02:00
Clinton Gormley 8a94b71b75 Docs: Corrected the use of keyword_marker on the lang analyzers 2014-06-11 13:43:02 +02:00
Clinton Gormley 673ef3db3f The StemmerTokenFilter had a number of issues:
* `english` returned the slow snowball English stemmer
* `porter2` returned the snowball Porter stemmer (v1)
* `portuguese` was used twice, preventing the second version from working

Changes:

* `english` now returns the fast PorterStemmer (for indices created from v1.3.0 onwards)
* `porter2` now returns the snowball English stemmer (for indices created from v1.3.0 onwards)
* `light_english` now returns the `kstem` stemmer (`kstem` still works)
* `portuguese_rslp` returns the PortugueseStemmer
* `dutch_kp` is a synonym for `kp`

Tests and docs updated

Fixes #6345
Fixes #6213
Fixes #6330
2014-06-11 12:30:16 +02:00
Martijn van Groningen 5e408f3d40 Change the top_hits to be a metric aggregation instead of a bucket aggregation (which can't have an sub aggs)
Closes #6395
Closes #6434
2014-06-10 09:09:50 +02:00
Clinton Gormley e323e577e8 Docs: Fixed bad ref on cjk_width/bigram pages 2014-06-09 23:36:58 +02:00
Clinton Gormley 5e40868f44 Docs: Fixed a bad ref on lang analyzers page 2014-06-09 23:03:12 +02:00
Clinton Gormley 5c5c1da06c Docs: Fixed some errors on the language analyzers page 2014-06-09 22:51:28 +02:00
Clinton Gormley 585b0ef730 Docs: Added custom-analyzer equivalents of all the language analyzers 2014-06-09 22:41:25 +02:00
Clinton Gormley bc402d5f87 Docs: Documented the cjk_width and cjk_bigram token filters 2014-06-09 22:40:58 +02:00
Matthew L Daniel b0a85f6ca3 Guard against improper auto_expand_replica values
Previously if the user provided a non-conforming string, it would blow up with
`java.lang.StringIndexOutOfBoundsException: String index out of range: -1`
which is not a *helpful* error message.

Also updated the documentation to make the possible setting values more clear.

Close #5752
2014-06-07 01:19:06 +02:00
markharwood 724129e6ce Aggregations optimisation for memory usage. Added changes to core Aggregator class to support a new mode of deferred collection.
A new "breadth_first" results collection mode allows upper branches of aggregation tree to be calculated and then pruned
to a smaller selection before advancing into executing collection on child branches.

Closes #6128
2014-06-06 15:59:51 +01:00
fransflippo cdbde4a578 [DOCS] Reworded note about shorthand suggest syntax
The existing Note about the shorthand suggest syntax was poorly worded and confusing. Please check whether the way I've phrased it now is still correct as to what the shorthand form actually does and doesn't do: the original wording did not provide me enough information to be sure.
Thanks!
2014-06-06 10:21:01 +02:00
Evgeniy Sokovikov 1383ab77b6 [DOCS] Fixed typo in put-mapping docs
split backwardscompatibility to backwards compatibility
2014-06-05 19:55:11 +02:00
Yervand Aghababyan cb22417cc1 [DOCS] Fixed the fuzzy query docs with correct default value max_expansion option 2014-06-05 19:52:12 +02:00
Steve Fuller e991c1f717 [DOCS] fixed typo in date-format.asciidoc 2014-06-05 19:49:20 +02:00
Jad Naous 5aa84c9aab [DOCS] Fixed typos in aggregations.asciidoc
Fix plural/singular forms.
2014-06-05 19:47:01 +02:00
gseng 7b5807fe4a [DOCS] Fixed typo in object-type.asciidoc 2014-06-05 19:34:50 +02:00
Philip Stevens 4998c0928f [DOCS] Replace facets example with aggregations in warmers docs 2014-06-05 19:22:16 +02:00
Israel Tsadok 1a58016ea1 [DOCS] Add special attributes for indices allocation filtering 2014-06-05 10:38:07 +02:00
Rob Young 07a6143386 [DOCS] Fix grammar in dynamic mappings 2014-06-04 08:56:15 +02:00
Colin Goodheart-Smithe b9f4d44b14 Aggregations: Adds GeoBounds Aggregation
The GeoBounds Aggregation is a new single bucket aggregation which outputs the coordinates of a bounding box containing all the points from all the documents passed to the aggregation as well as the doc count. Geobound Aggregation also use a wrap_logitude parameter which specifies whether the resulting bounding box is permitted to overlap the international date line.  This option defaults to true.

This aggregation introduces the idea of MetricsAggregation which do not return double values and cannot be used for sorting.  The existing MetricsAggregation has been renamed to NumericMetricsAggregation and is a subclass of MetricsAggregation.  MetricsAggregations do not store doc counts and do not support child aggregations.

Closes #5634
2014-06-03 15:59:56 +01:00
violuke 4f99f0c6f1 [DOCS] Improved readability of multi-match query docs 2014-06-03 14:23:34 +02:00
darkwarriors d8765a8f1d [DOCS] fixed urls in nodes-stats docs 2014-06-03 13:48:42 +02:00
Patrik Ragnarsson 9a3368b937 [DOCS] Fix minor error in cluster stats example 2014-06-03 13:38:37 +02:00
Gaurav Arora 4a3837acf0 [DOCS] fix typo in network module docs 2014-06-03 13:19:36 +02:00
James Yu 8994eed82b [DOCS] Update elasticsearch version in repositories.asciidoc 2014-06-03 12:30:51 +02:00
Steve Fuller b800be891f [DOCS] fixed typo in fucntion-score query docs 2014-06-03 12:05:59 +02:00
violuke 0020e5fc0a [DOCS] Improved grammar in multi-match query docs 2014-06-03 11:50:41 +02:00
javanna 5a1ad7b42e [DOCS] fixed curl requests in benchmark docs 2014-06-03 11:47:13 +02:00
leonardo menezes f3eca05c3b [DOCS] removed slowest on single query benchmark requests
Relates to #5904
2014-06-03 11:47:13 +02:00
javanna 3fcbe1d6cf [DOCS] reordered cat apis menu 2014-06-03 11:06:35 +02:00
Ivan Brusic 29bc6bce1a [DOCS] Fielddata cat API added in 1.2.0 2014-06-03 11:06:28 +02:00
salyh db9921fc03 [DOCS] Add community supported MSI installer to docs 2014-06-03 10:59:57 +02:00
salyh 27b38818b7 [DOCS] Add imap river and security plugin to docs 2014-06-03 10:59:50 +02:00
Andrew Raines b2d1b3df4b [DOCS] Clarify that only_expunge_deletes doesn't override expunge_deletes_allowed 2014-06-02 17:49:01 -05:00
Clinton Gormley 46a67b638d Parent/Child: Added min_children/max_children to has_child query/filter
Added support for min_children and max_children parameters to
the has_child query and filter. A parent document will only
be considered if a match if the number of matching children
fall between the min/max bounds.

Closes #6019
2014-05-30 19:38:39 +02:00
Shay Banon 9c98bb3554 Have a dedicated join timeout that is higher than ping.timeout for node join
Using ping.timeout, which defaults to 3s, to use as a timeout value on the join request a node makes to the master once its discovered can be too small, specifically when there is a large cluster state involved (and by definition, all the buffers and such on the nio layer will be "cold"). Introduce a dedicated join.timeout setting, that by default is 10x the ping.timeout (so 30s by default).
closes #6342
2014-05-30 12:42:08 +02:00
Clinton Gormley 7fff6f1f43 Docs: Tidied percolate.asciidoc 2014-05-30 11:56:06 +02:00
Martijn van Groningen aab38fb2e6 Aggregations: added pagination support to `top_hits` aggregation by adding `from` option.
Closes #6299
2014-05-30 11:45:31 +02:00
javanna 74eff87dd6 [DOCS] Java 7 is required since 1.2.0 2014-05-30 10:45:22 +02:00
Adrien Grand 328a7e513c [DOCS] Document filtered query strategies. 2014-05-28 17:57:43 +02:00
David Pilato 1dc186a595 [DOCS] fix typo 2014-05-27 15:57:39 +02:00
Itamar Syn-Hershko ac812f72b7 Docs: Adding Hebrew analyzer
Closes #6306
2014-05-27 13:40:53 +02:00
Martijn van Groningen 5fafd2451a Added `top_hits` aggregation that keeps track of the most relevant document being aggregated per bucket.
Closes #6124
2014-05-23 16:01:18 +02:00
Nik Everett 3573822b7e Highlight fields in request order
Because json objects are unordered this also adds an explicit order syntax
that looks like
    "highlight": {
        "fields": [
            {"title":{ /*params*/ }},
            {"text":{ /*params*/ }}
        ]
    }

This is not useful for any of the builtin highlighters but will be useful
in plugins.

Closes #4649
2014-05-22 16:44:14 +02:00
Simon Willnauer 9d5507047f Update Documentation Feature Flags [1.2.0] 2014-05-22 15:06:42 +02:00
Alex Ksikes 2546c06131 More Like This Query: allow for both 'like_text' and 'docs/ids' to be specified.
Closes #6246
2014-05-22 13:50:17 +02:00
Clinton Gormley f950344546 [DOCS] Fixed title levels in context suggester 2014-05-21 20:47:25 +02:00
Alex Ksikes a29b4a800d More Like This Query: replaced 'exclude' with 'include' to avoid double negation when set.
Closes #6248
2014-05-21 18:45:03 +02:00
Simon Willnauer ec3b1c57ac Move Benchmark release to 1.3 2014-05-21 10:17:59 +02:00
Igor Motov 91c7892305 Add ability to snapshot replicating primary shards
This change adds a new cluster state that waits for the replication of a shard to finish before starting snapshotting process. Because this change adds a new snapshot state, an pre-1.2.0 nodes will not be able to join the 1.2.0 cluster that is currently running snapshot/restore operation.

Closes #5531
2014-05-20 08:57:21 -04:00
Simon Willnauer 85a0b76dbb Upgrade to Lucene 4.8.1
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:

 * An IndexThrottle now kicks in when merges start falling behind
   limiting index threads to 1 until merges caught up. Closes #6066
 * RateLimiter now kicks in at the configured rate where previously
   the limiter was limiting at ~8MB/sec almost all the time. Closes #6018
2014-05-19 20:47:55 +02:00
Itamar Syn-Hershko d1589b3815 Fixing invalid jsons 2014-05-19 15:07:56 +02:00
Andrew Selden 420f2db4cd [DOCS] Cat recovery API update
This is an update for the _cat/recovery API documentation. The examples
have been updated. Removed the bottom paragraph explaining why there
could be values > 100%. This can no longer happen so that had to be
removed.

Closes #6159
2014-05-18 17:43:13 -07:00
Simon Willnauer f79b28375d Add missing coming tag
Relates to #6188
Relates to #5539
2014-05-18 10:54:17 +02:00
Alex Ksikes db991dc3a4 More Like This Query: Added searching for multiple items.
The syntax to specify one or more items is the same as for the Multi GET API.
If only one document is specified, the results returned are the same as when
using the More Like This API.

Relates #4075 Closes #5857
2014-05-17 19:14:56 +02:00
Igor Motov c20713530d Switch to shared thread pool for all snapshot repositories
Closes #6181
2014-05-16 19:03:15 -04:00
Boaz Leskes 9f10547f4b Allow 0 as a valid external version
Until now all version types have officially required the version to be a positive long number. Despite of this has being documented, ES versions <=1.0 did not enforce it when using the `external` version type. As a result people have succesfully indexed documents with 0 as a version. In 1.1. we introduced validation checks on incoming version values and causing indexing request to fail if the version was set to 0. While this is strictly speaking OK, we effectively have a situation where data already indexed does not match the version invariant.

To be lenient and adhere to spirit of our data backward compatibility policy, we have decided to allow 0 as a valid external version type. This is somewhat complicated as 0 is also the internal value of `MATCH_ANY`, which indicates requests should succeed regardles off the current doc version. To keep things simple, this commit changes the internal value of `MATCH_ANY` to `-3` for all version types.

Since we're doing this in a minor release (and because versions are stored in the transaction log), the default `internal` version type still accepts 0 as a `MATCH_ANY` value. This is not a problem for other version types as `MATCH_ANY` doesn't make sense in that context.

Closes #5662
2014-05-16 22:10:16 +02:00
Clinton Gormley f510e25306 [DOCS] Renamed the "cat" chapters to be more searchable 2014-05-16 21:43:35 +02:00
Clinton Gormley bfeb5a7120 added install instruction with apt
Closes #6206
2014-05-16 19:07:05 +02:00
David Pilato bd871f96c2 Check that a plugin is Lucene compatible with the current running node using `lucene` property in `es-plugin.properties` file.
* If plugin does not provide `lucene` property, we consider that the plugin is compatible.
* If plugin provides `lucene` property, we try to load related Enum org.apache.lucene.util.Version. If this fails, it means that the node is too "old" comparing to the Lucene version the plugin was built for.
* We compare then two first digits of current node lucene version against two first digits of plugin Lucene version. If not equal, it means that the plugin is too "old" for the current node.

Plugin developers who wants to launch plugin check only have to add a `lucene` property in `es-plugin.properties` file. If you are using maven to build your plugin, you can do it like this:

In `pom.xml`:

```xml
    <properties>
        <lucene.version>4.6.0</lucene.version>
    </properties>

    <build>
        <resources>
            <resource>
                <directory>src/main/resources</directory>
                <filtering>true</filtering>
            </resource>
        </resources>
    </build>
```

In `es-plugin.properties`, add:

```properties
lucene=${lucene.version}
```

BTW, if you don't already have it, you can add the plugin version as well:

```properties
version=${project.version}
```

You can disable that check using `plugins.check_lucene: false`.
2014-05-16 13:41:20 +02:00
Clinton Gormley 8f0991c14f [DOCS] Rewrote the memory settings section on the configuration page 2014-05-14 16:02:59 +02:00
Britta Weber 08e57890f8 use shard_min_doc_count also in TermsAggregation
This was discussed in issue #6041 and #5998 .

closes #6143
2014-05-14 14:10:04 +02:00
Gaurav Arora e041b5992c Fix typo in docs 2014-05-14 12:36:35 +02:00
Clinton Gormley ff12585fea Improved wording in search-type.asciidoc
Closes #5951
2014-05-14 12:15:48 +02:00
Clinton Gormley 2912e1cce3 Fixed typo in getting-started.asciidoc
Closes #6064
2014-05-14 12:03:12 +02:00
ericheiker 0eb7b5024d Update match-query.asciidoc 2014-05-14 11:59:12 +02:00
Mahesh Paolini-Subramanya c93e7f26c5 Type is the 'doc-type', not the word 'type' 2014-05-14 11:50:08 +02:00
David Pilato 1cb2c3bdd3 [DOCS] reverse-nested aggs are added in 1.2.0 2014-05-13 20:00:42 +02:00
matdere b9c58adf28 Update repositories.asciidoc
Improved instructions for using YUM
2014-05-13 15:55:46 +02:00
Tiago Alves Macambira a8242e6c8c Clarify `missing` behavior. 2014-05-13 15:49:46 +02:00
Clinton Gormley b331aa1670 [DOCS] Changed coming[1.1.0] to added in snapshot status 2014-05-13 11:19:28 +02:00
Adrien Grand cc530b9037 Use t-digest as a dependency.
Our improvements to t-digest have been pushed upstream and t-digest also got
some additional nice improvements around memory usage and speedups of quantile
estimation. So it makes sense to use it as a dependency now.

This also allows to remove the test dependency on Apache Mahout.

Close #6142
2014-05-13 10:38:08 +02:00
Clinton Gormley 3aac594503 [DOCS] Fix typos in context suggest 2014-05-13 10:34:16 +02:00
markharwood 1e560b0d92 Significant_terms agg: added option for a background_filter to define background context for analysis of term frequencies
Closes #5944
2014-05-13 09:10:30 +01:00
Clinton Gormley 5b93255ec8 [DOCS] Added "Aggregation" to all aggs titles 2014-05-13 01:35:58 +02:00
Rashid Khan 233aaa63c9 Change key to keyed 2014-05-12 13:15:07 -07:00
mikemccand 00fcf4d560 #6081: set IO throttling back to 20 MB/sec now that #6018 is fixed 2014-05-12 14:42:26 -04:00
mikemccand b6ae7fbadb #5882: fix docs 2014-05-12 14:16:27 -04:00
mikemccand 254ebc2f88 #6120 Remove SerialMergeScheduler (master only)
It's dangerous to expose SerialMergeScheduler as an option: since it only allows one merge at a time, it can easily cause merging to fall behind.

Closes #6120
2014-05-12 14:06:20 -04:00
Lee Hinman e7e4ef859a Add /_cat/fielddata to display fielddata usage
Closes #4593
2014-05-09 13:18:02 +02:00
Alex Ksikes dae48d9fe8 Added the ability to include the queried document for More Like This API.
By default More Like This API excludes the queried document from the response.
However, when debugging or when comparing scores across different queries, it
could be useful to have the best possible matched hit. So this option lets users
explicitly specify the desired behavior.

Closes #6067
2014-05-09 12:59:39 +02:00
Alex Ksikes 48b7172ee7 Provided some insights as to how More Like This works internally.
In the Google Groups forum there appears to be some confusion as to what mlt
does. This documentation update should hopefully help demystifying this
feature, and provide some understanding as to how to use its parameters.

Closes #6092
2014-05-09 12:13:29 +02:00
javanna bd2a616c82 [DOCS] fixed broken json in multi term vectors docs 2014-05-08 16:01:13 +02:00
javanna 2999152e19 [DOCS] fixed typo in multi term vectors docs 2014-05-08 15:50:24 +02:00
Ivan Brusic bac0627c5e Update fielddata.asciidoc
Spelling correction
2014-05-08 10:59:24 +02:00
Ivan Brusic 59e0c34cdb Update fielddata.asciidoc
Fixed default value for circuit breaker
2014-05-08 10:58:10 +02:00
Andrew Selden f23274523a Integration tests for benchmark API.
- Randomized integration tests for the benchmark API.
- Negative tests for cases where the cluster cannot run benchmarks.
- Return 404 on missing benchmark name.
- Allow to specify 'types' as an array in the JSON syntax when describing a benchmark competition.
- Don't record slowest for single-request competitions.

Closes #6003, #5906, #5903, #5904
2014-05-07 14:14:54 -07:00
mikemccand 9daaae27b3 clarify that CMS defaults change is coming in 1.2 2014-05-07 13:49:54 -04:00
uboness fc52db1209 Changed the respnose structure of the percentiles aggregation where now all the percentiles are placed under a `values` object (or `values` array in case the `keyed` flag is set to `false`
Closes #5870
2014-05-07 18:35:24 +02:00
Chris Earle 12f758e811 [DOCS] Update nodes documentation with all headers
Adds a table with the exhaustive list of all available headers with a brief description (mostly from `org.elasticsearch.rest.action.cat.RestNodesAction`) so that people do not need to go searching for them in the code like I did, or search through `nodes?help`.
2014-05-07 11:18:22 -05:00
Britta Weber 7944369fd1 Add `shard_min_doc_count` parameter for significant terms similar to `shard_size`
Significant terms internally maintain a priority queue per shard with a size potentially
lower than the number of terms. This queue uses the score as criterion to determine if
a bucket is kept or not. If many terms with low subsetDF score very high
but the `min_doc_count` is set high, this might result in no terms being
returned because the pq is filled with low frequent terms which are all sorted
out in the end.

This can be avoided by increasing the `shard_size` parameter to a higher value.
However, it is not immediately clear to which value this parameter must be set
because we can not know how many terms with low frequency are scored higher that
the high frequent terms that we are actually interested in.

On the other hand, if there is no routing of docs to shards involved, we can maybe
assume that the documents of classes and also the terms therein are distributed evenly
across shards. In that case it might be easier to not add documents to the pq that have
subsetDF <= `shard_min_doc_count` which can be set to something like
`min_doc_count`/number of shards  because we would assume that even when summing up
the subsetDF across shards `min_doc_count` will not be reached.

closes #5998
closes #6041
2014-05-07 18:02:56 +02:00
Richard Boulton fdb5eb6555 Update keyword-tokenizer.asciidoc 2014-05-07 15:04:07 +02:00
violuke 9ed34b5a9e Correcting gramma 2014-05-06 18:00:19 +02:00
Clinton Gormley 394a3e4332 [DOCS] Updated the mapping and field mapping docs to use the new format
Closes #6057
2014-05-06 17:21:09 +02:00
Keiji Yoshida 80d7bc3423 Update getting-started.asciidoc
Fixed "Jone Done" to "Jone Doe"
2014-05-06 16:32:33 +02:00
Matthieu Bacconnier 7fd5f18539 Update asciifolding-tokenfilter.asciidoc
Typo
2014-05-06 16:30:09 +02:00
Benjamin Devèze 6feeac98c8 s/boost_factor/boost in custom_filters_score doc
I may be wrong but I think custom_filters_score used boost rather than boost factor?
2014-05-06 16:15:36 +02:00
Clinton Gormley 2e03a6629b Update create-index.asciidoc
Document defaults for `number_of_shards` and `number_of_replicas`

Closes #5899
2014-05-06 16:10:23 +02:00
Audrey d7023fbb3f Update "Character classes" part 2014-05-06 16:05:51 +02:00
Kevin Wang 33d256119d fix field data stats doc 2014-05-06 15:57:00 +02:00
gabriel-tessier 7b0efcbd96 fix typo 2014-05-06 15:54:36 +02:00
Radu Gheorghe c4477f0ded Removed mention of Spatial4J and JTS requirement
AFAIK, on 1.0 at least (and later), those libraries are included.
2014-05-06 14:49:48 +02:00
pickypg 2c11475bdd Update geo-shape-type documentation
Update `geo-shape-type.asciidoc` to include all `GeoShapeType`s supported by the `org.elasticsearch.common.geo.builders.ShapeBuilder`.

Changes include:

1. A tabular mapping of GeoJSON types to Elasticsearch types
2. Listing all types, with brief examples, for all support Elasticsearch types
3. Putting non-standard types to the bottom (really just moving Envelope to the bottom)
4. Linking to all GeoJSON types.
5. Adding whitespace around tightly nested arrays (particularly `multipolygon`) for readability
2014-05-06 14:41:00 +02:00
Kevin Wang 19468880a8 [DOCS] add compass and compress_threshold to binary field mapping doc 2014-05-06 14:27:35 +02:00
Ali Bozorgkhan f1af845795 [DOCS] Fixed a typo
Close #5963
2014-05-06 10:28:13 +02:00
Audrey 52d2f2d229 [DOCS] Update phrase-suggest.asciidoc
Grammatical error

Close #5993
2014-05-06 10:28:13 +02:00
Adrien Grand fc78dd2f13 [DOC] Fix default values for filter cache size and field data circuit breaker.
Relates to #5990
2014-05-06 10:13:05 +02:00
mikemccand 07563379dc fix docs for merging and throttling 2014-05-05 16:22:00 -04:00
Clinton Gormley 7a9aad30f4 [DOCS] Changed score_type to score_mode for has_child/parent queries 2014-05-05 18:30:12 +02:00
Alexander Reelsen d4fcf23057 Cluster State API: Remove index template filtering
The possibility of filtering for index templates in the cluster state API
had been introduced before there was a dedicated index templates API. This
commit removes this support from the cluster state API, as it was not really
clean, requiring you to specify the metadata and the index templates.

Closes #4954
2014-05-05 14:54:14 +02:00
gabriel-tessier 48930c2950 [DOC] Fix typo in function score query documentation. 2014-05-02 23:44:56 +02:00
Alex Ksikes b55d8ed2e3 Fix behavior on default boost factor for More Like This.
A boost terms factor of 1.0 is not the same as no boosting of terms.
The desired behavior is to deactivate boosting by default. If the user
specifies any value other than 0, then boosting is activated.

Closes #6021
2014-05-02 16:59:09 +02:00
Holger Hoffstätte f5c9bf6f0f Update JNA to latest version
Updating to this version allows to configure a special JNA directory,
in case the /tmp directory is mounted with the noexec option, as JNA
extracts some data and tries to execute parts of it.

Also updated documentation to clarify mlockall and memory settings as well
as pointing to the new jna.tmpdir system property.

Closes #5493
2014-05-02 11:52:57 +02:00
Martijn van Groningen 013b319415 Added `reverse_nested` aggregation.
The `reverse_nested` aggregation allows to aggregate on properties outside of the nested scope of a `nested` aggregation.

Closes #5507
2014-05-01 00:23:05 +07:00
Binh Ly fe89b8735a [DOC] Fixed filtered_query typo 2014-04-29 10:24:52 -04:00
Robert Muir 8e0a479316 Upgrade to Lucene 4.8
Closes #5932
2014-04-28 06:45:50 -04:00
Chris Earle 5528370e24 Added type, max, min, queueSize & keepAlive to _cat/thread_pool
Closes #5366
2014-04-28 12:00:27 +02:00
Simon Willnauer f285ffc610 Multi value handling in decay functions
Decay functions currently only use the first value in a field that contains
multiple values to compute the distance to the origin. Instead, it should
consider all distances if more values are in the field and then use
one of min/max/sum/avg which is defined by the user.

Relates to #3960
closes #5940
2014-04-28 11:55:32 +02:00
javanna 5d1d5d6754 [DOCS] Removed leftover indices status link 2014-04-28 11:39:12 +02:00
javanna 1685e3611c [DOCS] Fixed get asciidoc missing section warning 2014-04-28 11:39:12 +02:00
javanna 16468f9ca3 [DOCS] Fixed scripting example 2014-04-28 11:39:12 +02:00
Clinton Gormley 4b9f1d261d Removed indices-status docs.
Related #4854
2014-04-28 10:40:45 +02:00
Lee Hinman 81e83cca74 Disable dynamic scripting by default
Closes #5853
2014-04-25 15:08:26 -06:00
Boaz Leskes 051beb51a3 Version types `EXTERNAL` & `EXTERNAL_GTE` test for version equality in read operation & disallow them in the Update API
Separate version check logic for reads and writes for all version types, which allows different behavior in these cases.
Change `VersionType.EXTERNAL` & `VersionType.EXTERNAL_GTE` to behave the same as `VersionType.INTERNAL` for read operations.
The previous behavior was fit for writes but is useless in reads.

This commit also makes the usage of `EXTERNAL` & `EXTERNAL_GTE` in the update api raise a validation error as it make cause data to
be lost.

Closes #5663 , Closes #5661, Closes #5929
2014-04-25 23:06:12 +02:00
Uwe Dauernheim 080c4ade25 Fix typo 2014-04-25 14:59:10 -06:00
Benoss ed33b022d3 Update setup repositories documentation
Update doc so
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-repositories.html
example is going to 1.1 instead of 0.90
2014-04-25 14:57:23 -06:00
Clinton Gormley c1e03bf860 Update keyword-repeat-tokenfilter.asciidoc 2014-04-24 16:44:02 +02:00
Clinton Gormley 39705aa236 [DOCS] rewrite -> fuzzy_rewrite in match query
Fixed typo
2014-04-23 21:05:14 +02:00
Simon Willnauer b36ef995bb Change default recovery throttling to 50MB / sec
The current setting of 20MB/sec seems to be too conservative given
the capabilities of modern hardware / network throughput.
A 50MB default should provide better out of the box performance.
2014-04-23 15:40:21 +02:00
Robert Muir 8568c18e6f Change default numeric precision_step
Change the default numeric precision_step to 16 for 64-bit types,
8 for 32-bit and 16-bit types. Disable precision_step for the 8-bit
byte type.

Closes #5905
2014-04-23 09:01:25 -04:00
Simon Willnauer b4f0603169 Change default merge throttling to 50MB / sec
The current setting of 20MB/sec seems to be too conservative given
the capabilities of modern hardware. Even on cloud infrastructure this
seems to be too lowish. A 50MB default should provide better out of the box
performance
2014-04-22 21:08:40 +02:00
Binh Ly 1746f2f792 [DOCS] getting started tutorial 2014-04-22 13:33:03 -04:00
Lee Hinman 57bee03193 [DOCS] Add /_search_shards documentation 2014-04-22 08:54:32 -06:00
Simon Willnauer 1cf62e7782 Use unlimited flush_threshold_ops for translog
Currently we use 5k operations as a flush threshold. Indexing 5k documents
per second is rather common which would cause the index to be committed on
the lucene level each time the flush logic runs which is 5 seconds by default.
We should rather use a size based threshold similar to the lucene index writer
that doesn't cause such agressive commits which can slow down indexing significantly
especially since they cause the underlying devices to fsync their data.
2014-04-22 16:37:07 +02:00
Clinton Gormley 3ba8fbbef8 Update benchmark.asciidoc
Fixed incorrect parameter spec for benchmark nodes
2014-04-22 14:16:10 +02:00
Clinton Gormley 0e782331be Update benchmark.asciidoc 2014-04-21 20:39:33 +02:00
Samuel Molinari 909cf4de44 Update function-score-query.asciidoc 2014-04-20 13:39:32 +02:00
David Pilato f3fe50aac4 [DOCS] fix typo 2014-04-19 22:44:44 +02:00
Xiao Yu 4b5e8cec8e Add a site plugin into list
Howdy,

Not sure if this is kosher but I would like to add my site plugin to the list in the docs.
2014-04-17 19:28:37 +02:00
Christoph Frick e3e631eca5 Update allocation.asciidoc 2014-04-17 14:42:58 +02:00
Igor Motov 4c3027729e [DOCS] Make snapshot repository examples consistent 2014-04-16 17:28:43 -04:00
Clinton Gormley 65906d176a Update multi-match-query.asciidoc
Typo
2014-04-16 15:41:38 +02:00
Kouhei Sutou de59cde926 Remove garbage 2014-04-15 17:57:25 +02:00
Simon Willnauer 9898eed30c [DOCS] Update merge docs to reflect the max_merge_at_once property 2014-04-15 16:42:23 +02:00
Simon Willnauer 320a206352 Switch back to ConcurrentMergeScheduler
Load tests showed that SerialMS has problems to keep up with
the merges under high load. We should switch back to CMS
until we have a better story to balance merge
threads / efforts across shards on a single node.

Closes #5817
2014-04-15 16:42:23 +02:00
Scott Wilkerson 9ea0e3a95b Update percolate.asciidoc
fix typo
2014-04-15 16:01:44 +02:00
eliasah c61110c28d Update core-types.asciidoc
Missing bracket
2014-04-15 15:57:04 +02:00
Yousef d7fda621e9 Updated date_formats to new dynamic_date_formats 2014-04-15 15:44:08 +02:00
Andrew Selden 2cf66c4115 Benchmark documentation
Moving benchmark documentation under the search section.

Closes #5786
2014-04-14 14:08:41 -07:00
Peter Dyson f8537183b9 [DOCS] update old status of plugins 2014-04-13 20:18:19 -04:00
Malte Schirnacher 8ce3bba010 Fix typos in percolate.asciidoc
Close #5762 #5763 #5764
2014-04-11 18:09:16 +02:00
Sean Gallagher 80ebd49253 [DOCS] Added tables and fixes to upgrade.asciidoc, fixed version in README.textile
Author: Sean Gallagher
Date: 10 Apr 2014 15:23 EDT
2014-04-10 15:23:07 -04:00
Nik Everett 40f1913cf3 [Docs] Add experimental highlighter plugin 2014-04-10 13:32:34 -04:00
Andrew Selden e2c8ff92ba Benchmark API
Add an API endpoint at /_bench for submitting, listing, and aborting
search benchmarks. This API can be used for timing search requests,
subject to various user-defined settings.

Benchmark results provide summary and detailed statistics on such
values as min, max, and mean time. Values are reported per-node so that
it is easy to spot outliers. Slow requests are also reported.

Long running benchmarks can be viewed with a GET request, or aborted
with a POST request.

Benchmark results are optionally stored in an index for subsequent
analysis.

Closes #5407
2014-04-09 13:06:55 -07:00
Nik Everett af0278b51b [Docs] Allocation setting explanation
Closes #5748
2014-04-09 12:11:36 -06:00
Costin Leau 960d353dbd Remove plugin isolation feature for a future version
relates #5261
2014-04-09 17:28:11 +03:00
Andrew O'Brien 48031b6236 Fixes typo in "Scan" search type documention 2014-04-07 16:01:37 -06:00
Sean Gallagher 5138083e13 Author: Sean Gallagher
Date: Tue Apr 1 12:28:00 2014

Added upgrade.asciidoc and links to it from setup.asciidoc

Author: Sean Gallagher
Date: Apr 1 2014

Added upgrade.asciidoc

Add upgrade instructions
Author: Sean Gallagher
Date: 4/4/14
Closes issue #5651

Fixed upgrade.asciidoc typo and incorrect usage.
Author: Sean Gallagher
Date: 4 Apr 2014
Closes 5651
2014-04-07 14:43:35 -04:00
wittyameta 94278d81e3 Update advanced-scripting.asciidoc 2014-04-07 07:20:13 -06:00
Kevin Wang ecab74fe6c add lucene language model similarities (Dirichlet & JelinekMercer) 2014-04-07 10:48:03 +02:00
Kevin Wang 866c520abb Add doc value for binary field.
Close #5669
2014-04-07 10:18:55 +02:00
gabriel-tessier 000c33aac3 fix typo 2014-04-07 09:23:46 +02:00
Martijn van Groningen ade1d0ef57 Added global ordinals (unique incremental numbering for terms) to fielddata.
Added a terms aggregation implementations that work on global ordinals, which is also the default.

Closes #5672
2014-04-07 11:06:41 +07:00
Lee Hinman 211f740100 Add `getAsRatio` to Settings class, allow DiskThresholdDecider to take percentages
Adds new RatioValue class that parses ratios between 0-100% expressed in
either floating-point (0.13) or percentage (51.12%) notation.

Closes #5690
2014-04-04 13:19:35 -06:00
Karl Meisterheim 6d993bc810 [DOCS] A few grammar and word use corrections 2014-04-04 19:26:38 +02:00
Peter Dyson 233279bb64 [DOCS] Fixed typo 2014-04-04 17:37:56 +02:00
Lee Hinman c3089701f2 [DOCS] remove extraneous ` from cache page 2014-04-02 16:07:00 -06:00
Alexander Reelsen e547e113e1 Geo context suggester: Require precision in mapping
The default precision was way too exact and could lead people to
think that geo context suggestions are not working. This patch now
requires you to set the precision in the mapping, as elasticsearch itself
can never tell exactly, what the required precision for the users
suggestions are.

Closes #5621
2014-04-02 23:51:14 +02:00
Radu Gheorghe b9cb70198e Typo in the description for include_in_all
I know this is uber-minor, but I was confused by the phrase "the raw field value to be copied". I assume "is" was supposed to be instead of "to"
2014-04-02 12:02:12 +02:00
Binh Ly 51a6a95de3 [DOC] Fixed flags example incorrect syntax 2014-04-01 14:43:38 -04:00
Igor Motov d13850814e [DOCS] "F" is not valid false value for boolean type 2014-04-01 08:16:43 -04:00
Nik Everett 1df942b463 [docs] Indices stats groups in nodes api
Closes #5349
2014-03-31 19:54:48 +02:00
Hannes Korte c11293ad78 Fix some typos in documentation. 2014-03-31 13:48:17 +02:00
Alex Brasetvik cd8ed388d9 Document http.cors-settings 2014-03-31 11:34:46 +02:00
Andrew O'Brien bd9c1bc8d9 Update has-parent-filter.asciidoc
"This filter return child..." => This filter returns child...
2014-03-31 00:06:35 +02:00
Kevin Wang ceed22fe00 Add suggest stats
closes #4032
2014-03-28 11:13:54 +01:00
Lee Hinman 8fbd1bdd48 Add the `field_value_factor` function to the function_score query
The `field_value_factor` function uses the value of a field in the
document to influence the score.

A query that looks like:
{
  "query": {
    "function_score": {
      "query": {"match": { "body": "foo" }},
      "functions": [
        {
          "field_value_factor": {
            "field": "popularity",
            "factor": 1.1,
            "modifier": "square"
          }
        }
      ],
      "score_mode": "max",
      "boost_mode": "sum"
    }
  }
}

Would have the score modified by:

square(1.1 * doc['popularity'].value)

Closes #5519
2014-03-27 14:29:37 -06:00
Shay Banon 6fce15beec Tribe: Index level blocks, index conflict settings
allow to configure on the index level which blocks can optionally be applied using tribe.blocks.indices prefix settings.
allow to control what will be done when a conflict is detected on index names coming from several clusters using the tribe.on_conflict setting. Defaults remains "any", but now support also "drop" and "prefer_[tribeName]".
closes #5501
2014-03-27 09:45:20 -07:00
David Pilato 85b9aafaad [DOCS] `_type` instead of Type Field 2014-03-27 08:35:15 +01:00
Igor Motov 3ffd0a1dfa Remove deprecated gateways
Closes #5422
2014-03-26 18:10:51 -04:00
Igor Motov c2e38fbf78 [DOCS] Clarify nested type documentation 2014-03-26 11:57:41 -04:00
javanna 42c36ef72d [DOCS] fixed typo
Closes #5272
2014-03-26 14:51:02 +01:00
Kevin Wang 374b633a4b add uppercase token filter
closes #5539
2014-03-26 15:07:43 +07:00
bleskes 5d832374dd Update Documentation Feature Flags [1.1.0] 2014-03-25 17:51:30 +01:00
Adrien Grand c977a49b76 [DOC] Clarify settings and documentation about norms. 2014-03-25 16:05:23 +01:00
Boaz Leskes fc8dc3f733 [Docs] updated the search template and query template docs 2014-03-25 15:25:02 +01:00
Adrien Grand 1c0b6da0ac Allow to disable norms on an existing field.
Close #4813
2014-03-25 14:13:06 +01:00
Alexander Reelsen 4fc461a97c [DOCS] Moved the template query documentation into search section 2014-03-25 10:01:41 +01:00
Simon Willnauer b4e504df99 [Docs] Add coming tag for context suggester docs 2014-03-25 09:46:49 +01:00
Igor Motov 3414deb215 [DOCS] Mark snapshot status API as coming in 1.1.0 2014-03-24 21:55:19 -04:00
Kevin 1496b03458 Merge null_value for boolean field and remove include_in_all for boolean field in doc
Close #5502
2014-03-24 11:00:57 +01:00
Kevin Wang bfd3236378 Merge GeoPoint specific mapping properties
Close #5505
2014-03-24 09:30:55 +01:00
Jun Ohtani 20e596cb86 fix typo joda-time link 2014-03-21 10:02:53 +01:00
Andrew Selden 89e45fde9c Recovery API
Adds a new API endpoint at /_recovery as well as to the Java API. The
recovery API allows one to see the recovery status of all shards in the
cluster. It will report on percent complete, recovery type, and which
files are copied.

Closes #4637
2014-03-20 10:13:30 -07:00
Alexander Reelsen 8f6e1d4720 Query Templates: Adding dedicated /_search/template endpoint
In order to simplify query template execution an own endpoint has been added

Closes #5353
2014-03-20 17:43:40 +01:00
uboness 7d6ad8d91c Added extended_bounds support for date_/histogram aggs
By default the date_/histogram returns all the buckets within the range of the data itself, that is, the documents with the smallest values (on which with histogram) will determine the min bucket (the bucket with the smallest key) and the documents with the highest values will determine the max bucket (the bucket with the highest key). Often, when when requesting empty buckets (min_doc_count : 0), this causes a confusion, specifically, when the data is also filtered.

To understand why, let's look at an example:

Lets say the you're filtering your request to get all docs from the last month, and in the date_histogram aggs you'd like to slice the data per day. You also specify min_doc_count:0 so that you'd still get empty buckets for those days to which no document belongs. By default, if the first document that fall in this last month also happen to fall on the first day of the **second week** of the month, the date_histogram will **not** return empty buckets for all those days prior to that second week. The reason for that is that by default the histogram aggregations only start building buckets when they encounter documents (hence, missing on all the days of the first week in our example).

With extended_bounds, you now can "force" the histogram aggregations to start building buckets on a specific min values and also keep on building buckets up to a max value (even if there are no documents anymore). Using extended_bounds only makes sense when min_doc_count is 0 (the empty buckets will never be returned if the min_doc_count is greater than 0).

Note that (as the name suggest) extended_bounds is **not** filtering buckets. Meaning, if the min bounds is higher than the values extracted from the documents, the documents will still dictate what the min bucket will be (and the same goes to the extended_bounds.max and the max bucket). For filtering buckets, one should nest the histogram agg under a range filter agg with the appropriate min/max.

Closes #5224
2014-03-20 14:48:27 +01:00
Clinton Gormley 1fff379742 [DOCS] Documented the fact that binary fields are not stored by default 2014-03-20 12:43:43 +01:00
Florian Schilling c0a092aa92 [Doc] Updated docs for distance scripting
Updated docs for distance scripting and
added missing geohash distance functions
Closes #5397
2014-03-20 12:18:25 +01:00
Clinton Gormley 4c34615686 [DOCS] Fixed some bad UTF8 2014-03-19 12:46:06 +01:00
Shay Banon 0ef3b03be1 Move to use serial merge schedule by default
Today, we use ConcurrentMergeScheduler, and this can be painful since it is concurrent on a shard level, with a max of 3 threads doing concurrent merges. If there are several shards being indexed, then there will be a minor explosion of threads trying to do merges, all being throttled by our merge throttling.
Moving to serial merge scheduler will still maintain concurrency of merges across shards, as we have the merge thread pool that schedules those merges. It will just be a serial one on a specific shard.
Also, on serial merge scheduler, we now have a limit of how many merges it will do at one go, so it will let other shards get their fair chance of merging. We use the pending merges on IW to check if merges are needed or not for it.
Note, that if a merge is happening, it will not block due to a sync on the maybeMerge call at indexing (flush) time, since we wrap our merge scheduler with the EnabledMergeScheduler, where maybeMerge is not activated during indexing, only with explicit calls to IW#maybeMerge (see Merges).
closes #5447
2014-03-18 13:17:00 +01:00
Igor Motov a1192044f2 Add ability to get snapshot status for running snapshots
Closes #4946
2014-03-17 20:13:49 -04:00
David Pilato 0805c01984 [DOCS] Add Azure storage repositories 2014-03-17 19:40:28 +01:00
markharwood 5f1d9af9fe Documentation fix for significant_terms heading levels 2014-03-17 12:17:54 +00:00
Randy Stauner 1486188a3b [DOCS] Reword clear-scroll sentence 2014-03-17 12:08:49 +01:00
lzhoucs 5a5171cb70 [DOCS] Fix typo in the reference doc. SuSe -> SUSE
SUSE, as a Linux distribution, is never lower cased

fixes #5354
2014-03-17 12:03:25 +01:00
Justin Etheredge 36219a1786 [DOCS] Updating scripting docs for geo functions
Added a few functions are corrected the default unit where necessary
2014-03-17 11:59:02 +01:00
Boaz Leskes ee8743f3f2 [Docs] added a missing reference to significantterms-aggergations
Also fix header level mismatch issue reported by the build
2014-03-17 11:45:55 +01:00
David Pilato f54e9246c1 Add _cat/plugins endpoint
If we want to have a full picture of versions running in a cluster, we need to add a `_cat/plugins` endpoint.

Response could look like:

```sh
% curl es2:9200/_cat/plugins?v
node component                        version   type url                                   desc
es1  mapper-attachments               1.7.0       j                                        Adds the attachment type allowing to parse difference attachment formats
es1  lang-javascript                  1.4.0       j                                        JavaScript plugin allowing to add javascript scripting support
es1  analysis-smartcn                 1.9.0       j                                        Smart Chinese analysis support
es1  marvel                           1.1.0      j/s http://localhost:9200/_plugins/marvel Elasticsearch Management & Monitoring
es1  kopf                             0.5.3       s  http://localhost:9200/_plugins/kopf   kopf - simple web administration tool for ElasticSearch
es2  mapper-attachments               2.0.0.RC1   j                                        Adds the attachment type allowing to parse difference attachment formats
es2  lang-javascript                  2.0.0.RC1   j                                        JavaScript plugin allowing to add javascript scripting support
es2  analysis-smartcn                 2.0.0.RC1   j                                        Smart Chinese analysis support
```

Closes #4824.
2014-03-16 12:16:09 +01:00
Clinton Gormley fb934aff57 [DOCS] Documented gateway.local.auto_import_dangled
Relates to #4996
2014-03-15 12:07:17 +01:00
rphadake 36a0cb99d7 [Doc] doc updates for date histogram interval
Close #5308
2014-03-14 18:55:32 +01:00
Adrien Grand 65d3b61b97 Add an option to force _optimize operations.
When forced, the index will be merged even if it contains a single segment with
no deletions.

Close #5243
2014-03-14 18:21:56 +01:00
Adrien Grand eef71da650 [Doc] Add a chart about the relative error of the percentiles aggregation. 2014-03-14 12:23:23 +01:00
markharwood 767bef0596 Significant_terms aggregation identifies terms that are significant rather than merely popular in a set.
Significance is related to the changes in document frequency observed between everyday use in the corpus and
frequency observed in the result set. The asciidocs include extensive details on the applications of this feature.

Closes #5146
2014-03-14 10:34:24 +00:00
Adrien Grand 5821fa042c Cardinality aggregation.
This aggregation computes unique term counts using the hyperloglog++ algorithm
which uses linear counting to estimate low cardinalities and hyperloglog on
higher cardinalities.

Since this algorithm works on hashes, it is useful for high-cardinality fields
to store the hash of values directly in the index, which is the purpose of
the new `murmur3` field type. This is less necessary on low-cardinality
string fields because the aggregator is smart enough to only compute the hash
once per unique value per segment thanks to ordinals, or on numeric fields
since hashing them is very fast.

Close #5426
2014-03-13 19:19:56 +01:00
Florian Schilling 81e537bd5e ContextSuggester
================

This commit extends the `CompletionSuggester` by context
informations. In example such a context informations can
be a simple string representing a category reducing the
suggestions in order to this category.

Three base implementations of these context informations
have been setup in this commit.

- a Category Context
- a Geo Context

All the mapping for these context informations are
specified within a context field in the completion
field that should use this kind of information.
2014-03-13 11:24:46 +01:00
Kurt Hurtado ca6a2bb790 [DOCS] Various aggregation doc fixes 2014-03-13 09:05:25 +01:00
Costin Leau 9624b215fb Add docs for plugin isolation 2014-03-11 12:32:58 +02:00
Boaz Leskes b7a95d11a7 Introduced VersionType.FORCE & VersionType.EXTERNAL_GTE
Also added "external_gt" as an alias name for VersionType.EXTERNAL , accessible for the rest layer.

Closes #4213 , Closes #2946
2014-03-10 21:07:17 +01:00
javanna d5aaa90f34 [TEST] Randomized number of shards used for indices created during tests
Introduced two levels of randomization for the number of shards (between 1 and 10) when running tests:

1) through the existing random index template, which now sets a random number of shards that is shared across all the indices created in the same test method unless overwritten

2) through `createIndex` and `prepareCreate` methods, similar to what happens using the `indexSettings` method, which changes for every `createIndex` or `prepareCreate` unless overwritten (overwrites index template for what concerns the number of shards)

Added the following facilities to deal with the random number of shards:
- `getNumShards` to retrieve the number of shards of a given existing index, useful when doing comparisons based on the number of shards and we can avoid specifying a static number. The method returns an object containing the number of primaries, number of replicas and the total number of shards for the existing index

- added `assertFailures` that checks that a shard failure happened during a search request, either partial failure or total (all shards failed). Checks also the error code and the error message related to the failure. This is needed as without knowing the number of shards upfront, when simulating errors we can run into either partial (search returns partial results and failures) or total failures (search returns an error)

- added common methods similar to `indexSettings`, to be used in combination with `createIndex` and `prepareCreate` method and explicitly control the second level of randomization: `numberOfShards`, `minimumNumberOfShards` and `maximumNumberOfShards`. Added also `numberOfReplicas` despite the number of replicas is not randomized (default not specified but can be overwritten by tests)

Tests that specified the number of shards have been reviewed and the results follow:
- removed number_of_shards in node settings, ignored anyway as it would be overwritten by both mechanisms above
- remove specific number of shards when not needed
- removed manual shards randomization where present, replaced with ordinary one that's now available
- adapted tests that didn't need a specific number of shards to the new random behaviour
- fixed a couple of test bugs (e.g. 3 levels parent child test could only work on a single shard as the routing key used for grand-children wasn't correct)
- also done some cleanup, shared code through shard size facets and aggs tests and used common methods like `assertAcked`, `ensureGreen`, `refresh`, `flush` and `refreshAndFlush` where possible
- made sure that `indexSettings()` is always used as a basis when using `prepareCreate` to inject specific settings
- converted indexRandom(false, ...) + refresh to indexRandom(true, ...)
2014-03-10 13:01:52 +01:00
Simon Willnauer fbb8c0fafa [DOCS] Add `coming` tag to multiple rescores
Closes #5365
2014-03-10 09:27:44 +01:00
Andrew Raines 2f48be597e Display all available endpoints by default at /_cat
Closes #5106
2014-03-07 13:21:43 -06:00
Konrad Feldmeier d7b0d547d4 [DOCS] Multiple doc fixes
Closes #5047
2014-03-07 14:24:58 +01:00
Benjamin Devèze 2affa5004f Fix small typo in percentiles doc 2014-03-07 10:10:19 +01:00
Adrien Grand f359b7f38b [DOC] The percentiles aggregation is coming in 1.1.0. 2014-03-07 10:03:15 +01:00
Brusic 95274c18c5 Added support for char filters in the analyze API
Closes #5148
2014-03-06 12:23:51 +01:00
James Brook a93d6d55a5 Added support for aliases to index templates
Adapted existing PR (#2739) to updated code (post #4920), added tests and docs (@javanna)

Closes #1825
2014-03-06 11:11:07 +01:00
uboness 9d0fc76f54 Added support for sorting buckets based on sub aggregations
Supports sorting on sub-aggs down the current hierarchy. This is supported as long as the aggregation in the specified order path are of a single-bucket type, where the last aggregation in the path points to either a single-bucket aggregation or a metrics one. If it's a single-bucket aggregation, the sort will be applied on the document count in the bucket (i.e. doc_count), and if it is a metrics type, the sort will be applied on the pointed out metric (in case of a single-metric aggregations, such as avg, the sort will be applied on the single metric value)

 NOTE: this commit adds a constraint on what should be considered a valid aggregation name. Aggregations names must be alpha-numeric and may contain '-' and '_'.

 Closes #5253
2014-03-06 00:05:27 +01:00
Igor Motov b723ee0d20 [DOCS] Update boolean mapping docs with a full list of values that are treated as false
Closes #5337
2014-03-05 15:33:59 -05:00
Clinton Gormley 98ecf80f07 [DOCS] Formatting error
Closes #5346
2014-03-05 17:40:51 +01:00
Kevin 2c7a3a49c5 [DOCS] add Elasticsearch Image Plugin 2014-03-05 14:16:56 +01:00
Zachary Tong 7b16c5857d Percentiles aggregation.
A new metric aggregation that can compute approximate values of arbitrary
percentiles.

Close #5323
2014-03-03 18:06:14 +01:00
Martijn van Groningen dcb590398d [DOCS] Better document the limitation of nested objects. 2014-03-03 14:12:18 +01:00
Binh Ly 7e49848697 Clarify range aggregations 2014-02-28 14:38:57 -05:00
Clinton Gormley 53ce0e8e27 [DOCS] Fixed added[] tag version number 2014-02-28 15:29:43 +01:00
Lee Hinman e53a43800e Add `explain` flag support to the reroute API
By specifying the `explain` flag, an explanation for the reason a
command can or cannot be executed is returned. No allocation commands
are actually performed.

Returns a response similar to:

{
  "state": {...cluster state...},
  "acknowledged": true,
  "explanations" : [ {
    "command" : "cancel",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "allow_primary" : false
      },
      "decisions" : [ {
        "decider" : "cancel_allocation_command",
        "decision" : "YES",
        "explanation" : "..."
        } ]
     }, {
      "command" : "move",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
       },
       "decisions" : [ {
         "decider" : "same_shard",
         "decision" : "NO",
         "explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
       },
       etc
       ]
  }]
}

also removes AllocationExplanation from cluster state

Closes #2483
Closes #5169
2014-02-27 09:48:51 -07:00
Simon Willnauer 9160516b28 Expose `filler_token` via ShingleTokenFilterFactory
Lucene 4.7 supports a setter for the `filler_token` that is
inserted if there are gaps in the token stream. This change exposes
this setting.

Closes #4307
2014-02-26 22:21:10 +01:00
Martijn van Groningen 1441fec068 [DOCS] Updated memory considerations for p/c queries and filters. 2014-02-26 22:16:51 +01:00
Simon Willnauer 90e57c15e8 [DOCS]: fixed small problem in example json 2014-02-26 16:40:04 +01:00
Clinton Gormley 03ad168b24 [DOCS] Added note about dely in clearing filter cache.
Closes #5231
2014-02-24 11:36:22 +01:00
hura 818f8c0e2b [DOCS] Fix wrong explanation in configuration.asciidoc
Replaced network.host with node.name to match config file
2014-02-24 11:29:50 +01:00
Luca Cavanna 4e6610a798 Fixed multi term queries support in postings highlighter for non top-level queries
In #4052 we added support for highlighting multi term queries using the postings highlighter. That worked only for top-level queries though, and not for multi term queries that are nested for instance within a bool query, or filtered query, or a constant score query.

The way we make this work is by walking the query structure and temporarily overriding the query rewrite method with a method that allows for multi terms extraction.

Closes #5102
2014-02-21 21:43:40 +01:00
Adrien Grand edb854d952 Document the indices segments response format. 2014-02-21 12:01:32 +01:00
Lee Hinman 8f8cc7205d Add "locale" parameter to query_string and simple_query_string
Fixes #5128

Remove java 7 specific Locale functions, add "coming[1.1.0]" to documentation

add LocaleUtils utility class for dealing with Locale functions
2014-02-20 15:53:08 -07:00
Martijn van Groningen a81a4a5efe [DOCS] Included the `_percolator` index breaking change to migration docs. 2014-02-20 16:43:06 +01:00
Isabel Drost-Fromm 48004ff8a5 Add mustache templating to query execution.
Adds support for storing mustache based query templates that can later be filled
with query parameter values at execution time. Templates may be both quoted,
non-quoted and referencing templates stored in config/scripts/*.mustache by file
name.

See docs/reference/query-dsl/queries/template-query.asciidoc for templating
examples.

Implementation detail: mustache itself is being shaded as it depends directly on
guava - so having it marked optional but included in the final distribution
raises chances of version conflicts downstream.

Fixes #4879
2014-02-20 12:21:59 +01:00
javanna 419db6ee12 [DOCS] Fixed typo in create index api 2014-02-19 17:49:38 +01:00
Boaz Leskes e379f419e6 [DOCS] Remove clear flag from node-stats as it is not used anymore 2014-02-17 15:20:12 +01:00
Luca Cavanna 3afdf4a872 Added support for aliases to create index api
It is now possible to specify aliases during index creation:

curl -XPUT 'http://localhost:9200/test' -d '
{
    "aliases" : {
        "alias1" : {},
        "alias2" : {
            "filter" : { "term" : {"field":"value"}}
        }
    }
}'

Closes #4920
2014-02-17 14:54:21 +01:00
Britta Weber db3c6c2a8e Enable percolation for nested documents
closes #5082
2014-02-14 22:42:33 +01:00
Lee Hinman c97bcc3602 Add support for `lowercase_expanded_terms` flag to simple_query_string
Default the flag to true, making simple_query_string behave similarly to
query_string

Fixes #5008
2014-02-14 11:51:23 -07:00
Nik Everett 5c3f4ceafb Add preserve original token option to ASCIIFolding
Closes #4931
2014-02-14 19:37:00 +01:00
Luca Cavanna 6abd0a76bd [DOCS] improved get docs
- added _version to response
- exists call use -XHEAD with -i flag to include headers in the output
2014-02-14 13:11:10 +01:00
Lars Francke 2a765415c8 Update get.asciidoc
Minor improvements.

curl -XHEAD doesn't actually print anything so I've changed to use -I which actually prints the headers received.
2014-02-14 13:11:10 +01:00
Brian Yoder 41dba68bda Added the `DistanceUnit.NAUTICALMILES` enumeration
label with the corresponding *NM* and *nmi* unit
suffixes. Update the docs to match.

Closes #5085
2014-02-14 19:48:58 +09:00
uboness d335630e57 [docs] fixed errors in aggs docs
- error in nested aggs example
- error in terms aggs example
2014-02-13 20:36:02 +01:00
Oleg Anashkin eb0e1aa38f Fix typo in similarity docs
DRF similarity -> DFR similarity
2014-02-13 07:45:30 -08:00
Luca Cavanna 179750f0f5 [DOCS] fixed count docs, it now requires a top-level query object, same as other apis
Relates to #4074
2014-02-13 13:36:20 +01:00
Luca Cavanna 9902f04033 [DOCS] rephrased delete by query docs 2014-02-13 11:44:51 +01:00
Luca Cavanna 01abea5945 [DOCS] fixed count and validate query docs, they now require a top-level query object, same as other apis
Relates to #4074
Closes #5111
2014-02-13 11:42:04 +01:00
Kevin 99942089a8 [DOCS] add DynamoDB river plugin 2014-02-13 10:38:04 +01:00
James Yu 699fe5e929 fixed markup and typo 2014-02-13 10:33:15 +01:00
Clinton Gormley 80c7619591 [DOCS] Changed coming[] to added[] for 1.0.0* 2014-02-12 17:17:25 +02:00
Luca Cavanna 1d8d58391f [DOCS] added coming tags for `zen.discovery.publish_timeout` made dynamic 2014-02-12 15:24:38 +01:00
Luca Cavanna 16e4ac8713 [DOCS] Documented `discovery.zen.publish_timeout` setting 2014-02-12 10:45:37 +01:00
Luca Cavanna 847521b44c [DOCS] added `discovery.zen.publish_timeout` to the dynamic settings list 2014-02-12 10:45:30 +01:00
Igor Motov 02ebe33758 [DOCS] Fix typo in rename_pattern in snapshot/restore documentation 2014-02-11 09:23:07 -05:00
Simon Willnauer 990ce658a4 [Docs] Remove `custom_score` from documentation and add a migration
section.
2014-02-11 14:59:15 +01:00
Mihnea Dobrescu-Balaur 1f7efb5471 [DOCS] Add GitHub community river plugin 2014-02-11 11:55:24 +01:00
Alexander Reelsen b02e6dc996 Migrating NodesInfo API to use plugins instead of singular plugin
In order to be consistent (and because in 1.0 we switched from
parameter driven information to specifzing the metrics as part of the URI)
this patch moves from 'plugin' to 'plugins' in the Nodes Info API.
2014-02-11 10:05:10 +01:00
Luca Cavanna 7de7a0ace3 [TEST] fixed typo in _cat/thread_pool docs 2014-02-10 16:20:03 +01:00
Shay Banon e5f43a1867 add version and master_node flags to cluster state 2014-02-10 02:24:03 +01:00
David Pilato c214acc5e7 [DOCS] Add GridFS repository community plugin 2014-02-08 10:43:54 +01:00
Sean Gallagher e935a301df Doc fix explaining resynchronization with the Cancel command.
Added line explaining resync process to Reroute/Cancel command.

Closes #5025
2014-02-07 17:02:36 -05:00
Clinton Gormley 93930d6dc7 Removed 0.90.* deprecation and addition notifications
Closes #5052
2014-02-07 20:52:49 +01:00
Adrien Grand 9cb17408cb Make size=0 return all buckets for the geohash_grid aggregation.
Close #4875
2014-02-07 09:55:10 +01:00
David Pilato 444dff7b40 [DOCS] delete by query requires a top-level query parameter
Closes #5044
(cherry picked from commit 1e265b3)
2014-02-07 08:50:15 +01:00
Kevin d9b704fd86 add redis transport plugin 2014-02-06 18:19:54 +01:00
Lee Hinman d2078a5e28 Add fuzzy/slop support to `simple_query_string`
Ports the change from https://issues.apache.org/jira/browse/LUCENE-5410
2014-02-06 10:05:10 -07:00
Costin Leau f5a8de6321 [DOCS] organize a bit the repository plugins
(cherry picked from commit 88e1c20c4581885db7e5e65edf7eb3629c2d31ca)
2014-02-06 19:01:58 +02:00
Simon Willnauer 162ca99376 Added `cross_fields` mode to multi_match query
`cross_fields` attemps to treat fields with the same analysis
configuration as a single field and uses maximum score promotion or
combination of the scores based depending on the `use_dis_max` setting.
By default scores are combined. `cross_fields` can also search across
fields of hetrogenous types for instance if numbers can be part of
the query it makes sense to search also on numeric fields if an analyzer
is provided in the reqeust.

Relates to #2959
2014-02-06 17:15:55 +01:00
Clinton Gormley 56479fb0e4 [DOCS] Make apt/yum repos more visible 2014-02-06 17:04:37 +01:00
Boaz Leskes 9bf263c741 [DOCS] Fix terms agg value script example 2014-02-06 16:35:49 +01:00
Boaz Leskes ae4ed29f9b [Docs] value_count supports script per 1.1 2014-02-06 15:04:50 +01:00
Clinton Gormley 17e2ca5259 [DOCS] Updated migration docs for multi_field to point to copy_to 2014-02-06 14:34:07 +01:00
Clinton Gormley 6238d406b5 [DOCS] Removed the experimental label from Tribe, Hot Threads
and Completion Suggester
2014-02-06 14:19:17 +01:00
David Pilato 583f148334 [DOCS] add azure and gce discovery plugins
Clean EC2 disco doc
Add Azure disco doc
Add Google Compute Engine doc
Fix Zen doc (add `enabled` in `multicast` parameters list) - Fix #5032.
2014-02-06 09:18:42 +01:00
David Pilato 8b1a6fc5b6 Add S3 and HDFS repositories 2014-02-05 17:53:37 +01:00
Clinton Gormley d9bdfe3fec [DOCS] Deprecated the path setting in favour of copy_to
Relates to #4729
2014-02-05 14:47:48 +01:00
Adrien Grand 6777be60ce Add script support to value_count aggregations.
Close #5001
2014-02-04 14:29:32 +01:00
Clinton Gormley 238b26a466 [DOC] Tidied up geohashgrid aggregations 2014-02-04 11:54:32 +01:00
Jun Ohtani ba415b8ad2 Does not support "script" in value_clunt aggregation. 2014-02-04 10:26:07 +01:00
Adrien Grand cc1ff560df Rename `geohashgrid` to `geohash_grid` in documentation.
It was renamed in fc6bc4c477.

Close #4997
2014-02-04 09:39:55 +01:00
Lars Francke 1bd9dc129b Fix confusing sentence
The original sentence didn't make much sense. I hope this is a bit better. Taken heavy inspiration from c63d8c4fb5
2014-02-03 17:20:40 +01:00
Lars Francke 7cbd0962b5 Improve Aggregations documentation
* Mostly minor things like typos and grammar stuff
* Some clarifications
* The note on the deprecation was ambiguous. I've removed the problematic part so that it now definitely says it's deprecated
2014-02-03 17:16:52 +01:00
Shay Banon d36e345f1f fix docs to reflect removal of byte buffer memory 2014-02-03 09:54:30 -05:00
Igor Motov 90da268237 Remove support for boost in copy_to field
Currently, boosting on `copy_to` is misleading and does not work as originally specified in #4520. Instead of boosting just the terms from the origin field, it boosts the whole destination field.  If two fields copy_to a third field, one with a boost of 2 and another with a boost of 3, all the terms in the third field end up with a boost of 6.  This was not the intention.

  The alternative: to store the boost in a payload for every term, results in poor performance and inflexibility. Instead, users should either (1) query the common field AND the field that requires boosting, or (2) the multi_match query will soon be able to perform term-centric cross-field matching that will allow per-field boosting at query time (coming in 1.1).
2014-01-31 14:34:01 -05:00
Martijn van Groningen 7e1eed9814 The forceful no cache behaviour for range filter with now date match expression should only be active if no rounding has been specified for `now` in the date range range expression (for example: `now/d`).
Also the automatic now detection in range filters is overrideable by the `_cache` option.

 Closes #4947
 Relates to #4846
2014-01-30 15:51:33 +01:00
uboness d3f2173ef9 fixed date_/histogram aggregation documentation - added documentation for the `min_doc_count` setting
Closes #4944
2014-01-29 20:55:26 +01:00
Igor Motov 2755eecf65 Add throttling to snaphost and restore operations
Closes #4855
2014-01-29 10:33:59 -05:00
Martijn van Groningen c82f27577b Added dedicated thread pool cat api, that can show all thread pool related statistic (size, rejected, queue etc.) for all thread pools (get, search, index etc.)
By default active, rejected and queue thread statistics are included for the index, bulk and search thread pool.
Other thread statistics of other thread pools can be included via the `h` query string parameter.

Closes #4907
2014-01-29 13:25:06 +01:00
uboness 9f04e5fe38 fixed nested example response in docs
Closes #4935
2014-01-29 13:09:12 +01:00
uboness dd389d1cc5 Made all multi-bucket aggs return consistent response format
Closes #4926
2014-01-28 17:46:57 +01:00
Luca Cavanna b61ca9932a [DOCS] Clarified docs for cluster.routing.allocation.same_shard.host cluster setting
Clarified also javadocs for SameShardAllocationDecider
2014-01-28 12:32:37 +01:00
Luca Cavanna 95bf091dd6 [DOCS] unified index settings info and added warmers section in create index docs 2014-01-27 17:10:38 +01:00
Costin Leau 2690019e95 update link to Hadoop Snapshot/Restore plugin 2014-01-25 18:27:14 +02:00
Clinton Gormley 1aa1e83e03 [DOCS] Updated the breaking changes for the fields param
Closes #4888
2014-01-25 12:34:15 +01:00
Karel Minarik 241bb09db1 [DOCS] More assertive statement about requiring `query` in _count, etc 2014-01-23 20:35:44 +01:00
Nik Everett 93a8e80aff Support multiple rescores
Detects if rescores arrive as an array instead of a plain object.  If so
then parse each element of the array as a separate rescore to be executed
one after another.  It looks like this:
   "rescore" : [ {
      "window_size" : 100,
      "query" : {
         "rescore_query" : {
            "match" : {
               "field1" : {
                  "query" : "the quick brown",
                  "type" : "phrase",
                  "slop" : 2
               }
            }
         },
         "query_weight" : 0.7,
         "rescore_query_weight" : 1.2
      }
   }, {
      "window_size" : 10,
      "query" : {
         "score_mode": "multiply",
         "rescore_query" : {
            "function_score" : {
               "script_score": {
                  "script": "log10(doc['numeric'].value + 2)"
               }
            }
         }
      }
   } ]

Rescores as a single object are still supported.

Closes #4748
2014-01-23 16:29:07 +01:00
Nik Everett 37f80c8d80 Documentation for score_mode
Closes #4742
2014-01-23 16:24:48 +01:00
Brusic d9b71a8083 [DOCS] various docs fixes
Removed unused misc.asciidoc file
Added plugins directory to directory layout
Fixed transport.tcp.connect_timeout value to match the code found in NetworkService.TcpSettings
Clarified that phrase query does not preserve order of terms
Clarified merge page
Added instructions on how to build documentation to docs/README
2014-01-23 10:52:13 +01:00
Clinton Gormley 8685818ad3 [DOCS] Moved termvector and mtermvectors from search to docs 2014-01-22 14:10:26 +01:00
Simon Willnauer cb3bcb05be [DOCS]: Fix added version termvectors.asciidoc 2014-01-22 12:08:13 +01:00
Simon Willnauer e6ace1313e [DOCS]: fixed added / coming tags in docs 2014-01-22 12:02:37 +01:00
Martijn van Groningen 2981edca54 [DOCS] `coming` instead of `added` for copy_to feature. 2014-01-22 11:26:22 +01:00
Martijn van Groningen 5a61a8b098 [DOCS] annotated the multi fields and copy_to feature with the right version. 2014-01-22 11:16:41 +01:00
Adrien Grand 9282ae4ffd Terms aggregations: make size=0 return all terms.
Terms aggregations return up to `size` terms, so up to now, the way to get all
matching terms back was to set `size` to an arbitrary high number that would be
larger than the number of unique terms.

Terms aggregators already made sure to not allocate memory based on the `size`
parameter so this commit mostly consists in making `0` an alias for the
maximum integer value in the TermsParser.

Close #4837
2014-01-22 11:05:10 +01:00
Martijn van Groningen 75778d082b [DOCS] Moved multi fields documentation into the core-types page
Removed docs about setting inheriting (was never added)
Made mapping samples formatting similar as other ones.
2014-01-22 10:05:58 +01:00
Lee Hinman 2c289fb538 Add the ability to retrieve fields from field data
Adds a new FetchSubPhase, FieldDataFieldsFetchSubPhase, which loads the
field data cache for a field and returns an array of values for the
field.

Also removes `doc['<field>']` and `_source.<field>` workaround no longer
needed in field name resolving.

Closes #4492
2014-01-21 09:13:32 -07:00
Adrien Grand fe351f14e8 Document `index.shard.check_on_startup`. 2014-01-21 15:55:59 +01:00
Martijn van Groningen 66ed9a855a [DOCS] Added multi fields link to mapping page. 2014-01-21 10:52:32 +01:00
Shay Banon e29659e36d add internal force local flag, used by tribe node
tribe node to set it to true so all master read operations will automatically execute on the local tribe node
2014-01-20 22:40:26 +01:00
Luca Cavanna bdb1992e85 Fixed typo 2014-01-20 19:32:50 +01:00
Martijn van Groningen 9bc3d996ff [SPECS] Updated percolator specs. 2014-01-20 18:18:27 +01:00
Igor Motov 649f1b13da Initial implementation of custom _all field
Closes #4520
2014-01-20 10:44:33 -05:00
Simon Willnauer f0bce08c30 Return `MatchNoDocsQuery` if query string is emtpy
Closes #3952
2014-01-20 16:08:57 +01:00
Florian Gilcher eed079aaac Reference docs fixes
* Make it clearer that `aggs` is an allowed synomym
  for the `aggregations` key
* Fix broken example in for datehistogram, `1.5M` is
  not an allowed interval
* Make use of colon before examples consistent
* Fix typos
2014-01-20 12:14:17 +01:00
Dawid Weiss ae71b25145 Documentation typo. 2014-01-20 11:51:08 +01:00
Martijn van Groningen db394117c4 Made sure that any filter that wraps a p/c filter (has_child & has_parent) either directly or indirectly will never be cached by making CustomQueryWrappingFilter extend from NoCacheFilter.
Closes #4757
2014-01-20 10:54:09 +01:00
Alexander Reelsen e34a35244c [DOCS] Added documentation for CAT Aliases API
Added asciidoc. Added new lines in java class.
2014-01-20 09:23:00 +01:00
Clinton Gormley 5003ca9278 [DOCS] Fixed file:/// URL for installing plugins 2014-01-20 01:34:12 +01:00
Andy Goldstein 8f659bccb1 Add documentation for transport.publish_port 2014-01-17 22:06:22 +01:00
David Pilato 38874e5f9b Remove the "-f" script argument from the documentation
Closes #4778.
2014-01-17 11:44:30 +01:00
Clinton Gormley 8cb091e55d [DOCS] Tidied up asciidoc for migration page 2014-01-16 12:22:05 +01:00
Luca Cavanna 4126ae2631 [DOCS] updated json responses after #4310 and #4480
- Removed "ok": true from response examples
 - Added "created" flag to index response examples
 - Replaced exists flag with found in delete response examples
2014-01-16 12:01:39 +01:00
Luca Cavanna 3399f6926a [DOCS] made it clearer that the _version is incremented by all write operations (deletes included) 2014-01-16 11:44:46 +01:00
Igor Motov 4643f78098 [DOCS] Add documentation for URL repository 2014-01-15 13:13:16 -05:00
Clinton Gormley 3d4891321b [DOCS] Minor changes to the breaking changes doc 2014-01-15 18:23:03 +01:00
Alexander Reelsen c6155c5142 release [1.0.0.RC1] 2014-01-15 17:02:22 +00:00
Clinton Gormley 9e3f527721 [DOCS] Fixed asciidoc issue 2014-01-15 18:00:13 +01:00
Clinton Gormley faddd66e87 [DOCS] Added breaking changes in 1.0 2014-01-15 17:50:24 +01:00
Clinton Gormley 12a095d797 [DOCS] Tidied up the multi-indices docs 2014-01-15 16:13:38 +01:00
Clinton Gormley 93ba3b5e70 [DOCS] Tidied up layout of setup docs 2014-01-15 15:09:34 +01:00
Lee Hinman 3062e59f51 [DOCS] Fix default setting in circuit breaker documentation 2014-01-15 07:05:05 -07:00
Clinton Gormley a0b993e2dc [DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00
Clinton Gormley f8a427e266 [DOCS] Moved fielddata circuit breaker higher up the page 2014-01-15 14:00:08 +01:00
Alexander Reelsen 349a8be4fd Consistent REST API changes for GETting data
* Made GET mappings consistent, supporting
  * /{index}/_mappings/{type}
  * /{index}/_mapping/{type}
  * /_mapping/{type}
  * Added "mappings" in the JSON response to align it with other responses
* Made GET warmers consistent, support /{index}/_warmers/{type} and /_warmer, /_warner/{name}
  as well as wildcards and _all notation
* Made GET aliases consistent, support /{index}/_aliases/{name} and /_alias, /_aliases/{name}
  as well as wildcards and _all notation
* Made GET settings consistent, added /{index}/_setting/{name}, /_settings/{name}
  as well as supportings wildcards in settings name
* Returning empty JSON instead of a 404, if a specific warmer/
  setting/alias/type is missing
* Added a ton of spec tests for all of the above
* Added a couple of more integration tests for several features

Relates #4071
2014-01-14 22:33:52 +01:00
Igor Motov ba7699a38b Add documentation for index.routing.allocation.*._name and index.routing.allocation.*._id options 2014-01-14 16:20:46 -05:00
Britta Weber 411739fe3b Make PUT and DELETE consistent for _mapping, _alias and _warmer
See issue #4071

PUT options for _mapping:

Single type can now be added with

`[PUT|POST] {index|_all|*|regex|blank}/[_mapping|_mappings]/type`

and

`[PUT|POST] {index|_all|*|regex|blank}/type/[_mapping|_mappings]`

PUT options for _warmer:

PUT with a single warmer can now be done with

`[PUT|POST] {index|_all|*|prefix*|blank}/{type|_all|*|prefix*|blank}/[_warmer|_warmers]/warmer_name`

PUT options for _alias:

Single alias can now be PUT with

`[PUT|POST] {index|_all|*|prefix*|blank}/[_alias|_aliases]/alias`

DELETE options _mapping:

Several mappings can be deleted at once by defining several indices and types with

`[DELETE] /{index}/{type}`

`[DELETE] /{index}/{type}/_mapping`

`[DELETE] /{index}/_mapping/{type}`

where

`index= * | _all | glob pattern | name1, name2, …`

`type= * | _all | glob pattern | name1, name2, …`

Alternatively, the keyword `_mapings` can be used.

DELETE options for  _warmer:

Several warmers can be deleted at once by defining several indices and names with

`[DELETE] /{index}/_warmer/{type}`

where

`index= * | _all | glob pattern | name1, name2, …`

`type= * | _all | glob pattern | name1, name2, …`

Alternatively, the keyword `_warmers` can be used.

DELETE options for _alias:

Several aliases can be deleted at once by defining several indices and names with

`[DELETE] /{index}/_alias/{type}`

where

`index= * | _all | glob pattern | name1, name2, …`

`type= * | _all | glob pattern | name1, name2, …`

Alternatively, the keyword `_aliases` can be used.
2014-01-14 20:02:43 +01:00
Benjamin Vetter ba8e012be9 Referring to stop analyzer for stopword docs #329 2014-01-14 11:53:30 +01:00
Benjamin Vetter 22a96e6a18 Added stopwords: _none_ to the docs #329 2014-01-14 11:53:29 +01:00
Igor Motov b987615f5e Improve support for partial snapshots
Fixes #4701. Changes behavior of the snapshot operation. The operation now fails if not all primary shards are available at the beginning of the snapshot operation. The restore operation no longer tries to restore indices with shards that failed or were missing during snapshot operation.
2014-01-13 16:59:21 -05:00
Lee Hinman b379bf5668 Default to not accepting type wrapper in indexing requests
Currently it is possible to index a document as:

```
POST /myindex/mytype/1
{ "foo"...}
```

or as:

```
POST /myindex/mytype/1
{
    "mytype": {
        "foo"...
    }
}
```

This makes indexing non-deterministic and fields can be misinterpreted
as type names.

This changes makes Elasticsearch accept only the first form by default,
ie without the type wrapper. This can be changed by setting
`index.mapping.allow_type_wrapper` to `true`` when creating the index.

Closes #4484
2014-01-13 14:37:00 -07:00
Clinton Gormley 0751f0b7c6 [DOCS] Fixed link to tribe.asciidoc 2014-01-13 22:01:12 +01:00
Clinton Gormley 2e79246c1a [DOCS] Added docs for tribe node
Related #4708
2014-01-13 21:53:53 +01:00
Andrew Raines e13f55dfca [DOCS] Update cat/indices to reflect ?pri flag 2014-01-13 14:18:27 -06:00
markharwood 541059a4d1 Adds a new coerce flag for numeric field mappings which is defaulted to true.
When set to false a new strict mode of parsing is employed which
a) does not permit numbers to be passed as JSON strings in quotes
b) rejects numbers with fractions that are passed to integer, short or long fields.

Closes #4117
2014-01-13 17:58:18 +00:00
markharwood 2795f4e55d Standardized use of “*_length” for parameter names rather than “*_len”.
Java Builder apis drop old “len” methods in favour of new “length”
Rest APIs support both old “len: and new “length” forms using new ParseField class to a) provide compiler-checked consistency between Builder and Parser classes and
b) a common means of handling deprecated syntax in the DSL.
Documentation and rest specs only document the new “*length” forms
Closes #4083
2014-01-13 15:59:15 +00:00
Simon Willnauer 8247e4beae Rename RobinEngine and friends to InternalEngine
Closes #4633
2014-01-13 15:49:10 +01:00
LightGuard e89d5d0d86 Fixing up code block delimeters for asciidoctor
You can now successfully run the docs through asciidoctor
2014-01-13 15:26:53 +01:00
Simon Willnauer 7f63ddf94e Default stopwords list should be `_none_` for all but language-specific analyzers
`standard_html_strip` and `pattern` analyzer support stopwords which are
set to the default `english` stopwords by default. Those analyzers
should not use stopwords by default since they are language neutral

Closes #4699
2014-01-13 14:44:10 +01:00
Adrien Grand 5c237fe834 Add new option `min_doc_count` to terms and histogram aggregations.
`min_doc_count` is the minimum number of hits that a term or histogram key
should match in order to appear in the response.

`min_doc_count=0` replaces `compute_empty_buckets` for histograms and will
behave exactly like facets' `all_terms=true` for terms aggregations.

Close #4662
2014-01-13 10:09:38 +01:00
Martijn van Groningen 943b62634c Replaced the multi-field type in favour for the multi fields option that can be set on any core field.
When upgrading to ES 1.0 the existing mappings with a multi-field type automatically get replaced to a core field with the new `fields` option.

If a `multi_field` type-ed field doesn't have a main / default field, a default field will be chosen for the multi fields syntax. The new main field type
will be equal to the first `multi_field` fields' field or type string if no fields have been configured for the `multi_field` field and in both cases
the default index will not be indexed (`index=no` is set on the default field).

If a `multi_field` typed field has a default field, that field will replace the `multi_field` typed field.

Closes to #4521
2014-01-13 09:21:53 +01:00
Florian Schilling 464037e0c1 Geo clean Up
============
The default unit for measuring distances is *MILES* in most cases. This commit moves ES
over to the *International System of Units* and make it work on a default which relates
to *METERS* . Also the current structures of the `GeoBoundingBox Filter` changed in
order to define the *Bounding* by setting abitrary corners.

Distances
---------
Since the default unit for measuring distances has changed to a default unit
`DistanceUnit.DEFAULT` relating to *meters*, the **REST API** has changed at the
following places:

  * `ScriptDocValues.factorDistance()` returns *meters* instead of *miles*
  * `ScriptDocValues.factorDistanceWithDefault()` returns *meters* instead of *miles*
  * `ScriptDocValues.arcDistance()` returns *meters* instead of *miles*
        one might use `ScriptDocValues.arcDistanceInMiles()`
  * `ScriptDocValues.arcDistanceWithDefault()` returns *meters* instead of *miles*
  * `ScriptDocValues.distance()` returns *meters* instead of *miles*
        one might use `ScriptDocValues.distanceInMiles()`
  * `ScriptDocValues.distanceWithDefault()` returns *meters* instead of *miles*
        one might use `ScriptDocValues.distanceInMilesWithDefault()`
  * `GeoDistanceFilter` default unit changes from *kilometers* to *meters*
  * `GeoDistanceRangeFilter` default unit changes from *miles* to *meters*
  * `GeoDistanceFacet` default unit changes from *miles* to *meters*

Geo Bounding Box Filter
-----------------------
The naming of the GeoBoundingBoxFilter properties allows to set arbitrary corners
(see #4084) namely `top_right`, `top_left`, `bottom_right` and `bottom_left`. This
change also includes the fields `topRight` and `bottomLeft` Also it is be possible to
set the single values by using just `top`, `bottom`, `left` and `right` parameters.

Closes #4515, #4084
2014-01-11 21:30:29 +09:00
Boaz Leskes 5ac7bd83ad Expose min/max open file descriptors in Cluster Stats API
Also changes the response format of that section to:

```
 "open_file_descriptors": {
      "min": 200,
      "max": 346,
       "avg": 273
 }
```

Closes #4681

Note: this is an aggregate of 3 commits in the 0.90 branch
2014-01-10 12:15:56 +01:00
Shay Banon fe2a70831f remove bloom from clear cache API, add id_cache 2014-01-09 21:08:45 +01:00
Clinton Gormley 3ab73ab957 Deprecate document _boost
Fixes #4664
2014-01-09 16:04:01 +01:00
Simon Willnauer bc5a9ca342 Rename edit_distance/min_similarity to fuzziness
A lot of different API's currently use different names for the
same logical parameter. Since lucene moved away from the notion
of a `similarity` and now uses an `fuzziness` we should generalize
this and encapsulate the generation, parsing and creation of these
settings across all queries.

This commit adds a new `Fuzziness` class that handles the renaming
and generalization in a backwards compatible manner.

This commit also added a ParseField class to better support deprecated
Query DSL parameters

The ParseField class allows specifying parameger that have been deprecated.
Those parameters can be more easily tracked and removed in future version.
This also allows to run queries in `strict` mode per index to throw
exceptions if a query is executed with deprected keys.

Closes #4082
2014-01-09 15:14:51 +01:00
Martijn van Groningen eb63bb259d Added `action.destructive_requires_name` that controls whether wildcard expressions and `_all` is allowed to be used for destructive operat Also the delete index api requires always an index to be specified (either concrete index, alias or wildcard expression)
Closes #4549 #4481
2014-01-09 11:36:50 +01:00
Alexander Reelsen 7042a9aa65 [DOCS] Fix HTTP endpoints after stats API changes 2014-01-09 11:30:28 +01:00
Alexander Reelsen 1652767ec8 [DOCS] Added documentation for SameShardAllocationDecider
Closes #4615
2014-01-09 11:24:12 +01:00
Martijn van Groningen e6f83248a2 Deprecated disable allocation decider which has the following options:
`allocation.disable_new_allocation`, `allocation.disable_allocation`, `allocation.disable_replica_allocation`,
in favour for the enable allocation decider which has a single option `allocation.enable` wich can be set to the following values:
`none`, `new_primaries`, `primaries` and `all` (default).

Closes #4488
2014-01-09 10:01:46 +01:00
Martijn van Groningen 7e341cefd0 Change the `sort` boolean option in percolate api to the sort dsl available in search api.
Closes #4625
2014-01-09 09:58:34 +01:00
Martijn van Groningen 0973b2863c Added extra rest endpoint for get settings api.
Added rest test to also test the get settings' prefix option.
2014-01-09 09:44:40 +01:00
Clinton Gormley 2e4b70d40f [DOCS] Fixed duplicate ID in highlighting 2014-01-09 00:37:18 +01:00
Nik Everett bbf0ec52de Add warning phrase suggester's max_errors
large number can badly impact performance.
2014-01-08 23:06:41 +01:00
Igor Motov bec6527312 Add support for flat_settings flag to all REST APIs that output settings
Closes #4140
2014-01-08 10:36:36 -05:00
Martijn van Groningen 6dc434822c Changed get index settings api to use new internal get index settings api instead of relying on the cluster state api.
The new internal get index settings api is more efficient when it comes to sending the index settings from the master to the client via the
Also the get index settings support now all the indices options.

Closes #4620
2014-01-08 13:18:57 +01:00
Nik Everett 8bd9e34e39 Stop FVH from throwing away some query boosts
The FVH was throwing away some boosts on queries stopping a number of
ways to boost phrase matches to the top of the list of fragments from
working.

The plain highlighter also doesn't work for this but that is because it
doesn't support the concept of the same term having a different score at
different positions.

Also update documentation claiming that FHV is nicer for weighing terms
found by query combinations.

Closes #4351
2014-01-08 11:51:48 +01:00
Nik Everett 522d620eb6 Use FHV's phraseLimit
This prevents poisoning the FVH with documents that contain TONS of matches
which take tons of memory and time to highlight.

Closes #4645
2014-01-08 11:27:58 +01:00
Alexander Reelsen ad50afbec8 Simplify usage of nodes info API
Important: This breaks backwards compatibility with 0.90

* Removed endpoints: /_cluster/nodes, /_cluster/nodes/nodeId1,nodeId2
* Disallow usage of parameters, but make required metrics part of URI
* Changed NodesInfoRequest to return everything by default
* Fixed NPE in NodesInfoResponse

Closes #4055
2014-01-08 09:46:04 +01:00
Alexander Reelsen 6ef6bb993c Cluster state API: Improved consistency
Instead of specifying what kind of data should be filtered, this commit
streamlines the API to actually specify, what kind of data should be displayed.
This makes its behaviour similar to the other requests, like NodeIndicesStats.

A small feature has been added as well: If you specify an index to select on, not
only the metadata, but also the routing tables are filtered by index in order
to prevent too big cluster states to be returned.

Also the CAT apis have been changed to only return the wanted data in order to keep
network traffic as small as needed.

Tests for the cluster state API filtering have been added as well.

Note: This change breaks backwards compatibility with 0.90!

Closes #4065
2014-01-08 09:25:20 +01:00
Igor Motov 5d98341d11 Fix typo in snapshot/restore documentation 2014-01-07 14:03:12 -05:00
Shay Banon 4aa5ef139e randomize flush interval so multiple shards won't flush at the sam time
- also, allow to update interval using update settings on an index
2014-01-07 19:58:28 +01:00
markharwood 602de04692 A GeoHashGrid aggregation that buckets GeoPoints into cells whose dimensions are determined by a choice of GeoHash resolution.
Added a long-based representation of GeoHashes to GeoHashUtils for fast evaluation in aggregations.
The new BucketUtils provides a common heuristic for determining the number of results to obtain from each shard in "top N" type requests.
2014-01-07 18:03:33 +00:00
Lee Hinman 2cb40fcb17 Rename "exists" to "found" in TermVector and Get responses
- Adds the "created" field to the index action response
- Reverses Delete class' notFound to Found to avoid double negative
2014-01-07 09:47:07 -07:00
Simon Willnauer fa16969360 Cleanup comments and class names s/ElasticSearch/Elasticsearch
* Clean up s/ElasticSearch/Elasticsearch on docs/*
 * Clean up s/ElasticSearch/Elasticsearch on src/* bin/* & pom.xml
 * Clean up s/ElasticSearch/Elasticsearch on NOTICE.txt and README.textile

Closes #4634
2014-01-07 11:21:51 +01:00
Andrew Raines c46721a25f Document h/headers switcheroo. 2014-01-06 16:08:48 -06:00
Martijn van Groningen 32c5471d33 Rename `score` to `track_scores` in percolate api.
Closes #4624
2014-01-06 14:57:39 +01:00
Adrien Grand 9763d079b8 Eager norms loading options.
Norms can be eagerly loaded on a per-field basis by setting norms.loading to
`eager` instead of the default `lazy`:

```
"my_string_field" : {
  "type": "string",
  "norms": {
    "loading": "eager"
  }
}
```

In case this behavior should be applied to all fields, it is possible to change
the default value by setting `index.norms.loading` to `eager`.

Close #4079
2014-01-06 09:53:42 +01:00
Alexander Reelsen bb275166f1 Simplify nodes stats API
First, this breaks backwards compatibility!

* Removed /_cluster/nodes/stats endpoint
* Excpect the stats types not as parameters, but as part of the URL
* Returning all indices stats by default, returning all nodes stats by default
* Supporting groups & types in nodes stats now as well
* Updated documentation & tests accordingly
* Allow level parameter for "shards" and "indices" (cluster does not make sense here)

Closes #4057
2014-01-06 08:33:32 +01:00
Alexander Reelsen 33878be1e8 Simplify indices stats API
Note: This breaks backward compatibility

* Removed clear/all parameters, now all stats are returned by default
* Made the metrics part of the URL
* Removed a lot of handlers
* Added shards/indices/cluster level paremeter to change response serialization
* Returning translog statistics in IndicesStats
* Added TranslogStats class
* Added IndexShard.translogStats() method to get the stats from concrete implementation
* Updated documentation

Closes #4054
2014-01-06 07:27:03 +01:00
Lee Hinman 47607a69a1 Default the circuit breaker limit to 80% of the maximum JVM heap 2014-01-03 16:21:55 -07:00
Lee Hinman 5463f7953f Expose `simple_query_string` flags in `flags` parameter 2014-01-03 16:14:19 -07:00
Alexander Reelsen 811b7d7d78 Do not start packages on installation
The reason to not start packages on installation is to allow to configure
them before starting up (setting heap, cluster.name etc)

Also the documentation was updated in order to show, which statements need
to be executed.
In addition, these statements are also printed out when the package is
installed, depending on whether chkconfig, system or update-rc.d is used.

Closes #3722
2014-01-03 17:40:27 +01:00
Martijn van Groningen f1bf585089 The `fields` option should always return an array for json document fields and single valued field for metadata fields.
Also the `fields` option can only be used to fetch leaf fields, trying to do fetch object fields will return in a client error.

Closes #4542
2014-01-03 17:29:12 +01:00
David Pilato 0c7b494bb8 plugin manager: new `timeout` option
When testing plugin manager with real downloads, it could happen that the test run forever. Fortunately, test suite will be interrupted after 20 minutes, but it could be useful not to fail the whole test suite but only warn in that case.

By default, plugin manager still wait indefinitely but it can be modified using new `--timeout` option:

```sh
bin/plugin --install elasticsearch/kibana --timeout 30s

bin/plugin --install elasticsearch/kibana --timeout 1h
```

Closes #4603.
Closes #4600.
2014-01-03 16:48:18 +01:00
Britta Weber 9f54e9782d rename _shard -> _index and also rename classes and variables
closes #4584
2014-01-03 14:00:23 +01:00
Lee Hinman a754224751 Add field data memory circuit breaker.
This adds the field data circuit breaker, which is used to estimate
the amount of memory required to load field data before loading it. It
then raises a CircuitBreakingException if the limit is exceeded.

It is configured with two parameters:

`indices.fielddata.cache.breaker.limit` - the maximum number of bytes
of field data to be loaded before circuit breaking. Defaults to
`indices.fielddata.cache.size` if set, unbounded otherwise.

`indices.fielddata.cache.breaker.overhead` - a contast for all field
data estimations to be multiplied with before aggregation. Defaults to
1.03.

Both settings can be configured dynamically using the cluster update
settings API.
2014-01-02 15:04:47 -07:00
Martijn van Groningen aa548f5148 Remove GET `_aliases` api in favour for GET `_alias` api
Currently there are two get aliases apis that both have the same functionality, but have a different response structure. The reason for having 2 apis is historic.

The GET _alias api was added in 0.90.x and is more efficient since it only sends the needed alias data from the cluster state between the master node and the node that received the request. In the GET _aliases api the complete cluster state is send to the node that received the request and then the right information is filtered out and send back to the client.

The GET _aliases api should be removed in favour for the alias api

Closes to #4539
2014-01-02 13:56:11 +01:00
Martijn van Groningen f4bf0d5112 Replaced `ignore_indices` with `ignore_unavailable`, `expand_wildcards` and `allow_no_indices`.
* `ignore_unavailable` - Controls whether to ignore if any specified indices are unavailable, this includes indices that don't exist or closed indices. Either `true` or `false` can be specified.
* `allow_no_indices` - Controls whether to fail if a wildcard indices expressions results into no concrete indices. Either `true` or `false` can be specified. For example if the wildcard expression `foo*` is specified and no indices are available that start with `foo` then depending on this setting the request will fail. This setting is also applicable when `_all`, `*` or no index has been specified.
* `expand_wildcards` - Controls to what kind of concrete indices wildcard indices expression expand to. If `open` is specified then the wildcard expression if expanded to only open indices and if `closed` is specified then the wildcard expression if expanded only to closed indices. Also both values (`open,closed`) can be specified to expand to all indices.

Closes to #4436
2014-01-02 12:19:45 +01:00
Britta Weber 1ede9a5730 make term statistics accessible in scripts
term statistics can be accessed via the _shard variable.

Below is a minimal example. See documentation on details.

```

DELETE paytest

PUT paytest
{
    "mappings": {
        "test": {
            "_all": {
                "auto_boost": true,
                "enabled": true
            },
            "properties": {
                "text": {
                    "index_analyzer": "fulltext_analyzer",
                    "store": "yes",
                    "type": "string"
                }
            }
        }
    },
    "settings": {
        "analysis": {
            "analyzer": {
                "fulltext_analyzer": {
                    "filter": [
                        "my_delimited_payload_filter"
                    ],
                    "tokenizer": "whitespace",
                    "type": "custom"
                }
            },
            "filter": {
                "my_delimited_payload_filter": {
                    "delimiter": "+",
                    "encoding": "float",
                    "type": "delimited_payload_filter"
                }
            }
        },
        "index": {
            "number_of_replicas": 0,
            "number_of_shards": 1
        }
    }
}

POST paytest/test/1
{
    "text": "the+1 quick+2 brown+3 fox+4 is quick+10"
}

POST paytest/test/2
{
    "text": "the+1 quick+2 red+3 fox+4"
}

POST paytest/_refresh

POST paytest/_search
{
    "script_fields": {
       "ttf": {
          "script": "_shard[\"text\"][\"quick\"].ttf()"
       }
    }
}

POST paytest/_search
{
    "script_fields": {
       "freq": {
          "script": "_shard[\"text\"][\"quick\"].freq()"
       }
    }
}
POST paytest/test/2/_termvector
POST paytest/_search
{
    "script_fields": {
       "payloads": {
          "script": "term = _shard[\"text\"].get(\"red\",_PAYLOADS);payloads = []; for(pos : term){payloads.add(pos.payloadAsFloat(-1));} return payloads;"
       }
    }
}

POST paytest/_search
{
   "script_fields": {
      "tv": {
         "script": "_shard[\"text\"][\"quick\"].freq()"
      }
   },
   "query": {
      "function_score": {
         "functions": [
            {
               "script_score": {
                  "script": "_shard[\"text\"][\"quick\"].freq()"
               }
            }
         ]
      }
   }
}

```

closes #3772
2014-01-02 11:17:33 +01:00
Adrien Grand 1654ae8937 Explicit doc_values setting.
Once doc values are enabled on a field, they can't be disabled.

Close #4560
2013-12-30 11:10:52 +01:00
Adrien Grand 05448b6276 Doc values for geo points.
This commits add doc values support to geo point using the exact same approach
as for numeric data: geo points for a given document are stored uncompressed
and sequentially in a single binary doc values field.

Close #4207
2013-12-27 12:45:18 +01:00
Florian Schilling bc452dff84 * setup accurate GeoDistance Function
* adapt tests
* introduced default GeoDistance function
* Updated docs

closes #4498
2013-12-27 19:15:19 +09:00
Andrew Raines 69d88a1edd [DOCS] Add headers and help parameters. 2013-12-23 22:26:28 -06:00
Martijn van Groningen eb86a3a6fe [DOCS] Changed `shape_field_name` to `path` in geo_shape filter documentation.
Relates to #4486
2013-12-23 11:27:06 +01:00
Clinton Gormley dea6b112ae [DOCS] Corrected bloom loading docs 2013-12-20 11:20:54 +01:00
Clinton Gormley 2b8c82c883 [DOCS] Documented index.codec.bloom.load for #4525 2013-12-20 10:51:17 +01:00
Richard Pijnenburg df85fdf88f Add repository information to docs
This adds the apt and yum repo information to the setup docs.
2013-12-19 15:58:08 +01:00
Adrien Grand 52db8eb324 More documentation improvements for fielddata loading. 2013-12-18 16:05:35 +01:00
Adrien Grand 07443089ce Improve documentation of the new `disabled` field data format. 2013-12-18 15:44:57 +01:00
Boaz Leskes 3c5106ae98 Added cluster health status to the Cluster Stats API
Relates to #4460
2013-12-18 12:03:49 +01:00
Chris Simpson 4f8c916eed [Docs] Fix Typo
Fixes small typo in the geo_distance aggregation docs.
2013-12-18 11:21:21 +01:00
Boaz Leskes 2b6214cff7 Added Cluster Stats API
Closes #4460
2013-12-17 13:14:46 +01:00
Grégory Quatannens c64abaae7e Fixing typo and grammar 2013-12-17 11:39:02 +01:00
Adrien Grand 33599d9a34 Compressed geo-point field data.
This commit allows to trade precision for memory when storing geo points.
This new field data impl accepts a `precision` parameter that controls the
maximum expected error for storing coordinates. This option can be updated on
a live index with the PUT mapping API.

Default precision is 1cm, which requires 8 bytes per geo-point (50% memory
saving compared to using 2 doubles).

Close #4386
2013-12-17 11:29:48 +01:00
Clinton Gormley 684affa5c7 [DOCS] Removed unused file 2013-12-17 11:28:19 +01:00
Alexander Reelsen b713cf56ed Allow to provide parameters not only through -D but as long parameters
All getopt long style parameters are now set as es. properties,

elasticsearch --path.data=/some/path

results in -Des.path.data=/some/path

Closes #4393
2013-12-17 10:43:27 +01:00
Alexander Reelsen c30945a3d8 Start elasticsearch in the foreground by default
Instead of using the '-f' parameter to start elasticsearch in the
foreground, this is now the default modus.

In order to start elasticsearch in the background, the '-d' parameter
can be used.

Closes #4392
2013-12-17 10:39:22 +01:00
Clinton Gormley 34b9b16233 [DOCS] Fixed some bad link refs 2013-12-16 18:07:33 +01:00
Martijn van Groningen 23d2b1ea7b Renamed top level `filter` to `post_filter`.
Closes #4119
2013-12-16 17:10:14 +01:00
Lee Hinman db431b7cb3 Remove the `field` and `text` queries.
The `text` query was replaced by the `match` query and has been
deprecated for quite a while.

The `field` query should be replaced by a `query_string` query with
the `default_field` specified.

Fixes #4033
2013-12-16 08:59:36 -07:00
Adrien Grand 4e7ce4ee02 Make field data changes immediately taken into account and add the ability to disallow field data loading.
This commit changes field data configuration updates so that they are
immediately taken into account for loading new segments. The way it works
is that field data configuration is now cached separately from the field
data cache, meaning that it is now possible to clear the field data
configuration from IndexFieldDataService while the cache will stay around. On
the next time that Elasticsearch will reload field data configuration, it will
check if there is already a cache entry, and reuse it if it exists.

To disable field data loading, all that is required is to change the field
data format to "none" (supported by all field data types) using the update
mapping API. Elasticsearch will then refuse to load field data on any new
segment, but field data which has been loaded on the previous segments will
remain available. So you need to clear the field data cache in order to
reclaim memory (otherwise memory will be reclaimed slower, as segments get
merged).

Close #4430
Close #4431
2013-12-16 14:34:33 +01:00
Adrien Grand 36bd9cc432 Aggregations: Ordinals-based string bucketing support.
When the ValuesSource has ordinals, terms ordinals are used as a cache key to
bucket ordinals. This can make terms aggregations on String terms significantly
faster.

Close #4350
2013-12-13 15:34:02 +01:00
Martijn van Groningen 10e2528cce Added the `force_source` option to highlighting that enforces to use of the _source even if there are stored fields.
The percolator uses this option to deal with the fact that the MemoryIndex doesn't support stored fields,
this is possible b/c the _source of the document being percolated is always present.

Closes #4348
2013-12-13 13:39:53 +01:00
Lee Hinman 77fcf71338 Add new `simple_query_string` query type
This adds support for Lucene's SimpleQueryParser by adding a new type
of query called the `simple_query_string`. The `simple_query_string`
query is designed to be able to parse human-entered queries without
throwing any exceptions.

Resolves #4159
2013-12-12 12:09:32 -07:00
Alexander Reelsen 81e13a870b Packaging: Ensure setting of sysctl vm.max_map_count
In order to be sure that memory mapped lucene directories are working
one can configure the kernel about how many memory mapped areas
a process may have. This setting ensure for the debian and redhat initscripts
as well as the systemd startup, that this setting is set high enough.

Closes #4397
2013-12-11 09:19:22 +01:00
Boaz Leskes 99b421925f Add wildcard support to field resolving in the Get Field Mapping API
Closes #4367
2013-12-10 23:46:37 +01:00
Simon Willnauer 6c189310b9 Remove 'term_index_interval' and 'term_index_divisor'
These settings are no longer relevant since they are codec /
postingsformat level settings since Lucene 4.0

Closes #3912
2013-12-10 16:54:08 +01:00
Martijn van Groningen ebf6519965 Added aggs option to percolate api documentation. 2013-12-10 14:09:37 +01:00
Lee Hinman bc9698a347 Support 'yaml' as a format for the Analyze API
Fixes #4311
2013-12-08 15:08:00 -07:00
Martijn van Groningen 8c1de501e7 Update percolator highlighting docs. 2013-12-07 16:40:49 -05:00
Adrien Grand 32eb5ffa92 [Docs] Document which encoding should be used in order to make sense of the offsets returned by the term vectors API.
Close #4363
2013-12-06 22:39:08 +01:00
Shay Banon 28eff2ba29 remove help command, list all cat commands in /_cat?h endpoint 2013-12-05 14:36:27 +01:00
Markus Fischer 2da0611dfb [DOCS] Completion suggest: Clarify de-duplication, optimize/merge
This contribution is based on the feedback given in issue #4254 and
issue #4255, and should clear things up, when suggestions are being
removed and not displayed anymore after deletion of data.
2013-12-05 11:10:56 +01:00
Nik Everett 8e34057bc0 Add support for combining fields to the FVH
The Fast Vector Highlighter can combine matches on multiple fields to
highlight a single field using `matched_fields`.  This is most
intuitive for multifields that analyze the same string in different
ways.  Example:
{
    "query": {
        "query_string": {
            "query": "content.plain:running scissors",
            "fields": ["content"]
        }
    },
    "highlight": {
        "order": "score",
        "fields": {
            "content": {
                "matched_fields": ["content", "content.plain"],
                "type" : "fvh"
            }
        }
    }
}

Closes #3750
2013-12-03 11:10:01 +01:00
Yousef 302c762d5e Wrong link to Token Filter 2013-12-03 10:39:13 +01:00
Nik Everett 7690b40ec6 Allow string fields to store token counts
To use this one you send a string to a field of type 'token_count'.  This
makes the most sense with a multi-field.
2013-12-03 09:39:32 +01:00
Alexander Reelsen 6528df2764 [DOCS] Test framework documentation
The java test framework using randomized testing is explained with a couple of examples.
2013-12-02 18:01:45 +01:00
Clinton Gormley 7d993fd917 [DOCS] Another cat?v change 2013-12-02 15:30:49 +01:00
Clinton Gormley 5b15ed73fa [DOCS] Linked cat-pending to cluster-pending 2013-12-02 15:29:47 +01:00
Clinton Gormley 992b2d82b0 [DOCS] Changed the _cat docs to use ?v instead of ?v=true 2013-12-02 15:27:41 +01:00
Clinton Gormley d9a480c97a [DOCS] Typos in aggregations 2013-12-02 15:14:25 +01:00
Conrad Pankoff 87246af256 [DOCS] Fixed typos and corrected grammar 2013-12-02 10:08:26 +01:00
uboness cdc7dfbb2c Changed the "script_lang" parameter to "lang" in all value source based aggs - to be consistent with all other script based APIs. 2013-12-02 02:01:03 +01:00
Clinton Gormley bc393b6d79 Changed the minScore comparator from > to >=
Closes #4303
2013-11-29 20:29:20 +01:00
uboness 0d6a35b9a7 - Added support for term filtering based on include/exclude regex on the terms agg
- Added javadoc to the TermsBuilder

Closes #4267
2013-11-29 13:46:48 +01:00
uboness afb0d119e4 - Added docs for the value_count aggregation
- Fixed typos in the terms facets docs
- Fixed aggregation docs layout
- Added docs for shard_size in term aggregation
2013-11-29 12:35:42 +01:00
Clinton Gormley b48344f296 [DOCS] Doc'ed cluster pending tasks 2013-11-29 08:21:26 +01:00
Andrew Raines 91999e14ce Add _cat/pending_tasks.
Closes #4251.
2013-11-29 01:09:06 -06:00
Lee Hinman 9939e81d88 [DOCS] Fix porter stem filter name in other stemming docs 2013-11-28 22:14:47 -07:00
Lee Hinman fb4e903e35 [DOCS] Fix name of porter stemming token filter 2013-11-28 22:01:19 -07:00
Clinton Gormley 6ce3495029 [DOCS] Fixed a bad link 2013-11-27 17:54:25 +01:00
Clinton Gormley cdc1935b6e [DOCS] Documented rest.action.multi.allow_explicit_index 2013-11-27 17:33:09 +01:00
Boaz Leskes c63d8c4fb5 [Docs] Added _source filtering to documentation
Relates to #3301
2013-11-26 19:16:24 +01:00
Britta Weber dbef64009f [DOC] add doc for multi term vector api
closes #3998
2013-11-26 17:03:14 +01:00
Alexander Reelsen bf74f49fdd Updated Analyzing/Fuzzysuggester from lucene trunk
* Minor alignments (like setter to ctor)
* FuzzySuggester has a unicode aware flag, which is not exposed in the fuzzy completion request parameters
* Made XAnalyzingSuggester flags (PAYLOAD_SEP, END_BYTE, SEP_LABEL) to be written into the postings format, so we can retain backwards compatibility
* The above change also implies, that these flags can be set per instantiated XAnalyzingSuggester
* CompletionPostingsFormatTest now uses a randomProvider for writing data to check for bwc
2013-11-26 12:52:06 +01:00
Martijn van Groningen a03556daa0 Added execution option to `range` filter, with the `index` and `fielddata` as values.
Deprecated `numeric_range` filter in favor for the `range` filter with `fielddata` as execution.

Closes #4034
2013-11-25 23:43:40 +01:00
uboness c7f6c5266d initial commit of the aggregations module
Closes #3300
2013-11-24 03:13:08 -08:00
Jun Ohtani 7bbe453273 [DOCS] Added elasticsearch-extended-analyze plugin 2013-11-21 09:48:00 +01:00
Clinton Gormley 7c59ed4087 [DOCS] Fixed duplicate docs ID in delete 2013-11-21 17:38:51 +11:00
Shay Banon a9880dcbf1 add timeout doc to delete 2013-11-20 12:50:03 -08:00
Matt Weber a841a422f6 Add a field data based TermsFilter
Add FieldDataTermsFilter that compares terms out of
the fielddata cache. When filtering on a large
set of terms this filter can be considerably faster
than using a standard lucene terms filter.

Add the "fielddata" execution mode to the
terms filter parser to enable the use of
the new FieldDataTermsFilter.

Add supporting tests and documentation.

Closes #4209
2013-11-19 19:18:16 +01:00
Andrew Raines 8fabeb1c0b First pass at cat docs. 2013-11-14 21:37:02 -05:00
Andrew Raines 5c085c1204 Fix misspellings. 2013-11-14 20:10:36 -05:00
Luca Cavanna 0aaa39d00a Minor improvements to indices filter and query & updated docs
Slightly simplified indices filter and query parsers code
Trimmed down tests where possible
2013-11-14 17:25:34 +01:00
Olivier Favre fa80ca97b2 Indices query/filter skip parsing altogether for irrelevant indices when possible
Closes #2416
2013-11-14 17:24:49 +01:00
Igor Motov 510397aecd Initial implementation of Snapshot/Restore API
Closes #3826
2013-11-10 18:26:56 -05:00
Lee Hinman f7d5d1e5c9 [DOCS] Update store docs to indicate mmapfs is now the default on 64-bit Linux 2013-11-09 11:42:43 -07:00
Clinton Gormley 5af4e02d6c [DOCS] Fix link to statsd plugin
Fixes #4128
2013-11-08 20:29:51 +01:00
Clinton Gormley 7189310764 In ctor of GeoPointFieldMapper, geohash_prefix now implicitly enables geohash option
Also improved docs for geopoint type and geohash_cell filte

Closes #3951
2013-11-08 13:52:17 +01:00
Clinton Gormley b27976fbed [DOCS] Fixed the fielddata regex example on core mapping 2013-11-07 17:09:18 +01:00
Clinton Gormley 3465e69e83 [DOCS] Changed all store:yes/no to store:true/false
which is how this setting is stored internally
2013-11-07 16:57:18 +01:00
Simon Willnauer 77bc5d5ecf release [1.0.0.Beta1] 2013-11-06 15:32:43 +01:00
Simon Willnauer 9654631186 Change 'standart' analyzer to use emtpy stopword list by default.
The 'default' / 'standard' analyzer can be a trappy default sicne it filters
english stopwords by default. Yet a default should not be dedicated to a certain language
since elasticsearch is used in many different scenarios where a standard analysis chain
with specialization to english full-text might be rather counter productive.

This commit changes the 'standard' analyzer to use an empty stopword list for indices
that are created from 1.0.0.Beta1 version onwards but will maintain backwards compatibiliy
for older indices.

Closes #3775
2013-11-05 21:07:21 +01:00
Shay Banon 7c32269f4f Dist. Percolation: Use .percolator instead of _percolator for type name
Use .percolator as the internal (hidden) type name for percolators within the index. Seems nicer name to represent "hidden" types within an index.
closes #4090
2013-11-05 20:02:59 +01:00
Boaz Leskes a9fdcadf01 [DOCS] Added documentation for the keep word token filter 2013-11-04 18:38:44 +01:00
Clinton Gormley 356de95840 Added simplified range syntax to query string docs 2013-11-04 18:18:36 +01:00
Ben McCann 46edfc484a [DOCS] Add some documentation about the performance of `_source` usage in scripts. 2013-11-04 11:05:55 +01:00
Igor Motov c724f0de5d Initial implementation of ResourceWatcherService
Closes #4062
2013-11-03 21:55:54 -05:00
Dan Everton 6df60b7271 [DOC] Improve documentation on search stats groups
Document the ability to return all search statistics groups and provide examples of returning search statistics for groups.
2013-11-01 13:53:39 +01:00
Martijn van Groningen 30ab6f841d [DOCS] Fixed percolate docs errors 2013-11-01 11:44:07 +01:00
Clinton Gormley 4206cc988e [DOCS] Typo on shingle tokenfilter 2013-10-31 20:18:00 +01:00
Alexander Reelsen dfcb3ca2d4 RegexpQueryBuilder now implements MultiTermQueryBuilder
This allows the RegexpQueryBuilder to be used in span queries

Added tests for all span multi term queries.
Also updated the documentation and removed mentioning of numeric range
queries for span queries (they have to be terms).

Closes #3392
2013-10-31 09:12:57 +01:00
Boaz Leskes 8819f91d47 Add a GetFieldMapping API
This new API allows to get the mapping for a specific set of fields rather than get the whole index mapping and traverse it.
The fields to be retrieved can be specified by their full path, index name and field name and will be resolved in this order.
In case multiple field match, the first one will be returned.

Since we are now generating the output (rather then fall back to the stored mapping), you can specify `include_defaults`=true on the request to have default values returned.

Closes #3941
2013-10-30 16:16:36 +01:00
Clinton Gormley 8b2efd4849 [DOCS] Added a version flag to percolation 2013-10-30 13:59:03 +01:00
Clinton Gormley 0585890a5f [DOCS] Fixed a typo 2013-10-30 13:57:18 +01:00
Alexander Reelsen 2ec9742147 [DOCS] Extending setup as a service documentation
* Tell people to use ES_JAVA_OPTS for es.node.name or similar parameters
* Showing a simple way to install Oracle JDK on ubuntu/debian

Closes #3999
2013-10-29 13:58:06 +01:00
David Pilato 5d90abf701 mget API should support global routing parameter
mget API support `_routing` field but not `routing` parameter.

Reproduction here:

```sh
curl -XDELETE "http://localhost:9200/test/"; echo
curl -XPUT "http://localhost:9200/test/" -d'{
   "settings": {
      "number_of_replicas": 0,
      "number_of_shards": 5
   }
}'; echo

curl -XPUT 'http://localhost:9200/test/order/1-1?routing=key1' -d '{
   "productName":"doc 1"
}'; echo
curl -XPUT 'http://localhost:9200/test/order/1-2?routing=key1' -d '{
   "productName":"doc 2"
}'; echo
curl -XPUT 'http://localhost:9200/test/order/1-3?routing=key1&refresh=true' -d '{
   "productName":"doc 3"
}'; echo

curl -XPOST 'http://localhost:9200/test/order/_mget?pretty' -d '{
    "docs" : [
        {
            "_index" : "test",
            "_type" : "order",
            "_id" : "1-1",
            "_routing" : "key1"
        },
        {
            "_index" : "test",
            "_type" : "order",
            "_id" : "1-2",
            "_routing" : "key1"
        },
        {
            "_index" : "test",
            "_type" : "order",
            "_id" : "1-3",
            "_routing" : "key1"
        }
    ]
}'; echo

curl -XPOST 'http://localhost:9200/test/order/_mget?pretty&routing=key1' -d '{
	"ids": [
		"1-1",
		"1-2",
		"1-3"
	]
}'; echo
```

Closes #3996.
2013-10-28 21:05:55 +01:00
Britta Weber c9dab6991e rename and document "index.mapping.date.parse_upper_inclusive" setting for date fields
The setting causes the upper bound for a range query/filter to be rounded up,
therefore the name `round_ceil` seems to make more sense.

Also this commit removes the redundant fourth parameter to DateMathParser.parse(..)
which was never used.
was:    parse(String text, long now, boolean roundUp, boolean upperInclusive)
is now: parse(String text, long now, boolean roundCeil)

closes #3914
2013-10-28 15:48:31 +01:00
Ben McCann cc4bc7d57d Fix nonsensical sentence in standard analyzer documentation so that it is more understandable 2013-10-25 00:18:32 +02:00
Luca Cavanna 48ac9747a8 Added third highlighter type based on lucene postings highlighter
Requires field index_options set to "offsets" in order to store positions and offsets in the postings list.
Considerably faster than the plain highlighter since it doesn't require to reanalyze the text to be highlighted: the larger the documents the better the performance gain should be.
Requires less disk space than term_vectors, needed for the fast_vector_highlighter.
Breaks the text into sentences and highlights them. Uses a BreakIterator to find sentences in the text. Plays really well with natural text, not quite the same if the text contains html markup for instance.
Treats the document as the whole corpus, and scores individual sentences as if they were documents in this corpus, using the BM25 algorithm.

Uses forked version of lucene postings highlighter to support:
- per value discrete highlighting for fields that have multiple values, needed when number_of_fragments=0 since we want to return a snippet per value
- manually passing in query terms to avoid calling extract terms multiple times, since we use a different highlighter instance per doc/field, but the query is always the same

The lucene postings highlighter api is  quite different compared to the existing highlighters api, the main difference being that it allows to highlight multiple fields in multiple docs with a single call, ensuring sequential IO.
The way it is introduced in elasticsearch in this first round is a compromise trying not to change the current highlight api, which works per document, per field. The main disadvantage is that we lose the sequential IO, but we can always refactor the highlight api to work with multiple documents.

Supports pre_tag, post_tag, number_of_fragments (0 highlights the whole field), require_field_match, no_match_size, order by score and html encoding.

Closes #3704
2013-10-24 23:38:00 +02:00
Luca Cavanna e981e411d7 [DOCS] rephrased docs for highlight no_match_size parameter
(removed 0.90.6 coming tag as it's needed only in 0.90 branch)
2013-10-24 14:38:32 +02:00
Nik Everett 14a709f563 Highlighting can return excerpt with no highlights
You can configure the highlighting api to return an excerpt of a field
even if there wasn't a match on the field.

The FVH makes excerpts from the beginning of the string to the first
boundary character after the requested length or the boundary_max_scan,
whichever comes first.  The Plain highlighter makes excerpts from the
beginning of the string to the end of the last token before the requested
length.

Closes #1171
2013-10-24 14:38:32 +02:00
Boaz Leskes 0e6e6f97dc Merge pull request #3940 from rboulton/patch-1
[Docs] Clean up wording in cluster health api doc
2013-10-22 04:09:13 -07:00
Markus Fischer 782d315da3 Fix markup 2013-10-21 16:11:09 +02:00
Richard Boulton b62cc7c716 Clean up wording to reduce confusion
The description of the timeout parameter was worded misleadingly; it implied that the API would wait until the cluster reached the desired level and then stayed at that level for the timeout. I've tweaked the sentence to remove the risk of confusion.
2013-10-21 12:37:50 +01:00
Clinton Gormley b2d82d7e75 [DOCS] Reorganised the highlight_query docs and added a version flag 2013-10-18 18:03:31 +02:00
Matt Weber 1e0a834c68 Document strict dynamic type mapping. 2013-10-18 08:29:31 -07:00
Nik Everett 60550e4cc2 phrase_len is not called phrase_length 2013-10-18 09:29:53 -04:00
Clinton Gormley adf0c8424b [DOCS] How to check max_file_descriptors 2013-10-17 11:54:36 +02:00
Martijn van Groningen b7c4adeea3 [Docs] update reference to remove documentation about percolating during an index, bulk or update request. 2013-10-16 16:31:36 +02:00
Martijn van Groningen 1d0841e2b8 Added initial documentation for the redesigned percolator. 2013-10-16 14:12:19 +02:00
Boaz Leskes 18e12ef66c [Docs] updated refrences to dynamic_date_formats 2013-10-16 12:04:31 +02:00
Boaz Leskes 57b2d45142 [Docs] added document for the lenient option in match queries 2013-10-16 10:53:25 +02:00
Alexander Reelsen 4d19239ec4 Add support for Lucene SuggestStopFilter
The suggest stop filter is an improved version of the stop filter, which
takes stopwords only into account if the last char of a query is a
whitespace. This allows you to keep stopwords, but to allow suggesting for
"a".

Example: Index document content "a word". You are now able to suggest for
"a" and get back results in the completion suggester, if the suggest stop
filter is used on the query side, but will not get back any results for
"a " as this is identified as a stopword.

The implementation allows to set the `remove_trailing` parameter for a
custom stop filter and thus use the suggest stop filter instead of the
standard stop filter.
2013-10-15 16:12:02 +02:00
Clinton Gormley 870346070e [DOCS] Added compound_on_flush docs and updated compound_format
docs to include note about accepting a float
2013-10-15 13:30:56 +02:00
Clinton Gormley d67331b554 [DOCS] Added script.disable_dynamic to the scripting page 2013-10-15 12:25:07 +02:00
steve mayzak 48656fd1ed removed a duplicate paragraphin config docs 2013-10-14 15:33:56 -07:00
Britta Weber 34441f3897 fix naming in function_score
- "boost" should be "boost_factor"
    - "mult" should be "multiply"

Also, store combine function names in ImmutableMap instead of iterating
over all possible names each time.

closes #3872 for master
2013-10-14 14:56:59 +02:00
Simon Willnauer 25d6f04f13 [DOCS] Note that cutoff_frequency doesn't handle stacked tokens gracefully 2013-10-14 14:09:38 +02:00
Britta Weber c3ab79a10e [DOCS] Add doc for delimited payload token filter 2013-10-14 13:41:35 +02:00
Clinton Gormley 9a062e465c [DOCS] Reorganised common API conventions 2013-10-13 16:46:56 +02:00
Clinton Gormley 4316b13880 [DOCS] Render common options on the same page 2013-10-13 14:14:50 +02:00
Shay Banon 420b3396f4 Set queue sizes by default on bulk/index thread pools
Now that we properly fixed the ability to set the queue size on the index / bulk thread pool, we should actually set them to a somehow reasonable value to protect from users potentially overflowing our system.

I suggest defaults to be 50 for bulk, and 200 for indexing.

Also, set the thread pool for get, which we should set (in a similar value to a "read" queue size we have today).
closes #3888
2013-10-12 21:51:37 +02:00
Subhash Gopalakrishnan b758b76da4 Support year units in date math expressions
According to http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-date-format.html, the date math expressions support M (month), w (week), h (hour), m (minute), and s (second) units. Why years are not supported? Please add support for year units.

Closes #3828.
Closes #3874.
2013-10-11 09:24:52 +02:00
Clinton Gormley 8462f88c39 [DOCS] Added more specific versions to the suggesters 2013-10-10 20:59:12 +02:00
Adrien Grand f2d75654bf Add clear warnings that only the default codec, postings format and doc values format have backward compatibility warranties. 2013-10-10 13:30:08 +02:00
Clinton Gormley ba1b4886e3 [DOCS] Moved "named filters/queries" up one level 2013-10-10 11:23:08 +02:00
Adrien Grand 4fa8f6f61f Doc values integration.
This commit allows for using Lucene doc values as a backend for field data,
moving the cost of building field data from the refresh operation to indexing.
In addition, Lucene doc values can be stored on disk (partially, or even
entirely), so that memory management is done at the operating system level
(file-system cache) instead of the JVM, avoiding long pauses during major
collections due to large heaps.

So far doc values are supported on numeric types and non-analyzed strings
(index:no or index:not_analyzed). Under the hood, it uses SORTED_SET doc values
which is the only type to support multi-valued fields. Since the field data API
set is a bit wider than the doc values API set, some operations are not
supported:
 - field data filtering: this will fail if doc values are enabled,
 - field data cache clearing, even for memory-based doc values formats,
 - getting the memory usage for a specific field,
 - knowing whether a field is actually multi-valued.

This commit also allows for configuring doc-values formats on a per-field basis
similarly to postings formats. In particular the doc values format of the
_version field can be configured through its own field mapper (it used to be
handled in UidFieldMapper previously).

Closes #3806
2013-10-09 16:34:30 +02:00
Lee Hinman dede6ee874 Remove extra 'processors' anchor in threadpool docs 2013-10-09 01:56:49 -06:00
Adrien Grand 97958ed02a Improved warm-up of new segments.
* Merged segments are now warmed-up at the end of the merge operation instead
  of _refresh, so that _refresh doesn't pay the price for the warm-up of merged
  segments, which is often higher than flushed segments because of their size.
* Even when no _warmer is registered, some basic warm-up of the segments is
  performed: norms, doc values (_version). This should help a bit people who
  forget to register warmers.
* Eager loading support for the parent id cache and field data: when one
  can't predict what terms will be present in the index, it is tempting to use
  a match_all query in a warmer, but in that case, query execution might not be
  much faster than field data loading so having a warmer that only loads field
  data without running a query can be useful.

Closes #3819
2013-10-08 23:06:55 +02:00
Clinton Gormley 264a00a40f [DOCS] Added pages explaining lucene query parser syntax and regular expression syntax 2013-10-07 14:42:49 +02:00
Clinton Gormley 7a53d41446 [DOCS] Changed capitalization of operator in rescore query 2013-10-05 17:18:15 +02:00
Clinton Gormley d062409309 [DOCS] Removed enable_position_increments in stop filter 2013-10-05 17:06:13 +02:00
Clinton Gormley ea05f4538c [DOCS] Updated ICU-Plugin docs from the repo README 2013-10-05 16:31:52 +02:00
Luca Cavanna b0fee6c01b Changed nested filter example to use an inner bool filter instead of a bool query, to demonstrate the usage of a filter rather than a query. 2013-10-04 14:08:37 +02:00
Clinton Gormley e53a26ff21 [DOCS] Fixed a typo in indices.get_templates 2013-10-03 11:40:29 +02:00
uboness f3c6108b71 introduced support for "shard_size" for terms & terms_stats facets. The "shard_size" is the number of term entries each shard will send back to the coordinating node. "shard_size" > "size" will increase the accuracy (both in terms of the counts associated with each term and the terms that will actually be returned the user) - of course, the higher "shard_size" is, the more expensive the processing becomes as bigger queues are maintained on a shard level and larger lists are streamed back from the shards.
closes #3821
2013-10-02 22:02:00 +02:00
Nik Everett 6b000d8c6d Support specifing score query on highlight.
This is useful if you want to highlight terms not in the search query or
you want sort highlighted snippets based on another query.

Closes #3630
2013-10-02 15:46:24 -04:00
Lee Hinman ba40aa374e Uniquify anchor links to fix asciidoc/docbook generation 2013-09-30 15:32:00 -06:00
Lee Hinman 0442b737be Add more anchor links to documentation
Related to #3679
2013-09-30 13:13:16 -06:00
Alexander Reelsen c63869b0be Documentation: Removed service wrapper, added rpm/deb package information 2013-09-26 14:30:25 +02:00
gtt116 6304d58e36 Remove a comma in doc to make example a valid json.
This will help reader to do a hurry up copy-paste test.
2013-09-24 15:23:23 +08:00
Costin Leau 3685a22e4a add docs on new service.bat facility 2013-09-23 18:24:31 +03:00
Martijn van Groningen d365a4ccba Added nested filter join option to the docs.
Closes #3738
2013-09-20 21:22:56 +02:00
Shay Banon 359d14ddc5 doc processors setting 2013-09-20 14:55:35 +02:00
Shay Banon 29c0f27a9e fix thread pool docs to remove blocking 2013-09-20 12:31:17 +02:00
Adrien Grand 90524d7ad2 Fix formatting of the documentation.
Remaining '@'s have been replaced with '`'s.
2013-09-18 12:35:44 +02:00
Britta Weber b7c3b50909 add date field to decay function doc 2013-09-17 19:54:31 +02:00
David Pilato 1e3ffa0df7 Add distance supported units 2013-09-17 14:21:45 +02:00
Clinton Gormley 85bba668f7 [DOCS] Tidied up various doc formatting errors 2013-09-16 16:13:01 +02:00
Clinton Gormley c2eb4a1c40 [DOCS] Tidied up function score 2013-09-16 15:57:08 +02:00
Clinton Gormley 422eed7985 [Docs] Added an added[0.90.4] flag to the disk based allocator 2013-09-16 15:57:07 +02:00
Simon Willnauer 85fcefc60d Allow include / exclude of completion stats via REST parameters
Stats can be retrieved on a per-feature / per-component  basis including the fields
they apply to. This commit add support for a 'completion' flag to include statistics
for the complition feature as well as 'completion_fields' to only
include certain fields into the returned statistics.
To disambiguate between 'fielddata' and 'completion' fields this commit
uses 'fields' as the default inclusion filter for stats fields only used
if not dedicated '[completion|fielddata]_fields' paramter is provided.

Relates to #3522
2013-09-16 11:28:32 +02:00
Martijn van Groningen f6f4b5014f Added docs for named queries.
Relates to #3581
2013-09-16 11:17:01 +02:00
Shay Banon 20745adadd Add dedicated Suggest Thread Pool
Add a dedicated suggest thread pool for the suggest API. With the new completion suggest type, which is purely CPU bounded, it makes more sense to have a dedicated thread pool for suggest compared to having it share the search thread pool and "competing" against other search operations.
closes #3698
2013-09-15 01:54:27 +02:00
Shay Banon df3f681ef0 Optimize API: Remove refresh flag
Refresh flag in optimize is problematic, since the shards refresh is allowed to execute on is different compared to the optimize shards. In order to do optimize and then refresh, they should be executed as separate APIs when needed.
closes #3690
2013-09-13 21:44:38 +02:00
Shay Banon 7cc48c8e87 Flush API: remove refresh flag
Refresh flag in flush is problematic, since the shards refresh is allowed to execute on is different compared to the flush shards. In order to do flush and then refresh, they should be executed as separate APIs when needed.
closes #3689
2013-09-13 21:09:45 +02:00
David Pilato ea4988e9dc Support for REST get ALL templates.
/_template shows: No handler found for uri [/_template] and method [GET]

It would make sense to list the templates as they are listed in the /_cluster/state call.

Closes #2532.
2013-09-13 15:08:59 +02:00
Clinton Gormley d6ecdecc19 [DOCS] Deprecated the from/to/include_lower/include_upper params
in the range query, range filter and numeric range filter.
Better to use gt/gte/lt/lte as they are explicit.
2013-09-12 15:07:36 +02:00
David Pilato 169cd007b5 Fix typo
Thanks to @ybonnel for finding it ;-)
2013-09-12 11:00:59 +02:00
Martijn van Groningen 8ddb809f98 If all scroll ids should be removed then the `_all` value should be used instead of not specifying any scroll ids. 2013-09-12 10:41:38 +02:00
Martijn van Groningen 0efa78710b Added clear scroll api.
The clear scroll api allows clear all resources associated with a `scroll_id` by deleting the `scroll_id` and its associated SearchContext.

Closes #3657
2013-09-10 21:17:34 +02:00
David Pilato fafc4eef98 Plugin Manager: add silent mode.
Now with have proper exit codes for elasticsearch plugin manager (see #3463), we can add a silent mode to plugin manager.

```sh
bin/plugin --install karmi/elasticsearch-paramedic --silent
```

Closes #3628.
2013-09-10 18:31:35 +02:00
David Pilato 764aa54f2d Plugin Manager should support -remove group/artifact/version naming
When installing a plugin, we use:

```sh
bin/plugin --install groupid/artifactid/version
```

But when removing the plugin, we only support:

```sh
bin/plugin --remove dirname
```

where `dirname` is the directory name of the plugin under `/plugins` dir.

Closes #3421.
2013-09-09 21:17:16 +02:00
Brad Fritz f3c0e39380 key is "index.store.type", not "index.storage.type" 2013-09-09 13:06:09 -04:00
Lee Hinman 7d52d58747 Add AllocationDecider that takes free disk space into account
This commit adds two main pieces, the first is a ClusterInfoService
that provides a service running on the master nodes that fetches the
total/free bytes for each data node in the cluster as well as the
sizes of all shards in the cluster. This information is gathered by
default every 30 seconds, and can be changed dynamically by setting
the `cluster.info.update.interval` setting. This ClusterInfoService
can hopefully be used in the future to weight nodes for allocation
based on their disk usage, if desired.

The second main piece is the DiskThresholdDecider, which can disallow
a shard from being allocated to a node, or from remaining on the node
depending on configuration parameters. There are three main
configuration parameters for the DiskThresholdDecider:

`cluster.routing.allocation.disk.threshold_enabled` controls whether
the decider is enabled. It defaults to false (disabled). Note that the
decider is also disabled for clusters with only a single data node.

`cluster.routing.allocation.disk.watermark.low` controls the low
watermark for disk usage. It defaults to 0.70, meaning ES will not
allocate new shards to nodes once they have more than 70% disk
used. It can also be set to an absolute byte value (like 500mb) to
prevent ES from allocating shards if less than the configured amount
of space is available.

`cluster.routing.allocation.disk.watermark.high` controls the high
watermark. It defaults to 0.85, meaning ES will attempt to relocate
shards to another node if the node disk usage rises above 85%. It can
also be set to an absolute byte value (similar to the low watermark)
to relocate shards once less than the configured amount of space is
available on the node.

Closes #3480
2013-09-09 09:49:30 -06:00
Clinton Gormley 9e6d30a14a [DOCS] Changed the deprecation of custom_boost/score/filters_score
queries to 0.90.4
2013-09-05 12:14:10 +02:00
Clinton Gormley 2b3a762c27 [DOCS] Function score was added in 0.90.4 not 1.00.Beta 2013-09-05 11:25:06 +02:00
Clinton Gormley 8257aba166 [DOCS] Fixed fielddata regex syntax 2013-09-04 23:20:56 +02:00
Clinton Gormley 6d667e5d41 [DOCS] Missing sort values now works for all field types 2013-09-04 23:20:55 +02:00
Clinton Gormley 765bd026f5 [DOCS] Added function score query 2013-09-04 23:20:55 +02:00
Clinton Gormley aa59ef2e84 [DOCS] Added the human flag 2013-09-04 23:20:55 +02:00
Clinton Gormley 9d0dd545cb [DOCS] Tidied up the plugins page and added Graphite and Statsd 2013-09-04 23:20:55 +02:00
Clinton Gormley e1c6f45ff0 [DOCS] Added clarification about global scope in facets 2013-09-04 23:20:55 +02:00
Clinton Gormley 08f8e77b8f [DOCS] Added fuzzy options to completion suggester 2013-09-04 23:20:55 +02:00
Clinton Gormley 047c86e3b2 [DOCS] Added wildcard template matching 2013-09-04 23:20:55 +02:00
Clinton Gormley 9f5d0b6e89 [DOCS] Added a few clarifications to the docs from the issues list 2013-09-04 23:20:55 +02:00
Clinton Gormley 94be785726 [DOCS] Added multi-index open/close 2013-09-04 23:20:55 +02:00
Clinton Gormley 5b60506b2e [DOCS] Added highlighting to the phrase suggester 2013-09-04 23:20:54 +02:00
Clinton Gormley 53ad7330fc [DOCS] Added docs for term vectors 2013-09-04 23:20:54 +02:00
Clinton Gormley eac2b3a52e [DOCS] Fixed typo 2013-09-04 23:20:54 +02:00
Clinton Gormley 393c28bee4 [DOCS] Removed outdated new/deprecated version notices 2013-09-03 21:28:31 +02:00
Simon Willnauer eb2fed85f1 Add 'min_input_len' to completion suggester
Restrict the size of the input length to a reasonable size otherwise very
long strings can cause StackOverflowExceptions deep down in lucene land.
Yet, this is simply a saftly limit set to `50` UTF-16 codepoints by default.
This limit is only present at index time and not at query time. If prefix
completions > 50 UTF-16 codepoints are expected / desired this limit should be raised.
Critical string sizes are beyone the 1k UTF-16 Codepoints limit.

Closes #3596
2013-09-03 10:26:37 +02:00
Boaz Leskes e807c99f27 Fixed a typo in the config of light finnish stemmer (old last_finish is still supported for backward compatibility)
Closes #3594
2013-08-29 10:15:40 +02:00
Clinton Gormley 822043347e Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00