Commit Graph

1433 Commits

Author SHA1 Message Date
Tanguy Leroux 340b7ef6ef Add common SystemD file for RPM/DEB package 2015-05-27 11:51:58 +02:00
javanna fc28bc73f8 [DOCS] add kopf to site plugins 2015-05-27 10:28:53 +02:00
Ryan Schneider 8ec6bf7340 [DOCS] Update get.asciidoc
Updated to not mislead the reader that the data is actually gone when a document is updated. For example if you have 100GB of docs and update each one you'll only be able to access 100GB of the data, but there would theoretically be 200GB of doc data.

Closes #10375
2015-05-27 10:17:10 +02:00
Boaz Leskes 6d269cbf4d feedback 2015-05-27 10:29:37 +03:00
javanna 6c81a8daf3 Internal: count api to become a shortcut to the search api
The count api used to have its own execution path, although it would do the same (up to bugs!) of the search api. This commit makes it a shortcut to the search api with size set to 0. The change is made in a backwards compatible manner, by leaving all of the java api code around too, given that you may not want to get back a whole SearchResponse when asking only for number of hits matching a query, also cause migrating from countResponse.getCount() to searchResponse.getHits().totalHits() doesn't look great from a user perspective. We can always decide to drop more code around the count api if we want to break backwards compatibility on the java api, making it a shortcut on the rest layer only.

Closes #9117
Closes #11198
2015-05-26 19:12:11 +02:00
Alexander Reelsen 1fa21a76cf Documentation: Fix elasticsearch documentation build
The commit for closing #11033 was not building the asciidoc
documentation.
2015-05-26 18:16:12 +02:00
Alexander Reelsen 045f01c085 Infra for deprecation logging
Add support for a specific deprecation logging that can be used to turn
on in order to notify users of a specific feature, flag, setting,
parameter, ... being deprecated.

The deprecation logger logs with a "deprecation." prefix logge
(or "org.elasticsearch.deprecation." if full name is used), and outputs
the logging to a dedicated deprecation log file.

Deprecation logging are logged under the DEBUG category. The idea is not to
enabled them by default (under WARN or ERROR) when running embedded in
another application.

By default they are turned off (INFO), in order to turn it on, the
"deprecation" category need to be set to DEBUG. This can be set in the
logging file or using the cluster update settings API, see the documentation

Closes #11033
2015-05-26 17:44:52 +02:00
Tanguy Leroux ce63590bd6 API: Add response filtering with filter_path parameter
This change adds a new "filter_path" parameter that can be used to filter and reduce the responses returned by the REST API of elasticsearch.

For example, returning only the shards that failed to be optimized:
```
curl -XPOST 'localhost:9200/beer/_optimize?filter_path=_shards.failed'
{"_shards":{"failed":0}}%
```

It supports multiple filters (separated by a comma):
```
curl -XGET 'localhost:9200/_mapping?pretty&filter_path=*.mappings.*.properties.name,*.mappings.*.properties.title'
```

It also supports the YAML response format. Here it returns only the `_id` field of a newly indexed document:
```
curl -XPOST 'localhost:9200/library/book?filter_path=_id' -d '---hello:\n  world: 1\n'
---
_id: "AU0j64-b-stVfkvus5-A"
```

It also supports wildcards. Here it returns only the host name of every nodes in the cluster:
```
curl -XGET 'http://localhost:9200/_nodes/stats?filter_path=nodes.*.host*'
{"nodes":{"lvJHed8uQQu4brS-SXKsNA":{"host":"portable"}}}
```

And "**" can be used to include sub fields without knowing the exact path. Here it returns only the Lucene version of every segment:
```
curl 'http://localhost:9200/_segments?pretty&filter_path=indices.**.version'
{
  "indices" : {
    "beer" : {
      "shards" : {
        "0" : [ {
          "segments" : {
            "_0" : {
              "version" : "5.2.0"
            },
            "_1" : {
              "version" : "5.2.0"
            }
          }
        } ]
      }
    }
  }
}
```

Note that elasticsearch sometimes returns directly the raw value of a field, like the _source field. If you want to filter _source fields, you should consider combining the already existing _source parameter (see Get API for more details) with the filter_path parameter like this:

```
curl -XGET 'localhost:9200/_search?pretty&filter_path=hits.hits._source&_source=title'
{
  "hits" : {
    "hits" : [ {
      "_source":{"title":"Book #2"}
    }, {
      "_source":{"title":"Book #1"}
    }, {
      "_source":{"title":"Book #3"}
    } ]
  }
}
```
2015-05-26 13:51:04 +02:00
Britta Weber eeeb29f900 spell correct and add single quotes 2015-05-26 11:41:19 +02:00
Britta Weber 37782c1745 analyzers: custom analyzers names and aliases must not start with _
closes #9596
2015-05-26 11:38:15 +02:00
Boaz Leskes b376a3fbfb Move index sealing terminology to synced flush
#10032 introduced the notion of sealing an index by marking it with a special read only marker, allowing for a couple of optimization to happen. The most important one was to speed up recoveries of shards where we know nothing has changed since they were online by skipping the file based sync phase. During the implementation we came up with a light notion which achieves the same recovery benefits but without the read only aspects which we dubbed synced flush. The fact that it was light weight and didn't put the index in read only mode, allowed us to do it automatically in the background which has great advantage. However we also felt the need to allow users to manually trigger this operation.

 The implementation at #11179 added the sync flush internal logic and the manual (rest) rest API. The name of the API was modeled after the sealing terminology which may end up being confusing. This commit changes the API name to match the internal synced flush naming, namely `{index}/_flush/synced'.

  On top of that it contains a couple other changes:
   - Remove all java client API. This feature is not supposed to be called programtically by applications but rather by admins.
   - Improve rest responses making structure similar to other (flush) API
   - Change IndexShard#getOperationsCount to exclude the internal +1 on open shard . it's confusing to get 1 while there are actually no ongoing operations
   - Some minor other clean ups
2015-05-25 22:32:32 +03:00
Alex Chan e31049988b [Docs] Fix minor spelling errors
Closes #11320
2015-05-25 19:56:43 +02:00
Eduardo Gurgel 0f3b3c0787 Docs: Fix typo on percolate_format description
Closes #11215
2015-05-25 13:17:59 +02:00
Clinton Gormley 4d27d751fb Docs: Move the page on facets into redirects.asciidoc 2015-05-24 23:34:23 +02:00
Clinton Gormley 6171ae6cc4 Docs: Added stub entries for pages deleted from 1.x 2015-05-24 17:57:34 +02:00
Clinton Gormley 4b854d10bd Docs: Tidied up the field statistics docs 2015-05-24 15:12:44 +02:00
Britta Weber 4d0b40ca52 Merge pull request #11235 from nik9000/seal_docs
Rewrote some _seal documentation
2015-05-22 18:24:23 +02:00
Clinton Gormley cde2c91b5a Docs: Example blocks can't contain warnings 2015-05-22 17:37:58 +02:00
Clinton Gormley 631e03c872 Docs: Tidied up term vectors docs
Moved annotations out of titles
Made the example titles into example blocks
2015-05-22 17:19:12 +02:00
Nik Everett 6da1e858dc Rewrote some _seal documentation
The first two paragraphs were confusing to me so I tried to rewrite them.

I removed some passive voice because it irks me.
2015-05-22 10:51:21 -04:00
Clinton Gormley 20279a2556 Docs: Rename reference docs to Elasticsearch Reference 2015-05-22 14:49:11 +02:00
Adrien Grand 42f9053817 Merge pull request #11280 from jpountz/fix/remove_binary_compress
Mappings: Remove the `compress`/`compress_threshold` options of the BinaryFieldMapper.
2015-05-22 14:21:13 +02:00
Adrien Grand 461683ac58 Mappings: Remove the `compress`/`compress_threshold` options of the BinaryFieldMapper.
This option is broken currently since it potentially interprets an incoming
binary value as compressed while it just happens that the first bytes are the
same as the LZF header.
2015-05-22 14:20:42 +02:00
Colin Goodheart-Smithe 35deb7efea Aggregations: Renaming reducers to Pipeline Aggregators 2015-05-21 14:57:23 +01:00
Igor Motov dd41c68741 Snapshot/Restore: fix FSRepository location configuration
Closes #11068
2015-05-20 22:14:31 -04:00
Lee Hinman 0a6f7ef379 [DOCS] Mention Integer.MAX_VALUE limit for http.max_content_length
Fixes #11244
2015-05-20 13:08:59 -06:00
Clinton Gormley 5e4d5e1c64 Docs: Included the index-seal docs in the indices section 2015-05-20 11:20:12 +02:00
Simon Willnauer 488be75d19 Add some words about the purpose of a seal etc. 2015-05-19 12:26:08 +02:00
Simon Willnauer 9d2852f0ab Merge branch 'master' into feature/synced_flush
Conflicts:
	src/main/java/org/elasticsearch/index/engine/InternalEngine.java
	src/main/java/org/elasticsearch/index/shard/IndexShard.java
	src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
	src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
2015-05-19 12:16:22 +02:00
Adrien Grand 2c241e8a36 Mappings: Remove the `ignore_conflicts` option.
Mappings conflicts should not be ignored. If I read the history correctly, this
option was added when a mapping update to an existing field was considered a
conflict, even if the new mapping was exactly the same. Now that mapping updates
are smart enough to detect conflicting options, we don't need an option to
ignore conflicts.
2015-05-18 15:28:23 +02:00
javanna a843008b17 Highlighting: require_field_match set to true by default
The default `false` for `require_field_match` is a bit odd and confusing for users, given that field names get ignored by default and every field gets highlighted if it contains terms extracted out of the query, regardless of which fields were queries. Changed the default to `true`, it can always be changed per request.

Closes #10627
Closes #11067
2015-05-15 21:38:45 +02:00
Clinton Gormley 9d71816cd2 Docs: Fixed explanation of AUTO fuzziness
Closes #11186
2015-05-15 21:25:11 +02:00
javanna 46c521f7ec Highlighting: nuke XPostingsHighlighter
Our own fork of the lucene PostingsHighlighter is not easy to maintain and doesn't give us any added value at this point. In particular, it was introduced to support the require_field_match option and discrete per value highlighting, used in case one wants to highlight the whole content of a field, but get back one snippet per value. These two features won't
 make it into lucene as they slow things down and shouldn't have been supported from day one on our end probably.

One other customization we had was support for a wider range of queries via custom rewrite etc. (yet another way to slow
 things down), which got added to lucene and works much much better than what we used to do (instead of or rewrite, term
s are pulled out of the automata for multi term queries).

Removing our fork means the following in terms of features:
- dropped support for require_field_match: the postings highlighter will only highlight fields that were queried
- some custom es queries won't be supported anymore, meaning they won't be highlighted. The only one I found up until now is the phrase_prefix. Postings highlighter rewrites against an empty reader to avoid slow operations (like the ones that we were performing with the fork that we are removing here), thus the prefix will not be expanded to any term. What the postings highlighter does instead is pulling the automata out of multi term queries, but this is not supported at the moment with our MultiPhrasePrefixQuery.

Closes #10625
Closes #11077
2015-05-15 20:41:33 +02:00
Clinton Gormley 3a69b65e88 Docs: Fixed the backslash escaping on the pattern analyzer docs
Closes #11099
2015-05-15 18:40:16 +02:00
Jun Ohtani 597c53a0bb Add migrationi note for AnalyzeRequest 2015-05-16 00:25:53 +09:00
Adrien Grand bf599d68dd Merge pull request #11042 from jpountz/feature/aggs_missing
Aggs: Make it possible to configure missing values.
2015-05-15 16:33:29 +02:00
Adrien Grand 32e23b9100 Aggs: Make it possible to configure missing values.
Most aggregations (terms, histogram, stats, percentiles, geohash-grid) now
support a new `missing` option which defines the value to consider when a
field does not have a value. This can be handy if you eg. want a terms
aggregation to handle the same way documents that have "N/A" or no value
for a `tag` field.

This works in a very similar way to the `missing` option on the `sort`
element.

One known issue is that this option sometimes cannot make the right decision
in the unmapped case: it needs to replace all values with the `missing` value
but might not know what kind of values source should be produced (numerics,
strings, geo points?). For this reason, we might want to add an `unmapped_type`
option in the future like we did for sorting.

Related to #5324
2015-05-15 16:26:58 +02:00
Martijn van Groningen 719252a138 Merge pull request #11183 from martijnvg/parent-child/remove_id_cache_from_stats_and_clear_cache_apis
Removed `id_cache` from stats and cat apis.
2015-05-15 14:39:35 +02:00
Martijn van Groningen ece18f162e Removed `id_cache` from stats and cat apis.
Also removed the `id_cache` option from the clear cache api.

Closes #5269
2015-05-15 14:06:18 +02:00
Jun Ohtani 3a1a4d3e89 Analysis: Add multi-valued text support
Add support array text as a multi-valued for AnalyzeRequestBuilder
Add support array text as a multi-valued for Analyze REST API
Add docs

Closes #3023
2015-05-15 20:01:10 +09:00
Britta Weber 7a8d08a4a3 Merge remote-tracking branch 'origin/master' into feature/synced_flush 2015-05-15 10:35:36 +02:00
Lee Hinman 179dad69b6 [DOCS] Add DNS SRV discovery plugin 2015-05-14 16:02:59 -06:00
Areek Zillur 7efc43db25 Re-structure collate option in PhraseSuggester to only collate on local shard.
Previously, collate feature would be executed on all shards of an index using the client,
this leads to a deadlock when concurrent collate requests are run from the _search API,
due to the fact that both the external request and internal collate requests use the
same search threadpool.

As phrase suggestions are generated from the terms of the local shard, in most cases the
generated suggestion, which does not yield a hit for the collate query on the local shard
would not yield a hit for collate query on non-local shards.

Instead of using the client for collating suggestions, collate query is executed against
the ContextIndexSearcher. This PR removes the ability to specify a preference for a collate
query, as the collate query is only run on the local shard.

closes #9377
2015-05-14 17:21:53 -04:00
Jack Conradson a5c0ac0d67 Scripting: Add Multi-Valued Field Methods to Expressions
Add methods to operate on multi-valued fields in the expressions language.
Note that users will still not be able to access individual values
within a multi-valued field.

The following methods will be included:

* min
* max
* avg
* median
* count
* sum

Additionally, changes have been made to MultiValueMode to support the
new median method.

closes #11105
2015-05-14 08:27:24 -07:00
Britta Weber 2b03a03c0c Merge remote-tracking branch 'origin/master' into feature/synced_flush 2015-05-13 18:00:18 +02:00
Britta Weber f1948cf95c doc for seal api and doc for syned flush in general 2015-05-13 15:43:05 +02:00
Adrien Grand 630757906a Query DSL: Add `filter` clauses to `bool` queries.
These clauses filter the document space without affecting scoring and map to
Lucene's BooleanClause.Occur.FILTER. The `filtered` query is now deprecated and

```json
{
  "filtered": {
    "query": { //query },
    "filter": { //filter }
  }
}
```
should be replaced with
```json
{
  "bool": {
    "must": { //query },
    "filter": { //filter }
  }
}
```
2015-05-13 12:04:56 +02:00
Ryan Ernst f766b260ba Add tests for includeInObject backcompat 2015-05-12 23:11:15 -07:00
Ryan Ernst 565ffb16f1 Mappings: Remove ability to set meta fields inside documents
A few meta fields can currently be set within a document's source.
However, the recommended way to set meta fields like this is through
the api, and setting within the document can be a performance trap
(e.g. needing to find _id in order to route the document).

This change removes the ability to set meta fields within
a document source for 2.0+ indexes.

closes #11051
closes #11074
2015-05-12 23:09:03 -07:00
Igor Motov d6efe1e508 Docs: Add information about restoring to a different cluster 2015-05-12 20:59:24 -04:00
Ryan Ernst e7618b8528 Settings: Remove file based index templates
As a follow up to #10870, this removes support for
index templates on disk. It also removes a missed
place still allowing disk based mappings.

closes #11052
2015-05-11 12:51:22 -07:00
javanna 36c373e615 [DOCS] documented missing query_string parameters for count, exists, search & validate_query
relates to #11057
2015-05-11 12:58:30 +02:00
Martijn van Groningen acdd9a5dd9 parent/child: Removed the `top_children` query. 2015-05-10 16:30:19 +02:00
Lee Hinman 459a05168c Merge remote-tracking branch 'refs/remotes/dakrone/truncate-loglines' 2015-05-08 10:11:26 -06:00
Lee Hinman c6747ded16 Truncate log messages at 10,000 characters 2015-05-08 10:10:44 -06:00
Clinton Gormley a536bd5f81 Docs: Rewrote the term query docs to explain analyzed vs not_analyzed 2015-05-08 08:32:13 +02:00
Andrew Selden c953e99324 Merge pull request #10864 from aleph-zero/issues/9606
Remove (dfs_)query_and_fetch from the REST API
2015-05-07 12:51:28 -07:00
josephwolnskipn 7f064c592f Docs: Fix grammar and typos in percolate
Added commas, capitalized "JSON" and "API", capitalized titles, etc.

Closes #11023
2015-05-07 21:50:48 +02:00
Ryan Ernst e29492ce94 Docs: Cleanup meta field docs
Meta fields were locked down to not allow exotic options to the
underlying field types in #8143. This change fixes the docs
to no longer refer to the old settings.

closes #10879
2015-05-07 11:26:49 -07:00
Adrien Grand a0af88e996 Query DSL: Remove filter parsers.
This commit makes queries and filters parsed the same way using the
QueryParser abstraction. This allowed to remove duplicate code that we had
for similar queries/filters such as `range`, `prefix` or `term`.
2015-05-07 20:14:34 +02:00
Alex Ksikes 4787cf701f More Like This: remove percent_terms_to_match
Users should use minimum_should_match instead.

Closes #11030
2015-05-07 14:21:29 +02:00
Martijn van Groningen f7c29457d0 parent/child: Deprecated the `top_children` in favour of the `has_child` query. 2015-05-07 09:27:54 +02:00
Alexander Reelsen 82c21ff5b3 Documentation: Mention RPM repo does not work with older distributions
Getting this to work would be a lot of work (creating two different
repositories, having another GPG key, integrating this into our build).

Closes #6498
2015-05-07 08:20:06 +02:00
Alex Ksikes ec4f12f9ef More Like This: removal of the MLT API
Removes the More Like This API, users should now use the More Like This query.
The MLT API tests were converted to their query equivalent. Also some clean
ups in MLT tests.

Closes #10736
Closes #11003
2015-05-06 18:11:11 +02:00
Colin Goodheart-Smithe cf1251796f Aggregations: Adding Sum Bucket Aggregation
Closes #11007
2015-05-06 14:44:56 +01:00
Zachary Tong e70a8d4ee9 Merge pull request #10964 from polyfractal/feature/aggs_movavg_rename
Rename Moving Average models to their "common" names
2015-05-06 09:07:23 -04:00
Zachary Tong 3eb9cb913d Rename Moving Average models to their "common" names
Previously, we were using the "statistical", technically accurate name.  Instead, we
should probably use the name that people are familiar with, e.g. "Holt Winters" instead
of "triple exponential".  To that end:

- `single_exp` becomes `ewma` (exponentially weighted moving average)
- `double_exp` becomes `holt`

When the `triple_exp` is added, it will be called `holt_winters`.
2015-05-06 09:04:44 -04:00
Colin Goodheart-Smithe 72d99773dc Aggregations: Adding Average Bucket Aggregation
Also includes changes to the other bucket metric aggregations to share code

Closes #11006
2015-05-06 13:53:57 +01:00
Colin Goodheart-Smithe 644fd00714 Aggregations: x-axis units normalisation for derivative aggregation 2015-05-06 10:31:16 +01:00
Ryan Ernst 7a7bd6086a Mappings: Remove ability to disable _source field
Current features (eg. update API) and future features (eg. reindex API)
depend on _source. This change locks down the field so that
it can no longer be disabled. It also removes legacy settings
compress/compress_threshold.

closes #8142
closes #10915
2015-05-05 22:04:18 -07:00
Clinton Gormley 603a0c193b Docs: More translog doc improvements 2015-05-05 22:01:58 +02:00
Clinton Gormley a60251068c Docs: Improved the translog docs 2015-05-05 21:32:52 +02:00
Simon Willnauer fe5a35b68e Merge branch 'master' into pr-10624
Conflicts:
	src/main/java/org/elasticsearch/index/shard/IndexShard.java
2015-05-05 11:46:02 +02:00
Clinton Gormley e28ad853c7 Docs: Fixed bad asciidoc in migrate_2_0 2015-05-05 11:17:21 +02:00
Pascal Borreli af6d890ad5 Docs: Fixed typos
Closes #10973
2015-05-05 10:38:05 +02:00
aleph-zero 2b483cc806 Removed reference to search type 'count'
Removed reference to search type 'count' as this is now a deprecated
search type.
2015-05-04 14:48:40 -07:00
Shay Banon 187d79b6df Centralize admin implementations and action execution
This change removes the multiple implementations of different admin interfaces and centralizes it with AbstractClient. It also makes sure *all* executions of actions now go through a single AbstractClient#execute method, taking care of copying headers and wrapping listener.
This also has the side benefit of removing all the code around differnet possible clients, and removes quite a bit of code (most of the + code is actually removal of generics and such).

This change also changes how TransportClient is constructed, requiring a Builder to create it, its a breaking change and its noted in the migration guide.

Yea another step towards simplifying the action infra and making it simpler...
2015-05-04 23:40:17 +02:00
Zachary Tong f6d5167d41 Merge pull request #10929 from polyfractal/docs/aggs
Restructure Aggregation documentation
2015-05-04 13:28:47 -04:00
Ryan Ernst ba68d354c4 Merge pull request #10934 from mattweber/custom_analyzer_pos_offset_gap
document and test custom analyzer position offset gap
2015-05-04 08:56:50 -07:00
Matt Weber 63c4a214db document and test custom analyzer position offset gap 2015-05-04 08:53:45 -07:00
Clément Salaün c0659ce4d4 Docs: Update geo-distance-range-filter.asciidoc
missing comma

Closes #10957
2015-05-04 17:17:48 +02:00
Simon Willnauer 930eacd457 Merge branch 'master' into pr-10624 2015-05-04 17:06:05 +02:00
Clinton Gormley bffcf5af58 Docs: Update rolling upgrade
Added note about why replica shards may remain unassigned while there is only one node of the higher version in the cluster.

Closes #10951
2015-05-04 16:52:35 +02:00
Robert Muir 4b3672b7df Add migration note for hunspell dictionaries 2015-05-04 10:00:05 -04:00
Zachary Tong 967e05ea76 [DOCS] Fix section levels for Sampler agg 2015-05-04 09:18:24 -04:00
Simon Willnauer 7e5f9d5628 Merge branch 'master' into pr-10624
Conflicts:
	src/main/java/org/elasticsearch/index/engine/EngineConfig.java
	src/main/java/org/elasticsearch/index/shard/IndexShard.java
	src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
	src/test/java/org/elasticsearch/index/engine/ShadowEngineTests.java
2015-05-04 11:37:54 +02:00
Adrien Grand b72f27a410 Core: Cut over to the Lucene filter cache.
This removes Elasticsearch's filter cache and uses Lucene's instead. It has some
implications:
 - custom cache keys (`_cache_key`) are unsupported
 - decisions are made internally and can't be overridden by users ('_cache`)
 - not only filters can be cached but also all queries that do not need scores
 - parent/child queries can now be cached, however cached entries are only
   valid for the current top-level reader so in practice it will likely only
   be used on read-only indices
 - the cache deduplicates filters, which plays nicer with large keys (eg. `terms`)
 - better stats: we already had ram usage and evictions, but now also hit count,
   miss count, lookup count, number of cached doc id sets and current number of
   doc id sets in the cache
 - dynamically changing the filter cache size is not supported anymore

Internally, an important change is that it removes the NoCacheFilter infrastructure
in favour of making Query.rewrite specializing the query for the current reader so
that it will only be cached on this reader (look for IndexCacheableQuery).

Note that consuming filters with the query API (createWeight/scorer) instead of
the filter API (getDocIdSet) is important for parent/child queries because
otherwise a QueryWrapperFilter(ParentQuery) would run the wrapped query per
segment while relations might be cross segments.
2015-05-04 09:02:15 +02:00
Zachary Tong e3ae1df6f0 [DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00
Clinton Gormley c28bf3bb3f Docs: Updated elasticsearch.org links to elastic.co 2015-05-01 20:46:12 +02:00
Robert Muir dfe1d1463c fix doc typo 2015-04-30 23:46:37 -04:00
Robert Muir aade6194b7 Add span within/containing queries.
Expose new span queries from https://issues.apache.org/jira/browse/LUCENE-6083

Within returns matches from 'little' that are enclosed inside of a match from 'big'.
Containing returns matches from 'big' that enclose matches from 'little'.
2015-04-30 23:31:31 -04:00
Jack Conradson aa968f6b65 Scripting: Add Field Methods
Added infrastructure to allow basic member methods in the expressions
language to be called.  The methods must have a signature with no arguments.  Also
added the following member methods for date fields (and it should be easy to add more)
* getYear
* getMonth
* getDayOfMonth
* getHourOfDay
* getMinutes
* getSeconds

Allow fields to be accessed without using the member variable [value].
(Note that both ways can be used to access fields for back-compat.)

closes #10890
2015-04-30 15:36:46 -07:00
Ryan Ernst d2b12e4fc2 Mappings: Remove docs for type level analyzer defaults
These settings were removed in #9430.
2015-04-30 13:57:55 -07:00
Ryan Ernst 4ef9f3ca63 Mappings: Remove file based default mappings
Using files that must be specified on each node is an anti-pattern
from the API based goal of ES. This change removes the ability
to specify the default mapping with a file on each node.

closes #10620
2015-04-30 13:50:35 -07:00
Boaz Leskes d596f5cc45 Decouple recoveries from engine flush
In order to safely complete recoveries / relocations we have to keep all operation done since the recovery start at available for replay. At the moment we do so by preventing the engine from flushing and thus making sure that the operations are kept in the translog. A side effect of this is that the translog keeps on growing until the recovery is done. This is not a problem as we do need these operations but if the another recovery starts concurrently it may have an unneededly long translog to replay. Also, if we shutdown the engine for some reason at this point (like when a node is restarted)  we have to recover a long translog when we come back.

To void this, the translog is changed to be based on multiple files instead of a single one. This allows recoveries to keep hold to the files they need while allowing the engine to flush and do a lucene commit (which will create a new translog files bellow the hood).

Change highlights:
- Refactor Translog file management to allow for multiple files.
- Translog maintains a list of referenced files, both by outstanding recoveries and files containing operations not yet committed to Lucene.
- A new Translog.View concept is introduced, allowing recoveries to get a reference to all currently uncommitted translog files plus all future translog files created until the view is closed. They can use this view to iterate over operations.
- Recovery phase3 is removed. That phase was replaying operations while preventing new writes to the engine. This is unneeded as standard indexing also send all operations from the start of the recovery  to the recovering shard. Replay all ops in the view acquired in recovery start is enough to guarantee no operation is lost.
- IndexShard now creates the translog together with the engine. The translog is closed by the engine on close. ShadowIndexShards do not open the translog.
- Moved the ownership of translog fsyncing to the translog it self, changing the responsible setting to `index.translog.sync_interval` (was `index.gateway.local.sync`)

Closes #10624
2015-04-30 23:42:50 +03:00
Adrien Grand e5be85d586 Aggs: Change the default `min_doc_count` to 0 on histograms.
The assumption is that gaps in histogram are generally undesirable, for instance
if you want to build a visualization from it. Additionally, we are building new
aggregations that require that there are no gaps to work correctly (eg.
derivatives).
2015-04-30 15:48:23 +02:00
Colin Goodheart-Smithe 969f53e399 fix typo in Min bucket aggregation docs 2015-04-30 14:41:01 +01:00
Colin Goodheart-Smithe d16bf992a9 Aggregations: min_bucket aggregation
An aggregation to calculate the minimum value in a set of buckets.

Closes #9999
2015-04-30 13:34:21 +01:00
Zachary Tong 351a4d3315 [DOCS] Fix movavg images and naming 2015-04-29 13:33:54 -04:00
Colin Goodheart-Smithe 57a8885964 Merge branch 'master' into feature/aggs_2_0
# Conflicts:
#	src/main/java/org/elasticsearch/index/query/CommonTermsQueryBuilder.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregationModule.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregatorFactories.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregatorParsers.java
#	src/main/java/org/elasticsearch/search/aggregations/InternalMultiBucketAggregation.java
#	src/main/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregator.java
#	src/main/java/org/elasticsearch/search/aggregations/metrics/InternalNumericMetricsAggregation.java
#	src/test/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregatorTest.java
2015-04-29 15:49:41 +01:00
Adrien Grand 6e076efdb9 Docs: Add documentation for the `doc_values` setting on the `boolean` field type.
Close #10431
2015-04-29 15:59:24 +02:00
Clinton Gormley 7aa4c7e256 Docs: Removed a reference to index_name from the array mapping page 2015-04-29 15:12:31 +02:00
Antonio Bonuccelli ab83eb036b Docs: adding missing single quote on PUT index request
Closes #10876
2015-04-29 14:45:25 +02:00
Simon Willnauer 94d8b20611 Add multi data.path to migration guide
this commit removes the obsolete settings for distributors and updates
the documentation on multiple data.path. It also adds an explain to the
migration guide.

Relates to #9498
Closes #10770
2015-04-29 11:51:37 +02:00
aleph-zero 1d60f34944 Remove all doc references to (dfs_)query_and_fetch
Removes references to (dfs_)query_and_fetch as possible ‘search_type’
parameters for the REST API.
2015-04-28 15:57:46 -07:00
aleph-zero 89542facb3 Remove (dfs_)query_and_fetch from the REST API
Remove the ability to specify search type ‘query_and_fetch’ and
‘df_query_and_fetch’ from the REST API.

- Adds REST tests
- Updates REST API spec to remove ‘query_and_fetch’ and
‘df_query_and_fetch’ as options
- Removes documentation for these options

Closes #9606
2015-04-28 15:27:59 -07:00
Ryan Ernst bf09e58cb3 Mappings: Remove includes and excludes from _source
Regardless of the outcome of #8142, we should at least enforce that
when _source is enabled, it is sufficient to reindex. This change
removes the excludes and includes settings, since these modify
the source, causing us to lose the ability to reindex some fields.

closes #10814
2015-04-28 15:03:51 -07:00
Lee Hinman 04f6067c66 Merge branch 'pr/10845' 2015-04-28 09:13:26 -06:00
Nik Everett cb89a14010 Add default to field_value_factor
field_value_factor now takes a default that is used if the document doesn't
have a value for that field. It looks like:
"field_value_factor": {
  "field": "popularity",
  "missing": 1
}

Closes #10841
2015-04-28 11:06:24 -04:00
minde-eagleeye a1289b4ad5 Docs: Update cluster.asciidoc
added a missing comma in one of examples

Closes #10834
2015-04-28 11:48:08 +02:00
javanna c914134355 Scripting: remove groovy sandbox
Groovy sandboxing was disabled by default from 1.4.3 on though since we found out that it could be worked around, so it makes little sense to keep it and maintain it.

Closes #10156
Closes #10480
2015-04-28 11:27:50 +02:00
Jun Ohtani 933edf7bcc Analysis: Fix wrong position number by analyze API
Add breaking chages comment to migrate docs
Fix the stopword included text using stopword filter
2015-04-28 17:44:41 +09:00
Zachary Tong bf9739d0f0 [DOCS] review comment fixes 2015-04-27 14:40:04 -04:00
Simon Willnauer d164526d27 Remove `_shutdown` API
Thsi commit removes the `_shutdown` API entirely without any replacement.
Nodes should be managed from the operating system not via REST APIs
2015-04-27 17:19:36 +02:00
Clinton Gormley 089914dede Docs: Document `http.max_header_size`
Closes #10752
2015-04-27 15:59:27 +02:00
Clinton Gormley ba4ec6bca5 Docs: Updated current version 2015-04-27 13:45:35 +02:00
markharwood 1b8b993912 Query enhancement: Enable Lucene ranking behaviour for queries on numeric fields.
This changes the default ranking behaviour of single-term queries on numeric fields to use the usual Lucene TermQuery scoring logic rather than a constant-scoring wrapper.

Closes #10628
2015-04-27 09:42:55 +01:00
navins 84636557e1 Docs: correct three mis-match of brackets
Closes #10806
2015-04-26 19:43:14 +02:00
Christine 9e81e4c09b Docs: Update bool-filter.asciidoc
from, to deprecated in favour of gt, lt

Closes #10682
2015-04-26 19:23:11 +02:00
Clinton Gormley 37ed61807f Docs: Updated the experimental annotations in the docs as follows:
* Removed the docs for `index.compound_format` and `index.compound_on_flush` - these are expert settings which should probably be removed (see https://github.com/elastic/elasticsearch/issues/10778)
* Removed the docs for `index.index_concurrency` - another expert setting
* Labelled the segments verbose output as experimental
* Marked the `compression`, `precision_threshold` and `rehash` options as experimental in the cardinality and percentile aggs
* Improved the experimental text on `significant_terms`, `execution_hint` in the terms agg, and `terminate_after` param on count and search
* Removed the experimental flag on the `geobounds` agg
* Marked the settings in the `merge` and `store` modules as experimental, rather than the modules themselves

Closes #10782
2015-04-26 18:49:15 +02:00
Clinton Gormley f1a0e2216a Docs: Mentioned script_id and script_file parameters across all aggs
Closes #10760
2015-04-26 17:30:38 +02:00
Mark Mulder 690c16e81a Docs: Fix minor spelling mistakes in Match Query doc
Closes #10751
2015-04-26 16:29:41 +02:00
Clinton Gormley 7de8b7008e Docs: Tidied docs for field-stats 2015-04-26 15:52:02 +02:00
Mehdi Mollaverdi dce920b75f Docs: The name of scroll ID attribute in the response is "_scroll_id" rather than "scroll_id"
Closes #10691
2015-04-25 19:32:32 +02:00
Clinton Gormley cf177c32d4 Docs: Fixed pattern-capture token filter example
Closes #10690
2015-04-25 19:27:55 +02:00
Clinton Gormley 2579cc31b1 Docs: Note that include_in_parent/root does not apply to geo-shape fields
Closes #10653
2015-04-25 16:49:49 +02:00
Tanguy Leroux f7d4baacfb Remove working directory
This commit removes the working directory and its associated environment variable "WORK_DIR"
2015-04-25 13:08:36 +02:00
Oliver Eilhard 95e9b86505 Mustache tags syntax
Hi there. I've been experimenting with the search templates recently and I'm a bit confused. Shouldn't the Mustache tags be written like `{{tagname}}` instead of `{tagname}`? Your using `{{...}}` [here](http://www.elastic.co/guide/en/elasticsearch/reference/current/search-template.html) BTW.

Using the first example in that page  seems to indicate that something's wrong, or am I missing something?

```
$ curl 'localhost:9200/test/_search' -d '{"query":{"template":{"query":{"match":{"text":"{keywords}"}},"params":{"keywords":"value1_foo"}}}}'
{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

$ curl 'localhost:9200/test/_search' -d '{"query":{"template":{"query":{"match":{"text":"{{keywords}}"}},"params":{"keywords":"value1_foo"}}}}'
{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"test","_type":"testtype","_id":"1","_score":1.0,"_source":{"text":"value1_foo"}}]}}
```
2015-04-24 21:23:58 +02:00
Ryan Ernst 1f5bdca8cc Mappings: Restrict murmur3 field type to sane options
Disabling doc values or trying to index hash values are not
correct uses of this the murmur3 field type, and just cause
problems.  This disallows changing doc values or index options
for 2.0+.

closes #10465
2015-04-23 21:48:42 -07:00
Benoit Delbosc 4a94e1f14b Docs: Warning about the conflict with the Standard Tokenizer
The examples given requires a specific Tokenizer to work.

Closes: 10645
2015-04-23 21:16:30 +02:00
Igor Motov 60721b2a17 Snapshot/Restore: remove obsolete expand_wildcards_open and expand_wildcards_close options
In #6097 we made snapshot/restore index option consistent with other API. Now we can remove old style options from master.

Closes #10743
2015-04-23 13:29:24 -04:00
Mal Curtis 9eabcd7c0f Docs: Fix missing comma in context suggester docs
Closes #10623
2015-04-23 14:04:46 +02:00
Alexander dbbfe39415 [Docs] fix typo in scripting module
Closes #10622
2015-04-23 14:00:44 +02:00
Martijn van Groningen dbeb4aaacf docs: make sure that the options are rendered correctly 2015-04-23 10:50:01 +02:00
Martijn van Groningen 6a2f9c2682 docs: fixed title out of sequence 2015-04-23 09:57:31 +02:00
Martijn van Groningen 5705537ecf Added field stats api
The field stats api returns field level statistics such as lowest, highest values and number of documents that have at least one value for a field.

An api like this can be useful to explore a data set you don't know much about. For example you can figure at with the lowest and highest response times are, so that you can create a histogram or range aggregation with sane settings.

This api doesn't run a search to figure this statistics out, but rather use the Lucene index look these statics up (using Terms class in Lucene). So finding out these stats for fields is cheap and quick.

The min/max values are based on the type of the field. So for a numeric field min/max are numbers and date field the min/max date and other fields the min/max are term based.

Closes #10523
2015-04-23 08:52:34 +02:00
Zachary Tong e08e45cee8 [DOCS] Add link to movavg page 2015-04-22 18:59:39 -04:00
Zachary Tong a03cefcece [DOCS] Add documentation for moving average 2015-04-22 18:59:39 -04:00
Lee Hinman a4f98e7400 [DOCS] Add example of setting disk threshold decider settings
Fixes #10686
2015-04-22 11:53:19 -06:00
Clinton Gormley a60571c597 Docs: Removed some unused callout from the scroll docs 2015-04-22 12:49:06 +02:00
Jun Ohtani 0955c127c0 Rest: Add json in request body to scroll, clear scroll, and analyze API
Change analyze.asciidoc and scroll.asciidoc
Add json support to Analyze and Scroll, and clear scrollAPI
Add rest-api-spec/test

Closes #5866
2015-04-22 17:53:20 +09:00
Nicholas Knize 453217fd7a [GEO] Prioritize tree_level and precision parameters over default distance_error_pct
If a user explicitly defined the tree_level or precision parameter in a geo_shape mapping their specification was always overridden by the default_error_pct parameter (even though our docs say this parameter is a 'hint'). This lead to unexpected accuracy problems in the results of a geo_shape filter. (example provided in issue #9691)

This simple patch fixes the unexpected behavior by setting the default distance_error_pct parameter to zero when the tree_level or precision parameters are provided by the user. Under the covers the quadtree will now use the tree level defined by the user. The docs will be updated to alert the user to exercise caution with these parameters.  Specifying a precision of "1m" for an index using large complex shapes can quickly lead to OOM issues.

closes #9691
2015-04-21 14:42:10 -05:00
Colin Goodheart-Smithe bd28c9c44e Documentation for the max_bucket reducer 2015-04-21 15:06:20 +01:00
Colin Goodheart-Smithe be647a89d3 Documentation for the derivative reducer 2015-04-21 15:06:20 +01:00
Colin Goodheart-Smithe 0f4b7f3b5c Added section for reducer aggregations in the main aggregation docs page 2015-04-21 15:06:19 +01:00
Adrien Grand d7abb12100 Replace deprecated filters with equivalent queries.
In Lucene 5.1 lots of filters got deprecated in favour of equivalent queries.
Additionally, random-access to filters is now replaced with approximations on
scorers. This commit
 - replaces the deprecated NumericRangeFilter, PrefixFilter, TermFilter and
   TermsFilter with NumericRangeQuery, PrefixQuery, TermQuery and TermsQuery,
   wrapped in a QueryWrapperFilter
 - replaces XBooleanFilter, AndFilter and OrFilter with a BooleanQuery in a
   QueryWrapperFilter
 - removes DocIdSets.isBroken: the new two-phase iteration API will now help
   execute slow filters efficiently
 - replaces FilterCachingPolicy with QueryCachingPolicy

Close #8960
2015-04-21 15:32:43 +02:00
markharwood 63db34f649 New feature - Sampler aggregation used to limit any nested aggregations' processing to a sample of the top-scoring documents.
Optionally, a “diversify” setting can limit the number of collected matches that share a common value such as an "author".

Closes #8108
2015-04-21 10:22:05 +01:00
Adrien Grand f4d5914511 Docs: Warn about the fact that min_doc_count=0 might return terms that only belong to different types. 2015-04-21 00:57:57 +02:00
Honza Král e929c1560d [DOCS] Be explicit about scan doing no scoring 2015-04-20 18:05:45 +02:00
Tanguy Leroux b3d91b1cbb Doc: Change the wording a bit for the HOSTNAME environment variable
I should have done this while merging #9474.
2015-04-17 10:24:50 +02:00