1655 Commits

Author SHA1 Message Date
Alexander Reelsen
1fa21a76cf Documentation: Fix elasticsearch documentation build
The commit for closing #11033 was not building the asciidoc
documentation.
2015-05-26 18:16:12 +02:00
Alexander Reelsen
045f01c085 Infra for deprecation logging
Add support for a specific deprecation logging that can be used to turn
on in order to notify users of a specific feature, flag, setting,
parameter, ... being deprecated.

The deprecation logger logs with a "deprecation." prefix logge
(or "org.elasticsearch.deprecation." if full name is used), and outputs
the logging to a dedicated deprecation log file.

Deprecation logging are logged under the DEBUG category. The idea is not to
enabled them by default (under WARN or ERROR) when running embedded in
another application.

By default they are turned off (INFO), in order to turn it on, the
"deprecation" category need to be set to DEBUG. This can be set in the
logging file or using the cluster update settings API, see the documentation

Closes #11033
2015-05-26 17:44:52 +02:00
Tanguy Leroux
ce63590bd6 API: Add response filtering with filter_path parameter
This change adds a new "filter_path" parameter that can be used to filter and reduce the responses returned by the REST API of elasticsearch.

For example, returning only the shards that failed to be optimized:
```
curl -XPOST 'localhost:9200/beer/_optimize?filter_path=_shards.failed'
{"_shards":{"failed":0}}%
```

It supports multiple filters (separated by a comma):
```
curl -XGET 'localhost:9200/_mapping?pretty&filter_path=*.mappings.*.properties.name,*.mappings.*.properties.title'
```

It also supports the YAML response format. Here it returns only the `_id` field of a newly indexed document:
```
curl -XPOST 'localhost:9200/library/book?filter_path=_id' -d '---hello:\n  world: 1\n'
---
_id: "AU0j64-b-stVfkvus5-A"
```

It also supports wildcards. Here it returns only the host name of every nodes in the cluster:
```
curl -XGET 'http://localhost:9200/_nodes/stats?filter_path=nodes.*.host*'
{"nodes":{"lvJHed8uQQu4brS-SXKsNA":{"host":"portable"}}}
```

And "**" can be used to include sub fields without knowing the exact path. Here it returns only the Lucene version of every segment:
```
curl 'http://localhost:9200/_segments?pretty&filter_path=indices.**.version'
{
  "indices" : {
    "beer" : {
      "shards" : {
        "0" : [ {
          "segments" : {
            "_0" : {
              "version" : "5.2.0"
            },
            "_1" : {
              "version" : "5.2.0"
            }
          }
        } ]
      }
    }
  }
}
```

Note that elasticsearch sometimes returns directly the raw value of a field, like the _source field. If you want to filter _source fields, you should consider combining the already existing _source parameter (see Get API for more details) with the filter_path parameter like this:

```
curl -XGET 'localhost:9200/_search?pretty&filter_path=hits.hits._source&_source=title'
{
  "hits" : {
    "hits" : [ {
      "_source":{"title":"Book #2"}
    }, {
      "_source":{"title":"Book #1"}
    }, {
      "_source":{"title":"Book #3"}
    } ]
  }
}
```
2015-05-26 13:51:04 +02:00
Britta Weber
eeeb29f900 spell correct and add single quotes 2015-05-26 11:41:19 +02:00
Britta Weber
37782c1745 analyzers: custom analyzers names and aliases must not start with _
closes #9596
2015-05-26 11:38:15 +02:00
Boaz Leskes
b376a3fbfb Move index sealing terminology to synced flush
#10032 introduced the notion of sealing an index by marking it with a special read only marker, allowing for a couple of optimization to happen. The most important one was to speed up recoveries of shards where we know nothing has changed since they were online by skipping the file based sync phase. During the implementation we came up with a light notion which achieves the same recovery benefits but without the read only aspects which we dubbed synced flush. The fact that it was light weight and didn't put the index in read only mode, allowed us to do it automatically in the background which has great advantage. However we also felt the need to allow users to manually trigger this operation.

 The implementation at #11179 added the sync flush internal logic and the manual (rest) rest API. The name of the API was modeled after the sealing terminology which may end up being confusing. This commit changes the API name to match the internal synced flush naming, namely `{index}/_flush/synced'.

  On top of that it contains a couple other changes:
   - Remove all java client API. This feature is not supposed to be called programtically by applications but rather by admins.
   - Improve rest responses making structure similar to other (flush) API
   - Change IndexShard#getOperationsCount to exclude the internal +1 on open shard . it's confusing to get 1 while there are actually no ongoing operations
   - Some minor other clean ups
2015-05-25 22:32:32 +03:00
Alex Chan
e31049988b [Docs] Fix minor spelling errors
Closes #11320
2015-05-25 19:56:43 +02:00
Eduardo Gurgel
0f3b3c0787 Docs: Fix typo on percolate_format description
Closes #11215
2015-05-25 13:17:59 +02:00
Clinton Gormley
4d27d751fb Docs: Move the page on facets into redirects.asciidoc 2015-05-24 23:34:23 +02:00
Clinton Gormley
6171ae6cc4 Docs: Added stub entries for pages deleted from 1.x 2015-05-24 17:57:34 +02:00
Clinton Gormley
4b854d10bd Docs: Tidied up the field statistics docs 2015-05-24 15:12:44 +02:00
Britta Weber
4d0b40ca52 Merge pull request #11235 from nik9000/seal_docs
Rewrote some _seal documentation
2015-05-22 18:24:23 +02:00
Clinton Gormley
cde2c91b5a Docs: Example blocks can't contain warnings 2015-05-22 17:37:58 +02:00
Clinton Gormley
631e03c872 Docs: Tidied up term vectors docs
Moved annotations out of titles
Made the example titles into example blocks
2015-05-22 17:19:12 +02:00
Nik Everett
6da1e858dc Rewrote some _seal documentation
The first two paragraphs were confusing to me so I tried to rewrite them.

I removed some passive voice because it irks me.
2015-05-22 10:51:21 -04:00
Clinton Gormley
20279a2556 Docs: Rename reference docs to Elasticsearch Reference 2015-05-22 14:49:11 +02:00
Adrien Grand
42f9053817 Merge pull request #11280 from jpountz/fix/remove_binary_compress
Mappings: Remove the `compress`/`compress_threshold` options of the BinaryFieldMapper.
2015-05-22 14:21:13 +02:00
Adrien Grand
461683ac58 Mappings: Remove the compress/compress_threshold options of the BinaryFieldMapper.
This option is broken currently since it potentially interprets an incoming
binary value as compressed while it just happens that the first bytes are the
same as the LZF header.
2015-05-22 14:20:42 +02:00
Colin Goodheart-Smithe
35deb7efea Aggregations: Renaming reducers to Pipeline Aggregators 2015-05-21 14:57:23 +01:00
Igor Motov
dd41c68741 Snapshot/Restore: fix FSRepository location configuration
Closes #11068
2015-05-20 22:14:31 -04:00
Lee Hinman
0a6f7ef379 [DOCS] Mention Integer.MAX_VALUE limit for http.max_content_length
Fixes #11244
2015-05-20 13:08:59 -06:00
Clinton Gormley
5e4d5e1c64 Docs: Included the index-seal docs in the indices section 2015-05-20 11:20:12 +02:00
Simon Willnauer
488be75d19 Add some words about the purpose of a seal etc. 2015-05-19 12:26:08 +02:00
Simon Willnauer
9d2852f0ab Merge branch 'master' into feature/synced_flush
Conflicts:
	src/main/java/org/elasticsearch/index/engine/InternalEngine.java
	src/main/java/org/elasticsearch/index/shard/IndexShard.java
	src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
	src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
2015-05-19 12:16:22 +02:00
Adrien Grand
2c241e8a36 Mappings: Remove the ignore_conflicts option.
Mappings conflicts should not be ignored. If I read the history correctly, this
option was added when a mapping update to an existing field was considered a
conflict, even if the new mapping was exactly the same. Now that mapping updates
are smart enough to detect conflicting options, we don't need an option to
ignore conflicts.
2015-05-18 15:28:23 +02:00
javanna
a843008b17 Highlighting: require_field_match set to true by default
The default `false` for `require_field_match` is a bit odd and confusing for users, given that field names get ignored by default and every field gets highlighted if it contains terms extracted out of the query, regardless of which fields were queries. Changed the default to `true`, it can always be changed per request.

Closes #10627
Closes #11067
2015-05-15 21:38:45 +02:00
Clinton Gormley
9d71816cd2 Docs: Fixed explanation of AUTO fuzziness
Closes #11186
2015-05-15 21:25:11 +02:00
Scott Chamberlain
ae599f93a7 Added an R client to community clients page
I maintain an R client called `elastic` at https://github.com/ropensci/elastic
2015-05-15 21:25:11 +02:00
javanna
46c521f7ec Highlighting: nuke XPostingsHighlighter
Our own fork of the lucene PostingsHighlighter is not easy to maintain and doesn't give us any added value at this point. In particular, it was introduced to support the require_field_match option and discrete per value highlighting, used in case one wants to highlight the whole content of a field, but get back one snippet per value. These two features won't
 make it into lucene as they slow things down and shouldn't have been supported from day one on our end probably.

One other customization we had was support for a wider range of queries via custom rewrite etc. (yet another way to slow
 things down), which got added to lucene and works much much better than what we used to do (instead of or rewrite, term
s are pulled out of the automata for multi term queries).

Removing our fork means the following in terms of features:
- dropped support for require_field_match: the postings highlighter will only highlight fields that were queried
- some custom es queries won't be supported anymore, meaning they won't be highlighted. The only one I found up until now is the phrase_prefix. Postings highlighter rewrites against an empty reader to avoid slow operations (like the ones that we were performing with the fork that we are removing here), thus the prefix will not be expanded to any term. What the postings highlighter does instead is pulling the automata out of multi term queries, but this is not supported at the moment with our MultiPhrasePrefixQuery.

Closes #10625
Closes #11077
2015-05-15 20:41:33 +02:00
Clinton Gormley
3a69b65e88 Docs: Fixed the backslash escaping on the pattern analyzer docs
Closes #11099
2015-05-15 18:40:16 +02:00
Jun Ohtani
597c53a0bb Add migrationi note for AnalyzeRequest 2015-05-16 00:25:53 +09:00
Adrien Grand
bf599d68dd Merge pull request #11042 from jpountz/feature/aggs_missing
Aggs: Make it possible to configure missing values.
2015-05-15 16:33:29 +02:00
Adrien Grand
32e23b9100 Aggs: Make it possible to configure missing values.
Most aggregations (terms, histogram, stats, percentiles, geohash-grid) now
support a new `missing` option which defines the value to consider when a
field does not have a value. This can be handy if you eg. want a terms
aggregation to handle the same way documents that have "N/A" or no value
for a `tag` field.

This works in a very similar way to the `missing` option on the `sort`
element.

One known issue is that this option sometimes cannot make the right decision
in the unmapped case: it needs to replace all values with the `missing` value
but might not know what kind of values source should be produced (numerics,
strings, geo points?). For this reason, we might want to add an `unmapped_type`
option in the future like we did for sorting.

Related to #5324
2015-05-15 16:26:58 +02:00
Martijn van Groningen
719252a138 Merge pull request #11183 from martijnvg/parent-child/remove_id_cache_from_stats_and_clear_cache_apis
Removed `id_cache` from stats and cat apis.
2015-05-15 14:39:35 +02:00
Martijn van Groningen
ece18f162e Removed id_cache from stats and cat apis.
Also removed the `id_cache` option from the clear cache api.

Closes #5269
2015-05-15 14:06:18 +02:00
Jun Ohtani
3a1a4d3e89 Analysis: Add multi-valued text support
Add support array text as a multi-valued for AnalyzeRequestBuilder
Add support array text as a multi-valued for Analyze REST API
Add docs

Closes #3023
2015-05-15 20:01:10 +09:00
Britta Weber
7a8d08a4a3 Merge remote-tracking branch 'origin/master' into feature/synced_flush 2015-05-15 10:35:36 +02:00
Lee Hinman
179dad69b6 [DOCS] Add DNS SRV discovery plugin 2015-05-14 16:02:59 -06:00
Areek Zillur
7efc43db25 Re-structure collate option in PhraseSuggester to only collate on local shard.
Previously, collate feature would be executed on all shards of an index using the client,
this leads to a deadlock when concurrent collate requests are run from the _search API,
due to the fact that both the external request and internal collate requests use the
same search threadpool.

As phrase suggestions are generated from the terms of the local shard, in most cases the
generated suggestion, which does not yield a hit for the collate query on the local shard
would not yield a hit for collate query on non-local shards.

Instead of using the client for collating suggestions, collate query is executed against
the ContextIndexSearcher. This PR removes the ability to specify a preference for a collate
query, as the collate query is only run on the local shard.

closes #9377
2015-05-14 17:21:53 -04:00
Jack Conradson
a5c0ac0d67 Scripting: Add Multi-Valued Field Methods to Expressions
Add methods to operate on multi-valued fields in the expressions language.
Note that users will still not be able to access individual values
within a multi-valued field.

The following methods will be included:

* min
* max
* avg
* median
* count
* sum

Additionally, changes have been made to MultiValueMode to support the
new median method.

closes #11105
2015-05-14 08:27:24 -07:00
Britta Weber
2b03a03c0c Merge remote-tracking branch 'origin/master' into feature/synced_flush 2015-05-13 18:00:18 +02:00
Britta Weber
f1948cf95c doc for seal api and doc for syned flush in general 2015-05-13 15:43:05 +02:00
Adrien Grand
630757906a Query DSL: Add filter clauses to bool queries.
These clauses filter the document space without affecting scoring and map to
Lucene's BooleanClause.Occur.FILTER. The `filtered` query is now deprecated and

```json
{
  "filtered": {
    "query": { //query },
    "filter": { //filter }
  }
}
```
should be replaced with
```json
{
  "bool": {
    "must": { //query },
    "filter": { //filter }
  }
}
```
2015-05-13 12:04:56 +02:00
Ryan Ernst
f766b260ba Add tests for includeInObject backcompat 2015-05-12 23:11:15 -07:00
Ryan Ernst
565ffb16f1 Mappings: Remove ability to set meta fields inside documents
A few meta fields can currently be set within a document's source.
However, the recommended way to set meta fields like this is through
the api, and setting within the document can be a performance trap
(e.g. needing to find _id in order to route the document).

This change removes the ability to set meta fields within
a document source for 2.0+ indexes.

closes #11051
closes #11074
2015-05-12 23:09:03 -07:00
Igor Motov
d6efe1e508 Docs: Add information about restoring to a different cluster 2015-05-12 20:59:24 -04:00
Ryan Ernst
e7618b8528 Settings: Remove file based index templates
As a follow up to #10870, this removes support for
index templates on disk. It also removes a missed
place still allowing disk based mappings.

closes #11052
2015-05-11 12:51:22 -07:00
javanna
36c373e615 [DOCS] documented missing query_string parameters for count, exists, search & validate_query
relates to #11057
2015-05-11 12:58:30 +02:00
Martijn van Groningen
acdd9a5dd9 parent/child: Removed the top_children query. 2015-05-10 16:30:19 +02:00
Lee Hinman
459a05168c Merge remote-tracking branch 'refs/remotes/dakrone/truncate-loglines' 2015-05-08 10:11:26 -06:00