OpenSearch

Commit Graph

Author	SHA1	Message	Date
Lee Hinman	a754224751	Add field data memory circuit breaker. This adds the field data circuit breaker, which is used to estimate the amount of memory required to load field data before loading it. It then raises a CircuitBreakingException if the limit is exceeded. It is configured with two parameters: `indices.fielddata.cache.breaker.limit` - the maximum number of bytes of field data to be loaded before circuit breaking. Defaults to `indices.fielddata.cache.size` if set, unbounded otherwise. `indices.fielddata.cache.breaker.overhead` - a contast for all field data estimations to be multiplied with before aggregation. Defaults to 1.03. Both settings can be configured dynamically using the cluster update settings API.	2014-01-02 15:04:47 -07:00
Martijn van Groningen	aa548f5148	Remove GET `_aliases` api in favour for GET `_alias` api Currently there are two get aliases apis that both have the same functionality, but have a different response structure. The reason for having 2 apis is historic. The GET _alias api was added in 0.90.x and is more efficient since it only sends the needed alias data from the cluster state between the master node and the node that received the request. In the GET _aliases api the complete cluster state is send to the node that received the request and then the right information is filtered out and send back to the client. The GET _aliases api should be removed in favour for the alias api Closes to #4539	2014-01-02 13:56:11 +01:00
Martijn van Groningen	f4bf0d5112	Replaced `ignore_indices` with `ignore_unavailable`, `expand_wildcards` and `allow_no_indices`. * `ignore_unavailable` - Controls whether to ignore if any specified indices are unavailable, this includes indices that don't exist or closed indices. Either `true` or `false` can be specified. * `allow_no_indices` - Controls whether to fail if a wildcard indices expressions results into no concrete indices. Either `true` or `false` can be specified. For example if the wildcard expression `foo` is specified and no indices are available that start with `foo` then depending on this setting the request will fail. This setting is also applicable when `_all`, `` or no index has been specified. * `expand_wildcards` - Controls to what kind of concrete indices wildcard indices expression expand to. If `open` is specified then the wildcard expression if expanded to only open indices and if `closed` is specified then the wildcard expression if expanded only to closed indices. Also both values (`open,closed`) can be specified to expand to all indices. Closes to #4436	2014-01-02 12:19:45 +01:00
Britta Weber	1ede9a5730	make term statistics accessible in scripts term statistics can be accessed via the _shard variable. Below is a minimal example. See documentation on details. ``` DELETE paytest PUT paytest { "mappings": { "test": { "_all": { "auto_boost": true, "enabled": true }, "properties": { "text": { "index_analyzer": "fulltext_analyzer", "store": "yes", "type": "string" } } } }, "settings": { "analysis": { "analyzer": { "fulltext_analyzer": { "filter": [ "my_delimited_payload_filter" ], "tokenizer": "whitespace", "type": "custom" } }, "filter": { "my_delimited_payload_filter": { "delimiter": "+", "encoding": "float", "type": "delimited_payload_filter" } } }, "index": { "number_of_replicas": 0, "number_of_shards": 1 } } } POST paytest/test/1 { "text": "the+1 quick+2 brown+3 fox+4 is quick+10" } POST paytest/test/2 { "text": "the+1 quick+2 red+3 fox+4" } POST paytest/_refresh POST paytest/_search { "script_fields": { "ttf": { "script": "_shard[\"text\"][\"quick\"].ttf()" } } } POST paytest/_search { "script_fields": { "freq": { "script": "_shard[\"text\"][\"quick\"].freq()" } } } POST paytest/test/2/_termvector POST paytest/_search { "script_fields": { "payloads": { "script": "term = _shard[\"text\"].get(\"red\",_PAYLOADS);payloads = []; for(pos : term){payloads.add(pos.payloadAsFloat(-1));} return payloads;" } } } POST paytest/_search { "script_fields": { "tv": { "script": "_shard[\"text\"][\"quick\"].freq()" } }, "query": { "function_score": { "functions": [ { "script_score": { "script": "_shard[\"text\"][\"quick\"].freq()" } } ] } } } ``` closes #3772	2014-01-02 11:17:33 +01:00
Adrien Grand	1654ae8937	Explicit doc_values setting. Once doc values are enabled on a field, they can't be disabled. Close #4560	2013-12-30 11:10:52 +01:00
Adrien Grand	05448b6276	Doc values for geo points. This commits add doc values support to geo point using the exact same approach as for numeric data: geo points for a given document are stored uncompressed and sequentially in a single binary doc values field. Close #4207	2013-12-27 12:45:18 +01:00
Florian Schilling	bc452dff84	* setup accurate GeoDistance Function * adapt tests * introduced default GeoDistance function * Updated docs closes #4498	2013-12-27 19:15:19 +09:00
Andrew Raines	69d88a1edd	[DOCS] Add headers and help parameters.	2013-12-23 22:26:28 -06:00
Martijn van Groningen	eb86a3a6fe	[DOCS] Changed `shape_field_name` to `path` in geo_shape filter documentation. Relates to #4486	2013-12-23 11:27:06 +01:00
Clinton Gormley	998b7b3b86	[DOCS] Fixed community links to official clients	2013-12-20 12:16:58 +01:00
Clinton Gormley	dea6b112ae	[DOCS] Corrected bloom loading docs	2013-12-20 11:20:54 +01:00
Clinton Gormley	2b8c82c883	[DOCS] Documented index.codec.bloom.load for #4525	2013-12-20 10:51:17 +01:00
Clinton Gormley	51dc057244	[DOCS] Added the official PHP client to the community page.	2013-12-20 10:51:17 +01:00
Richard Pijnenburg	df85fdf88f	Add repository information to docs This adds the apt and yum repo information to the setup docs.	2013-12-19 15:58:08 +01:00
Adrien Grand	52db8eb324	More documentation improvements for fielddata loading.	2013-12-18 16:05:35 +01:00
Adrien Grand	07443089ce	Improve documentation of the new `disabled` field data format.	2013-12-18 15:44:57 +01:00
Boaz Leskes	3c5106ae98	Added cluster health status to the Cluster Stats API Relates to #4460	2013-12-18 12:03:49 +01:00
Chris Simpson	4f8c916eed	[Docs] Fix Typo Fixes small typo in the geo_distance aggregation docs.	2013-12-18 11:21:21 +01:00
spenceralger	89e6b9cfc4	Merge pull request #4494 from spenceralger/add_js_docs JavaScript client docs	2013-12-17 14:41:57 -08:00
Spencer Alger	a8ca8497c5	added doc page for the JavaScipt client, and listed it in the clients list.	2013-12-17 15:26:29 -07:00
Boaz Leskes	2b6214cff7	Added Cluster Stats API Closes #4460	2013-12-17 13:14:46 +01:00
Grégory Quatannens	c64abaae7e	Fixing typo and grammar	2013-12-17 11:39:02 +01:00
Adrien Grand	33599d9a34	Compressed geo-point field data. This commit allows to trade precision for memory when storing geo points. This new field data impl accepts a `precision` parameter that controls the maximum expected error for storing coordinates. This option can be updated on a live index with the PUT mapping API. Default precision is 1cm, which requires 8 bytes per geo-point (50% memory saving compared to using 2 doubles). Close #4386	2013-12-17 11:29:48 +01:00
Clinton Gormley	684affa5c7	[DOCS] Removed unused file	2013-12-17 11:28:19 +01:00
Alexander Reelsen	b713cf56ed	Allow to provide parameters not only through -D but as long parameters All getopt long style parameters are now set as es. properties, elasticsearch --path.data=/some/path results in -Des.path.data=/some/path Closes #4393	2013-12-17 10:43:27 +01:00
Alexander Reelsen	c30945a3d8	Start elasticsearch in the foreground by default Instead of using the '-f' parameter to start elasticsearch in the foreground, this is now the default modus. In order to start elasticsearch in the background, the '-d' parameter can be used. Closes #4392	2013-12-17 10:39:22 +01:00
Clinton Gormley	34b9b16233	[DOCS] Fixed some bad link refs	2013-12-16 18:07:33 +01:00
Martijn van Groningen	23d2b1ea7b	Renamed top level `filter` to `post_filter`. Closes #4119	2013-12-16 17:10:14 +01:00
Lee Hinman	db431b7cb3	Remove the `field` and `text` queries. The `text` query was replaced by the `match` query and has been deprecated for quite a while. The `field` query should be replaced by a `query_string` query with the `default_field` specified. Fixes #4033	2013-12-16 08:59:36 -07:00
Adrien Grand	4e7ce4ee02	Make field data changes immediately taken into account and add the ability to disallow field data loading. This commit changes field data configuration updates so that they are immediately taken into account for loading new segments. The way it works is that field data configuration is now cached separately from the field data cache, meaning that it is now possible to clear the field data configuration from IndexFieldDataService while the cache will stay around. On the next time that Elasticsearch will reload field data configuration, it will check if there is already a cache entry, and reuse it if it exists. To disable field data loading, all that is required is to change the field data format to "none" (supported by all field data types) using the update mapping API. Elasticsearch will then refuse to load field data on any new segment, but field data which has been loaded on the previous segments will remain available. So you need to clear the field data cache in order to reclaim memory (otherwise memory will be reclaimed slower, as segments get merged). Close #4430 Close #4431	2013-12-16 14:34:33 +01:00
Adrien Grand	36bd9cc432	Aggregations: Ordinals-based string bucketing support. When the ValuesSource has ordinals, terms ordinals are used as a cache key to bucket ordinals. This can make terms aggregations on String terms significantly faster. Close #4350	2013-12-13 15:34:02 +01:00
Martijn van Groningen	10e2528cce	Added the `force_source` option to highlighting that enforces to use of the _source even if there are stored fields. The percolator uses this option to deal with the fact that the MemoryIndex doesn't support stored fields, this is possible b/c the _source of the document being percolated is always present. Closes #4348	2013-12-13 13:39:53 +01:00
Lee Hinman	77fcf71338	Add new `simple_query_string` query type This adds support for Lucene's SimpleQueryParser by adding a new type of query called the `simple_query_string`. The `simple_query_string` query is designed to be able to parse human-entered queries without throwing any exceptions. Resolves #4159	2013-12-12 12:09:32 -07:00
Alexander Reelsen	81e13a870b	Packaging: Ensure setting of sysctl vm.max_map_count In order to be sure that memory mapped lucene directories are working one can configure the kernel about how many memory mapped areas a process may have. This setting ensure for the debian and redhat initscripts as well as the systemd startup, that this setting is set high enough. Closes #4397	2013-12-11 09:19:22 +01:00
Boaz Leskes	99b421925f	Add wildcard support to field resolving in the Get Field Mapping API Closes #4367	2013-12-10 23:46:37 +01:00
Simon Willnauer	6c189310b9	Remove 'term_index_interval' and 'term_index_divisor' These settings are no longer relevant since they are codec / postingsformat level settings since Lucene 4.0 Closes #3912	2013-12-10 16:54:08 +01:00
Martijn van Groningen	ebf6519965	Added aggs option to percolate api documentation.	2013-12-10 14:09:37 +01:00
Lee Hinman	bc9698a347	Support 'yaml' as a format for the Analyze API Fixes #4311	2013-12-08 15:08:00 -07:00
Martijn van Groningen	8c1de501e7	Update percolator highlighting docs.	2013-12-07 16:40:49 -05:00
Adrien Grand	32eb5ffa92	[Docs] Document which encoding should be used in order to make sense of the offsets returned by the term vectors API. Close #4363	2013-12-06 22:39:08 +01:00
Lee Hinman	a1d4731137	[DOCS] Fix outdated link to wonderdog in community integration	2013-12-06 12:05:43 -07:00
Shay Banon	28eff2ba29	remove help command, list all cat commands in /_cat?h endpoint	2013-12-05 14:36:27 +01:00
Markus Fischer	2da0611dfb	[DOCS] Completion suggest: Clarify de-duplication, optimize/merge This contribution is based on the feedback given in issue #4254 and issue #4255, and should clear things up, when suggestions are being removed and not displayed anymore after deletion of data.	2013-12-05 11:10:56 +01:00
Nik Everett	8e34057bc0	Add support for combining fields to the FVH The Fast Vector Highlighter can combine matches on multiple fields to highlight a single field using `matched_fields`. This is most intuitive for multifields that analyze the same string in different ways. Example: { "query": { "query_string": { "query": "content.plain:running scissors", "fields": ["content"] } }, "highlight": { "order": "score", "fields": { "content": { "matched_fields": ["content", "content.plain"], "type" : "fvh" } } } } Closes #3750	2013-12-03 11:10:01 +01:00
Yousef	302c762d5e	Wrong link to Token Filter	2013-12-03 10:39:13 +01:00
Nik Everett	7690b40ec6	Allow string fields to store token counts To use this one you send a string to a field of type 'token_count'. This makes the most sense with a multi-field.	2013-12-03 09:39:32 +01:00
Alexander Reelsen	6528df2764	[DOCS] Test framework documentation The java test framework using randomized testing is explained with a couple of examples.	2013-12-02 18:01:45 +01:00
Clinton Gormley	7d993fd917	[DOCS] Another cat?v change	2013-12-02 15:30:49 +01:00
Clinton Gormley	5b15ed73fa	[DOCS] Linked cat-pending to cluster-pending	2013-12-02 15:29:47 +01:00
Clinton Gormley	992b2d82b0	[DOCS] Changed the _cat docs to use ?v instead of ?v=true	2013-12-02 15:27:41 +01:00
Clinton Gormley	d9a480c97a	[DOCS] Typos in aggregations	2013-12-02 15:14:25 +01:00
Conrad Pankoff	87246af256	[DOCS] Fixed typos and corrected grammar	2013-12-02 10:08:26 +01:00
uboness	cdc7dfbb2c	Changed the "script_lang" parameter to "lang" in all value source based aggs - to be consistent with all other script based APIs.	2013-12-02 02:01:03 +01:00
Clinton Gormley	bc393b6d79	Changed the minScore comparator from > to >= Closes #4303	2013-11-29 20:29:20 +01:00
uboness	0d6a35b9a7	- Added support for term filtering based on include/exclude regex on the terms agg - Added javadoc to the TermsBuilder Closes #4267	2013-11-29 13:46:48 +01:00
uboness	afb0d119e4	- Added docs for the value_count aggregation - Fixed typos in the terms facets docs - Fixed aggregation docs layout - Added docs for shard_size in term aggregation	2013-11-29 12:35:42 +01:00
Clinton Gormley	b48344f296	[DOCS] Doc'ed cluster pending tasks	2013-11-29 08:21:26 +01:00
Andrew Raines	91999e14ce	Add _cat/pending_tasks. Closes #4251.	2013-11-29 01:09:06 -06:00
Lee Hinman	9939e81d88	[DOCS] Fix porter stem filter name in other stemming docs	2013-11-28 22:14:47 -07:00
Lee Hinman	fb4e903e35	[DOCS] Fix name of porter stemming token filter	2013-11-28 22:01:19 -07:00
Clinton Gormley	6ce3495029	[DOCS] Fixed a bad link	2013-11-27 17:54:25 +01:00
Clinton Gormley	cdc1935b6e	[DOCS] Documented rest.action.multi.allow_explicit_index	2013-11-27 17:33:09 +01:00
Boaz Leskes	c63d8c4fb5	[Docs] Added _source filtering to documentation Relates to #3301	2013-11-26 19:16:24 +01:00
Britta Weber	dbef64009f	[DOC] add doc for multi term vector api closes #3998	2013-11-26 17:03:14 +01:00
Alexander Reelsen	bf74f49fdd	Updated Analyzing/Fuzzysuggester from lucene trunk * Minor alignments (like setter to ctor) * FuzzySuggester has a unicode aware flag, which is not exposed in the fuzzy completion request parameters * Made XAnalyzingSuggester flags (PAYLOAD_SEP, END_BYTE, SEP_LABEL) to be written into the postings format, so we can retain backwards compatibility * The above change also implies, that these flags can be set per instantiated XAnalyzingSuggester * CompletionPostingsFormatTest now uses a randomProvider for writing data to check for bwc	2013-11-26 12:52:06 +01:00
Martijn van Groningen	a03556daa0	Added execution option to `range` filter, with the `index` and `fielddata` as values. Deprecated `numeric_range` filter in favor for the `range` filter with `fielddata` as execution. Closes #4034	2013-11-25 23:43:40 +01:00
uboness	c7f6c5266d	initial commit of the aggregations module Closes #3300	2013-11-24 03:13:08 -08:00
Jun Ohtani	7bbe453273	[DOCS] Added elasticsearch-extended-analyze plugin	2013-11-21 09:48:00 +01:00
Clinton Gormley	7c59ed4087	[DOCS] Fixed duplicate docs ID in delete	2013-11-21 17:38:51 +11:00
Shay Banon	a9880dcbf1	add timeout doc to delete	2013-11-20 12:50:03 -08:00
Matt Weber	a841a422f6	Add a field data based TermsFilter Add FieldDataTermsFilter that compares terms out of the fielddata cache. When filtering on a large set of terms this filter can be considerably faster than using a standard lucene terms filter. Add the "fielddata" execution mode to the terms filter parser to enable the use of the new FieldDataTermsFilter. Add supporting tests and documentation. Closes #4209	2013-11-19 19:18:16 +01:00
Andrew Raines	8fabeb1c0b	First pass at cat docs.	2013-11-14 21:37:02 -05:00
Andrew Raines	5c085c1204	Fix misspellings.	2013-11-14 20:10:36 -05:00
Luca Cavanna	0aaa39d00a	Minor improvements to indices filter and query & updated docs Slightly simplified indices filter and query parsers code Trimmed down tests where possible	2013-11-14 17:25:34 +01:00
Olivier Favre	fa80ca97b2	Indices query/filter skip parsing altogether for irrelevant indices when possible Closes #2416	2013-11-14 17:24:49 +01:00
Igor Motov	510397aecd	Initial implementation of Snapshot/Restore API Closes #3826	2013-11-10 18:26:56 -05:00
Lee Hinman	f7d5d1e5c9	[DOCS] Update store docs to indicate mmapfs is now the default on 64-bit Linux	2013-11-09 11:42:43 -07:00
Clinton Gormley	5af4e02d6c	[DOCS] Fix link to statsd plugin Fixes #4128	2013-11-08 20:29:51 +01:00
Clinton Gormley	7189310764	In ctor of GeoPointFieldMapper, geohash_prefix now implicitly enables geohash option Also improved docs for geopoint type and geohash_cell filte Closes #3951	2013-11-08 13:52:17 +01:00
Cory G Watson	6bbcc34061	Add wabisabi to Scala clients.	2013-11-08 10:34:14 +01:00
Clinton Gormley	b27976fbed	[DOCS] Fixed the fielddata regex example on core mapping	2013-11-07 17:09:18 +01:00
Clinton Gormley	3465e69e83	[DOCS] Changed all store:yes/no to store:true/false which is how this setting is stored internally	2013-11-07 16:57:18 +01:00
Simon Willnauer	77bc5d5ecf	release [1.0.0.Beta1]	2013-11-06 15:32:43 +01:00
Simon Willnauer	9654631186	Change 'standart' analyzer to use emtpy stopword list by default. The 'default' / 'standard' analyzer can be a trappy default sicne it filters english stopwords by default. Yet a default should not be dedicated to a certain language since elasticsearch is used in many different scenarios where a standard analysis chain with specialization to english full-text might be rather counter productive. This commit changes the 'standard' analyzer to use an empty stopword list for indices that are created from 1.0.0.Beta1 version onwards but will maintain backwards compatibiliy for older indices. Closes #3775	2013-11-05 21:07:21 +01:00
Shay Banon	7c32269f4f	Dist. Percolation: Use .percolator instead of _percolator for type name Use .percolator as the internal (hidden) type name for percolators within the index. Seems nicer name to represent "hidden" types within an index. closes #4090	2013-11-05 20:02:59 +01:00
Boaz Leskes	a9fdcadf01	[DOCS] Added documentation for the keep word token filter	2013-11-04 18:38:44 +01:00
Clinton Gormley	356de95840	Added simplified range syntax to query string docs	2013-11-04 18:18:36 +01:00
Karel Minarik	b93dac678f	[DOC] Added a link to the official Ruby client to the "Clients" page	2013-11-04 11:47:14 +01:00
Karel Minarik	7023ef2e3f	[DOCS] Added a basic information about the official Ruby client to documentation	2013-11-04 11:46:36 +01:00
Ben McCann	46edfc484a	[DOCS] Add some documentation about the performance of `_source` usage in scripts.	2013-11-04 11:05:55 +01:00
Igor Motov	c724f0de5d	Initial implementation of ResourceWatcherService Closes #4062	2013-11-03 21:55:54 -05:00
Dan Everton	6df60b7271	[DOC] Improve documentation on search stats groups Document the ability to return all search statistics groups and provide examples of returning search statistics for groups.	2013-11-01 13:53:39 +01:00
Martijn van Groningen	30ab6f841d	[DOCS] Fixed percolate docs errors	2013-11-01 11:44:07 +01:00
Clinton Gormley	4206cc988e	[DOCS] Typo on shingle tokenfilter	2013-10-31 20:18:00 +01:00
Opak Alex	6856cfc5e3	add reference for ember-data-elasticsearch-kit to integrations page	2013-10-31 11:40:01 +01:00
Alexander Reelsen	dfcb3ca2d4	RegexpQueryBuilder now implements MultiTermQueryBuilder This allows the RegexpQueryBuilder to be used in span queries Added tests for all span multi term queries. Also updated the documentation and removed mentioning of numeric range queries for span queries (they have to be terms). Closes #3392	2013-10-31 09:12:57 +01:00
Boaz Leskes	8819f91d47	Add a GetFieldMapping API This new API allows to get the mapping for a specific set of fields rather than get the whole index mapping and traverse it. The fields to be retrieved can be specified by their full path, index name and field name and will be resolved in this order. In case multiple field match, the first one will be returned. Since we are now generating the output (rather then fall back to the stored mapping), you can specify `include_defaults`=true on the request to have default values returned. Closes #3941	2013-10-30 16:16:36 +01:00
Clinton Gormley	8b2efd4849	[DOCS] Added a version flag to percolation	2013-10-30 13:59:03 +01:00
Clinton Gormley	0585890a5f	[DOCS] Fixed a typo	2013-10-30 13:57:18 +01:00
Alexander Reelsen	2ec9742147	[DOCS] Extending setup as a service documentation * Tell people to use ES_JAVA_OPTS for es.node.name or similar parameters * Showing a simple way to install Oracle JDK on ubuntu/debian Closes #3999	2013-10-29 13:58:06 +01:00
David Pilato	5d90abf701	mget API should support global routing parameter mget API support `_routing` field but not `routing` parameter. Reproduction here: ```sh curl -XDELETE "http://localhost:9200/test/"; echo curl -XPUT "http://localhost:9200/test/" -d'{ "settings": { "number_of_replicas": 0, "number_of_shards": 5 } }'; echo curl -XPUT 'http://localhost:9200/test/order/1-1?routing=key1' -d '{ "productName":"doc 1" }'; echo curl -XPUT 'http://localhost:9200/test/order/1-2?routing=key1' -d '{ "productName":"doc 2" }'; echo curl -XPUT 'http://localhost:9200/test/order/1-3?routing=key1&refresh=true' -d '{ "productName":"doc 3" }'; echo curl -XPOST 'http://localhost:9200/test/order/_mget?pretty' -d '{ "docs" : [ { "_index" : "test", "_type" : "order", "_id" : "1-1", "_routing" : "key1" }, { "_index" : "test", "_type" : "order", "_id" : "1-2", "_routing" : "key1" }, { "_index" : "test", "_type" : "order", "_id" : "1-3", "_routing" : "key1" } ] }'; echo curl -XPOST 'http://localhost:9200/test/order/_mget?pretty&routing=key1' -d '{ "ids": [ "1-1", "1-2", "1-3" ] }'; echo ``` Closes #3996.	2013-10-28 21:05:55 +01:00
Britta Weber	c9dab6991e	rename and document "index.mapping.date.parse_upper_inclusive" setting for date fields The setting causes the upper bound for a range query/filter to be rounded up, therefore the name `round_ceil` seems to make more sense. Also this commit removes the redundant fourth parameter to DateMathParser.parse(..) which was never used. was: parse(String text, long now, boolean roundUp, boolean upperInclusive) is now: parse(String text, long now, boolean roundCeil) closes #3914	2013-10-28 15:48:31 +01:00
Ben McCann	cc4bc7d57d	Fix nonsensical sentence in standard analyzer documentation so that it is more understandable	2013-10-25 00:18:32 +02:00
Luca Cavanna	48ac9747a8	Added third highlighter type based on lucene postings highlighter Requires field index_options set to "offsets" in order to store positions and offsets in the postings list. Considerably faster than the plain highlighter since it doesn't require to reanalyze the text to be highlighted: the larger the documents the better the performance gain should be. Requires less disk space than term_vectors, needed for the fast_vector_highlighter. Breaks the text into sentences and highlights them. Uses a BreakIterator to find sentences in the text. Plays really well with natural text, not quite the same if the text contains html markup for instance. Treats the document as the whole corpus, and scores individual sentences as if they were documents in this corpus, using the BM25 algorithm. Uses forked version of lucene postings highlighter to support: - per value discrete highlighting for fields that have multiple values, needed when number_of_fragments=0 since we want to return a snippet per value - manually passing in query terms to avoid calling extract terms multiple times, since we use a different highlighter instance per doc/field, but the query is always the same The lucene postings highlighter api is quite different compared to the existing highlighters api, the main difference being that it allows to highlight multiple fields in multiple docs with a single call, ensuring sequential IO. The way it is introduced in elasticsearch in this first round is a compromise trying not to change the current highlight api, which works per document, per field. The main disadvantage is that we lose the sequential IO, but we can always refactor the highlight api to work with multiple documents. Supports pre_tag, post_tag, number_of_fragments (0 highlights the whole field), require_field_match, no_match_size, order by score and html encoding. Closes #3704	2013-10-24 23:38:00 +02:00
Luca Cavanna	e981e411d7	[DOCS] rephrased docs for highlight no_match_size parameter (removed 0.90.6 coming tag as it's needed only in 0.90 branch)	2013-10-24 14:38:32 +02:00
Nik Everett	14a709f563	Highlighting can return excerpt with no highlights You can configure the highlighting api to return an excerpt of a field even if there wasn't a match on the field. The FVH makes excerpts from the beginning of the string to the first boundary character after the requested length or the boundary_max_scan, whichever comes first. The Plain highlighter makes excerpts from the beginning of the string to the end of the last token before the requested length. Closes #1171	2013-10-24 14:38:32 +02:00
Boaz Leskes	0e6e6f97dc	Merge pull request #3940 from rboulton/patch-1 [Docs] Clean up wording in cluster health api doc	2013-10-22 04:09:13 -07:00
Markus Fischer	782d315da3	Fix markup	2013-10-21 16:11:09 +02:00
Richard Boulton	b62cc7c716	Clean up wording to reduce confusion The description of the timeout parameter was worded misleadingly; it implied that the API would wait until the cluster reached the desired level and then stayed at that level for the timeout. I've tweaked the sentence to remove the risk of confusion.	2013-10-21 12:37:50 +01:00
Clinton Gormley	b2d82d7e75	[DOCS] Reorganised the highlight_query docs and added a version flag	2013-10-18 18:03:31 +02:00
Matt Weber	1e0a834c68	Document strict dynamic type mapping.	2013-10-18 08:29:31 -07:00
Nik Everett	60550e4cc2	phrase_len is not called phrase_length	2013-10-18 09:29:53 -04:00
Clinton Gormley	adf0c8424b	[DOCS] How to check max_file_descriptors	2013-10-17 11:54:36 +02:00
David Pilato	4efd94e7cf	Java API Documentation (0.90+) needs update for accessors in Facets docs Closes #3921. (cherry picked from commit a753c48)	2013-10-17 09:50:15 +02:00
Honza Kral	dd43d932f1	Added a link to official Python client to the client list, fixed perl link	2013-10-16 17:51:50 +02:00
Honza Kral	4f3ad73854	Added brief overview of the python client to the guide	2013-10-16 17:45:05 +02:00
Martijn van Groningen	b7c4adeea3	[Docs] update reference to remove documentation about percolating during an index, bulk or update request.	2013-10-16 16:31:36 +02:00
Martijn van Groningen	1d0841e2b8	Added initial documentation for the redesigned percolator.	2013-10-16 14:12:19 +02:00
Boaz Leskes	18e12ef66c	[Docs] updated refrences to dynamic_date_formats	2013-10-16 12:04:31 +02:00
Boaz Leskes	57b2d45142	[Docs] added document for the lenient option in match queries	2013-10-16 10:53:25 +02:00
Clinton Gormley	f5e2cf9785	[Docs] Typo	2013-10-15 17:27:05 +02:00
Clinton Gormley	4798425da6	[Docs] Added a page for the Perl client	2013-10-15 17:22:34 +02:00
Alexander Reelsen	4d19239ec4	Add support for Lucene SuggestStopFilter The suggest stop filter is an improved version of the stop filter, which takes stopwords only into account if the last char of a query is a whitespace. This allows you to keep stopwords, but to allow suggesting for "a". Example: Index document content "a word". You are now able to suggest for "a" and get back results in the completion suggester, if the suggest stop filter is used on the query side, but will not get back any results for "a " as this is identified as a stopword. The implementation allows to set the `remove_trailing` parameter for a custom stop filter and thus use the suggest stop filter instead of the standard stop filter.	2013-10-15 16:12:02 +02:00
Clinton Gormley	870346070e	[DOCS] Added compound_on_flush docs and updated compound_format docs to include note about accepting a float	2013-10-15 13:30:56 +02:00
Clinton Gormley	d67331b554	[DOCS] Added script.disable_dynamic to the scripting page	2013-10-15 12:25:07 +02:00
steve mayzak	48656fd1ed	removed a duplicate paragraphin config docs	2013-10-14 15:33:56 -07:00
Britta Weber	34441f3897	fix naming in function_score - "boost" should be "boost_factor" - "mult" should be "multiply" Also, store combine function names in ImmutableMap instead of iterating over all possible names each time. closes #3872 for master	2013-10-14 14:56:59 +02:00
Simon Willnauer	25d6f04f13	[DOCS] Note that cutoff_frequency doesn't handle stacked tokens gracefully	2013-10-14 14:09:38 +02:00
Britta Weber	c3ab79a10e	[DOCS] Add doc for delimited payload token filter	2013-10-14 13:41:35 +02:00
Clinton Gormley	9a062e465c	[DOCS] Reorganised common API conventions	2013-10-13 16:46:56 +02:00
Clinton Gormley	4316b13880	[DOCS] Render common options on the same page	2013-10-13 14:14:50 +02:00
Shay Banon	420b3396f4	Set queue sizes by default on bulk/index thread pools Now that we properly fixed the ability to set the queue size on the index / bulk thread pool, we should actually set them to a somehow reasonable value to protect from users potentially overflowing our system. I suggest defaults to be 50 for bulk, and 200 for indexing. Also, set the thread pool for get, which we should set (in a similar value to a "read" queue size we have today). closes #3888	2013-10-12 21:51:37 +02:00
Subhash Gopalakrishnan	b758b76da4	Support year units in date math expressions According to http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-date-format.html, the date math expressions support M (month), w (week), h (hour), m (minute), and s (second) units. Why years are not supported? Please add support for year units. Closes #3828. Closes #3874.	2013-10-11 09:24:52 +02:00
Clinton Gormley	8462f88c39	[DOCS] Added more specific versions to the suggesters	2013-10-10 20:59:12 +02:00
Adrien Grand	f2d75654bf	Add clear warnings that only the default codec, postings format and doc values format have backward compatibility warranties.	2013-10-10 13:30:08 +02:00
Clinton Gormley	ba1b4886e3	[DOCS] Moved "named filters/queries" up one level	2013-10-10 11:23:08 +02:00
Jonathan CHAMPION	278e99ef69	Fix small doc mistakes	2013-10-10 11:20:13 +02:00
Adrien Grand	4fa8f6f61f	Doc values integration. This commit allows for using Lucene doc values as a backend for field data, moving the cost of building field data from the refresh operation to indexing. In addition, Lucene doc values can be stored on disk (partially, or even entirely), so that memory management is done at the operating system level (file-system cache) instead of the JVM, avoiding long pauses during major collections due to large heaps. So far doc values are supported on numeric types and non-analyzed strings (index:no or index:not_analyzed). Under the hood, it uses SORTED_SET doc values which is the only type to support multi-valued fields. Since the field data API set is a bit wider than the doc values API set, some operations are not supported: - field data filtering: this will fail if doc values are enabled, - field data cache clearing, even for memory-based doc values formats, - getting the memory usage for a specific field, - knowing whether a field is actually multi-valued. This commit also allows for configuring doc-values formats on a per-field basis similarly to postings formats. In particular the doc values format of the _version field can be configured through its own field mapper (it used to be handled in UidFieldMapper previously). Closes #3806	2013-10-09 16:34:30 +02:00
Matt Weber	3225375a77	Add monitoring link for es2graphite.	2013-10-09 10:47:59 +02:00
Lee Hinman	dede6ee874	Remove extra 'processors' anchor in threadpool docs	2013-10-09 01:56:49 -06:00
Adrien Grand	97958ed02a	Improved warm-up of new segments. * Merged segments are now warmed-up at the end of the merge operation instead of _refresh, so that _refresh doesn't pay the price for the warm-up of merged segments, which is often higher than flushed segments because of their size. * Even when no _warmer is registered, some basic warm-up of the segments is performed: norms, doc values (_version). This should help a bit people who forget to register warmers. * Eager loading support for the parent id cache and field data: when one can't predict what terms will be present in the index, it is tempting to use a match_all query in a warmer, but in that case, query execution might not be much faster than field data loading so having a warmer that only loads field data without running a query can be useful. Closes #3819	2013-10-08 23:06:55 +02:00
Clinton Gormley	264a00a40f	[DOCS] Added pages explaining lucene query parser syntax and regular expression syntax	2013-10-07 14:42:49 +02:00
Alexander Reelsen	f0cf97c0ac	Changed documentation to use getter notation Updated some java documentation to reflect the use of getters instead of calling methods based on field names. Relates to #2657	2013-10-06 21:18:43 +02:00
Clinton Gormley	7a53d41446	[DOCS] Changed capitalization of operator in rescore query	2013-10-05 17:18:15 +02:00
Clinton Gormley	0aeac65424	[DOCS] Fixed typo	2013-10-05 17:10:30 +02:00
Clinton Gormley	d062409309	[DOCS] Removed enable_position_increments in stop filter	2013-10-05 17:06:13 +02:00
Clinton Gormley	ea05f4538c	[DOCS] Updated ICU-Plugin docs from the repo README	2013-10-05 16:31:52 +02:00
Luca Cavanna	b0fee6c01b	Changed nested filter example to use an inner bool filter instead of a bool query, to demonstrate the usage of a filter rather than a query.	2013-10-04 14:08:37 +02:00
Clinton Gormley	e53a26ff21	[DOCS] Fixed a typo in indices.get_templates	2013-10-03 11:40:29 +02:00
uboness	f3c6108b71	introduced support for "shard_size" for terms & terms_stats facets. The "shard_size" is the number of term entries each shard will send back to the coordinating node. "shard_size" > "size" will increase the accuracy (both in terms of the counts associated with each term and the terms that will actually be returned the user) - of course, the higher "shard_size" is, the more expensive the processing becomes as bigger queues are maintained on a shard level and larger lists are streamed back from the shards. closes #3821	2013-10-02 22:02:00 +02:00
Nik Everett	6b000d8c6d	Support specifing score query on highlight. This is useful if you want to highlight terms not in the search query or you want sort highlighted snippets based on another query. Closes #3630	2013-10-02 15:46:24 -04:00
Lee Hinman	b923c138b8	Uniquify more anchor links to fix asciidoc	2013-10-01 10:28:35 -06:00
Lee Hinman	ba40aa374e	Uniquify anchor links to fix asciidoc/docbook generation	2013-09-30 15:32:00 -06:00
Lee Hinman	0442b737be	Add more anchor links to documentation Related to #3679	2013-09-30 13:13:16 -06:00
Alexander Reelsen	c63869b0be	Documentation: Removed service wrapper, added rpm/deb package information	2013-09-26 14:30:25 +02:00
gtt116	6304d58e36	Remove a comma in doc to make example a valid json. This will help reader to do a hurry up copy-paste test.	2013-09-24 15:23:23 +08:00
Costin Leau	3685a22e4a	add docs on new service.bat facility	2013-09-23 18:24:31 +03:00
Martijn van Groningen	d365a4ccba	Added nested filter join option to the docs. Closes #3738	2013-09-20 21:22:56 +02:00
Shay Banon	359d14ddc5	doc processors setting	2013-09-20 14:55:35 +02:00
Shay Banon	29c0f27a9e	fix thread pool docs to remove blocking	2013-09-20 12:31:17 +02:00
Martijn van Groningen	4958a6805f	Updated outdated default setting in doc.	2013-09-18 18:01:23 +02:00
Adrien Grand	90524d7ad2	Fix formatting of the documentation. Remaining '@'s have been replaced with '`'s.	2013-09-18 12:35:44 +02:00
Britta Weber	b7c3b50909	add date field to decay function doc	2013-09-17 19:54:31 +02:00
David Pilato	1e3ffa0df7	Add distance supported units	2013-09-17 14:21:45 +02:00
Clinton Gormley	85bba668f7	[DOCS] Tidied up various doc formatting errors	2013-09-16 16:13:01 +02:00
Clinton Gormley	c2eb4a1c40	[DOCS] Tidied up function score	2013-09-16 15:57:08 +02:00
Clinton Gormley	422eed7985	[Docs] Added an added[0.90.4] flag to the disk based allocator	2013-09-16 15:57:07 +02:00
Simon Willnauer	85fcefc60d	Allow include / exclude of completion stats via REST parameters Stats can be retrieved on a per-feature / per-component basis including the fields they apply to. This commit add support for a 'completion' flag to include statistics for the complition feature as well as 'completion_fields' to only include certain fields into the returned statistics. To disambiguate between 'fielddata' and 'completion' fields this commit uses 'fields' as the default inclusion filter for stats fields only used if not dedicated '[completion\|fielddata]_fields' paramter is provided. Relates to #3522	2013-09-16 11:28:32 +02:00
Martijn van Groningen	f6f4b5014f	Added docs for named queries. Relates to #3581	2013-09-16 11:17:01 +02:00
Shay Banon	20745adadd	Add dedicated Suggest Thread Pool Add a dedicated suggest thread pool for the suggest API. With the new completion suggest type, which is purely CPU bounded, it makes more sense to have a dedicated thread pool for suggest compared to having it share the search thread pool and "competing" against other search operations. closes #3698	2013-09-15 01:54:27 +02:00
Shay Banon	df3f681ef0	Optimize API: Remove refresh flag Refresh flag in optimize is problematic, since the shards refresh is allowed to execute on is different compared to the optimize shards. In order to do optimize and then refresh, they should be executed as separate APIs when needed. closes #3690	2013-09-13 21:44:38 +02:00
Shay Banon	7cc48c8e87	Flush API: remove refresh flag Refresh flag in flush is problematic, since the shards refresh is allowed to execute on is different compared to the flush shards. In order to do flush and then refresh, they should be executed as separate APIs when needed. closes #3689	2013-09-13 21:09:45 +02:00
Clinton Gormley	fd9f62b9b7	[DOCS] Added Cold Fusion client to community page	2013-09-13 16:20:38 +02:00
David Pilato	ea4988e9dc	Support for REST get ALL templates. /_template shows: No handler found for uri [/_template] and method [GET] It would make sense to list the templates as they are listed in the /_cluster/state call. Closes #2532.	2013-09-13 15:08:59 +02:00
Clinton Gormley	17fb10689c	The docs URLs have changed to include en/	2013-09-13 11:23:37 +02:00
Clinton Gormley	d6ecdecc19	[DOCS] Deprecated the from/to/include_lower/include_upper params in the range query, range filter and numeric range filter. Better to use gt/gte/lt/lte as they are explicit.	2013-09-12 15:07:36 +02:00
David Pilato	169cd007b5	Fix typo Thanks to @ybonnel for finding it ;-)	2013-09-12 11:00:59 +02:00
Martijn van Groningen	8ddb809f98	If all scroll ids should be removed then the `_all` value should be used instead of not specifying any scroll ids.	2013-09-12 10:41:38 +02:00
Martijn van Groningen	0efa78710b	Added clear scroll api. The clear scroll api allows clear all resources associated with a `scroll_id` by deleting the `scroll_id` and its associated SearchContext. Closes #3657	2013-09-10 21:17:34 +02:00
David Pilato	fafc4eef98	Plugin Manager: add silent mode. Now with have proper exit codes for elasticsearch plugin manager (see #3463), we can add a silent mode to plugin manager. ```sh bin/plugin --install karmi/elasticsearch-paramedic --silent ``` Closes #3628.	2013-09-10 18:31:35 +02:00
David Pilato	764aa54f2d	Plugin Manager should support -remove group/artifact/version naming When installing a plugin, we use: ```sh bin/plugin --install groupid/artifactid/version ``` But when removing the plugin, we only support: ```sh bin/plugin --remove dirname ``` where `dirname` is the directory name of the plugin under `/plugins` dir. Closes #3421.	2013-09-09 21:17:16 +02:00
Brad Fritz	f3c0e39380	key is "index.store.type", not "index.storage.type"	2013-09-09 13:06:09 -04:00
Lee Hinman	7d52d58747	Add AllocationDecider that takes free disk space into account This commit adds two main pieces, the first is a ClusterInfoService that provides a service running on the master nodes that fetches the total/free bytes for each data node in the cluster as well as the sizes of all shards in the cluster. This information is gathered by default every 30 seconds, and can be changed dynamically by setting the `cluster.info.update.interval` setting. This ClusterInfoService can hopefully be used in the future to weight nodes for allocation based on their disk usage, if desired. The second main piece is the DiskThresholdDecider, which can disallow a shard from being allocated to a node, or from remaining on the node depending on configuration parameters. There are three main configuration parameters for the DiskThresholdDecider: `cluster.routing.allocation.disk.threshold_enabled` controls whether the decider is enabled. It defaults to false (disabled). Note that the decider is also disabled for clusters with only a single data node. `cluster.routing.allocation.disk.watermark.low` controls the low watermark for disk usage. It defaults to 0.70, meaning ES will not allocate new shards to nodes once they have more than 70% disk used. It can also be set to an absolute byte value (like 500mb) to prevent ES from allocating shards if less than the configured amount of space is available. `cluster.routing.allocation.disk.watermark.high` controls the high watermark. It defaults to 0.85, meaning ES will attempt to relocate shards to another node if the node disk usage rises above 85%. It can also be set to an absolute byte value (similar to the low watermark) to relocate shards once less than the configured amount of space is available on the node. Closes #3480	2013-09-09 09:49:30 -06:00
Clinton Gormley	9e6d30a14a	[DOCS] Changed the deprecation of custom_boost/score/filters_score queries to 0.90.4	2013-09-05 12:14:10 +02:00
Clinton Gormley	2b3a762c27	[DOCS] Function score was added in 0.90.4 not 1.00.Beta	2013-09-05 11:25:06 +02:00
Clinton Gormley	8257aba166	[DOCS] Fixed fielddata regex syntax	2013-09-04 23:20:56 +02:00
Clinton Gormley	6d667e5d41	[DOCS] Missing sort values now works for all field types	2013-09-04 23:20:55 +02:00
Clinton Gormley	765bd026f5	[DOCS] Added function score query	2013-09-04 23:20:55 +02:00
Clinton Gormley	aa59ef2e84	[DOCS] Added the human flag	2013-09-04 23:20:55 +02:00
Clinton Gormley	9d0dd545cb	[DOCS] Tidied up the plugins page and added Graphite and Statsd	2013-09-04 23:20:55 +02:00
Clinton Gormley	e1c6f45ff0	[DOCS] Added clarification about global scope in facets	2013-09-04 23:20:55 +02:00
Clinton Gormley	08f8e77b8f	[DOCS] Added fuzzy options to completion suggester	2013-09-04 23:20:55 +02:00
Clinton Gormley	047c86e3b2	[DOCS] Added wildcard template matching	2013-09-04 23:20:55 +02:00
Clinton Gormley	9f5d0b6e89	[DOCS] Added a few clarifications to the docs from the issues list	2013-09-04 23:20:55 +02:00
Clinton Gormley	94be785726	[DOCS] Added multi-index open/close	2013-09-04 23:20:55 +02:00
Clinton Gormley	6568dae12c	[DOCS] Added "elastics" client for javascript	2013-09-04 23:20:55 +02:00
Clinton Gormley	5b60506b2e	[DOCS] Added highlighting to the phrase suggester	2013-09-04 23:20:54 +02:00
Clinton Gormley	53ad7330fc	[DOCS] Added docs for term vectors	2013-09-04 23:20:54 +02:00
Clinton Gormley	eac2b3a52e	[DOCS] Fixed typo	2013-09-04 23:20:54 +02:00
Clinton Gormley	393c28bee4	[DOCS] Removed outdated new/deprecated version notices	2013-09-03 21:28:31 +02:00
Clinton Gormley	69d1d35fc1	[DOCS] Fixed an out of sequence header in the Groovy docs	2013-09-03 16:28:44 +02:00
Clinton Gormley	17234fe454	[DOCS] link: prefix not required when using {ref} attributes	2013-09-03 16:16:15 +02:00
Clinton Gormley	e6127fc082	[DOCS] Chunk depth now configurable, so [float] not required	2013-09-03 16:15:50 +02:00
Simon Willnauer	eb2fed85f1	Add 'min_input_len' to completion suggester Restrict the size of the input length to a reasonable size otherwise very long strings can cause StackOverflowExceptions deep down in lucene land. Yet, this is simply a saftly limit set to `50` UTF-16 codepoints by default. This limit is only present at index time and not at query time. If prefix completions > 50 UTF-16 codepoints are expected / desired this limit should be raised. Critical string sizes are beyone the 1k UTF-16 Codepoints limit. Closes #3596	2013-09-03 10:26:37 +02:00
Clinton Gormley	ca4b85edef	Added IDs to the community clients docs, to control HTML page names	2013-09-01 12:58:17 +02:00
Boaz Leskes	e807c99f27	Fixed a typo in the config of light finnish stemmer (old last_finish is still supported for backward compatibility) Closes #3594	2013-08-29 10:15:40 +02:00
Clinton Gormley	822043347e	Migrated documentation into the main repo	2013-08-29 01:24:34 +02:00

... 140 141 142 143 144 ...

7257 Commits