OpenSearch

Commit Graph

Author	SHA1	Message	Date
Ben McCann	cc4bc7d57d	Fix nonsensical sentence in standard analyzer documentation so that it is more understandable	2013-10-25 00:18:32 +02:00
Luca Cavanna	48ac9747a8	Added third highlighter type based on lucene postings highlighter Requires field index_options set to "offsets" in order to store positions and offsets in the postings list. Considerably faster than the plain highlighter since it doesn't require to reanalyze the text to be highlighted: the larger the documents the better the performance gain should be. Requires less disk space than term_vectors, needed for the fast_vector_highlighter. Breaks the text into sentences and highlights them. Uses a BreakIterator to find sentences in the text. Plays really well with natural text, not quite the same if the text contains html markup for instance. Treats the document as the whole corpus, and scores individual sentences as if they were documents in this corpus, using the BM25 algorithm. Uses forked version of lucene postings highlighter to support: - per value discrete highlighting for fields that have multiple values, needed when number_of_fragments=0 since we want to return a snippet per value - manually passing in query terms to avoid calling extract terms multiple times, since we use a different highlighter instance per doc/field, but the query is always the same The lucene postings highlighter api is quite different compared to the existing highlighters api, the main difference being that it allows to highlight multiple fields in multiple docs with a single call, ensuring sequential IO. The way it is introduced in elasticsearch in this first round is a compromise trying not to change the current highlight api, which works per document, per field. The main disadvantage is that we lose the sequential IO, but we can always refactor the highlight api to work with multiple documents. Supports pre_tag, post_tag, number_of_fragments (0 highlights the whole field), require_field_match, no_match_size, order by score and html encoding. Closes #3704	2013-10-24 23:38:00 +02:00
Luca Cavanna	e981e411d7	[DOCS] rephrased docs for highlight no_match_size parameter (removed 0.90.6 coming tag as it's needed only in 0.90 branch)	2013-10-24 14:38:32 +02:00
Nik Everett	14a709f563	Highlighting can return excerpt with no highlights You can configure the highlighting api to return an excerpt of a field even if there wasn't a match on the field. The FVH makes excerpts from the beginning of the string to the first boundary character after the requested length or the boundary_max_scan, whichever comes first. The Plain highlighter makes excerpts from the beginning of the string to the end of the last token before the requested length. Closes #1171	2013-10-24 14:38:32 +02:00
Boaz Leskes	0e6e6f97dc	Merge pull request #3940 from rboulton/patch-1 [Docs] Clean up wording in cluster health api doc	2013-10-22 04:09:13 -07:00
Markus Fischer	782d315da3	Fix markup	2013-10-21 16:11:09 +02:00
Richard Boulton	b62cc7c716	Clean up wording to reduce confusion The description of the timeout parameter was worded misleadingly; it implied that the API would wait until the cluster reached the desired level and then stayed at that level for the timeout. I've tweaked the sentence to remove the risk of confusion.	2013-10-21 12:37:50 +01:00
Clinton Gormley	b2d82d7e75	[DOCS] Reorganised the highlight_query docs and added a version flag	2013-10-18 18:03:31 +02:00
Matt Weber	1e0a834c68	Document strict dynamic type mapping.	2013-10-18 08:29:31 -07:00
Nik Everett	60550e4cc2	phrase_len is not called phrase_length	2013-10-18 09:29:53 -04:00
Clinton Gormley	adf0c8424b	[DOCS] How to check max_file_descriptors	2013-10-17 11:54:36 +02:00
Martijn van Groningen	b7c4adeea3	[Docs] update reference to remove documentation about percolating during an index, bulk or update request.	2013-10-16 16:31:36 +02:00
Martijn van Groningen	1d0841e2b8	Added initial documentation for the redesigned percolator.	2013-10-16 14:12:19 +02:00
Boaz Leskes	18e12ef66c	[Docs] updated refrences to dynamic_date_formats	2013-10-16 12:04:31 +02:00
Boaz Leskes	57b2d45142	[Docs] added document for the lenient option in match queries	2013-10-16 10:53:25 +02:00
Alexander Reelsen	4d19239ec4	Add support for Lucene SuggestStopFilter The suggest stop filter is an improved version of the stop filter, which takes stopwords only into account if the last char of a query is a whitespace. This allows you to keep stopwords, but to allow suggesting for "a". Example: Index document content "a word". You are now able to suggest for "a" and get back results in the completion suggester, if the suggest stop filter is used on the query side, but will not get back any results for "a " as this is identified as a stopword. The implementation allows to set the `remove_trailing` parameter for a custom stop filter and thus use the suggest stop filter instead of the standard stop filter.	2013-10-15 16:12:02 +02:00
Clinton Gormley	870346070e	[DOCS] Added compound_on_flush docs and updated compound_format docs to include note about accepting a float	2013-10-15 13:30:56 +02:00
Clinton Gormley	d67331b554	[DOCS] Added script.disable_dynamic to the scripting page	2013-10-15 12:25:07 +02:00
steve mayzak	48656fd1ed	removed a duplicate paragraphin config docs	2013-10-14 15:33:56 -07:00
Britta Weber	34441f3897	fix naming in function_score - "boost" should be "boost_factor" - "mult" should be "multiply" Also, store combine function names in ImmutableMap instead of iterating over all possible names each time. closes #3872 for master	2013-10-14 14:56:59 +02:00
Simon Willnauer	25d6f04f13	[DOCS] Note that cutoff_frequency doesn't handle stacked tokens gracefully	2013-10-14 14:09:38 +02:00
Britta Weber	c3ab79a10e	[DOCS] Add doc for delimited payload token filter	2013-10-14 13:41:35 +02:00
Clinton Gormley	9a062e465c	[DOCS] Reorganised common API conventions	2013-10-13 16:46:56 +02:00
Clinton Gormley	4316b13880	[DOCS] Render common options on the same page	2013-10-13 14:14:50 +02:00
Shay Banon	420b3396f4	Set queue sizes by default on bulk/index thread pools Now that we properly fixed the ability to set the queue size on the index / bulk thread pool, we should actually set them to a somehow reasonable value to protect from users potentially overflowing our system. I suggest defaults to be 50 for bulk, and 200 for indexing. Also, set the thread pool for get, which we should set (in a similar value to a "read" queue size we have today). closes #3888	2013-10-12 21:51:37 +02:00
Subhash Gopalakrishnan	b758b76da4	Support year units in date math expressions According to http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-date-format.html, the date math expressions support M (month), w (week), h (hour), m (minute), and s (second) units. Why years are not supported? Please add support for year units. Closes #3828. Closes #3874.	2013-10-11 09:24:52 +02:00
Clinton Gormley	8462f88c39	[DOCS] Added more specific versions to the suggesters	2013-10-10 20:59:12 +02:00
Adrien Grand	f2d75654bf	Add clear warnings that only the default codec, postings format and doc values format have backward compatibility warranties.	2013-10-10 13:30:08 +02:00
Clinton Gormley	ba1b4886e3	[DOCS] Moved "named filters/queries" up one level	2013-10-10 11:23:08 +02:00
Adrien Grand	4fa8f6f61f	Doc values integration. This commit allows for using Lucene doc values as a backend for field data, moving the cost of building field data from the refresh operation to indexing. In addition, Lucene doc values can be stored on disk (partially, or even entirely), so that memory management is done at the operating system level (file-system cache) instead of the JVM, avoiding long pauses during major collections due to large heaps. So far doc values are supported on numeric types and non-analyzed strings (index:no or index:not_analyzed). Under the hood, it uses SORTED_SET doc values which is the only type to support multi-valued fields. Since the field data API set is a bit wider than the doc values API set, some operations are not supported: - field data filtering: this will fail if doc values are enabled, - field data cache clearing, even for memory-based doc values formats, - getting the memory usage for a specific field, - knowing whether a field is actually multi-valued. This commit also allows for configuring doc-values formats on a per-field basis similarly to postings formats. In particular the doc values format of the _version field can be configured through its own field mapper (it used to be handled in UidFieldMapper previously). Closes #3806	2013-10-09 16:34:30 +02:00
Lee Hinman	dede6ee874	Remove extra 'processors' anchor in threadpool docs	2013-10-09 01:56:49 -06:00
Adrien Grand	97958ed02a	Improved warm-up of new segments. * Merged segments are now warmed-up at the end of the merge operation instead of _refresh, so that _refresh doesn't pay the price for the warm-up of merged segments, which is often higher than flushed segments because of their size. * Even when no _warmer is registered, some basic warm-up of the segments is performed: norms, doc values (_version). This should help a bit people who forget to register warmers. * Eager loading support for the parent id cache and field data: when one can't predict what terms will be present in the index, it is tempting to use a match_all query in a warmer, but in that case, query execution might not be much faster than field data loading so having a warmer that only loads field data without running a query can be useful. Closes #3819	2013-10-08 23:06:55 +02:00
Clinton Gormley	264a00a40f	[DOCS] Added pages explaining lucene query parser syntax and regular expression syntax	2013-10-07 14:42:49 +02:00
Clinton Gormley	7a53d41446	[DOCS] Changed capitalization of operator in rescore query	2013-10-05 17:18:15 +02:00
Clinton Gormley	d062409309	[DOCS] Removed enable_position_increments in stop filter	2013-10-05 17:06:13 +02:00
Clinton Gormley	ea05f4538c	[DOCS] Updated ICU-Plugin docs from the repo README	2013-10-05 16:31:52 +02:00
Luca Cavanna	b0fee6c01b	Changed nested filter example to use an inner bool filter instead of a bool query, to demonstrate the usage of a filter rather than a query.	2013-10-04 14:08:37 +02:00
Clinton Gormley	e53a26ff21	[DOCS] Fixed a typo in indices.get_templates	2013-10-03 11:40:29 +02:00
uboness	f3c6108b71	introduced support for "shard_size" for terms & terms_stats facets. The "shard_size" is the number of term entries each shard will send back to the coordinating node. "shard_size" > "size" will increase the accuracy (both in terms of the counts associated with each term and the terms that will actually be returned the user) - of course, the higher "shard_size" is, the more expensive the processing becomes as bigger queues are maintained on a shard level and larger lists are streamed back from the shards. closes #3821	2013-10-02 22:02:00 +02:00
Nik Everett	6b000d8c6d	Support specifing score query on highlight. This is useful if you want to highlight terms not in the search query or you want sort highlighted snippets based on another query. Closes #3630	2013-10-02 15:46:24 -04:00
Lee Hinman	ba40aa374e	Uniquify anchor links to fix asciidoc/docbook generation	2013-09-30 15:32:00 -06:00
Lee Hinman	0442b737be	Add more anchor links to documentation Related to #3679	2013-09-30 13:13:16 -06:00
Alexander Reelsen	c63869b0be	Documentation: Removed service wrapper, added rpm/deb package information	2013-09-26 14:30:25 +02:00
gtt116	6304d58e36	Remove a comma in doc to make example a valid json. This will help reader to do a hurry up copy-paste test.	2013-09-24 15:23:23 +08:00
Costin Leau	3685a22e4a	add docs on new service.bat facility	2013-09-23 18:24:31 +03:00
Martijn van Groningen	d365a4ccba	Added nested filter join option to the docs. Closes #3738	2013-09-20 21:22:56 +02:00
Shay Banon	359d14ddc5	doc processors setting	2013-09-20 14:55:35 +02:00
Shay Banon	29c0f27a9e	fix thread pool docs to remove blocking	2013-09-20 12:31:17 +02:00
Adrien Grand	90524d7ad2	Fix formatting of the documentation. Remaining '@'s have been replaced with '`'s.	2013-09-18 12:35:44 +02:00
Britta Weber	b7c3b50909	add date field to decay function doc	2013-09-17 19:54:31 +02:00
David Pilato	1e3ffa0df7	Add distance supported units	2013-09-17 14:21:45 +02:00
Clinton Gormley	85bba668f7	[DOCS] Tidied up various doc formatting errors	2013-09-16 16:13:01 +02:00
Clinton Gormley	c2eb4a1c40	[DOCS] Tidied up function score	2013-09-16 15:57:08 +02:00
Clinton Gormley	422eed7985	[Docs] Added an added[0.90.4] flag to the disk based allocator	2013-09-16 15:57:07 +02:00
Simon Willnauer	85fcefc60d	Allow include / exclude of completion stats via REST parameters Stats can be retrieved on a per-feature / per-component basis including the fields they apply to. This commit add support for a 'completion' flag to include statistics for the complition feature as well as 'completion_fields' to only include certain fields into the returned statistics. To disambiguate between 'fielddata' and 'completion' fields this commit uses 'fields' as the default inclusion filter for stats fields only used if not dedicated '[completion\|fielddata]_fields' paramter is provided. Relates to #3522	2013-09-16 11:28:32 +02:00
Martijn van Groningen	f6f4b5014f	Added docs for named queries. Relates to #3581	2013-09-16 11:17:01 +02:00
Shay Banon	20745adadd	Add dedicated Suggest Thread Pool Add a dedicated suggest thread pool for the suggest API. With the new completion suggest type, which is purely CPU bounded, it makes more sense to have a dedicated thread pool for suggest compared to having it share the search thread pool and "competing" against other search operations. closes #3698	2013-09-15 01:54:27 +02:00
Shay Banon	df3f681ef0	Optimize API: Remove refresh flag Refresh flag in optimize is problematic, since the shards refresh is allowed to execute on is different compared to the optimize shards. In order to do optimize and then refresh, they should be executed as separate APIs when needed. closes #3690	2013-09-13 21:44:38 +02:00
Shay Banon	7cc48c8e87	Flush API: remove refresh flag Refresh flag in flush is problematic, since the shards refresh is allowed to execute on is different compared to the flush shards. In order to do flush and then refresh, they should be executed as separate APIs when needed. closes #3689	2013-09-13 21:09:45 +02:00
David Pilato	ea4988e9dc	Support for REST get ALL templates. /_template shows: No handler found for uri [/_template] and method [GET] It would make sense to list the templates as they are listed in the /_cluster/state call. Closes #2532.	2013-09-13 15:08:59 +02:00
Clinton Gormley	d6ecdecc19	[DOCS] Deprecated the from/to/include_lower/include_upper params in the range query, range filter and numeric range filter. Better to use gt/gte/lt/lte as they are explicit.	2013-09-12 15:07:36 +02:00
David Pilato	169cd007b5	Fix typo Thanks to @ybonnel for finding it ;-)	2013-09-12 11:00:59 +02:00
Martijn van Groningen	8ddb809f98	If all scroll ids should be removed then the `_all` value should be used instead of not specifying any scroll ids.	2013-09-12 10:41:38 +02:00
Martijn van Groningen	0efa78710b	Added clear scroll api. The clear scroll api allows clear all resources associated with a `scroll_id` by deleting the `scroll_id` and its associated SearchContext. Closes #3657	2013-09-10 21:17:34 +02:00
David Pilato	fafc4eef98	Plugin Manager: add silent mode. Now with have proper exit codes for elasticsearch plugin manager (see #3463), we can add a silent mode to plugin manager. ```sh bin/plugin --install karmi/elasticsearch-paramedic --silent ``` Closes #3628.	2013-09-10 18:31:35 +02:00
David Pilato	764aa54f2d	Plugin Manager should support -remove group/artifact/version naming When installing a plugin, we use: ```sh bin/plugin --install groupid/artifactid/version ``` But when removing the plugin, we only support: ```sh bin/plugin --remove dirname ``` where `dirname` is the directory name of the plugin under `/plugins` dir. Closes #3421.	2013-09-09 21:17:16 +02:00
Brad Fritz	f3c0e39380	key is "index.store.type", not "index.storage.type"	2013-09-09 13:06:09 -04:00
Lee Hinman	7d52d58747	Add AllocationDecider that takes free disk space into account This commit adds two main pieces, the first is a ClusterInfoService that provides a service running on the master nodes that fetches the total/free bytes for each data node in the cluster as well as the sizes of all shards in the cluster. This information is gathered by default every 30 seconds, and can be changed dynamically by setting the `cluster.info.update.interval` setting. This ClusterInfoService can hopefully be used in the future to weight nodes for allocation based on their disk usage, if desired. The second main piece is the DiskThresholdDecider, which can disallow a shard from being allocated to a node, or from remaining on the node depending on configuration parameters. There are three main configuration parameters for the DiskThresholdDecider: `cluster.routing.allocation.disk.threshold_enabled` controls whether the decider is enabled. It defaults to false (disabled). Note that the decider is also disabled for clusters with only a single data node. `cluster.routing.allocation.disk.watermark.low` controls the low watermark for disk usage. It defaults to 0.70, meaning ES will not allocate new shards to nodes once they have more than 70% disk used. It can also be set to an absolute byte value (like 500mb) to prevent ES from allocating shards if less than the configured amount of space is available. `cluster.routing.allocation.disk.watermark.high` controls the high watermark. It defaults to 0.85, meaning ES will attempt to relocate shards to another node if the node disk usage rises above 85%. It can also be set to an absolute byte value (similar to the low watermark) to relocate shards once less than the configured amount of space is available on the node. Closes #3480	2013-09-09 09:49:30 -06:00
Clinton Gormley	9e6d30a14a	[DOCS] Changed the deprecation of custom_boost/score/filters_score queries to 0.90.4	2013-09-05 12:14:10 +02:00
Clinton Gormley	2b3a762c27	[DOCS] Function score was added in 0.90.4 not 1.00.Beta	2013-09-05 11:25:06 +02:00
Clinton Gormley	8257aba166	[DOCS] Fixed fielddata regex syntax	2013-09-04 23:20:56 +02:00
Clinton Gormley	6d667e5d41	[DOCS] Missing sort values now works for all field types	2013-09-04 23:20:55 +02:00
Clinton Gormley	765bd026f5	[DOCS] Added function score query	2013-09-04 23:20:55 +02:00
Clinton Gormley	aa59ef2e84	[DOCS] Added the human flag	2013-09-04 23:20:55 +02:00
Clinton Gormley	9d0dd545cb	[DOCS] Tidied up the plugins page and added Graphite and Statsd	2013-09-04 23:20:55 +02:00
Clinton Gormley	e1c6f45ff0	[DOCS] Added clarification about global scope in facets	2013-09-04 23:20:55 +02:00
Clinton Gormley	08f8e77b8f	[DOCS] Added fuzzy options to completion suggester	2013-09-04 23:20:55 +02:00
Clinton Gormley	047c86e3b2	[DOCS] Added wildcard template matching	2013-09-04 23:20:55 +02:00
Clinton Gormley	9f5d0b6e89	[DOCS] Added a few clarifications to the docs from the issues list	2013-09-04 23:20:55 +02:00
Clinton Gormley	94be785726	[DOCS] Added multi-index open/close	2013-09-04 23:20:55 +02:00
Clinton Gormley	5b60506b2e	[DOCS] Added highlighting to the phrase suggester	2013-09-04 23:20:54 +02:00
Clinton Gormley	53ad7330fc	[DOCS] Added docs for term vectors	2013-09-04 23:20:54 +02:00
Clinton Gormley	eac2b3a52e	[DOCS] Fixed typo	2013-09-04 23:20:54 +02:00
Clinton Gormley	393c28bee4	[DOCS] Removed outdated new/deprecated version notices	2013-09-03 21:28:31 +02:00
Simon Willnauer	eb2fed85f1	Add 'min_input_len' to completion suggester Restrict the size of the input length to a reasonable size otherwise very long strings can cause StackOverflowExceptions deep down in lucene land. Yet, this is simply a saftly limit set to `50` UTF-16 codepoints by default. This limit is only present at index time and not at query time. If prefix completions > 50 UTF-16 codepoints are expected / desired this limit should be raised. Critical string sizes are beyone the 1k UTF-16 Codepoints limit. Closes #3596	2013-09-03 10:26:37 +02:00
Boaz Leskes	e807c99f27	Fixed a typo in the config of light finnish stemmer (old last_finish is still supported for backward compatibility) Closes #3594	2013-08-29 10:15:40 +02:00
Clinton Gormley	822043347e	Migrated documentation into the main repo	2013-08-29 01:24:34 +02:00

... 62 63 64 65 66

3287 Commits