OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jim Ferenczi	787acb14b9	Track total hits up to 10,000 by default (#37466 ) This commit changes the default for the `track_total_hits` option of the search request to `10,000`. This means that by default search requests will accurately track the total hit count up to `10,000` documents, requests that match more than this value will set the `"total.relation"` to `"gte"` (e.g. greater than or equals) and the `"total.value"` to `10,000` in the search response. Scroll queries are not impacted, they will continue to count the total hits accurately. The default is set back to `true` (accurate hit count) if `rest_total_hits_as_int` is set in the search request. I choose `10,000` as the default because that's also the number we use to limit pagination. This means that users will be able to know how far they can jump (up to 10,000) even if the total number of hits is not accurate. Closes #33028	2019-01-25 13:45:39 +01:00
Christoph Büscher	34f2d2ec91	Remove remaining occurances of "include_type_name=true" in docs (#37646 )	2019-01-22 15:13:52 +01:00
Christoph Büscher	25aac4f77f	Remove `include_type_name` in asciidoc where possible (#37568 ) The "include_type_name" parameter was temporarily introduced in #37285 to facilitate moving the default parameter setting to "false" in many places in the documentation code snippets. Most of the places can simply be reverted without causing errors. In this change I looked for asciidoc files that contained the "include_type_name=true" addition when creating new indices but didn't look likey they made use of the "_doc" type for mappings. This is mostly the case e.g. in the analysis docs where index creating often only contains settings. I manually corrected the use of types in some places where the docs still used an explicit type name and not the dummy "_doc" type.	2019-01-18 09:34:11 +01:00
Julie Tibshirani	36a3b84fc9	Update the default for include_type_name to false. (#37285 ) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.	2019-01-14 13:08:01 -08:00
Josh Soref	edb48321ba	[DOCS] Various spelling corrections (#37046 )	2019-01-07 14:44:12 +01:00
Daniel Mitterdorfer	75f3443c62	Rename setting to enable mmap With this commit we rename `node.store.allow_mmapfs` to `node.store.allow_mmap`. Previously this setting has controlled whether `mmapfs` could be used as a store type. With the introduction of `hybridfs` which also relies on memory-mapping, `node.store.allow_mmapfs` also applies to `hybridfs` and thus we rename it in order to convey that it is actually used to allow memory-mapping but not a specific store type. Relates #36668 Relates #37070	2019-01-03 07:10:34 +01:00
Daniel Mitterdorfer	f0052b1a7a	Add hybridfs store type With this commit we introduce a new store type `hybridfs` that is a hybrid between `mmapfs` and `niofs`. This store type chooses different strategies to read Lucene files based on the read access pattern (random or linear) in order to optimize performance. This store type has been available in earlier versions of Elasticsearch as `default_fs`. We have chosen a different name now in order to convey the intent of the store type instead of tying it to the fact whether it is the default choice. Relates #36668	2019-01-02 10:10:32 +01:00
debadair	c9e03e6ead	[DOCS] Reworked the shard allocation filtering info. (#36456 ) * [DOCS] Reworked the shard allocation filtering info. Closes #36079 * Added multiple index allocation settings example back. * Removed extraneous space	2018-12-11 07:44:57 -08:00
Jim Ferenczi	18866c4c0b	Make hits.total an object in the search response (#35849 ) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028	2018-12-05 19:49:06 +01:00
Jim Ferenczi	cfe8eab455	[DOCS] Removes beta label from index sorting (#34327 )	2018-10-05 19:44:25 +02:00
Vladimir Dolzhenko	2e2ae19b97	drop elasticsearch-translog for 7.0 (#33373 ) #32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0 Relates to #31389	2018-10-01 16:21:14 +02:00
Vladimir Dolzhenko	a3e8b831ee	add elasticsearch-shard tool (#32281 ) Relates #31389	2018-09-19 10:28:22 +02:00
Jim Ferenczi	7ad71f906a	Upgrade to a Lucene 8 snapshot (#33310 ) The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly. Some comments about the change: * Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309 Closes #32899	2018-09-06 14:42:06 +02:00
Jason Tedor	bdfcc326d7	Enable avoiding mmap bootstrap check (#32421 ) The maximum map count boostrap check can be a hindrance to users that do not own the underlying platform on which they are executing Elasticsearch. This is because addressing it requires tuning the kernel and a platform provider might now allow this, especially on shared infrastructure. However, this bootstrap check is not needed if mmapfs is not in use. Today we do not have a way for the user to communicate that they are not going to use mmapfs. This commit therefore adds a setting that enables the user to disallow mmapfs. When mmapfs is disallowed, the maximum map count bootstrap check is not enforced. Additionally, we fallback to a different default index store and prevent the explicit use of mmapfs for an index.	2018-08-21 11:02:25 -04:00
Nik Everett	22459576d7	Logging: Make node name consistent in logger (#31588 ) First, some background: we have 15 different methods to get a logger in Elasticsearch but they can be broken down into three broad categories based on what information is provided when building the logger. Just a class like: ``` private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class); ``` or: ``` protected final Logger logger = Loggers.getLogger(getClass()); ``` The class and settings: ``` this.logger = Loggers.getLogger(getClass(), settings); ``` Or more information like: ``` Loggers.getLogger("index.store.deletes", settings, shardId) ``` The goal of the "class and settings" variant is to attach the node name to the logger. Because we don't always have the settings available, we often use the "just a class" variant and get loggers without node names attached. There isn't any real consistency here. Some loggers get the node name because it is convenient and some do not. This change makes the node name available to all loggers all the time. Almost. There are some caveats are testing that I'll get to. But in production code the node name is node available to all loggers. This means we can stop using the "class and settings" variants to fetch loggers which was the real goal here, but a pleasant side effect is that the ndoe name is now consitent on every log line and optional by editing the logging pattern. This is all powered by setting the node name statically on a logging formatter very early in initialization. Now to tests: tests can't set the node name statically because subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in the same class loader. Also, lots of tests don't run with a real node so they don't have a node name at all. To support multiple nodes in the same JVM tests suss out the node name from the thread name which works surprisingly well and easy to test in a nice way. For those threads that are not part of an `ESIntegTestCase` node we stick whatever useful information we can get form the thread name in the place of the node name. This allows us to keep the logger format consistent.	2018-07-31 10:54:24 -04:00
Jason Tedor	588db621ac	Remove reference to non-existent store type (#32418 ) We removed the default_fs store type yet the docs still contain a reference to them. This commit addresses that by removing this reference, and changing a reference to this section of the docs to instead refer to mmapfs.	2018-07-27 11:24:03 -04:00
Adrien Grand	f5073813ef	Docs: Clarify constraints on scripted similarities. (#31076 ) Scripted similarities provide a lot of flexibility but they still need to obey some rules to not confuse Lucene.	2018-06-05 08:51:00 +02:00
srini-raman	0592b685b9	[Docs] Improve section detailing translog usage (#30573 )	2018-05-15 10:43:57 +02:00
Sue Gallagher	dd666599f7	[DOCS] Added 'on a single shard' to description of max_thread_count. Closes 28518 (#29686 )	2018-04-27 09:29:27 -07:00
Adrien Grand	569d0c0e89	Improve similarity integration. (#29187 ) This improves the way similarities are plugged in in order to: - reject the classic similarity on 7.x indices and emit a deprecation warning otherwise - reject unkwown parameters on 7.x indices and emit a deprecation warning otherwise Even though this breaks the plugin API, I'd like to backport to 7.x so that users can get deprecation warnings when they are doing something that will become unsupported in the future. Closes #23208 Closes #29035	2018-04-03 16:45:25 +02:00
Adrien Grand	1d6ed824c7	Improve similarity docs. (#29089 ) This adds links to the relevant Lucene javadocs and warnings regarding similarities that might return 0 as a score. Close #29015	2018-03-21 10:41:10 +01:00
Jason Tedor	bddf9df8b4	Add search slowlog level to docs (#29040 ) This commit adds an indication how to set the search slowlog level to the docs.	2018-03-13 18:27:14 -04:00
David Turner	0a4a4c8a0e	Minor improvements to translog docs (#28237 ) The use of the phrase "translog" vs "transaction log" was inconsistent, and it was apparently unclear that the translog was stored on every shard copy.	2018-01-19 10:17:22 +00:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Christoph Büscher	0d11b9fe34	[Docs] Unify spelling of Elasticsearch (#27567 ) Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.	2017-11-29 09:44:25 +01:00
Alexander Reelsen	66b5a43d0e	Logging: Unify log rotation for index/search slow log (#27298 ) The existing log rotation configuration allowed the index and search slow log to grow unbounded. This commit removes the date based rotation and adds the same size based rotation, that the depreciation log already has.	2017-11-15 10:01:32 +01:00
agent5566	93a47cf860	Fix a typo in the similarity docs (#26970 )	2017-10-12 09:29:25 -07:00
Jim Ferenczi	74473c1c3d	Early termination with index sorting should not set terminated_early in the response (#26597 ) Early termination with index sorting always return the best top N in the response but set the flag `terminated_early` in the response. This can be confusing because we use the same flag for `terminate_after` which on the contrary returns partial results. This change removes the flag when results are not partial (early termination due to index sorting) and keeps it only when `terminate_after` is used. Closes #26408	2017-09-26 11:37:11 +02:00
Tanguy Leroux	db54c4dc7c	[Docs] Convert more doc snippets (#26404 ) This commit converts some remaining doc snippets so that they are now testable.	2017-08-30 09:30:36 +02:00
Adrien Grand	f0cba4fce5	Add a scripted similarity. (#25831 ) The goal of this similarity is to help users who would like to keep the functionality of the `tf-idf` similarity that we want to remove, or to allow for specific usec-cases (disabling idf, disabling tf, disabling length norm, etc.) to not have to build a custom plugin and familiarize with the low-level Lucene API.	2017-08-08 08:55:12 +02:00
Adrien Grand	1b34f691e5	Remove reference to 32-bit systems. (#25971 ) They are not supported anymore as of #25435.	2017-07-31 09:55:09 +02:00
Clinton Gormley	ff4a2519f2	Update experimental labels in the docs (#25727 ) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram	2017-07-18 14:06:22 +02:00
Boaz Leskes	d963882053	Enable a long translog retention policy by default (#25294 ) #25147 added the translog deletion policy but didn't enable it by default. This PR enables a default retention of 512MB (same maximum size of the current translog) and an age of 12 hours (i.e., after 12 hours all translog files will be deleted). This increases to chance to have an ops based recovery, even if the primary flushed or the replica was offline for a few hours. In order to see which parts of the translog are committed into lucene the translog stats are extended to include information about uncommitted operations. Views now include all translog ops and guarantee, as before, that those will not go away. Snapshotting a view allows to filter out generations that are not relevant based on a specific sequence number. Relates to #10708	2017-06-22 17:08:14 +02:00
Deb Adair	dbe2de0891	[DOCS] Fixed callout reference error.	2017-06-08 16:47:13 -07:00
Jim Ferenczi	36a5cf8f35	Automatically early terminate search query based on index sorting (#24864 ) This commit refactors the query phase in order to be able to automatically detect queries that can be early terminated. If the index sort matches the query sort, the top docs collection is early terminated on each segment and the computing of the total number of hits that match the query is delegated to a simple TotalHitCountCollector. This change also adds a new parameter to the search request called `track_total_hits`. It indicates if the total number of hits that match the query should be tracked. If false, queries sorted by the index sort will not try to compute this information and and will limit the collection to the first N documents per segment. Aggregations are not impacted and will continue to see every document even when the index sort matches the query sort and `track_total_hits` is false. Relates #6720	2017-06-08 12:10:46 +02:00
Adrien Grand	bbdf50f6bd	Docs: More search speed advices. (#24802 )	2017-06-01 17:23:22 +02:00
Jim Ferenczi	f05af0a382	Enable index-time sorting (#24055 ) This change adds an index setting to define how the documents should be sorted inside each Segment. It allows any numeric, date, boolean or keyword field inside a mapping to be used to sort the index on disk. It is not allowed to use a `nested` fields inside an index that defines an index sorting since `nested` fields relies on the original sort of the index. This change does not add early termination capabilities in the search layer. This will be added in a follow up. Relates #6720	2017-04-19 14:36:11 +02:00
Adrien Grand	4632661bc7	Upgrade to a Lucene 7 snapshot (#24089 ) We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index. Some notes about the change: - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge	2017-04-18 15:17:21 +02:00
Igor Motov	93b5e55660	Restores the original default format of search slow log In 5.0, the search slow log switched to the multi-line format with no option to get back to the origin single-line format that was used prior to 5.0 by default. This commit removes the reformat option from the search slow log and returns the search slow log back to the single-line format. Closes #21711	2016-12-09 12:38:28 -05:00
Loek van Gool	1a23739211	Update store.asciidoc (#21353 ) * Update store.asciidoc * Update store.asciidoc * Update store.asciidoc	2016-11-05 14:58:16 +01:00
Adriel Dean-Hall	b72a708c0d	Add docs with up to date instructions on updating default similarity (#21242 ) * Add docs with up to date instructions on updating default similarity The default similarity can no longer be set in the configuration file (you will get an error on startup). Update the docs with the method that works. * Add instructions for changing similarity on index creation	2016-11-01 16:14:20 -04:00
Jason Tedor	96aa5e33ce	Fix slowlog docs This commit fixes two issues with the slow log docs: - clarifies that these settings are per index - updates index slow log configuration for Log4j 2 Relates #20976	2016-10-17 10:50:32 -04:00
Pascal Borreli	fcb01deb34	Fixed typos (#20843 )	2016-10-10 14:51:47 -06:00
Jason Tedor	750033dc4b	Update docs for Log4j 2 This commit updates the logging docs for Elasticsearch to reflect the migration to Log4j 2.	2016-08-31 15:51:52 -04:00
Lee Hinman	0ade5a207d	Add documentation for the 'elasticsearch-translog' tool This adds documentation to the translog page for the CLI truncation tool.	2016-08-02 16:26:28 -06:00
Sakthipriyan Vairamani	8d5a5e500a	file is -> file name (#18994 )	2016-06-21 13:20:56 +02:00
Jim Ferenczi	423291b6bc	Change default similarity to BM25 The default similarity was set to `classic` which refers to TFIDF and has not been moved after the upgrade to Lucene 6. Though moving to BM25 could have some downside for queries that relies on coordination factor (match_query, multi_match_query) ? relates #18944	2016-06-21 11:29:36 +02:00
Adrien Grand	93415d4506	Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting.	2016-06-20 13:42:56 +02:00
eratio08	26aacfff72	default values for BM25 Similarity (#18778 ) assuming elasticsearch uses the lucene default values	2016-06-13 18:57:44 +02:00
trangvh	c0da8e4060	Fix some typos (#18746 ) * Update java-doc of SearchResponse.getProfileResults() * Fix a trivial typo in Reference document	2016-06-07 16:41:39 +02:00

1 2 3 4

195 Commits