OpenSearch

Commit Graph

Author	SHA1	Message	Date
agent5566	93a47cf860	Fix a typo in the similarity docs (#26970 )	2017-10-12 09:29:25 -07:00
Jim Ferenczi	74473c1c3d	Early termination with index sorting should not set terminated_early in the response (#26597 ) Early termination with index sorting always return the best top N in the response but set the flag `terminated_early` in the response. This can be confusing because we use the same flag for `terminate_after` which on the contrary returns partial results. This change removes the flag when results are not partial (early termination due to index sorting) and keeps it only when `terminate_after` is used. Closes #26408	2017-09-26 11:37:11 +02:00
Tanguy Leroux	db54c4dc7c	[Docs] Convert more doc snippets (#26404 ) This commit converts some remaining doc snippets so that they are now testable.	2017-08-30 09:30:36 +02:00
Adrien Grand	f0cba4fce5	Add a scripted similarity. (#25831 ) The goal of this similarity is to help users who would like to keep the functionality of the `tf-idf` similarity that we want to remove, or to allow for specific usec-cases (disabling idf, disabling tf, disabling length norm, etc.) to not have to build a custom plugin and familiarize with the low-level Lucene API.	2017-08-08 08:55:12 +02:00
Adrien Grand	1b34f691e5	Remove reference to 32-bit systems. (#25971 ) They are not supported anymore as of #25435.	2017-07-31 09:55:09 +02:00
Clinton Gormley	ff4a2519f2	Update experimental labels in the docs (#25727 ) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram	2017-07-18 14:06:22 +02:00
Boaz Leskes	d963882053	Enable a long translog retention policy by default (#25294 ) #25147 added the translog deletion policy but didn't enable it by default. This PR enables a default retention of 512MB (same maximum size of the current translog) and an age of 12 hours (i.e., after 12 hours all translog files will be deleted). This increases to chance to have an ops based recovery, even if the primary flushed or the replica was offline for a few hours. In order to see which parts of the translog are committed into lucene the translog stats are extended to include information about uncommitted operations. Views now include all translog ops and guarantee, as before, that those will not go away. Snapshotting a view allows to filter out generations that are not relevant based on a specific sequence number. Relates to #10708	2017-06-22 17:08:14 +02:00
Deb Adair	dbe2de0891	[DOCS] Fixed callout reference error.	2017-06-08 16:47:13 -07:00
Jim Ferenczi	36a5cf8f35	Automatically early terminate search query based on index sorting (#24864 ) This commit refactors the query phase in order to be able to automatically detect queries that can be early terminated. If the index sort matches the query sort, the top docs collection is early terminated on each segment and the computing of the total number of hits that match the query is delegated to a simple TotalHitCountCollector. This change also adds a new parameter to the search request called `track_total_hits`. It indicates if the total number of hits that match the query should be tracked. If false, queries sorted by the index sort will not try to compute this information and and will limit the collection to the first N documents per segment. Aggregations are not impacted and will continue to see every document even when the index sort matches the query sort and `track_total_hits` is false. Relates #6720	2017-06-08 12:10:46 +02:00
Adrien Grand	bbdf50f6bd	Docs: More search speed advices. (#24802 )	2017-06-01 17:23:22 +02:00
Jim Ferenczi	f05af0a382	Enable index-time sorting (#24055 ) This change adds an index setting to define how the documents should be sorted inside each Segment. It allows any numeric, date, boolean or keyword field inside a mapping to be used to sort the index on disk. It is not allowed to use a `nested` fields inside an index that defines an index sorting since `nested` fields relies on the original sort of the index. This change does not add early termination capabilities in the search layer. This will be added in a follow up. Relates #6720	2017-04-19 14:36:11 +02:00
Adrien Grand	4632661bc7	Upgrade to a Lucene 7 snapshot (#24089 ) We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index. Some notes about the change: - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge	2017-04-18 15:17:21 +02:00
Igor Motov	93b5e55660	Restores the original default format of search slow log In 5.0, the search slow log switched to the multi-line format with no option to get back to the origin single-line format that was used prior to 5.0 by default. This commit removes the reformat option from the search slow log and returns the search slow log back to the single-line format. Closes #21711	2016-12-09 12:38:28 -05:00
Loek van Gool	1a23739211	Update store.asciidoc (#21353 ) * Update store.asciidoc * Update store.asciidoc * Update store.asciidoc	2016-11-05 14:58:16 +01:00
Adriel Dean-Hall	b72a708c0d	Add docs with up to date instructions on updating default similarity (#21242 ) * Add docs with up to date instructions on updating default similarity The default similarity can no longer be set in the configuration file (you will get an error on startup). Update the docs with the method that works. * Add instructions for changing similarity on index creation	2016-11-01 16:14:20 -04:00
Jason Tedor	96aa5e33ce	Fix slowlog docs This commit fixes two issues with the slow log docs: - clarifies that these settings are per index - updates index slow log configuration for Log4j 2 Relates #20976	2016-10-17 10:50:32 -04:00
Pascal Borreli	fcb01deb34	Fixed typos (#20843 )	2016-10-10 14:51:47 -06:00
Jason Tedor	750033dc4b	Update docs for Log4j 2 This commit updates the logging docs for Elasticsearch to reflect the migration to Log4j 2.	2016-08-31 15:51:52 -04:00
Lee Hinman	0ade5a207d	Add documentation for the 'elasticsearch-translog' tool This adds documentation to the translog page for the CLI truncation tool.	2016-08-02 16:26:28 -06:00
Sakthipriyan Vairamani	8d5a5e500a	file is -> file name (#18994 )	2016-06-21 13:20:56 +02:00
Jim Ferenczi	423291b6bc	Change default similarity to BM25 The default similarity was set to `classic` which refers to TFIDF and has not been moved after the upgrade to Lucene 6. Though moving to BM25 could have some downside for queries that relies on coordination factor (match_query, multi_match_query) ? relates #18944	2016-06-21 11:29:36 +02:00
Adrien Grand	93415d4506	Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting.	2016-06-20 13:42:56 +02:00
eratio08	26aacfff72	default values for BM25 Similarity (#18778 ) assuming elasticsearch uses the lucene default values	2016-06-13 18:57:44 +02:00
trangvh	c0da8e4060	Fix some typos (#18746 ) * Update java-doc of SearchResponse.getProfileResults() * Fix a trivial typo in Reference document	2016-06-07 16:41:39 +02:00
Nik Everett	72eb621bce	Docs: Replace [source,json] with [source,js] The syntax highlighter only supports [source,js]. Also adds a check to the rest test generator that runs during the build that'll fail the build if it sees `[source,json]`.	2016-05-24 11:17:27 -04:00
eratio08	7e00a1c1a3	Added Type name for DFI (#18480 )	2016-05-20 11:02:06 +02:00
Jason Tedor	c257e2c51f	Remove settings and system properties entanglement Today when parsing settings during bootstrap, we add a system property for every Elasticsearch setting. Additionally, settings can be set via system properties. This commit simplifies this situation. - settings are no longer propogated to system properties - system properties can not be used to set settings - the "es." prefix on settings is no longer required (nor permitted) - test logging has a dedicated system property (tests.logger.level) Relates #18198	2016-05-19 14:08:08 -04:00
Clinton Gormley	3f594089c2	Renamed all AUTOSENSE snippets to CONSOLE (#18210 )	2016-05-09 15:42:23 +02:00
Nik Everett	4b1c116461	Generate and run tests from the docs Adds infrastructure so `gradle :docs:check` will extract tests from snippets in the documentation and execute the tests. This is included in `gradle check` so it should happen on CI and during a normal build. By default each `// AUTOSENSE` snippet creates a unique REST test. These tests are executed in a random order and the cluster is wiped between each one. If multiple snippets chain together into a test you can annotate all snippets after the first with `// TEST[continued]` to have the generated tests for both snippets joined. Snippets marked as `// TESTRESPONSE` are checked against the response of the last action. See docs/README.asciidoc for lots more. Closes #12583. That issue is about catching bugs in the docs during build. This catches some bugs in the docs during build which is a good start.	2016-05-05 13:58:03 -04:00
Adrien Grand	51a53c55cb	Update store documentation after #17616 .	2016-05-04 08:53:11 +02:00
Simon Willnauer	2514681f66	updateing filtering.asciidoc to also use 'node.attr' namespace	2016-03-30 14:11:59 +02:00
Adrien Grand	b42f66c8ac	Document 5.0 mapping changes.	2016-03-22 16:22:58 +01:00
Jason Tedor	8a05c2a2be	Bootstrap does not set system properties Today, certain bootstrap properties are set and read via system properties. This action-at-distance way of managing these properties is rather confusing, and completely unnecessary. But another problem exists with setting these as system properties. Namely, these system properties are interpreted as Elasticsearch settings, not all of which are registered. This leads to Elasticsearch failing to startup if any of these special properties are set. Instead, these properties should be kept as local as possible, and passed around as method parameters where needed. This eliminates the action-at-distance way of handling these properties, and eliminates the need to register these non-setting properties. This commit does exactly that. Additionally, today we use the "-D" command line flag to set the properties, but this is confusing because "-D" is a special flag to the JVM for setting system properties. This creates confusion because some "-D" properties should be passed via arguments to the JVM (so via ES_JAVA_OPTS), and some should be passed as arguments to Elasticsearch. This commit changes the "-D" flag for Elasticsearch settings to "-E".	2016-03-13 20:09:15 -04:00
Clinton Gormley	6d7e8814d6	Redocument the `index.merge.scheduler.max_thread_count` setting Closes #16961	2016-03-05 16:28:43 +01:00
Boaz Leskes	4a7980f96c	Merge pull request #16766 from rstruber/patch-1 fix grammar in Total Shards Per Node docs	2016-02-22 08:42:42 -08:00
Dongjoon Hyun	21ea552070	Fix typos in docs.	2016-02-09 02:07:32 -08:00
Robert Muir	d5dc05f69e	Upgrade to lucene 5.5.0-snapshot-1725675	2016-02-02 22:53:39 -05:00
Simon Willnauer	84ce9f3618	Remove the ability to fsync on every operation and only schedule fsync task if really needed This commit limits the `index.translog.sync_interval` to a value not less than `100ms` and removes the support for fsync on every operation which used to be enabled if `index.translog.sync_interval` was set to `0s` Now this pr also only schedules an async fsync if the durability is set to `async`. By default not async task is scheduled. Closes #16152	2016-01-27 12:28:38 +01:00
Robert Muir	6e7e3a2274	Update lucene to r1725675 Adds DFI (divergence from independence) provider. Fixes test bugs passing invalid values for BM25 parameters.	2016-01-20 03:32:51 -05:00
Jim Ferenczi	992ffac509	Merge pull request #15446 from jimferenczi/classic_similarity Renames `default` similarity into `classic`	2015-12-30 08:42:20 -08:00
Simon Willnauer	fcfd98e9e8	Drop support for simple translog and hard-wire buffer to 8kb Today we have two variants of translogs for indexing. We only recommend the buffered one which also has a 20% advantage in indexing speed. This commit removes the option and defaults to the buffered case. It also hard-wires the translog buffer to 8kb instead of 64kb. We used to adjust that buffer based on if the shard is active or not, this code has also been removed and instead we just keep an 8kb buffer arround.	2015-12-21 16:44:35 +01:00
Jim Ferenczi	81fd2169cf	Renames "default" similarity into "classic". Replaces deprecated DefaultSimilarity by ClassicSimilarity. Fixes #15102	2015-12-21 16:22:53 +01:00
Simon Willnauer	afc1cc19af	Simplify translog-based flush settings This commit removes `index.translog.flush_threshold_ops` and `index.translog.disable_flush` in favor of `index.translog.flush_threshold_size`. The number of operations is meaningless by itself and can easily be turned into a size value with knowledge of the data. Disabling the flush is only useful in tests and we can set the size value to a really high value. If users really need to do this they can also apply a very high value like `1PB`.	2015-12-21 15:15:00 +01:00
Clinton Gormley	f20f41e02e	Merge pull request #15405 from alexg-dev/patch-1 More detailed explanation of some similarity types	2015-12-14 14:28:43 +01:00
William	e042e06a5a	Update similarity.asciidoc	2015-11-19 16:41:29 -08:00
Yannick Welsch	2084df825f	Simplify delayed shard allocation - moves calculation of the delay to a single place (ReplicaShardAllocator) - reduces coupling between GatewayAllocator and RoutingService - in master failover situations, elapsed delay time is forgotten Closes #14808	2015-11-19 09:53:07 +01:00
Lee Hinman	145374b762	Add cluster-wide setting for total shard limit This adds the `cluster.routing.allocation.total_shards_per_node` setting, which limits the total number of shards across all indices on each node. It defaults to -1 and can be dynamically configured. Resolves #14456	2015-11-09 11:03:07 -07:00
Jason O'Donnell	c7060c1b63	Fixing typo	2015-10-26 16:48:20 -04:00
Jason O'Donnell	73f620907d	Fixing typo	2015-10-26 16:43:25 -04:00
Simon Willnauer	75e816400c	Remove TranslogService and fold it into synchronous IndexShard API This commit moves the size and ops based flush into a synchronous API into IndexShard and removes the time-based flush alltogether since it' basically covered by the inactive async flush API we have today. The functionality doesn't need to be covered by scheduled task and async APIs while we can actually make all the decisions in a sync manner which is way easier to control and to test. Closes #13707	2015-09-23 12:39:06 +02:00

1 2 3 4

169 Commits