Commit Graph

36 Commits

Author SHA1 Message Date
Adam Locke 776e9507fb
[DOCS] Update similarity.asciidoc (#59400) (#59644)
Community contribution to fix linking issues in the Similarity module docs.

Co-authored-by: Xin Yan <SHU_Yanx@hotmail.com>
2020-07-15 14:12:00 -04:00
Ahmet Arslan 52062565a9 [DOCS] Correct DFI docs regarding stop word removal (#53836)
The documentation of DFI should recommend *not* to [remove stop words][1], since DFI is good at scoring queries that contain common terms: `the wall`, `the sun`, `the who`, etc.

[1]:https://lucene.apache.org/core/8_1_1/core/org/apache/lucene/search/similarities/DFISimilarity.html
2020-03-24 10:48:42 -04:00
James Rodewig d590150ca2 [DOCS] Fix indent issue in similarity snippet (#51107)
Updates snippet to consistently use 2-space indentation. The snippet
previously used a mix of tab/5-space and 2-space indents.

Co-authored-by: Peter Johnson <wiz@wiz.co.nz>

Co-authored-by: Peter Johnson <peter@geocode.earth>
2020-01-16 11:00:15 -05:00
blueSky1825821 5ff6eafb4b [Docs] Update similarity.asciidoc (#50719)
DFRSimilarity -> DFR similarity
2020-01-08 17:48:26 +01:00
James Rodewig f04573f8e8
[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) (#46459) 2019-09-06 16:09:09 -04:00
James Rodewig c46c57d439
[DOCS] Change // CONSOLE comments to [source,console] (#46441) (#46451) 2019-09-06 11:31:13 -04:00
Jim Ferenczi ae65f77add Update docs for the DFR similarity (#40579)
The basic models `b, de, p` and the after effect `no`
are not available anymore in Lucene 8 but they are still
listed in the >7x documentation. This change removes these
references that should also be listed in the breaking change
of es 7.0.

Closes #40264
2019-03-29 10:44:24 +01:00
Christoph Büscher 34f2d2ec91
Remove remaining occurances of "include_type_name=true" in docs (#37646) 2019-01-22 15:13:52 +01:00
Julie Tibshirani 36a3b84fc9
Update the default for include_type_name to false. (#37285)
* Default include_type_name to false for get and put mappings.

* Default include_type_name to false for get field mappings.

* Add a constant for the default include_type_name value.

* Default include_type_name to false for get and put index templates.

* Default include_type_name to false for create index.

* Update create index calls in REST documentation to use include_type_name=true.

* Some minor clean-ups around the get index API.

* In REST tests, use include_type_name=true by default for index creation.

* Make sure to use 'expression == false'.

* Clarify the different IndexTemplateMetaData toXContent methods.

* Fix FullClusterRestartIT#testSnapshotRestore.

* Fix the ml_anomalies_default_mappings test.

* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.

We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.

This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.

* Fix more REST tests.

* Improve some wording in the create index documentation.

* Add a note about types removal in the create index docs.

* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.

* Make sure to mention include_type_name in the REST docs for affected APIs.

* Make sure to use 'expression == false' in FullClusterRestartIT.

* Mention include_type_name in the REST templates docs.
2019-01-14 13:08:01 -08:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Adrien Grand f5073813ef
Docs: Clarify constraints on scripted similarities. (#31076)
Scripted similarities provide a lot of flexibility but they still need to obey
some rules to not confuse Lucene.
2018-06-05 08:51:00 +02:00
Adrien Grand 569d0c0e89
Improve similarity integration. (#29187)
This improves the way similarities are plugged in in order to:
 - reject the classic similarity on 7.x indices and emit a deprecation
   warning otherwise
 - reject unkwown parameters on 7.x indices and emit a deprecation
   warning otherwise

Even though this breaks the plugin API, I'd like to backport to 7.x so
that users can get deprecation warnings when they are doing something
that will become unsupported in the future.

Closes #23208
Closes #29035
2018-04-03 16:45:25 +02:00
Adrien Grand 1d6ed824c7
Improve similarity docs. (#29089)
This adds links to the relevant Lucene javadocs and warnings regarding
similarities that might return 0 as a score.

Close #29015
2018-03-21 10:41:10 +01:00
Adrien Grand 1b660821a2
Allow `_doc` as a type. (#27816)
Allowing `_doc` as a type will enable users to make the transition to 7.0
smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`.
This also moves most of the documentation to `_doc` as a type name.

Closes #27750
Closes #27751
2017-12-14 17:47:53 +01:00
agent5566 93a47cf860 Fix a typo in the similarity docs (#26970) 2017-10-12 09:29:25 -07:00
Tanguy Leroux db54c4dc7c [Docs] Convert more doc snippets (#26404)
This commit converts some remaining doc snippets so that they are now
testable.
2017-08-30 09:30:36 +02:00
Adrien Grand f0cba4fce5 Add a scripted similarity. (#25831)
The goal of this similarity is to help users who would like to keep the
functionality of the `tf-idf` similarity that we want to remove, or to allow
for specific usec-cases (disabling idf, disabling tf, disabling length norm,
etc.) to not have to build a custom plugin and familiarize with the low-level
Lucene API.
2017-08-08 08:55:12 +02:00
Adrien Grand 4632661bc7 Upgrade to a Lucene 7 snapshot (#24089)
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.

Some notes about the change:
 - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
 - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
 - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
2017-04-18 15:17:21 +02:00
Adriel Dean-Hall b72a708c0d Add docs with up to date instructions on updating default similarity (#21242)
* Add docs with up to date instructions on updating default similarity

The default similarity can no longer be set in the configuration file
(you will get an error on startup). Update the docs with the method
that works.

* Add instructions for changing similarity on index creation
2016-11-01 16:14:20 -04:00
Jim Ferenczi 423291b6bc Change default similarity to BM25
The default similarity was set to `classic` which refers to TFIDF and has not been moved after the upgrade to Lucene 6.

Though moving to BM25 could have some downside for queries that relies on coordination factor (match_query, multi_match_query) ?

relates #18944
2016-06-21 11:29:36 +02:00
eratio08 26aacfff72 default values for BM25 Similarity (#18778)
assuming elasticsearch uses the lucene default values
2016-06-13 18:57:44 +02:00
eratio08 7e00a1c1a3 Added Type name for DFI (#18480) 2016-05-20 11:02:06 +02:00
Adrien Grand b42f66c8ac Document 5.0 mapping changes. 2016-03-22 16:22:58 +01:00
Robert Muir d5dc05f69e Upgrade to lucene 5.5.0-snapshot-1725675 2016-02-02 22:53:39 -05:00
Robert Muir 6e7e3a2274 Update lucene to r1725675
Adds DFI (divergence from independence) provider.
Fixes test bugs passing invalid values for BM25 parameters.
2016-01-20 03:32:51 -05:00
Jim Ferenczi 81fd2169cf Renames "default" similarity into "classic".
Replaces deprecated DefaultSimilarity by ClassicSimilarity.
Fixes #15102
2015-12-21 16:22:53 +01:00
Clinton Gormley f20f41e02e Merge pull request #15405 from alexg-dev/patch-1
More detailed explanation of some similarity types
2015-12-14 14:28:43 +01:00
William e042e06a5a Update similarity.asciidoc 2015-11-19 16:41:29 -08:00
Clinton Gormley ac2b8951c6 Docs: Mapping docs completely rewritten for 2.0 2015-08-06 17:24:51 +02:00
Kevin Wang ecab74fe6c add lucene language model similarities (Dirichlet & JelinekMercer) 2014-04-07 10:48:03 +02:00
Oleg Anashkin eb0e1aa38f Fix typo in similarity docs
DRF similarity -> DFR similarity
2014-02-13 07:45:30 -08:00
Lee Hinman ba40aa374e Uniquify anchor links to fix asciidoc/docbook generation 2013-09-30 15:32:00 -06:00
Lee Hinman 0442b737be Add more anchor links to documentation
Related to #3679
2013-09-30 13:13:16 -06:00
Clinton Gormley 393c28bee4 [DOCS] Removed outdated new/deprecated version notices 2013-09-03 21:28:31 +02:00
Clinton Gormley 822043347e Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00