OpenSearch

Commit Graph

Author	SHA1	Message	Date
Allen Torres	887fbb6387	Update lowercase-tokenizer.asciidoc (#21896 ) Fixed typo	2016-12-02 10:49:51 -05:00
Matt Weber	04e07bcdb6	Synonym Graph Support (LUCENE-6664) (#21517 ) Integrate the patch from LUCENE-6664 into elasticsearch and add support for handling a graph token stream in match/multi-match queries. This fixes longstanding bugs with multi-token synonyms returning incorrect results with proximity queries.	2016-11-28 09:25:49 -08:00
Achraf	d81a928b1f	Correction of the names of numirals (#21531 ) What was called Arabic numerals is actually Hindu - Eastern Arabic notation. And the Latin numerals you refer to is the Arabic numbers.	2016-11-25 14:30:49 +01:00
Pascal Borreli	fcb01deb34	Fixed typos (#20843 )	2016-10-10 14:51:47 -06:00
Clinton Gormley	22f1acde94	Docs: Pattern analyzer does not support a max_token_length parameter Closes #20713	2016-10-08 12:27:33 +02:00
Alexander Lin	7cd0316b51	Fix minhash docs level Relates #20547	2016-09-19 07:54:04 -04:00
Clinton Gormley	2f6d0119f1	Added warning messages about the dangers of pathological regexes to: * pattern-replace charfilter * pattern-capture and pattern-replace token filters * pattern tokenizer * pattern analyzer Relates to #20038	2016-09-09 09:53:07 +02:00
Alexander Lin	f825e8f4cb	Exposing lucene 6.x minhash filter. (#20206 ) Exposing lucene 6.x minhash tokenfilter Generate min hash tokens from an incoming stream of tokens that can be used to estimate document similarity. Closes #20149	2016-09-07 09:38:12 +02:00
Jim Ferenczi	4682fc34ae	Add the ability to disable the retrieval of the stored fields entirely This change adds a special field named _none_ that allows to disable the retrieval of the stored fields in a search request or in a TopHitsAggregation. To completely disable stored fields retrieval (including disabling metadata fields retrieval such as _id or _type) use _none_ like this: ```` POST _search { "stored_fields": "_none_" } ````	2016-08-24 16:40:08 +02:00
markwalkom	f556424ab9	Update synonym-tokenfilter.asciidoc (#19988 ) * Update synonym-tokenfilter.asciidoc * Update synonym-tokenfilter.asciidoc	2016-08-17 13:39:22 +02:00
Nik Everett	7aeea764ba	Remove wait_for_status=yellow from the docs It is no longer required after `687e2e12b3`.	2016-07-15 16:02:07 -04:00
Clinton Gormley	6f17736eb1	Fixed asciidoc	2016-07-15 12:58:38 +02:00
Jim Ferenczi	881afcba60	Fixed tests that failed now that BM25 is the default similarity.	2016-06-21 15:42:42 +02:00
Nik Everett	a0585269be	[docs] s/lags/Flags/ Copy and paste lots an `F`.	2016-06-09 13:08:53 -04:00
Nik Everett	09cc4c449a	[docs] Pattern replace char filter now support flags	2016-06-09 12:41:20 -04:00
Clinton Gormley	5da9e5dcbc	Docs: Improved tokenizer docs (#18356 ) * Docs: Improved tokenizer docs Added descriptions and runnable examples * Addressed Nik's comments * Added TESTRESPONSEs for all tokenizer examples * Added TESTRESPONSEs for all analyzer examples too * Added docs, examples, and TESTRESPONSES for character filters * Skipping two tests: One interprets "$1" as a stack variable - same problem exists with the REST tests The other because the "took" value is always different * Fixed tests with "took" * Fixed failing tests and removed preserve_original from fingerprint analyzer	2016-05-19 19:42:23 +02:00
Nik Everett	8155e1efda	[docs] Add wait_for_status=yellow Another unstable snippet.... https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-os-compatibility/os=sles/402/console	2016-05-12 17:53:34 -04:00
Zachary Tong	5ee5cc25cc	Move AsciiFolding earlier in FingerprintAnalyzer filter chain Rearranges the FingerprintAnalyzer so that AsciiFolding comes earlier in the chain (after lowercasing, before stop removal, for maximum deduping power) Closes #18266	2016-05-12 09:34:15 -04:00
Clinton Gormley	97a41ee973	First pass at improving analyzer docs (#18269 ) * Docs: First pass at improving analyzer docs I've rewritten the intro to analyzers plus the docs for all analyzers to provide working examples. I've also removed: * analyzer aliases (see #18244) * analyzer versions (see #18267) * snowball analyzer (see #8690) Next steps will be tokenizers, token filters, char filters * Fixed two typos	2016-05-11 14:17:56 +02:00
Clinton Gormley	3f594089c2	Renamed all AUTOSENSE snippets to CONSOLE (#18210 )	2016-05-09 15:42:23 +02:00
Nik Everett	3912761572	[docs] Add wait_until_yellow to fix build failure The snippet in the docs creates and index and uses it with the _analyze api. The trouble is that if the index hasn't been created fully the _analyze API will fail. This adds a GET _cluster/health?wait_for_status=yellow which fixes the issue. While this does make the docs more cluttered, it also makes the snippets actually runnable. Closes #18165	2016-05-05 16:02:00 -04:00
Nik Everett	4b1c116461	Generate and run tests from the docs Adds infrastructure so `gradle :docs:check` will extract tests from snippets in the documentation and execute the tests. This is included in `gradle check` so it should happen on CI and during a normal build. By default each `// AUTOSENSE` snippet creates a unique REST test. These tests are executed in a random order and the cluster is wiped between each one. If multiple snippets chain together into a test you can annotate all snippets after the first with `// TEST[continued]` to have the generated tests for both snippets joined. Snippets marked as `// TESTRESPONSE` are checked against the response of the last action. See docs/README.asciidoc for lots more. Closes #12583. That issue is about catching bugs in the docs during build. This catches some bugs in the docs during build which is a good start.	2016-05-05 13:58:03 -04:00
Zachary Tong	80288ad60c	Add `fingerprint` token filter and `fingerprint` analyzer Adds a `fingerprint` token filter which uses Lucene's FingerprintFilter, and a `fingerprint` analyzer that combines the Fingerprint filter with lowercasing, stop word removal and asciifolding. Closes #13325	2016-04-20 16:10:56 -04:00
Clinton Gormley	a62b9296c6	Docs: Fixed link to phonetic plugin	2016-04-13 10:17:46 +02:00
Adrien Grand	b42f66c8ac	Document 5.0 mapping changes.	2016-03-22 16:22:58 +01:00
Clinton Gormley	dc21ab7576	Docs: Corrected behaviour of max_token_length in standard tokenizer	2016-03-18 10:58:16 +01:00
Clinton Gormley	a5a9bbfe88	Update compound-word-tokenfilter.asciidoc Only FOP v1.2 compatible hyphenation files are supported by the hyphenation decompounder	2016-03-11 15:08:36 +01:00
Lee Hinman	6adbbff97c	Fix organization rename in all files in project Basically a query-replace of "https://github.com/elasticsearch/" with "https://github.com/elastic/"	2016-03-03 12:04:13 -07:00
Andrey Ryaguzov	f744c3f724	Docs: Added migration description for custom analysis file path Closes #15597 Closes #15556	2016-02-29 20:56:19 +01:00
Dongjoon Hyun	21ea552070	Fix typos in docs.	2016-02-09 02:07:32 -08:00
Adrien Grand	f8e802c028	Merge pull request #15794 from damienalexandre/french-doc [Doc] Fix french analyzer elision token filter doc	2016-01-06 18:39:26 +01:00
Damien Alexandre	23a64f8214	Fix french analyzer elision token filter doc Fix #15774	2016-01-06 18:26:03 +01:00
David Pilato	995e796eab	[doc] Fix cross link with ICU plugin Doc bug introduced with #15695	2015-12-30 12:07:33 +01:00
David Pilato	3076377fdb	Remove ICU Plugin in reference guide This documentation lives now in plugins documentation at https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu.html. We don't need a copy in analysis reference guide.	2015-12-29 11:23:28 +01:00
socurites	485915bbe7	comma(,) was duplicated deleted it.	2015-12-24 14:31:26 +01:00
socurites	25d23091e2	Edge NGram: "side" setting was depercated Edge NGram: "side" setting was depercated	2015-12-24 14:26:24 +01:00
Jason Tedor	d9a24961c5	Fix minor issues in delimited payload token filter docs This commit addresses a few minor issues in the delimited payload token filter docs: - the provided example reversed the payloads associated with the tokens "the" and "fox" - two additional typos in the same sentence - "per default" -> "by default" - "default int to" -> "default into" - adds two serial commas	2015-12-16 13:00:20 -05:00
tomoya yokota	82d26c852a	property name is not right `ignore_script` is not right. `ignored_script' is right. See org.elasticsearch.index.analysis.CJKBigramFilterFactory	2015-11-26 14:22:23 +09:00
Clinton Gormley	98028419a5	Merge pull request #14610 from yokotaso/patch-1 Update snowball document page.	2015-11-17 14:17:30 +01:00
Jason O'Donnell	42fb690a1c	Fixing typo	2015-10-26 16:46:36 -04:00
Adrien Grand	d3aa3565db	Deprecate `index.analysis.analyzer.default_index` in favor of `index.analysis.analyzer.default`. Close #11861	2015-10-12 22:19:16 +02:00
Clinton Gormley	1f76f49003	Update compound-word-tokenfilter.asciidoc Improved the docs for compound work token filter. Closes #13670 Closes #13595	2015-09-21 11:22:14 +02:00
Robert Muir	f216d92d19	Upgrade to lucene 5.4-snapshot r1701068	2015-09-03 15:13:33 -04:00
Robert Muir	0d3e3f81fc	Lithuanian analysis	2015-09-01 08:52:10 -04:00
xuzha	fb2be6d6a1	The name "position_offset_gap" is confusing because Lucene has three similar sounding things: * Analyzer#getPositionIncrementGap * Analyzer#getOffsetGap * IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS and * FieldType#storeTermVectorOffsets Rename position_offset_gap to position_increment_gap closes #13056	2015-08-26 14:56:35 -07:00
Nik Everett	4b9664beeb	Mapping: Default position_offset_gap to 100 This is much more fiddly than you'd expect it to be because of the way position_offset_gap is applied in StringFieldMapper. Instead of setting the default to 100 its simpler to make sure that all the analyzers default to 100 and that StringFieldMapper doesn't override the default unless the user specifies something different. Unless the index was created before 2.1, in which case the old default of 0 has to take. Also postition_offset_gaps less than 0 aren't allowed at all. New tests test that: 1. the new default doesn't match phrases across values with reasonably low slop (5) 2. the new default doest match phrases across values with reasonably high slop (50) 3. you can override the value and phrases work as you'd expect 4. if you leave the value undefined in the mapping and define it on a custom analyzer the the value from the custom analyzer shines through Closes #7268	2015-08-25 14:21:50 -04:00
Clinton Gormley	2b512f1f29	Docs: Use "js" instead of "json" and "sh" instead of "shell" for source highlighting	2015-07-14 18:14:09 +02:00
Britta Weber	eeeb29f900	spell correct and add single quotes	2015-05-26 11:41:19 +02:00
Britta Weber	37782c1745	analyzers: custom analyzers names and aliases must not start with _ closes #9596	2015-05-26 11:38:15 +02:00
Clinton Gormley	3a69b65e88	Docs: Fixed the backslash escaping on the pattern analyzer docs Closes #11099	2015-05-15 18:40:16 +02:00

1 2 3

125 Commits