OpenSearch/docs/reference/migration/migrate_7_0/analysis.asciidoc

[[breaking_70_analysis_changes]]
=== Analysis changes

==== The `delimited_payload_filter` is renamed

The `delimited_payload_filter` is renamed to `delimited_payload`, the old name is 
deprecated and will be removed at some point, so it should be replaced by 
`delimited_payload`.


==== Limiting the number of tokens produced by _analyze

To safeguard against out of memory errors, the number of tokens that can be produced
using the `_analyze` endpoint has been limited to 10000. This default limit can be changed
for a particular index with the index setting `index.analyze.max_token_count`.


==== Limiting the length of an analyzed text during highlighting

Highlighting a text that was indexed without offsets or term vectors,
requires analysis of this text in memory real time during the search request.
For large texts this analysis may take substantial amount of time and memory.
To protect against this, the maximum number of characters that will be analyzed has been
limited to 10000. This default limit can be changed
for a particular index with the index setting `index.highlight.max_analyzed_offset`.
Replace `delimited_payload_filter` by `delimited_payload` (#26625) The `delimited_payload_filter` is renamed to `delimited_payload`, the old name is deprecated and should be replaced by `delimited_payload`. Closes #21978 2017-11-24 07:03:19 -05:00			`[[breaking_70_analysis_changes]]`
			`=== Analysis changes`

			==== The `delimited_payload_filter` is renamed

			The `delimited_payload_filter` is renamed to `delimited_payload`, the old name is
			`deprecated and will be removed at some point, so it should be replaced by`
			`delimited_payload`.
Limit the number of tokens produced by _analyze (#27529) Add an index level setting `index.analyze.max_token_count` to control the number of generated tokens in the _analyze endpoint. Defaults to 10000. Throw an error if the number of generated tokens exceeds this limit. Closes #27038 2017-11-30 11:54:39 -05:00

			`==== Limiting the number of tokens produced by _analyze`

			`To safeguard against out of memory errors, the number of tokens that can be produced`
			using the `_analyze` endpoint has been limited to 10000. This default limit can be changed
Limit the analyzed text for highlighting (#27934) * Limit the analyzed text for highlighting - Introduce index level settings to control the max number of character to be analyzed for highlighting - Throw an error if analysis is required on a larger text Closes #27517 2017-12-21 10:19:58 -05:00			for a particular index with the index setting `index.analyze.max_token_count`.


			`==== Limiting the length of an analyzed text during highlighting`

			`Highlighting a text that was indexed without offsets or term vectors,`
			`requires analysis of this text in memory real time during the search request.`
			`For large texts this analysis may take substantial amount of time and memory.`
			`To protect against this, the maximum number of characters that will be analyzed has been`
			`limited to 10000. This default limit can be changed`
			for a particular index with the index setting `index.highlight.max_analyzed_offset`.