lucene

Commit Graph

Author	SHA1	Message	Date
Robert Muir	c8f5b9127d	LUCENE-10243: increase unicode versions of tokenizers to 12.1 (#465 ) * Bump %unicode 9 -> %unicode 12.1 for the 3 unicode grammars * regenerate emoji conformance tests for unicode 12.1 * modify wordbreak conformance tests to use emoji data (which replaces old crazy E_base etc properties) * regenerate wordbreak conformance tests * Simplify grammar files and word-break conformance test generator, now that full-width numbers are WordBreak=Numeric * Use jflex emoji properties rather than ICU-generated ones	2021-12-03 20:20:57 -05:00
Dawid Weiss	f91700a713	LUCENE-9914: Modernize Emoji regeneration scripts (#78 )	2021-04-12 20:16:43 +02:00

Author

SHA1

Message

Date

Robert Muir

c8f5b9127d

LUCENE-10243: increase unicode versions of tokenizers to 12.1 (#465 )

* Bump %unicode 9 -> %unicode 12.1 for the 3 unicode grammars
* regenerate emoji conformance tests for unicode 12.1
* modify wordbreak conformance tests to use emoji data (which replaces old crazy E_base etc properties)
* regenerate wordbreak conformance tests
* Simplify grammar files and word-break conformance test generator, now that full-width numbers are WordBreak=Numeric
* Use jflex emoji properties rather than ICU-generated ones

2021-12-03 20:20:57 -05:00

Dawid Weiss

f91700a713

LUCENE-9914: Modernize Emoji regeneration scripts (#78 )

2021-04-12 20:16:43 +02:00

2 Commits