* Bump %unicode 9 -> %unicode 12.1 for the 3 unicode grammars
* regenerate emoji conformance tests for unicode 12.1
* modify wordbreak conformance tests to use emoji data (which replaces old crazy E_base etc properties)
* regenerate wordbreak conformance tests
* Simplify grammar files and word-break conformance test generator, now that full-width numbers are WordBreak=Numeric
* Use jflex emoji properties rather than ICU-generated ones
Upgrade jflex.
Change doesn't alter the behavior of any of the analyzers (unicode
version or grammar refactorings), just the minimal to get new tooling
working.
Most cases of C-style array declarations have been switched. The Google Java Format, that which we adhere to, disallows C-style array declarations: https://google.github.io/styleguide/javaguide.html#s4.8.3-arrays
Some cases (esp. Snowball) can't be updated.
* LUCENE-9909: Some jflex regeneration tasks should have proper dependencies and also check the checksums of included files.
* Force a dependency on low-level spotless tasks so that they're always properly ordered (hell!). Update ASCIITLD and regenerate the remaining code. Add cross-dependencies between generation tasks that take includes as input.
The compilation of the library is slow, disable optimization as it doesn't speed up our usage of the gennorm2 tool.
Use better heuristic for make parallelism (tests.jvms rather than just hardcoded value of four).
Add a simple regeneration help doc
Improve task help and checksum failure message (include corresponding regeneration task). Sorry for being verbose. Maybe somebody will read it. :)
Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com>
This adds a bit of simplicity as the file is a simple domain list,
rather than a DNS zone. So the regexes parsing DNS can be removed.
Also the file may change less often as it contains JUST the list of
TLDs, and not any additional DNS metadata.
Detects common cases of unreachable/dead code.
For generated javacc code, the check is disabled via
SuppressWarnings("unused") because javacc generates strange/bad code such as:
if ("" == null)
For TestStressNRTReplication's startNode() method, the check is also
disabled because analysis folds the "test evilness controls" which are
static final constants. This itself is a WTF, shouldn't we instead
randomize these evil things in our tests rather than hardcoding them to
specific values?
Requiring the annotation is helpful because if an abstract method is removed, the concrete methods will then show up as compile errors: preventing dead code from being accidentally left behind.
Co-authored-by: Robert Muir <rmuir@apache.org>
Enable ecj unused local variable, private instance and method detection. Allow SuppressWarnings("unused") to disable unused checks (e.g. for generated code or very special tests). Fix gradlew regenerate for python 3.9 SuppressWarnings("unused") for generated javacc and jflex code. Enable a few other easy ecj checks such as Deprecated annotation, hashcode/equals, equals across different types.
Co-authored-by: Mike McCandless <mikemccand@apache.org>
Enable ecj unused local variable, private instance and method detection. Allow SuppressWarnings("unused") to disable unused checks (e.g. for generated code or very special tests). Fix gradlew regenerate for python 3.9 SuppressWarnings("unused") for generated javacc and jflex code. Enable a few other easy ecj checks such as Deprecated annotation, hashcode/equals, equals across different types.
Co-authored-by: Mike McCandless <mikemccand@apache.org>
Upgrade from icu 62.2 to 68.2, with Unicode 13 support.
Modify GenerateUTR30DataFiles to take the release tag as a program
argument. Gradle populates this automatically, removing a manual step
from regeneration process.