Commit Graph

218 Commits

Author SHA1 Message Date
Robert Muir 9302eee1e0
LUCENE-9235: upgrade all python to python3
Die, python2, die.

Some generated .java files change (parameterized automata for
spell-correction).

This is because the order of python dictionaries was not well-defined
previously. A sort() was added so that the python code now generates
reproducible output (Thanks @mikemccand).

So we'll suffer a change once, but the automata are equivalent. If you
run the script again you should not see source code changes.

The relevant unit tests are exhaustive (if you trust the paper!), so we can
be confident it does not break things, even though it looks very scary.
2020-02-20 21:27:38 -05:00
Anshum Gupta cb18586ea0
LUCENE-9155: Add Apache License header to the Kuromoji dictionary compilation (#1271) 2020-02-20 14:59:06 -08:00
Dawid Weiss 62662e477a LUCENE-9155: Port Kuromoji dictionary compilation (regenerate). 2020-02-20 19:00:56 +01:00
Dawid Weiss 7604639b59 Move jgit version declaration to scriptDepVersions. 2020-02-20 13:54:07 +01:00
Robert Muir b9a569e7be
LUCENE-9230: explicitly call python version we want from builds
On newer linux distros, at least, 'python' now means python3. So
we can't rely on what version of python it will invoke (at least for a
few years).

For example in Fedora Linux:

https://fedoraproject.org/wiki/Changes/Python_means_Python3

For python2.x code, explicitly call 'python2.7' and for python3.x code,
explicitly call 'python3'.

Ant variable names are cleaned up, e.g. 'python.exe' is renamed to
'python2.exe' and 'python32.exe' is renamed to 'python3.exe'. This also
makes it easy to identify remaining python 2.x code that should be
migrated to python 3.x
2020-02-18 18:58:17 -05:00
Dawid Weiss 491c99a3de LUCENE-9232: tone down daemon defaults in generated local settings. 2020-02-18 19:43:39 +01:00
Dawid Weiss 2a88aa9d0f LUCENE-9219: Port ECJ-based linter to gradle
Co-authored-by: Tomoko Uchida <tomoko@apache.org>
2020-02-19 02:43:47 +09:00
Robert Muir ccb390d4a6
LUCENE-9220: prevent zip file reproducibility issues based on users umask 2020-02-17 13:34:00 -05:00
Robert Muir 0203815ab2
LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0 (#1262)
Previous situation:

* The snowball base classes (Among, SnowballProgram, etc) had accumulated local performance-related changes. There was a task that would also "patch" generated classes (e.g. GermanStemmer) after-the-fact.
* Snowball classes had many "non-changes" from the original such as removal of tabs addition of javadocs, license headers, etc.
* Snowball test data (inputs and expected stems) was incorporated into lucene testing, but this was maintained manually. Also files had become large, making the test too slow (Nightly).
* Snowball stopwords lists from their website were manually maintained. In some cases encoding fixes were manually applied.
* Some generated stemmers (such as Estonian and Armenian) exist in lucene, but have no corresponding `.sbl` file in snowball sources at all.

Besides this mess, snowball project is "moving along" and acquiring new languages, adding non-BSD-licensed test data, huge test data, and other complexity. So it is time to automate the integration better.

New situation:

* Lucene has a `gradle snowball` regeneration task. It works on Linux or Mac only. It checks out their repos, applies the `snowball.patch` in our repository, compiles snowball stemmers, regenerates all java code, applies any adjustments so that our build is happy.
* Tests data is automatically regenerated from the commit hash of the snowball test data repository. Not all languages are tested from their data: only where the license is simple BSD. Test data is also (deterministically) sampled, so that we don't have huge files. We just want to make sure our integration works.
* Randomized tests are still set to test every language with generated fake words. The regeneration task ensures all languages get tested (it writes a simple text file list of them).
* Stopword files are automatically regenerated from the commit hash of the snowball website repository.
* The regeneration procedure is idempotent. This way when stuff does change, you know exactly what happened. For example if test data changes to a different license, you may see a git deletion. Or if a new language/stopwords/test data gets added, you will see git additions.
2020-02-17 12:38:01 -05:00
Dawid Weiss dcf448efeb LUCENE-9134: Minor cleanups. 2020-02-13 11:18:01 +01:00
Erick Erickson f9357ab0d2
LUCENE-9134: Port ant-regenerate tasks to Gradle build (util and packed) (#1251)
* LUCENE-9134: Port ant-regenerate tasks to Gradle build
2020-02-11 18:56:11 -05:00
Robert Muir f41eabdc5f
LUCENE-8279: fix javadocs wrong header levels and accessibility issues
Java 13 adds a new doclint check under "accessibility" that the html
header nesting level isn't crazy.

Many are incorrect because the html4-style javadocs had horrible
font-sizes, so developers used the wrong header level to work around it.
This is no issue in trunk (always html5).

Java recommends against using such structured tags at all in javadocs,
but that is a more involved change: this just "shifts" header levels
in documents to be correct.
2020-02-08 10:00:00 -05:00
Robert Muir a77bb1e6f5
LUCENE-9201: add overview.html from correct location to the javadocs in gradle build 2020-02-07 00:18:20 -05:00
Robert Muir 0d339043e3
LUCENE-9209: fix javadocs to be html5, enable doclint html checks, remove jtidy
Current javadocs declare an HTML5 doctype: !DOCTYPE HTML. Some HTML5
features are used, but unfortunately also some constructs that do not
exist in HTML5 are used as well.

Because of this, we have no checking of any html syntax. jtidy is
disabled because it works with html4. doclint is disabled because it
works with html5. our docs are neither.

javadoc "doclint" feature can efficiently check that the html isn't
crazy. we just have to fix really ancient removed/deprecated stuff
(such as use of tt tag).

This enables the html checking in both ant and gradle. The docs are
fixed via straightforward transformations.

One exception is table cellpadding, for this some helper CSS classes
were added to make the transition easier (since it must apply padding
to inner th/td, not possible inline). I added TODOs, we should clean
this up. Most problems look like they may have been generated from a
GUI or similar and not a human.
2020-02-06 22:30:52 -05:00
Tomoko Uchida f3cd1dbde3 LUCENE-9077: Force locale en_US on Javadoc task (workaroud for JDK-8222793) 2020-02-07 01:36:45 +09:00
Erick Erickson b0bb299dc4
LUCENE-9134: Port ant-regenerate tasks to Gradle build (#1230)
LUCENE-9134: Port ant-regenerate tasks to Gradle build (Solr javacc)
2020-02-04 09:16:38 -05:00
Erick Erickson 5253c0cb74
LUCENE-9134 Port ant-regenerate tasks to Gradle build (#1226)
LUCENE-9134: Port ant-regenerate tasks to Gradle build Javacc sub-task. Closes #1226
2020-01-31 17:04:10 -05:00
Robert Muir 4b5105e167
LUCENE-9193: heap allocations for tests.profile
Can be a bit noisier than cpu sampling, due to how threads are allocated
in tests... maybe we can improve that in the future.
2020-01-30 08:29:10 -05:00
Dawid Weiss 3a8ed5e8ed LUCENE-9134: add python-based regeneration of HTMLCharacterEntities.jflex inside jflexHTMLStripCharFilter. 2020-01-30 13:48:16 +01:00
Dawid Weiss e25dac085f LUCENE-9134: this adds initial javacc support (without follow-up tweaks required to make the sources identical as those generated by ant). 2020-01-29 17:02:59 +01:00
Robert Muir e504798a44
LUCENE-9185: add "tests.profile" to gradle build to aid fixing slow tests
Run test(s) with -Ptests.profile=true to print a histogram at the end of
the build.
2020-01-28 11:27:18 -05:00
Jan Høydahl 53f7b394e4 SOLR-11207: Mute warnings for owasp false positives 2020-01-27 21:03:20 +01:00
Dawid Weiss ff635cf701 LUCENE-9184, LUCENE-9183: allow skipping git status check in precommit with -Pvalidation.git.failOnModified=false (or place this in gradle.properties to make it permanent). 2020-01-27 20:47:02 +01:00
Uwe Schindler 7dc35e3a62 Let precommit depend on generic forbiddenApis task 2020-01-27 19:47:54 +01:00
Robert Muir fd5a0ce7c2
LUCENE-9182: the rat-sources.gradle was the one .gradle file already with a license header, we don't need it twice 2020-01-27 12:11:44 -05:00
Robert Muir 975df9ddd3
LUCENE-9182: add apache license headers to all .gradle files and enforce in rat task 2020-01-27 12:05:34 -05:00
Dawid Weiss b420ef8f77 LUCENE-9179: don't invoke the same build recursively upon first run, just continue. Seems like gradle bug but let's not cry about it - it just happens once and CI defaults can be passed independently on command-line. 2020-01-27 17:34:13 +01:00
Robert Muir 8e357b167b
LUCENE-9180: dos2unix files that don't need dos line endings 2020-01-27 11:29:59 -05:00
Dawid Weiss 6bde0f3ec8 LUCENE-9134: UAX29URLEmailTokenizerImpl regeneration. This requires TONS
of memory and time... insane compared to the size of the input. None of my
machines pass it without at least 12 gigs of heap (!).
2020-01-27 12:36:13 +01:00
Jan Høydahl 39df74de37 SOLR-11207: Exclude configuration 'unifiedClasspath'
It is generated by consistent-versions plugin and triggers owasp warnings for deps even for excluded projects
2020-01-27 12:17:31 +01:00
Robert Muir 2bb63afdaf
LUCENE-9166: gradle build: test failures need stacktraces 2020-01-27 06:09:04 -05:00
Jan Høydahl 9ddd05cd14 SOLR-11207: Exclude solr-ref-guide from owasp check
It picked up log4j1 dependency only used during build
2020-01-27 09:55:12 +01:00
Dawid Weiss ae95f0ab68 LUCENE-9134: lucene:core:jflexStandardTokenizerImpl 2020-01-27 09:03:19 +01:00
Dawid Weiss 6f85ec0460 LUCENE-9174: Bump default gradle memory to 2g 2020-01-26 18:27:41 +01:00
Dawid Weiss 5ab59f59ac SOLR-11207: minor changes:
- added 'owasp' task to the root project. This depends on
dependencyCheckAggregate which seems to be a better fit for multi-module
projects than dependencyCheckAnalyze (the difference is vague to me
from plugin's documentation).

- you can run the "gradlew owasp" task explicitly and it'll run the
validation without any flags.

- the owasp task is only added to check if validation.owasp property
is true. I think this should stay as the default on non-CI systems
(developer defaults) because it's a significant chunk of time it takes
to download and validate dependencies.

- I'm not sure *all* configurations should be included in the check...
perhaps we should only limit ourselves to actual runtime dependencies
 not build dependencies, solr-ref-guide, etc.
2020-01-26 10:45:05 +01:00
Jan Høydahl 74a8d6d5ac SOLR-11207: Add OWASP dependency checker to gradle build (#1121)
* SOLR-11207: Add OWASP dependency checker to gradle build
2020-01-26 10:01:51 +01:00
Robert Muir f5e9bb9493
LUCENE-9165: explicitly cast with the horrible groovy language so that numbers above 9 don't fail 2020-01-24 09:53:47 -05:00
Robert Muir 4d61e4aaab
change generate-defaults.gradle not to cap testsJvms at 4 2020-01-24 08:49:17 -05:00
Robert Muir 9dae566ee7
LUCENE-9160: add params/docs to override jvm params in gradle build, default C2 off in tests.
Adds some build parameters to tune how tests run. There is an example
shown by "gradle helpLocalSettings"

Default C2 off in tests as it is wasteful locally and causes slowdown of
tests runs. You can override this by setting tests.jvmargs for gradle,
or args for ant.

Some crazy lucene stress tests may need to be toned down after the
change, as they may have been doing too many iterations by default...
but this is not a new problem.
2020-01-22 09:58:30 -05:00
Robert Muir 3ecd7a03aa
LUCENE-9159: merge gradle/ant test security policies (main file) 2020-01-21 23:43:31 -05:00
Robert Muir 7e0534d87c
LUCENE-9159: merge gradle/ant test security policies 2020-01-21 21:26:37 -05:00
Dawid Weiss 351b30489c LUCENE-9077: Enable javac linting as in ant. TONS of warnings are currently printed. 2020-01-20 10:10:48 +01:00
Dawid Weiss 1ad6bc9361 LUCENE-9077: Allow locally staged files in git status precommit check. 2020-01-20 09:36:14 +01:00
Nicholas Knize 78655239c5 LUCENE-8369: Remove obsolete spatial module 2020-01-16 11:22:05 -06:00
Dawid Weiss 087b2e1c0d LUCENE-9077: Emit the location of test output on failure. 2020-01-15 14:01:20 +01:00
Dawid Weiss 44c203d72f Add workaround for https://github.com/palantir/gradle-consistent-versions/issues/383 2020-01-15 11:44:21 +01:00
Dawid Weiss ae2e4f3ae9 Add git help to help/ 2020-01-15 10:40:41 +01:00
Dawid Weiss e6d85cd4bc Cleaning up minor things in rat task. 2020-01-15 10:07:24 +01:00
Mike c9e7eebe28 Add RAT check using Gradle (#1157)
Merging Apache rat checks.
2020-01-15 09:55:41 +01:00
Dawid Weiss 4a8762cc2c Add javadoc generation/linter to precommit. 2020-01-13 19:11:43 +01:00
Dawid Weiss d800b8060b Javadoc workarounds for LUCENE-9132 2020-01-13 19:11:01 +01:00
Dawid Weiss 3beb1cfd1e Add initial support for rendering javadocs. 2020-01-10 16:43:52 +01:00
Dawid Weiss 34aa8714d8 Correct class->classname. 2020-01-10 12:53:30 +01:00
Dawid Weiss b4d26f94d3 Don't load all of groovy's tasks, just groovy. 2020-01-10 12:51:46 +01:00
Dawid Weiss 39a5323999 Add config file sanity check for precommit. 2020-01-10 12:49:04 +01:00
Dawid Weiss 109444fc5b Add an equivalent of validate-source-patterns task, delegating to the same groovy script. 2020-01-10 12:02:30 +01:00
Dawid Weiss 4599c51f0d LUCENE-9122: add support for running tests against alternate jvms. 2020-01-09 19:00:32 +01:00
Dawid Weiss cf51dfdb37 Silence gradle warnings. We'll deal with them when we upgrade the wrapper. 2020-01-09 17:41:53 +01:00
Dawid Weiss 10baa68b60 Revert "Disable checkUnusedConstraints in palantir's plugin (bug)."
This reverts commit b32db8ee6a.
2020-01-09 17:40:51 +01:00
Dawid Weiss b32db8ee6a Disable checkUnusedConstraints in palantir's plugin (bug). 2020-01-09 17:05:02 +01:00
Dawid Weiss c7ed133910 LUCENE-9122: upgrade gradle wrapper to 6.0.1. Relax JVM requirement to require at least Java 11. We can't even check for higher bound because gradle itself breaks before it can execute the check script. I verified locally and it works with 11-13. 2020-01-09 14:13:32 +01:00
Dawid Weiss e713ca62b9 Remove buildscan configuration. 2020-01-08 20:17:20 +01:00
Dawid Weiss e87228982c Remove travis support for now. 2020-01-08 15:14:11 +01:00
Dawid Weiss 7a12c89ce6 Move precommit dependencies to precommit for clarity. 2020-01-08 14:20:16 +01:00
Dawid Weiss 7808dd5c3a Add minimum repro line at the end of the build. 2020-01-08 12:20:09 +01:00
Dawid Weiss d9e5daf01b Move printing tests.verbose to error reporting test listener since we're already catching the output and handle it there anyway. 2020-01-08 11:38:34 +01:00
Dawid Weiss 85d261339b Speed up spill writer. Echo failed test output to disk. 2020-01-08 10:55:07 +01:00
Dawid Weiss 14dd5a5e4d Initial error reporting test listener that mirrors failed suite's output. 2020-01-07 22:23:18 +01:00
Dawid Weiss c9c0bab2eb Ensure versions.props contains sorted entries (like check-lib-versions did for ant). 2020-01-03 16:04:12 +01:00
Dawid Weiss 37fb4a5f49 Verify lock state on precommit. 2020-01-03 15:53:29 +01:00
Dawid Weiss ae4a2e381d Hook up license checks to precommit. 2020-01-03 15:50:04 +01:00
Dawid Weiss 797f571fc3 Hook up forbidden apis to precommit. 2020-01-03 15:35:15 +01:00
Dawid Weiss 8b03a7104e Add a precommit placeholder task and working copy's git status check. 2020-01-03 15:22:36 +01:00
Dawid Weiss ca8661bc3a Reworked dependency resolution for license checks to work around a problem with gradle. Consolidated licenses with the ant build (excluding some jars from the ref-guide). 2019-12-30 14:05:08 +01:00
Dawid Weiss a96bf612d7 Merge with master. 2019-12-25 13:26:56 +01:00
Dawid Weiss 496b6b1d51 Follow-up to merge with master. 2019-12-20 17:38:04 +01:00
Dawid Weiss 7c762c969a Allow simultaneous call to sha regeneration and validation by introducing soft ordering constraint. 2019-12-18 14:54:13 +01:00
Dawid Weiss d2d28329ef Changed license checksum regeneration task name to updateLicenses. 2019-12-18 14:14:39 +01:00
Dawid Weiss 0e2a493446 Add transitive Lucene dependencies so that Solr licenses/ folder is (more) consistent with ant. This is an insane hack at the build-level. Mark it for removal once we get rid of ant. 2019-12-17 15:02:08 +01:00
Dawid Weiss faadb65202 Regenerate checksum for a single dependency once. Add trailing newline for consistency with ant code. 2019-12-17 14:27:25 +01:00
Dawid Weiss 8906c2ddbe Merge forbidden APIs rules. 2019-12-17 13:39:10 +01:00
Dawid Weiss 8ca1d4d144 Enable security manager by default. 2019-12-16 15:56:29 +01:00
Dawid Weiss 401ddc6dd1 Upgrade gradlew. Add environment sanity check. 2019-12-16 15:23:06 +01:00
Dawid Weiss b4a6a63949 Solr test policy changes. 2019-12-16 11:28:21 +01:00
Dawid Weiss 208d094262 Remove checksum files from ref guide. 2019-12-13 17:09:25 +01:00
Dawid Weiss 981ddb825b Remove leftover junk. 2019-12-13 15:24:59 +01:00
Dawid Weiss 3aff1664e5 updateChecksums, validation of dangling unreferenced files under licenses/. Separated licenses-gradle for Solr for now (doesn't include transitive Lucene dependencies). 2019-12-13 15:07:59 +01:00
Dawid Weiss d8cac07d2a Sort output of dangling license files. 2019-12-13 14:03:06 +01:00
Dawid Weiss 4500f0e327 Consolidating versions between gradle and ant. 2019-12-13 13:31:23 +01:00
Dawid Weiss 25fc0487a1 Working jar checksums and aligned with ant build. 2019-12-13 12:12:29 +01:00
Dawid Weiss 73e8b49f0d Align versions with ant build. 2019-12-13 12:01:26 +01:00
Dawid Weiss f97c276a81 Merge master changes to solr tests policy. 2019-12-13 10:53:59 +01:00
Dawid Weiss a392a83558 Add support for validating the presence of licenses and notices. 2019-12-12 19:25:46 +01:00
Dawid Weiss 6c7eb3ffe6 Merging changes to solr policy done on master. 2019-12-12 14:22:02 +01:00
Dawid Weiss 453eee3987 Initial work on jar checksums/ license file validation. 2019-12-11 18:41:27 +01:00
Dawid Weiss ddeb992fee SOLR-14053: remove tests.disableHdfs support 2019-12-11 15:05:36 +01:00
Dawid Weiss 9fad7b67b0 Follow-up to changes on master. 2019-12-11 09:01:37 +01:00
Dawid Weiss 564a2b7e07 Speed up test filtering by a lot by upgrading to rr 2.7.5. 2019-12-09 16:43:44 +01:00
Dawid Weiss 95bdda52aa Add solr properties back to sm policies 2019-12-09 11:15:22 +01:00
Dawid Weiss eeb1c9abf6 Only print the slowest tests at the end of a successful run. Correct verbose mode to parse string switch correctly. 2019-12-08 19:06:01 +01:00
Dawid Weiss 02c79dd211 Add testOpts task and info about it in tests.txt 2019-12-08 18:45:41 +01:00
Dawid Weiss 4d3040235e Add initial guidelines concerning dependency management. 2019-12-08 18:34:12 +01:00
Dawid Weiss 1021f04d1a Add some support for -Ptests.verbose mode when streams are dumped to the console. This is constrained by gradle's runner but is better than nothing. 2019-12-07 14:53:13 +01:00
Dawid Weiss 519ed997da Enable solr testing with solr security manager. 2019-12-06 19:25:57 +01:00
Dawid Weiss 37263176cb Enable security manager for the replicator module. The test policy for the replicator duplicates everything the regular policy has and just adds those nasty jetty-specific sections. Easier to diff/ spot the difference. 2019-12-06 19:04:07 +01:00
Dawid Weiss 3e4d8a17ac Initial support for running with security manager (lucene). 2019-12-06 17:08:14 +01:00
Dawid Weiss 226f5490a0 Correct lucene version passed to tests to be stripped of qualifiers. 2019-12-06 13:05:10 +01:00
Dawid Weiss cd7fd6d750 Clean up test property passing and move a number of properties and randomizations from common.build (ant counterpart) 2019-12-06 11:55:53 +01:00
Dawid Weiss 62a810cda7 Fail the build if --tests filter is applied and no tests execute during the entire build (this allows for an empty set of filtered tests at single project level). 2019-12-05 13:23:43 +01:00
Dawid Weiss bf7d115414 Generate hardware-specific defaults for gradle parallelism on the first build run (any task). Add some explanations on how to tweak local settings even further (gradlew :helpLocalSettings 2019-12-05 11:14:09 +01:00
Dawid Weiss 64e1499bc7 Add verification check that gradle and ant rules are in sync. 2019-12-03 23:08:57 +01:00
Dawid Weiss 85e0e4fb75 Add a workaround for the problem of forbiddenApis not running upon changing just the rules/ rulesets. 2019-12-03 18:41:11 +01:00
Dawid Weiss 0247f02a70 Only apply log4j rules to Solr. 2019-12-03 15:18:10 +01:00
Dawid Weiss a6d6d633d5 Apply servlet APIs to just Solr. 2019-12-03 14:43:50 +01:00
Dawid Weiss 6461909129 Port forbidden APIs. See gradlew :helpForbiddenApis to see how rules are applied automatically based on the set of dependencies of a project. 2019-12-03 14:40:35 +01:00
Dawid Weiss 0d7336db9d Moved gradle fragments under ci/ and maven/ for clarity. 2019-12-03 12:10:13 +01:00
Dawid Weiss 27f4b02ab4 Correct helpAnt location and add a check to verify this in the future. 2019-12-03 09:27:52 +01:00
Dawid Weiss d4a9842375 Initial gradle build layer. 2019-12-02 15:34:57 +01:00