lucene

Commit Graph

Author	SHA1	Message	Date
Mike Drob	e25ab4204f	LUCENE-9266 remove gradle wrapper jar from source ASF Release Policy states that we cannot have binary JAR files checked in to our source releases, a few other projects have solved this by modifying their generated gradlew scripts to download a copy of the wrapper jar. We now have a version and checksum file in ./gradle/wrapper directory used for verifying the wrapper jar, and will take advantage of single source java execution to verify and download. The gradle wrapper jar will continue to be available in the git repository, but will be excluded from src tarball generation. This should not change workflows for any users, since we expect the gradlew script to get the jar when it is missing. Co-authored-by: Dawid Weiss <dweiss@apache.org>	2020-04-02 11:30:01 -05:00
Tomoko Uchida	d4a137d2b6	LUCENE-9242: generate javadocs by calling Ant javadoc task (#1304 )	2020-03-12 00:09:26 +09:00
Tomoko Uchida	312d6b2a0d	LUCENE-9201: Add an equivalent to "check missing javadocs" task to gradle build Co-Authored-By: Dawid Weiss <dawid.weiss@carrotsearch.com>	2020-02-24 11:05:35 +09:00
Dawid Weiss	cb68d7d2c5	LUCENE-9232: add a script-hack check so that in case somebody upgrades the scripts automatically they'll know they need to add the hack.	2020-02-21 10:40:27 +01:00
Dawid Weiss	f8a2c39906	LUCENE-9155: add missing naist dictionary generation, clean up the code a bit.	2020-02-21 10:24:05 +01:00
Robert Muir	9302eee1e0	LUCENE-9235: upgrade all python to python3 Die, python2, die. Some generated .java files change (parameterized automata for spell-correction). This is because the order of python dictionaries was not well-defined previously. A sort() was added so that the python code now generates reproducible output (Thanks @mikemccand). So we'll suffer a change once, but the automata are equivalent. If you run the script again you should not see source code changes. The relevant unit tests are exhaustive (if you trust the paper!), so we can be confident it does not break things, even though it looks very scary.	2020-02-20 21:27:38 -05:00
Anshum Gupta	cb18586ea0	LUCENE-9155: Add Apache License header to the Kuromoji dictionary compilation (#1271 )	2020-02-20 14:59:06 -08:00
Dawid Weiss	62662e477a	LUCENE-9155: Port Kuromoji dictionary compilation (regenerate).	2020-02-20 19:00:56 +01:00
Dawid Weiss	7604639b59	Move jgit version declaration to scriptDepVersions.	2020-02-20 13:54:07 +01:00
Robert Muir	b9a569e7be	LUCENE-9230: explicitly call python version we want from builds On newer linux distros, at least, 'python' now means python3. So we can't rely on what version of python it will invoke (at least for a few years). For example in Fedora Linux: https://fedoraproject.org/wiki/Changes/Python_means_Python3 For python2.x code, explicitly call 'python2.7' and for python3.x code, explicitly call 'python3'. Ant variable names are cleaned up, e.g. 'python.exe' is renamed to 'python2.exe' and 'python32.exe' is renamed to 'python3.exe'. This also makes it easy to identify remaining python 2.x code that should be migrated to python 3.x	2020-02-18 18:58:17 -05:00
Dawid Weiss	491c99a3de	LUCENE-9232: tone down daemon defaults in generated local settings.	2020-02-18 19:43:39 +01:00
Dawid Weiss	2a88aa9d0f	LUCENE-9219: Port ECJ-based linter to gradle Co-authored-by: Tomoko Uchida <tomoko@apache.org>	2020-02-19 02:43:47 +09:00
Robert Muir	ccb390d4a6	LUCENE-9220: prevent zip file reproducibility issues based on users umask	2020-02-17 13:34:00 -05:00
Robert Muir	0203815ab2	LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0 (#1262 ) Previous situation: * The snowball base classes (Among, SnowballProgram, etc) had accumulated local performance-related changes. There was a task that would also "patch" generated classes (e.g. GermanStemmer) after-the-fact. * Snowball classes had many "non-changes" from the original such as removal of tabs addition of javadocs, license headers, etc. * Snowball test data (inputs and expected stems) was incorporated into lucene testing, but this was maintained manually. Also files had become large, making the test too slow (Nightly). * Snowball stopwords lists from their website were manually maintained. In some cases encoding fixes were manually applied. * Some generated stemmers (such as Estonian and Armenian) exist in lucene, but have no corresponding `.sbl` file in snowball sources at all. Besides this mess, snowball project is "moving along" and acquiring new languages, adding non-BSD-licensed test data, huge test data, and other complexity. So it is time to automate the integration better. New situation: * Lucene has a `gradle snowball` regeneration task. It works on Linux or Mac only. It checks out their repos, applies the `snowball.patch` in our repository, compiles snowball stemmers, regenerates all java code, applies any adjustments so that our build is happy. * Tests data is automatically regenerated from the commit hash of the snowball test data repository. Not all languages are tested from their data: only where the license is simple BSD. Test data is also (deterministically) sampled, so that we don't have huge files. We just want to make sure our integration works. * Randomized tests are still set to test every language with generated fake words. The regeneration task ensures all languages get tested (it writes a simple text file list of them). * Stopword files are automatically regenerated from the commit hash of the snowball website repository. * The regeneration procedure is idempotent. This way when stuff does change, you know exactly what happened. For example if test data changes to a different license, you may see a git deletion. Or if a new language/stopwords/test data gets added, you will see git additions.	2020-02-17 12:38:01 -05:00
Dawid Weiss	dcf448efeb	LUCENE-9134: Minor cleanups.	2020-02-13 11:18:01 +01:00
Erick Erickson	f9357ab0d2	LUCENE-9134: Port ant-regenerate tasks to Gradle build (util and packed) (#1251 ) * LUCENE-9134: Port ant-regenerate tasks to Gradle build	2020-02-11 18:56:11 -05:00
Robert Muir	f41eabdc5f	LUCENE-8279: fix javadocs wrong header levels and accessibility issues Java 13 adds a new doclint check under "accessibility" that the html header nesting level isn't crazy. Many are incorrect because the html4-style javadocs had horrible font-sizes, so developers used the wrong header level to work around it. This is no issue in trunk (always html5). Java recommends against using such structured tags at all in javadocs, but that is a more involved change: this just "shifts" header levels in documents to be correct.	2020-02-08 10:00:00 -05:00
Robert Muir	a77bb1e6f5	LUCENE-9201: add overview.html from correct location to the javadocs in gradle build	2020-02-07 00:18:20 -05:00
Robert Muir	0d339043e3	LUCENE-9209: fix javadocs to be html5, enable doclint html checks, remove jtidy Current javadocs declare an HTML5 doctype: !DOCTYPE HTML. Some HTML5 features are used, but unfortunately also some constructs that do not exist in HTML5 are used as well. Because of this, we have no checking of any html syntax. jtidy is disabled because it works with html4. doclint is disabled because it works with html5. our docs are neither. javadoc "doclint" feature can efficiently check that the html isn't crazy. we just have to fix really ancient removed/deprecated stuff (such as use of tt tag). This enables the html checking in both ant and gradle. The docs are fixed via straightforward transformations. One exception is table cellpadding, for this some helper CSS classes were added to make the transition easier (since it must apply padding to inner th/td, not possible inline). I added TODOs, we should clean this up. Most problems look like they may have been generated from a GUI or similar and not a human.	2020-02-06 22:30:52 -05:00
Tomoko Uchida	f3cd1dbde3	LUCENE-9077: Force locale en_US on Javadoc task (workaroud for JDK-8222793)	2020-02-07 01:36:45 +09:00
Erick Erickson	b0bb299dc4	LUCENE-9134: Port ant-regenerate tasks to Gradle build (#1230 ) LUCENE-9134: Port ant-regenerate tasks to Gradle build (Solr javacc)	2020-02-04 09:16:38 -05:00
Erick Erickson	5253c0cb74	LUCENE-9134 Port ant-regenerate tasks to Gradle build (#1226 ) LUCENE-9134: Port ant-regenerate tasks to Gradle build Javacc sub-task. Closes #1226	2020-01-31 17:04:10 -05:00
Robert Muir	4b5105e167	LUCENE-9193: heap allocations for tests.profile Can be a bit noisier than cpu sampling, due to how threads are allocated in tests... maybe we can improve that in the future.	2020-01-30 08:29:10 -05:00
Dawid Weiss	3a8ed5e8ed	LUCENE-9134: add python-based regeneration of HTMLCharacterEntities.jflex inside jflexHTMLStripCharFilter.	2020-01-30 13:48:16 +01:00
Dawid Weiss	e25dac085f	LUCENE-9134: this adds initial javacc support (without follow-up tweaks required to make the sources identical as those generated by ant).	2020-01-29 17:02:59 +01:00
Robert Muir	e504798a44	LUCENE-9185: add "tests.profile" to gradle build to aid fixing slow tests Run test(s) with -Ptests.profile=true to print a histogram at the end of the build.	2020-01-28 11:27:18 -05:00
Jan Høydahl	53f7b394e4	SOLR-11207: Mute warnings for owasp false positives	2020-01-27 21:03:20 +01:00
Dawid Weiss	ff635cf701	LUCENE-9184, LUCENE-9183: allow skipping git status check in precommit with -Pvalidation.git.failOnModified=false (or place this in gradle.properties to make it permanent).	2020-01-27 20:47:02 +01:00
Uwe Schindler	7dc35e3a62	Let precommit depend on generic forbiddenApis task	2020-01-27 19:47:54 +01:00
Robert Muir	fd5a0ce7c2	LUCENE-9182: the rat-sources.gradle was the one .gradle file already with a license header, we don't need it twice	2020-01-27 12:11:44 -05:00
Robert Muir	975df9ddd3	LUCENE-9182: add apache license headers to all .gradle files and enforce in rat task	2020-01-27 12:05:34 -05:00
Dawid Weiss	b420ef8f77	LUCENE-9179: don't invoke the same build recursively upon first run, just continue. Seems like gradle bug but let's not cry about it - it just happens once and CI defaults can be passed independently on command-line.	2020-01-27 17:34:13 +01:00
Robert Muir	8e357b167b	LUCENE-9180: dos2unix files that don't need dos line endings	2020-01-27 11:29:59 -05:00
Dawid Weiss	6bde0f3ec8	LUCENE-9134: UAX29URLEmailTokenizerImpl regeneration. This requires TONS of memory and time... insane compared to the size of the input. None of my machines pass it without at least 12 gigs of heap (!).	2020-01-27 12:36:13 +01:00
Jan Høydahl	39df74de37	SOLR-11207: Exclude configuration 'unifiedClasspath' It is generated by consistent-versions plugin and triggers owasp warnings for deps even for excluded projects	2020-01-27 12:17:31 +01:00
Robert Muir	2bb63afdaf	LUCENE-9166: gradle build: test failures need stacktraces	2020-01-27 06:09:04 -05:00
Jan Høydahl	9ddd05cd14	SOLR-11207: Exclude solr-ref-guide from owasp check It picked up log4j1 dependency only used during build	2020-01-27 09:55:12 +01:00
Dawid Weiss	ae95f0ab68	LUCENE-9134: lucene:core:jflexStandardTokenizerImpl	2020-01-27 09:03:19 +01:00
Dawid Weiss	6f85ec0460	LUCENE-9174: Bump default gradle memory to 2g	2020-01-26 18:27:41 +01:00
Dawid Weiss	5ab59f59ac	SOLR-11207: minor changes: - added 'owasp' task to the root project. This depends on dependencyCheckAggregate which seems to be a better fit for multi-module projects than dependencyCheckAnalyze (the difference is vague to me from plugin's documentation). - you can run the "gradlew owasp" task explicitly and it'll run the validation without any flags. - the owasp task is only added to check if validation.owasp property is true. I think this should stay as the default on non-CI systems (developer defaults) because it's a significant chunk of time it takes to download and validate dependencies. - I'm not sure all configurations should be included in the check... perhaps we should only limit ourselves to actual runtime dependencies not build dependencies, solr-ref-guide, etc.	2020-01-26 10:45:05 +01:00
Jan Høydahl	74a8d6d5ac	SOLR-11207: Add OWASP dependency checker to gradle build (#1121 ) * SOLR-11207: Add OWASP dependency checker to gradle build	2020-01-26 10:01:51 +01:00
Robert Muir	f5e9bb9493	LUCENE-9165: explicitly cast with the horrible groovy language so that numbers above 9 don't fail	2020-01-24 09:53:47 -05:00
Robert Muir	4d61e4aaab	change generate-defaults.gradle not to cap testsJvms at 4	2020-01-24 08:49:17 -05:00
Robert Muir	9dae566ee7	LUCENE-9160: add params/docs to override jvm params in gradle build, default C2 off in tests. Adds some build parameters to tune how tests run. There is an example shown by "gradle helpLocalSettings" Default C2 off in tests as it is wasteful locally and causes slowdown of tests runs. You can override this by setting tests.jvmargs for gradle, or args for ant. Some crazy lucene stress tests may need to be toned down after the change, as they may have been doing too many iterations by default... but this is not a new problem.	2020-01-22 09:58:30 -05:00
Robert Muir	3ecd7a03aa	LUCENE-9159: merge gradle/ant test security policies (main file)	2020-01-21 23:43:31 -05:00
Robert Muir	7e0534d87c	LUCENE-9159: merge gradle/ant test security policies	2020-01-21 21:26:37 -05:00
Dawid Weiss	351b30489c	LUCENE-9077: Enable javac linting as in ant. TONS of warnings are currently printed.	2020-01-20 10:10:48 +01:00
Dawid Weiss	1ad6bc9361	LUCENE-9077: Allow locally staged files in git status precommit check.	2020-01-20 09:36:14 +01:00
Nicholas Knize	78655239c5	LUCENE-8369: Remove obsolete spatial module	2020-01-16 11:22:05 -06:00
Dawid Weiss	087b2e1c0d	LUCENE-9077: Emit the location of test output on failure.	2020-01-15 14:01:20 +01:00

1 2 3

123 Commits