Commit Graph

150 Commits

Author SHA1 Message Date
Uwe Schindler 951efc95be
LUCENE-9278: Improved options file creation: All parameters are escaped automatically, arguments don't need to be strings (they are converted during building options file) (#1479) 2020-05-02 09:53:51 +02:00
Philippe Ouellet 7a849f6943
LUCENE-9354: Sync French stop words with latest version from Snowball. (#1474)
* Sync French stop words with latest version from Snowball.

This new version removed some French homonyms from the list

* Use latest master commit from snowball-website

* LUCENE-9354: regenerate with 'gradle snowball

* LUCENE-9354: add CHANGES.txt entry
2020-05-01 21:11:35 -04:00
Erick Erickson 217c2faa2c LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-05-01 13:06:57 -04:00
Uwe Schindler 26b0b54bd3
LUCENE-9278: Fix javadocs task to work on windows and with whitespace in project folder (#1476) 2020-05-01 17:49:47 +02:00
Erick Erickson 9ae05e9b4f LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-30 19:50:31 -04:00
Dawid Weiss 26c9fce5db LUCENE-9278: concatenate paths for sourcepath using path separator rather than whitespace (which causes invalid option to be passed to javadoc). 2020-04-30 10:24:23 +02:00
Tomoko Uchida 5354f7e88e
LUCENE-9333: Add gradle task to compile changes.txt to a html (#1468) 2020-04-30 17:21:53 +09:00
Erick Erickson 6e96d01efc LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-29 10:56:54 -04:00
Erick Erickson 960610a615 LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-27 20:45:57 -04:00
Erick Erickson ff4363675e LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-27 08:34:10 -04:00
Tomoko Uchida f03e6aac59
SOLR-14429: Convert .txt files to properly formatted .md files (#1450) 2020-04-27 08:43:04 +09:00
Erick Erickson 8867f465dc LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-26 09:15:21 -04:00
Erick Erickson ecc98e8698 LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-24 13:34:03 -04:00
Erick Erickson e43b17962a LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-22 22:32:49 -04:00
Erick Erickson c94770c2b9 LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-20 21:08:15 -04:00
Erick Erickson f01c040ab3 LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-19 15:58:50 -04:00
Erick Erickson 1f1cdbffdf LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-18 19:08:15 -04:00
Erick Erickson 3af165b32a LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects 2020-04-17 20:40:32 -04:00
Shalin Shekhar Mangar 13f19f6555 SOLR-9906: SolrjNamedThreadFactory is deprecated in favor of SolrNamedThreadFactory. DefaultSolrThreadFactory is removed from solr-core in favor of SolrNamedThreadFactory in solrj package and all solr-core classes now use SolrNamedThreadFactory 2020-04-13 08:16:35 +05:30
Dawid Weiss 7279190c89 LUCENE-9316: Incorporate all :precommit tasks into :check 2020-04-12 13:32:54 +02:00
Dawid Weiss 9244558752 LUCENE-9201: remove javadoc task remnants. Make javadoc depend on renderJavadoc and skip the default gradle's implementation. 2020-04-12 12:55:56 +02:00
Dawid Weiss fea1ce0062 LUCENE-9278: move declaration calling getTemporaryDir inside the execution block closure so that gradlew clean renderJavadoc doesn't wipe out the temporary directory before the task has a chance to run. 2020-04-12 12:27:14 +02:00
Dawid Weiss 2437f3f56c
LUCENE-9311: detect intellij reimport and modify sourceset to exclude solr-ref-guide/tools (#1422) 2020-04-09 13:55:42 +02:00
Tomoko Uchida 4f92cd414c
LUCENE-9278: Use -linkoffline instead of relative paths to make links to other projects (#1388) 2020-04-09 08:44:07 +09:00
Dawid Weiss dbb4be1ca9 LUCENE-9310: workaround for IntelliJ gradle import 2020-04-08 19:46:35 +02:00
Erick Erickson e1e2085e94 SOLR-14386: Update Jetty to 9.4.27 and dropwizard-metrics version to 4.1.5 2020-04-04 16:14:57 -04:00
Dawid Weiss d32858b127
LUCENE-9301: add manifest entries to JARs (gradle build). 2020-04-04 20:56:35 +02:00
Mike Drob e25ab4204f LUCENE-9266 remove gradle wrapper jar from source
ASF Release Policy states that we cannot have binary JAR files checked
in to our source releases, a few other projects have solved this by
modifying their generated gradlew scripts to download a copy of the
wrapper jar.

We now have a version and checksum file in ./gradle/wrapper directory
used for verifying the wrapper jar, and will take advantage of single
source java execution to verify and download.

The gradle wrapper jar will continue to be available in the git
repository, but will be excluded from src tarball generation. This
should not change workflows for any users, since we expect the gradlew
script to get the jar when it is missing.

Co-authored-by: Dawid Weiss <dweiss@apache.org>
2020-04-02 11:30:01 -05:00
Tomoko Uchida d4a137d2b6
LUCENE-9242: generate javadocs by calling Ant javadoc task (#1304) 2020-03-12 00:09:26 +09:00
Tomoko Uchida 312d6b2a0d LUCENE-9201: Add an equivalent to "check missing javadocs" task to gradle build
Co-Authored-By: Dawid Weiss <dawid.weiss@carrotsearch.com>
2020-02-24 11:05:35 +09:00
Dawid Weiss cb68d7d2c5 LUCENE-9232: add a script-hack check so that in case somebody upgrades the scripts automatically they'll know they need to add the hack. 2020-02-21 10:40:27 +01:00
Dawid Weiss f8a2c39906 LUCENE-9155: add missing naist dictionary generation, clean up the code a bit. 2020-02-21 10:24:05 +01:00
Robert Muir 9302eee1e0
LUCENE-9235: upgrade all python to python3
Die, python2, die.

Some generated .java files change (parameterized automata for
spell-correction).

This is because the order of python dictionaries was not well-defined
previously. A sort() was added so that the python code now generates
reproducible output (Thanks @mikemccand).

So we'll suffer a change once, but the automata are equivalent. If you
run the script again you should not see source code changes.

The relevant unit tests are exhaustive (if you trust the paper!), so we can
be confident it does not break things, even though it looks very scary.
2020-02-20 21:27:38 -05:00
Anshum Gupta cb18586ea0
LUCENE-9155: Add Apache License header to the Kuromoji dictionary compilation (#1271) 2020-02-20 14:59:06 -08:00
Dawid Weiss 62662e477a LUCENE-9155: Port Kuromoji dictionary compilation (regenerate). 2020-02-20 19:00:56 +01:00
Dawid Weiss 7604639b59 Move jgit version declaration to scriptDepVersions. 2020-02-20 13:54:07 +01:00
Robert Muir b9a569e7be
LUCENE-9230: explicitly call python version we want from builds
On newer linux distros, at least, 'python' now means python3. So
we can't rely on what version of python it will invoke (at least for a
few years).

For example in Fedora Linux:

https://fedoraproject.org/wiki/Changes/Python_means_Python3

For python2.x code, explicitly call 'python2.7' and for python3.x code,
explicitly call 'python3'.

Ant variable names are cleaned up, e.g. 'python.exe' is renamed to
'python2.exe' and 'python32.exe' is renamed to 'python3.exe'. This also
makes it easy to identify remaining python 2.x code that should be
migrated to python 3.x
2020-02-18 18:58:17 -05:00
Dawid Weiss 491c99a3de LUCENE-9232: tone down daemon defaults in generated local settings. 2020-02-18 19:43:39 +01:00
Dawid Weiss 2a88aa9d0f LUCENE-9219: Port ECJ-based linter to gradle
Co-authored-by: Tomoko Uchida <tomoko@apache.org>
2020-02-19 02:43:47 +09:00
Robert Muir ccb390d4a6
LUCENE-9220: prevent zip file reproducibility issues based on users umask 2020-02-17 13:34:00 -05:00
Robert Muir 0203815ab2
LUCENE-9220: regenerate all stemmers/stopwords/test data from snowball 2.0 (#1262)
Previous situation:

* The snowball base classes (Among, SnowballProgram, etc) had accumulated local performance-related changes. There was a task that would also "patch" generated classes (e.g. GermanStemmer) after-the-fact.
* Snowball classes had many "non-changes" from the original such as removal of tabs addition of javadocs, license headers, etc.
* Snowball test data (inputs and expected stems) was incorporated into lucene testing, but this was maintained manually. Also files had become large, making the test too slow (Nightly).
* Snowball stopwords lists from their website were manually maintained. In some cases encoding fixes were manually applied.
* Some generated stemmers (such as Estonian and Armenian) exist in lucene, but have no corresponding `.sbl` file in snowball sources at all.

Besides this mess, snowball project is "moving along" and acquiring new languages, adding non-BSD-licensed test data, huge test data, and other complexity. So it is time to automate the integration better.

New situation:

* Lucene has a `gradle snowball` regeneration task. It works on Linux or Mac only. It checks out their repos, applies the `snowball.patch` in our repository, compiles snowball stemmers, regenerates all java code, applies any adjustments so that our build is happy.
* Tests data is automatically regenerated from the commit hash of the snowball test data repository. Not all languages are tested from their data: only where the license is simple BSD. Test data is also (deterministically) sampled, so that we don't have huge files. We just want to make sure our integration works.
* Randomized tests are still set to test every language with generated fake words. The regeneration task ensures all languages get tested (it writes a simple text file list of them).
* Stopword files are automatically regenerated from the commit hash of the snowball website repository.
* The regeneration procedure is idempotent. This way when stuff does change, you know exactly what happened. For example if test data changes to a different license, you may see a git deletion. Or if a new language/stopwords/test data gets added, you will see git additions.
2020-02-17 12:38:01 -05:00
Dawid Weiss dcf448efeb LUCENE-9134: Minor cleanups. 2020-02-13 11:18:01 +01:00
Erick Erickson f9357ab0d2
LUCENE-9134: Port ant-regenerate tasks to Gradle build (util and packed) (#1251)
* LUCENE-9134: Port ant-regenerate tasks to Gradle build
2020-02-11 18:56:11 -05:00
Robert Muir f41eabdc5f
LUCENE-8279: fix javadocs wrong header levels and accessibility issues
Java 13 adds a new doclint check under "accessibility" that the html
header nesting level isn't crazy.

Many are incorrect because the html4-style javadocs had horrible
font-sizes, so developers used the wrong header level to work around it.
This is no issue in trunk (always html5).

Java recommends against using such structured tags at all in javadocs,
but that is a more involved change: this just "shifts" header levels
in documents to be correct.
2020-02-08 10:00:00 -05:00
Robert Muir a77bb1e6f5
LUCENE-9201: add overview.html from correct location to the javadocs in gradle build 2020-02-07 00:18:20 -05:00
Robert Muir 0d339043e3
LUCENE-9209: fix javadocs to be html5, enable doclint html checks, remove jtidy
Current javadocs declare an HTML5 doctype: !DOCTYPE HTML. Some HTML5
features are used, but unfortunately also some constructs that do not
exist in HTML5 are used as well.

Because of this, we have no checking of any html syntax. jtidy is
disabled because it works with html4. doclint is disabled because it
works with html5. our docs are neither.

javadoc "doclint" feature can efficiently check that the html isn't
crazy. we just have to fix really ancient removed/deprecated stuff
(such as use of tt tag).

This enables the html checking in both ant and gradle. The docs are
fixed via straightforward transformations.

One exception is table cellpadding, for this some helper CSS classes
were added to make the transition easier (since it must apply padding
to inner th/td, not possible inline). I added TODOs, we should clean
this up. Most problems look like they may have been generated from a
GUI or similar and not a human.
2020-02-06 22:30:52 -05:00
Tomoko Uchida f3cd1dbde3 LUCENE-9077: Force locale en_US on Javadoc task (workaroud for JDK-8222793) 2020-02-07 01:36:45 +09:00
Erick Erickson b0bb299dc4
LUCENE-9134: Port ant-regenerate tasks to Gradle build (#1230)
LUCENE-9134: Port ant-regenerate tasks to Gradle build (Solr javacc)
2020-02-04 09:16:38 -05:00
Erick Erickson 5253c0cb74
LUCENE-9134 Port ant-regenerate tasks to Gradle build (#1226)
LUCENE-9134: Port ant-regenerate tasks to Gradle build Javacc sub-task. Closes #1226
2020-01-31 17:04:10 -05:00
Robert Muir 4b5105e167
LUCENE-9193: heap allocations for tests.profile
Can be a bit noisier than cpu sampling, due to how threads are allocated
in tests... maybe we can improve that in the future.
2020-01-30 08:29:10 -05:00