Latest version of lucene deprecated Query#setBoost and Query#getBoost which made queries effectively immutable. Those methods need to be replaced with `BoostQuery` that wraps any query that needs boosting.
This commit replaces usages of setBoost with BoostQuery and adds it to forbidden-apis for prod code.
Usages of `getBoost` are only partially removed, as some will have to stay for backwards compatibility.
Closes#14264
If you run tests under a 32-bit jvm, you will get a test failure in IndexStoreTests,
the logic there is wrong in the case of 32-bit (its NIOFSDirectory on linux).
Also if mlockall fails, you'll see huge bogus values (because of use of `long` instead of `NativeLong`)
finally add seccomp support for 32 bit too, and clean up all its `long` usage as well.
There have been security issues with tika's parsers in the past...
let's take away the network, filesystem, everything we can.
In some way, parsing these docs is a lot like executing untrusted code.
I know its not pretty, but I think its worth it.
This patch adds a zip of about 200 files from tika's test suite,
and we assert some content comes back from each. This is a good exercise
of the various formats.
I removed any huge files to try to keep size reasonable, but we want
a bit of a variety so we know stuff is working.
I fixed issues with the parser config by running this.
this removes a lot of obscure parsers, and leaves us with the basics.
This includes at least all of the formats listed on
https://github.com/elastic/elasticsearch-mapper-attachments/issues/163
I will start adding tests for each one of these document formats,
and take it as it goes and see what trouble we run into.
Closes#163
The plugin name currently defaults to the gradle project name. But the
gradle project name for standalone repo (like an external plugin would
be) defaults to the directory name of the repo. This is trappy, since it
depends on how the repo was checked out.
This change enforces the plugin name is always set.
closes#14603