Before we can expose options to configure this postings format
on a per-reader basis we need to expose the option to load the terms
index FST off or on heap on the postings format. This already allows to
change the default in a per-field posting format if an expert user
wants to change the defaults. This essentially provides the ability to change
defaults globally while still involving some glue code.
The racent change introducing ByteArrayUtf8CharSequence altered the
NamedLists produced by atomic-update requests so that they include
instances of this class for requests coming in as javabin. This is a
problem for 'remove' atomic-updates, which need to be able to compare
these ByteArrayUtf8CharSequence instances with existing field values
represented as Strings. equals() would always return false, and
'remove' operations would have no effect.
This commit converts items as necessary to allow atomic-update
operations to work as expected.
The previous version of this test had a chicken/egg problem (needed to init the servers w/the whitelist but didn't know the port nums until after init)
that caused it to require 'restarting' the servers -- leading to the possibility of 'Address already in use' errors from jenkins machines if the OS reclaimed the port between the stop/start of the jetty instance
Prior to this commit, RuleBasedAuthorizationPlugin would check for the
predefined 'ALL' permission only when the endpoint being hit wasn't
associated with another predefined-permission.
This resulted in some very unintuitive behavior. For example, the
permission {name:all, role:admin} would correctly prevent a
role:foo user from accessing /admin/info/properties, but would allow
write access to /admin/authorization because of the SECURITY_EDIT
predefined perm associated with that endpoint.
This commit fixes this bug so that the 'all' permission is always
consulted whether or not the endpoint is associated with other predefined
permissions.
We have a number of IntervalsSource implementations where automatic minimization of
disjunctions can lead to surprising results:
* PHRASE queries can miss matches because a longer matching sub-source is minimized
away, leaving a gap
* MAXGAPS queries can miss matches for the same reason
* CONTAINING, NOT_CONTAINING, CONTAINED_BY and NOT_CONTAINED_BY queries
can miss matches if the 'big' interval gets minimized
The proper way to deal with this is to rewrite the queries by pulling disjunctions to the top
of the query tree, so that PHRASE("a", OR(PHRASE("b", "c"), "c")) is rewritten to
OR(PHRASE("a", "b", "c"), PHRASE("a", "c")). To be able to do this generally, we need to
add a new pullUpDisjunctions() method to IntervalsSource that performs this rewriting
for each source that it would apply to.
Because these rewritten queries will in general be less efficient due to the duplication of
effort (eg the rewritten PHRASE query above pulls 5 term iterators rather than 4 in the
original), we also add an option to Intervals.or() that will prevent this happening, so that
consumers can choose speed over accuracy if it suits their usecase.
FilterDirectory.getPendingDeletions() did not delegate the call, which
resulted in a new IndexWriter on same directory not considering pending
delete files. This could in turn result in a FileAlreadyExistsException
when running windows.