The BigArrays utility class is useful to generate arrays of various sizes: when
small, arrays will be allocated directly on the heap while larger arrays are
going to be paged and to recycle pages through PageCacheRecycler. We already
have tracking for pages but this is not triggered very often since it only
happens on large amounts of data while our tests work on small amounts of data
in order to be fast.
Tracking arrays directly helps make sure that we never forget to release them.
This pull request also improves testing by:
- putting random content in the arrays upon release: this makes sure that
consumers don't use these arrays anymore when they are released as their
content may be subject to use for another purpose since pages are recycled
- putting random content in the arrays upon creation and resize when
`clearOnResize` is `false`.
The major difference with `master` is that the `BigArrays` class is now
instanciable, injected via Guice and usually available through the
`SearchContext`. This way, it can be mocked for tests.
It was being invoked once per reader and parent type combination resulting in more memory being reported to the circuit breaker than actually being used in field data.
Lucene 4.7 supports a setter for the `filler_token` that is
inserted if there are gaps in the token stream. This change exposes
this setting.
Closes#4307
The clients return an exception in case of failure and not the whole json response containing failures, thus this tests can only work with the Java REST tests runner
The `ShardOperationFailedException` is now created within `TransportIndexReplicationAction` passing in the current shard id as a constructor argument.
Also replaced `AtomicReferenceArray<Object>` with `AtomicReferenceArray<ShardActionResult>`, where `ShardActionResult` wraps the `ShardResponse` or the failure, containing all the needed info.
seed from the main master seed. Removed shared cluster's seed entirely.
The problem here is that if you don't give cluster's seed then test times
fluctuate oddly, even for a fixed -Dtests.seed=... This shouldn't be the
case -- ideally, the test ran with the same master seed should reproduce
pretty much with the same execution time (and internal logic, obviously).
From the code point of view "global" variables are indeed a problem
because JUnit has no notion of before-suite hooks. And RandomizedRunner
doesn't support context accesses in static class initializers (this is
intentional because there is no way to determine when such initializers
will be executed). A workaround is to move such static global variables to
lazily-initialized methods and invoke them (once) in @BeforeClass hooks.
the thread local recycler requires obtain and recycle to be called on the same thread, while other recyclers do not. Also, it can create heavy recycle usage since it depends on the threads that its being used on. The concurrent / pinned thread base one is by far better than the pure thread local (and is the default) one since it more easily bounds the elements recycled, while still allowing to mix obtain and recycle across threads.
We will end up using the paged recyclers more and more, for example, in our networking output buffer, where obtaining will happen on one thread, while recycling can potentially occur on another thread (the callback thread). Since the limit of binding to a thread of the 2 calls is not really needed, and our best implementation supports going cross threads, there is no real need to impose this restriction.
some of the highlighters require term extraction to be implemented in
order to work. BlendedTermQuery doesn't implement the trivial extraction.
Closes#5246
- introduce additional destroy() callback that allows better control
over internals of recycled data
- introduced AbstractRecyclerC as superclass for Recycler factories
(Adrien) with empty destroy() by default
- added tests for destroy()
- cleaned up Recycler tests (reduce copy&paste)
A Field instance can map to multiple actual fields when using wildcard expressions. Each actual field should use the proper highlighter depending on the available data structure (e.g. term_vectors), while we currently select the highlighter for the first field and we keep using the same for all the fields that match the wildcard expression.
Modified also how the PercolateContext sets the forceSource option, in a global manner now rather than per field.
Closes#5175
When starting elasticsearch with a wrong linux user, it could generate a `NullPointerException` when `PluginsService` tries to list available plugins in `./plugins` dir.
To reproduce:
* create a plugins directory with `rwx` rights for root user only
* launch elasticsearch from another account (elasticsearch for example)
It was supposed to be fixed with #4186, but sadly it's not :-(
Closes#5195.
In #4052 we added support for highlighting multi term queries using the postings highlighter. That worked only for top-level queries though, and not for multi term queries that are nested for instance within a bool query, or filtered query, or a constant score query.
The way we make this work is by walking the query structure and temporarily overriding the query rewrite method with a method that allows for multi terms extraction.
Closes#5102