This commit wraps the trace logging statements in
TransportBroadcastByNodeAction in trace enabled checks to avoid
unnecessarily allocating objects.
The most egregious offenders were the two trace logging statements in
BroadcastByNodeTransportRequestHandler#onShardOperation. Aside from the
usual object allocations that occur when invoking ESLogger#trace (the
allocated object array for the varargs Object... parameter), these two
logging statements were invoking ShardRouting#shortSummary generating a
bunch of char arrays and Strings (from the StringBuilder, and so a bunch
of array copies as well). In a scenario where there are a lot of shards
and this method is being invoked frequently (e.g., constantly hitting
the _stats endpoint), these two unprotected trace logging statements
were generating a lot of unnecessary allocations.
This commit modifies the handling of shard started cluster state updates
to use the general cluster state batching mechanism. An advantage of
this approach is we now get correct per-listener notification on
failures.
This commit removes a simple early-out check in
MetaDataMappingService#executeRefresh. The early-out is unnecessary
because the cluster state task execution framework will not invoke
ClusterStateTaskExecutor#execute if the list of tasks is empty.
This commit updates a stale Javadoc on
MetaDataMappingService#executeRefresh. Previously this method handled
refresh and update tasks. Update tasks have been removed and the method
was renamed, but the Javadoc was not updated to reflect this.
This is due to the fact that the query cache will still call the
onDocIdSetEviction callback in this case but with a number of entries equal to
zero.
Close#15043
This commit removes and now forbids all uses of the type-unsafe empty
Collections fields Collections#EMPTY_LIST, Collections#EMPTY_MAP, and
Collections#EMPTY_SET. The type-safe methods Collections#emptyList,
Collections#emptyMap, and Collections#emptySet should be used instead.
Today we try to have type-level granularity when dealing with mappings. This
does not play well with the cross-type validations that we are adding. For
instance we prevent the `_parent` field to point to an existing type. This
validation would be skipped today in the case of dedicated master nodes, since
those master nodes would only create the type that is being updated when
updating a mapping.
In 1.x it is possible via index templates to create an index with an alias with the same name as the index. The index name must match the index template and have an alias with the same name as the index being created.
This change attempts to simplify the gradle tasks for precommit. One
major part of that is using a "less groovy style", as well as being more
consistent about how tasks are created and where they are configured. It
also allows the things creating the tasks to set up inter task
dependencies, instead of assuming them (ie decoupling from tasks
eleswhere in the build).
This adds safety that you can't index into the `_default_` type (it was possible
before), and can't add default mappers to the field type lookups (was not
happening in tests but I think this is still a good check).
Also MapperService.types() now excludes `_default` so that eg. the `ids` query
does not try to search on this type anymore.
This commit modifies the handling of shard failure cluster state updates
to use the general cluster state batching mechanism. An advantage of
this approach is we now get correct per-listener notification on
failures.
This change pulls out the common fields that HighlighterBuilder shares with
its nested Field class into a new abstract CommonHighlighterOptions superclass
which also gets equals() and hashCode() method and methods to serialize the
common fields to a StreamOutput and read them from a stream.
Relates to #15044
Validation is not done as part of the distance setter method and tested in GeoDistanceQueryBuilderTests. Fixed GeoDistanceTests to adapt to the new validation.
Closes#15135
When creating an index on master for the purpose of updating mappings, a
mapping being updated could needlessly be merged multiple times. This
commit ensures that each mapping is merged at most once while preparing
to update mappings.
When creating an index on master for the purpose of updating mappings,
the default mapping could needlessly be added multiple times. This
commit ensures that the default mapping is added at most once while
preparing to update mappings.
This commit addresses an issues introduced in #14899 to apply mapping
updates in batches. The issue is that an existing mapping for a type
could be lost if that type came in a batch that already contained a
mapping update for another type on the same index. The underlying issue
was that the existing mapping would not be merged in because the merging
logic was only tripped once per index, rather than for all types seeing
updates for each index. Resolving this issue is simply a matter of
ensuring that all existing types seeing updates are merged in.
Closes#15129
The REST bulk API rejects use of `refresh` at the item level. But the Java API lets the user setting it.
We need to have the same behavior and don't let think the user he can define `refresh` per bulk item.
Note that the user can still define `refresh` on the bulk itself.
Also a user can create with Java API an IndexRequest without any source which is causing a NPE when evaluating the bulk item size.
Closes#7361.
Closes#15120.
I will followup with ITs and other modules. By fixing this, these tests become more reliable (will never sporatically
fail due to other stuff on your machine: ports are assigned by the OS), and it allows us to move forward with
gradle parallel builds, in my tests this is a nice speedup, but we can't do it until tests are cleaned up
This commit splits cluster state update tasks into roles. Those roles
are:
- task info
- task configuration
- task executor
- task listener
All tasks that have the same executor will be executed in batches. This
removes the need for local batching as was previously in
MetaDataMappingService.
Additionally, this commit reintroduces batching on mapping update calls.
Relates #13627
Do not to load fields from _source when using the `fields` option.
Non stored (non existing) fields are ignored by the fields visitor when using the `fields` option.
Fixes#10783
Support * wildcard to retrieve stored fields when using the `fields` option.
Supported pattern styles are "xxx*", "*xxx", "*xxx*" and "xxx*yyy".
Removed check that two query builder that are different according
to equals() have different hashCode since that is not required
by the contract of hashCode.
We used to check on several places if we are still open but non of these
places did the check under the lock which leaves a small window where we
potentially get closed but still access an already closed channel or another
IO resource.
This is a pretty trivial change that moves most of the monitor service related
object creation from guice into the monitor service. This is a babystep towards removing
guice on the node level as well. Instead of opening huge PRs I try to do this in baby-steps
that are easier to digest.
We handle AlreadyClosedExceptions gracefully wherever IndexShard / Engine
is used. In some cases, instead of throwing the appropriate exception we
bubble up ChannelClosedException instead which causes shard failures etc.
Today, it seems like that this can only happen if the engine is closed without
acquireing the lock which means that the shard has failed already so the impact is really
just a confusing log message. Yet, this change enforces throwing the right exception
if the translog is already closed.
Closes#14866
This moves the registration of field mappers from the index level to the node
level and also ensures that mappers coming from plugins are treated no
differently from core mappers.
This commit fixes some leniency in the parsing of CIDRs. The leniency
that existed before includes not validating the octet values nor
validating the network mask; in some cases issues with these values were
silently ignored. Parsing is now done according to the guidelines in RFC
4632, page 6.
Closes#14862
For example: if a node left the cluster and an async store fetch was triggered. In that time no shard is marked as delayed (and strictly speaking it's not yet delayed). This caused test for shard delays post node left to fail. see : http://build-us-00.elastic.co/job/es_core_master_windows-2012-r2/2074/testReport/
To fix this, the delay update is now done by the Allocation Service, based of a fixed time stamp that is determined at the beginning of the reroute.
Also, this commit fixes a bug where unassigned info instances were reused across shard routings, causing calculated delays to be leaked.
Closes#14890
This switches query parsing from manual field parsing to using ParseField.
Also adds unit tests for each query that check original json can be parsed
into query builders.
Relates to #8964
We recently refactored the queries to make them parsable on the
coordinating note and adding serialization and equals/hashCode
capability to them. So far ShapeBuilders nested inside queries
were still transported as a byte array that needs to be parsed
later on the shard receiving the query. To be able to also
serialize geo shapes this way, we also need to make all the
implementations of ShapeBuilder implement Writable.
This PR adds this to PointBuilder and also adds tests for
serialization, equality and hashCode.
The work for #10708 requires tighter integration with the current shard routing of a shard. As such, we need to make sure it is set before the IndexService exposes the shard to external operations.
Closes#14918
Index constraints should remove indices in the response if the field to evaluate if empty. Index constraints can't work with that and it is the same as if the field doesn't match.
Currently we use the "gradle project attachment plugin" to support
building elasticsearch as part of another project. However, this plugin
has a number of issues, a large part of which is requiring consistent
use of the projectsPrefix.
This change removes projectsPrefix, and adds support for a special
extra-plugins directory in the root of elasticsearch. Any projects
checked out within this directory will be automatically added to
elasticsearch.
* Forbid System.setProperties & co in forbidden APIs.
* Ban property write access at runtime with security manager.
Plugins that need to modify system properties will need to request permission in their plugin-security.policy
This commit adds an acquired flag to BulkProcessor#execute that is set
only after successful acquisition of a permit on the semaphore
there. This flag is used to ensure that we do not release a permit on
the semaphore when we did not obtain a permit on the semaphore.
Closes#14908
This makes AvgTests use a mock plugin engine. I also removed the
textScriptExplicit* methods for the base class since they only make sense for
a groovy script, not a mock script.
After the removal of some internal shape builders in #14482 the
BaseLineStringBuilder has only one implementation, the LineStringBuilder.
Same for the BasePolygonBuilder. This PR removes the abstract classes
and merges them with their concrete implementation to simplify the
inheritance hierarchy.
The Ring subclass is just a LineStringBuilder that has an additional
close() method and keeps a reference to a parent shape builder so
builders can be chained. This PR removes it and replaces it by
using LineStringBuilder instead. The close() method is moved there
and tests are adapted.
This is a first step in reducing the number of ShapeBuilders since
before we start making the remaining implement Writable for the
search request refactoring. This shape builder seems to have been
only used in tests, and those tests didn't do much to begin with,
so this removed them.
Relates to #14416
- moves calculation of the delay to a single place (ReplicaShardAllocator)
- reduces coupling between GatewayAllocator and RoutingService
- in master failover situations, elapsed delay time is forgotten
Closes#14808
At the time of geo_shape query conception, CONTAINS was not yet a supported spatial operation in Lucene. Since it is now available this commit adds ShapeRelation.CONTAINS to GeoShapeQuery. Randomized testing is included and documentation is updated.
This commit sets ReplicationRequest.internalShardId to null when the
stream indicates that no ShardId is present in the stream.
Additionally, the use of StreamOutput#writeOptionalStreamable is
changed to be explicit for clarity since the use of
StreamInput#readOptionalStreamable is not possible due to the
no-argument constructor on ShardId being private.
This commit replaces all occurrences of Thread.interrupted() with
Thread.currentThread().interrupt(). While the former checks and clears the current
thread's interrupt flag the latter sets it, which is actually intended.
Closes#14798
This commit adds a unit test for LinkedHashMap serialization that tests
that the method of serialization writes the entries in the LinkedHashMap
in iteration order and that the reconstructed LinkedHashMap preserves
that order. This test is randomized and tests iteration order is
preserved whether the LinkedHashMap is ordered by insertion order or
access order.
Closes#14743
This commit adds a timeout mechanism for sending shard failures. The
requesting thread can attach a listener to the timeout event so that
handling it is part of the event chain.
Relates #14252
This commit changes the signature of StreamInput#readOptionalStreamable
to accept a Supplier to create new streamables rather than requiring
callers to construct new instances. This has the advantage of avoiding
an allocation in cases when the stream indicates the resulting
streamable is null
If you build elasticsearch without a git repository it was creating a null
shortHash which was causing Elasticsearch not to be able to form transport
connections.
Closes#14748
This commit adds a method of encoding longs using a variable-length
representation. This encoding is an implementation of the zig-zag
encoding from protocol buffers. Numbers that have a small absolute value
will use a small number of bytes. This is achieved by zig-zagging
through the space of longs in order of increasing absolute value (0, -1,
1, -2, 2, …, Long.MAX_VALUE, Long.MIN_VALUE) -> (0, 1, 2, 3, 4, …, -2,
-1). The resulting values are then encoded as if they represent unsigned
numbers.
ClusterStatsIT#testClusterStatus() contained a race where the
test cluster might still be initializing while test already checks
for a green health status.
With this commit the test waits until the cluster status changed and
checks health afterwards.
Checked with @bleskes.
This issue occurs if the center latitude of the GeoPointDistance query is set to one of the poles. Since this issue is set to be fixed in LUCENE-6897 this commit temporarily limits the random latitudinal location to not include the poles.
_type should have got doc values with the change to default doc values.
However, due to how metadata fields have separate builders and special
constructors, it was not picking it up. This change updates the field
type for _type to have doc values.
closes#14781
If we run out of disk while recoverying the transaction log
we repeatedly fail since we expect the latest tranlog to be uncommitted.
This change adds 2 safety levels:
* uncommitted checkpoints are first written to a temp file and then atomically
renamed into a committed (recovered) checkpoint
* if the latest uncommitted checkpoints generation is already recovered it has to be
identical, if not the recovery fails
This allows to fail in between recovering the latest uncommitted checkpoint and moving
the checkpoint generation to N+1 which can for instance happen in a situation where
we can run out of disk. If we run out of disk while recovering the uncommitted checkpoint
either the temp file writing or the atomic rename will fail such that we never have a
half written or corrupted recovered checkpoint.
Close#14695
Currently the abstract ShapeBuilder class serves too many different
purposes, making it hard to refactor and maintain the code. In order
to reduce the size and responsibilities, this PR moved all the
static factory methods used as a shortcut to create new shape builders
out to a new ShapeBuilders class, similar to how QueryBuilders is
used already.
With this commit the cluster health status changes are logged
on INFO level. The change is only logged on master and actively
triggered in AllocationService in order to minimize the impact of
constantly reevaluating ClusterState in a ClusterStateListener
although we know that no health-relevant change happened.
Closes#11657
In AbstractQueryTestCase we randomly add the `_name` property to
some of the queries. While this generally works, there are exceptional
cases where we assign the same name to two queries in the setup which
leads to test failures later. This PR adds an increasing counter value
to the base tests that gets appended to all random query names to
avoid this name clashes.
ClusterRebalanceAllocationDecider did not take unassigned shards into account
that are temporarily marked as ingored. This can cause unexpected behavior
when gateway allocator is still fetching shards or has marked shareds as ignored
since their quorum is not met yet.
Closes#14670Closes#14678
Currently the next delay is calculated based on System.currentTimeMillis() but the actual shards to delay based on the last time the GatewayAllocator tried to assign/delay the shard.
This introduces an inconsistency for the case where shards should have been delay-allocated between the GatewayAllocator-based timestamp and System.currentTimeMillis().
Closes#14765
DateHistogramTests had some dependency on groovy scripts and
were moved to the lang-groovy module. This PR moves it back
and replaces use of groovy scripts by a mock script engine.
Removing three test cases that were testing doing some date
manipulation using script, since these are more groovy script
tests than testing the DateHistogram aggregation.
This commit changes TransportRequestOptions and TransportResponseOptions
to be immutable. This is to address an issue where the empty options
were being mutated permanently altering their state. Making these
objects immutable is just good, clean coding.
This fixes an issue where if the field for the aggregation was unmapped the extended bounds would get dropped and the resulting buckets would not cover the extended bounds requested.
Closes#14735
This fixes an issue where if the field for the aggregation was unmapped the extended bounds would get dropped and the resulting buckets would not cover the extended bounds requested.
Closes#14735
closes#14726
Squashed commit of the following:
commit 5b591e98570e3fa481b2816a44063b98bff36ddf
Author: Robert Muir <rmuir@apache.org>
Date: Fri Nov 13 00:54:08 2015 -0500
add assumption for self-signing in PluginManagerTests
commit ed11e5371b6f71591dc41c6f60d033502cfcf029
Author: Robert Muir <rmuir@apache.org>
Date: Fri Nov 13 00:20:59 2015 -0500
show error output from integ test startup
commit d8b187a10e95d89a0e775333dcbe1aaa903fb376
Author: Robert Muir <rmuir@apache.org>
Date: Thu Nov 12 22:14:11 2015 -0500
fix gradle check under jigsaw
This commit removes all noreleases and cuts over to Lucene 5.4 GeoPointField type. Included are randomized testing updates to unit and integration test suites for ensuring full backward compatability with existing geo_point indexes.
Just suck in the system policy, so its compatible with any version of java.
It means it also respects configuration (e.g. for monitoring agents)
Closes#14704
The disruption rules are changed to work on all transport addresses that are bound by a node (not only publish address).
This is important as UnicastZenPing creates fake DiscoveryNode instances which match one of the bound addresses and not necessarily the publish address.
Closes#14625Closes#14653
After a delayed reroute of a shard, RoutingService misses to schedule a new delayed reroute of other delayed shards.
Closes#14494Closes#14010Closes#14445