throw exception if a copy_to is within a multi field
Copy to within multi field is ignored from 2.0 on, see #10802.
Instead of just ignoring it, we should throw an exception if this
is found in the mapping when a mapping is added. For already
existing indices we should at least log a warning.
We remove the copy_to in any case.
related to #14946
this ensures the codebase URL matches the permission grant (see matching toRealPath in Security.java)
in the case of symlinks or other shenanigans.
this is best effort, if we really want to support symlinks in any way, we need
e.g. qa or vagrant tests that configure a bunch of symlinks for things and ensure that in jenkins.
this should be easier to do with gradle, as we can just create a symlink'd home if we want
Today we only handle correctly if the `ExecutionCancelledException` comes from the
local execution. Yet, this can also come from remove and should be handled identically.
This commit restores the chunk size of 512kb lost in a previous but unreleased
refactoring. At the same time it removes the configurability of:
* `indices.recovery.file_chunk_size` - now fixed to 512kb
* `indices.recovery.translog_ops` - removed without replacement
* `indices.recovery.translog_size` - now fixed to 512kb
* `indices.recovery.compress` - file chunks are not compressed due to lucene's compression but translog operations are.
The compress option is gone entirely and compression is used where it makes sense. On sending files of the index
we don't compress as we rely on the lucene compression for stored fields etc.
Relates to #15161
This commit cherry picks some infrastructure changes from the `feature/seq_no` branch to make merging from master easier.
More explicitly, IndexShard current have prepareIndex and prepareDelete methods that are called both on the primary as the replica, giving it a different origin parameter. Instead, this commits creates two explicit prepare*OnPrimary and prepare*OnReplica methods. This has the extra added value of not expecting the caller to use an Engine enum.
Also, the commit adds some code reuse between TransportIndexAction and TransportDeleteAction and their TransportShardBulkAction counter parts.
Closes#15282
The tribe node creates one local client node for each cluster it
connects to. Refactorings in #13383 broke this so that each local client
node now tries to load the full elasticsearch.yml that the real tribe
node uses.
This change fixes the problem by adding a TribeClientNode which is a
subclass of Node. The Environment the node uses is now passed in (in
place of Settings), and the TribeClientNode simply does not use
InternalSettingsPreparer.prepareEnvironment.
The tests around tribe nodes are not great. The existing tests pass, but
I also manually tested by creating 2 local clusters, and configuring and
starting a tribe node. With this I was able to see in the logs the tribe
node connecting to each cluster.
closes#13383
I don't recall of this property of any of our field mappers and it's not in our
docs so I suspect it's very old. The removal of this property will not fail
version upgrades since none of the field mappers use it in toXContent.
This commit removes some unneeded null checks from
IndexingMemoryController that were left over from the work in #15251,
and simplifies the try-catch block in
IndexingMemoryController#updateShardBuffers.
For the search refactoring the HighlightBuilder needs a way to
create new instances by parsing xContent. For bwc this PR start
by moving over and slightly modifying the parsing from
HighlighterParseElement and keeps parsing for top level highlighter
and field options separate. Also adding tests for roundtrip
of random builder (rendering it to xContent and parsing it and
making sure the original builder properties are preserved)
Since 2.2 we run all scripts with minimal privileges, similar to applets in your browser.
The problem is, they have unrestricted access to other things they can muck with (ES, JDK, whatever).
So they can still easily do tons of bad things
This PR restricts what classes scripts can load via the classloader mechanism, to make life more difficult.
The "standard" list was populated from the old list used for the groovy sandbox: though
a few more were needed for tests to pass (java.lang.String, java.util.Iterator, nothing scary there).
Additionally, each scripting engine typically needs permissions to some runtime stuff.
That is the downside of this "good old classloader" approach, but I like the transparency and simplicity,
and I don't want to waste my time with any feature provided by the engine itself for this, I don't trust them.
This is not perfect and the engines are not perfect but you gotta start somewhere. For expert users that
need to tweak the permissions, we already support that via the standard java security configuration files, the
specification is simple, supports wildcards, etc (though we do not use them ourselves).
This commit simplifies shard inactive debug logging to only log when the
physical shard is marked as inactive. This eliminates duplicate logging
that existed in IndexShard#checkIdle and
IndexingMemoryController#checkIdle, and eliminates excessive logging
that was occurring when the shard was already inactive as a result of
the work in #15252.
Currently, when a user tries to install an old plugin (pre 2.x) on a 2.x
node, the error message is cryptic (just printing the file path that was
missing, when looking for the descriptor). This improves the message to
be more explicit that the descriptor is missing, and suggests the
problem might be the plugin was built before 2.0.
closes#15197
This commit addresses some issues that arose during the review of #14899
but were lost during squash while integrating into master.
- the number of test threads is dropped to at most eight
- a local variable is renamed for clarity
- task priorities are randomized
This commit fixes a test bug in
ClusterService#testClusterStateBatchedUpdates. In particular, in the
case that an executor did not receive a task assignment from the random
assignments, it would not have an entry in the map of executors to
counts of assigned tasks. The fix is to just check if each executor has
an entry in the counts map.
This commit modifies IndexingMemoryController to be stateless. Rather
than statefully tracking the indexing status of shards,
IndexingMemoryController can grab all available shards, check their idle
state, and then resize the buffers based on the number of and which
shards are not idle.
The driver for this change is a performance regression that can arise in
some scenarios after #13918. One scenario under which this performance
regression can arise is if an index is deleted and then created
again. Because IndexingMemoryController was previously statefully
tracking the state of shards via a map of ShardIds, the new shards with
the same ShardIds as previously existing shards would not be detected
and therefore their version maps would never be resized from the
defaults. This led to an explosion in the number of merges causing a
degradation in performance.
Closes#15225
Today we only check mapping compatibility when adding mappers to the
lookup structure. However, at this stage, the mapping has already been merged
partially, so we can leave mappings in a bad state. This commit removes the
compatibility check from Mapper.merge entirely and performs it _before_ we
call Mapper.merge.
One minor regression is that the exception messages don't group together errors
that come from MappedFieldType.checkCompatibility and Mapper.merge. Since we
run the former before the latter, Mapper.merge won't even have a chance to let
the user know about conflicts if conflicts were discovered by
MappedFieldType.checkCompatibility.
Close#15049
The `translated` flag makes LineStringBuilder stateful and gets set
to true under certain conditions when building a Shape or Geometry
from the ShapeBuilder. This makes building operations not be idempotent,
so calling build() more than once on a LineStringBuilder might change the
builder itself. This PR fixes this by replacing the instance variable by
a local `translated` flag that is only updated internally during the
building process and created again on any subsequent calls to build()
or buildGeometry().
Failures to merge a mapping can either come as a MergeMappingException if they
come from Mapper.merge or as an IllegalArgumentException if they come from
FieldTypeLookup.checkCompatibility. I think we should settle on one: this pull
request replaces all usage of MergeMappingException with
IllegalArgumentException.
The ttl could be specified as a time value only via the REST layer. That is now possible via java api too, either as a string or as a proper TimeValue. The internal format in IndexRequest becomes now TimeValue, which will then still converted to a long before storing the document.
Closes#15047
- Supports ImmutableOpenIntMap besides java.util.Map and ImmutableOpenMap
- Map keys can be any value (not only String)
- Map values do not have to implement Diffable interface. In that case custom value serializer needs to be provided.
Several settings have been deprecated or are replaced with new settings after refactorings
in version 1.x. This commit removes the support for these settings.
The settings are:
* `index.shard.recovery.translog_size`
* `index.shard.recovery.translog_ops`
* `index.shard.recovery.file_chunk_size`
* `index.shard.recovery.concurrent_streams`
* `index.shard.recovery.concurrent_small_file_streams`
* `indices.recovery.max_size_per_sec`
When not in debug mode, we currently only print the message of an
exception. However, this is not usually useful without knowing what the
exception type was. This change makes cli tools use toString() on the
exception so we get the type + message.
This commit wraps the trace logging statements in
TransportBroadcastByNodeAction in trace enabled checks to avoid
unnecessarily allocating objects.
The most egregious offenders were the two trace logging statements in
BroadcastByNodeTransportRequestHandler#onShardOperation. Aside from the
usual object allocations that occur when invoking ESLogger#trace (the
allocated object array for the varargs Object... parameter), these two
logging statements were invoking ShardRouting#shortSummary generating a
bunch of char arrays and Strings (from the StringBuilder, and so a bunch
of array copies as well). In a scenario where there are a lot of shards
and this method is being invoked frequently (e.g., constantly hitting
the _stats endpoint), these two unprotected trace logging statements
were generating a lot of unnecessary allocations.
This commit modifies the handling of shard started cluster state updates
to use the general cluster state batching mechanism. An advantage of
this approach is we now get correct per-listener notification on
failures.
This commit removes a simple early-out check in
MetaDataMappingService#executeRefresh. The early-out is unnecessary
because the cluster state task execution framework will not invoke
ClusterStateTaskExecutor#execute if the list of tasks is empty.
This commit updates a stale Javadoc on
MetaDataMappingService#executeRefresh. Previously this method handled
refresh and update tasks. Update tasks have been removed and the method
was renamed, but the Javadoc was not updated to reflect this.
This is due to the fact that the query cache will still call the
onDocIdSetEviction callback in this case but with a number of entries equal to
zero.
Close#15043
This commit removes and now forbids all uses of the type-unsafe empty
Collections fields Collections#EMPTY_LIST, Collections#EMPTY_MAP, and
Collections#EMPTY_SET. The type-safe methods Collections#emptyList,
Collections#emptyMap, and Collections#emptySet should be used instead.
Today we try to have type-level granularity when dealing with mappings. This
does not play well with the cross-type validations that we are adding. For
instance we prevent the `_parent` field to point to an existing type. This
validation would be skipped today in the case of dedicated master nodes, since
those master nodes would only create the type that is being updated when
updating a mapping.
In 1.x it is possible via index templates to create an index with an alias with the same name as the index. The index name must match the index template and have an alias with the same name as the index being created.
This change attempts to simplify the gradle tasks for precommit. One
major part of that is using a "less groovy style", as well as being more
consistent about how tasks are created and where they are configured. It
also allows the things creating the tasks to set up inter task
dependencies, instead of assuming them (ie decoupling from tasks
eleswhere in the build).
This adds safety that you can't index into the `_default_` type (it was possible
before), and can't add default mappers to the field type lookups (was not
happening in tests but I think this is still a good check).
Also MapperService.types() now excludes `_default` so that eg. the `ids` query
does not try to search on this type anymore.
This commit modifies the handling of shard failure cluster state updates
to use the general cluster state batching mechanism. An advantage of
this approach is we now get correct per-listener notification on
failures.
This change pulls out the common fields that HighlighterBuilder shares with
its nested Field class into a new abstract CommonHighlighterOptions superclass
which also gets equals() and hashCode() method and methods to serialize the
common fields to a StreamOutput and read them from a stream.
Relates to #15044
Validation is not done as part of the distance setter method and tested in GeoDistanceQueryBuilderTests. Fixed GeoDistanceTests to adapt to the new validation.
Closes#15135
When creating an index on master for the purpose of updating mappings, a
mapping being updated could needlessly be merged multiple times. This
commit ensures that each mapping is merged at most once while preparing
to update mappings.
When creating an index on master for the purpose of updating mappings,
the default mapping could needlessly be added multiple times. This
commit ensures that the default mapping is added at most once while
preparing to update mappings.
This commit addresses an issues introduced in #14899 to apply mapping
updates in batches. The issue is that an existing mapping for a type
could be lost if that type came in a batch that already contained a
mapping update for another type on the same index. The underlying issue
was that the existing mapping would not be merged in because the merging
logic was only tripped once per index, rather than for all types seeing
updates for each index. Resolving this issue is simply a matter of
ensuring that all existing types seeing updates are merged in.
Closes#15129
The REST bulk API rejects use of `refresh` at the item level. But the Java API lets the user setting it.
We need to have the same behavior and don't let think the user he can define `refresh` per bulk item.
Note that the user can still define `refresh` on the bulk itself.
Also a user can create with Java API an IndexRequest without any source which is causing a NPE when evaluating the bulk item size.
Closes#7361.
Closes#15120.
I will followup with ITs and other modules. By fixing this, these tests become more reliable (will never sporatically
fail due to other stuff on your machine: ports are assigned by the OS), and it allows us to move forward with
gradle parallel builds, in my tests this is a nice speedup, but we can't do it until tests are cleaned up
This commit splits cluster state update tasks into roles. Those roles
are:
- task info
- task configuration
- task executor
- task listener
All tasks that have the same executor will be executed in batches. This
removes the need for local batching as was previously in
MetaDataMappingService.
Additionally, this commit reintroduces batching on mapping update calls.
Relates #13627
Do not to load fields from _source when using the `fields` option.
Non stored (non existing) fields are ignored by the fields visitor when using the `fields` option.
Fixes#10783
Support * wildcard to retrieve stored fields when using the `fields` option.
Supported pattern styles are "xxx*", "*xxx", "*xxx*" and "xxx*yyy".
Removed check that two query builder that are different according
to equals() have different hashCode since that is not required
by the contract of hashCode.
We used to check on several places if we are still open but non of these
places did the check under the lock which leaves a small window where we
potentially get closed but still access an already closed channel or another
IO resource.
This is a pretty trivial change that moves most of the monitor service related
object creation from guice into the monitor service. This is a babystep towards removing
guice on the node level as well. Instead of opening huge PRs I try to do this in baby-steps
that are easier to digest.
We handle AlreadyClosedExceptions gracefully wherever IndexShard / Engine
is used. In some cases, instead of throwing the appropriate exception we
bubble up ChannelClosedException instead which causes shard failures etc.
Today, it seems like that this can only happen if the engine is closed without
acquireing the lock which means that the shard has failed already so the impact is really
just a confusing log message. Yet, this change enforces throwing the right exception
if the translog is already closed.
Closes#14866
This moves the registration of field mappers from the index level to the node
level and also ensures that mappers coming from plugins are treated no
differently from core mappers.
This commit fixes some leniency in the parsing of CIDRs. The leniency
that existed before includes not validating the octet values nor
validating the network mask; in some cases issues with these values were
silently ignored. Parsing is now done according to the guidelines in RFC
4632, page 6.
Closes#14862
For example: if a node left the cluster and an async store fetch was triggered. In that time no shard is marked as delayed (and strictly speaking it's not yet delayed). This caused test for shard delays post node left to fail. see : http://build-us-00.elastic.co/job/es_core_master_windows-2012-r2/2074/testReport/
To fix this, the delay update is now done by the Allocation Service, based of a fixed time stamp that is determined at the beginning of the reroute.
Also, this commit fixes a bug where unassigned info instances were reused across shard routings, causing calculated delays to be leaked.
Closes#14890
This switches query parsing from manual field parsing to using ParseField.
Also adds unit tests for each query that check original json can be parsed
into query builders.
Relates to #8964
We recently refactored the queries to make them parsable on the
coordinating note and adding serialization and equals/hashCode
capability to them. So far ShapeBuilders nested inside queries
were still transported as a byte array that needs to be parsed
later on the shard receiving the query. To be able to also
serialize geo shapes this way, we also need to make all the
implementations of ShapeBuilder implement Writable.
This PR adds this to PointBuilder and also adds tests for
serialization, equality and hashCode.
The work for #10708 requires tighter integration with the current shard routing of a shard. As such, we need to make sure it is set before the IndexService exposes the shard to external operations.
Closes#14918
Index constraints should remove indices in the response if the field to evaluate if empty. Index constraints can't work with that and it is the same as if the field doesn't match.
Currently we use the "gradle project attachment plugin" to support
building elasticsearch as part of another project. However, this plugin
has a number of issues, a large part of which is requiring consistent
use of the projectsPrefix.
This change removes projectsPrefix, and adds support for a special
extra-plugins directory in the root of elasticsearch. Any projects
checked out within this directory will be automatically added to
elasticsearch.
* Forbid System.setProperties & co in forbidden APIs.
* Ban property write access at runtime with security manager.
Plugins that need to modify system properties will need to request permission in their plugin-security.policy
This commit adds an acquired flag to BulkProcessor#execute that is set
only after successful acquisition of a permit on the semaphore
there. This flag is used to ensure that we do not release a permit on
the semaphore when we did not obtain a permit on the semaphore.
Closes#14908
This makes AvgTests use a mock plugin engine. I also removed the
textScriptExplicit* methods for the base class since they only make sense for
a groovy script, not a mock script.
After the removal of some internal shape builders in #14482 the
BaseLineStringBuilder has only one implementation, the LineStringBuilder.
Same for the BasePolygonBuilder. This PR removes the abstract classes
and merges them with their concrete implementation to simplify the
inheritance hierarchy.