Commit Graph

2785 Commits

Author SHA1 Message Date
Andrey Ershov 287e334ef3 Do not perform cleanup if Manifest write fails with dirty exception (#40519)
Currently, if Manifest write is unsuccessful (i.e. WriteStateException
is thrown) we perform cleanup of newly created metadata files.
However, this is wrong.
Consider the following sequence (caught by CI here
https://github.com/elastic/elasticsearch/issues/39077):

- cluster global data is written **successful**
- the associated manifest write **fails** (during the fsync, ie files
have been written)
- deleting (revert) the manifest files, **fails**, metadata is
therefore persisted
- deleting (revert) the cluster global data is **successful**

In this case, when trying to load metadata (after node restart
because of dirty WriteStateException),  the following exception will
happen
```
java.io.IOException: failed to find global metadata [generation: 0]
```
because the manifest file is referencing missing global metadata file.

This commit checks if thrown WriteStateException is dirty and if its
we don't perform any cleanup, because new Manifest file might be
created, but its deletion has failed.
In the future, we might add more fine-grained check - perform the
clean up if WriteStateException is dirty, but Manifest deletion is
successful.

Closes https://github.com/elastic/elasticsearch/issues/39077

(cherry picked from commit 1fac56916bb3c4f3333c639e59188dbe743e385b)
2019-04-01 12:52:32 +03:00
Jim Ferenczi 7cc79123df Fix merging of text field mapper (#40627)
On mapping updates the `text` field mapper does not update
the field types for the underlying prefix and phrase fields.
In practice this shouldn't be considered as a bug but we have
an assert in the code that check that field types in the mapper service
are identical to the ones present in field mappers.
2019-04-01 08:41:42 +02:00
Jason Tedor cebe509460
Fix bug in detecting use of bundled JDK on macOS
This commit fixes a bug in detecting the use of the bundled JDK on
macOS. This bug arose because the path of Java home is different on
macOS.
2019-03-31 19:43:17 -04:00
Henning Andersen 92d07e9377 Geo Point parse error fix (#40447)
When geo point parsing threw a parse exception, it did not consume
remaining tokens from the parser. This in turn meant that
indexing documents with malformed geo points into mappings with
ignore_malformed=true would fail in some cases, since DocumentParser
expects geo_point parsing to end on the END_OBJECT token.

Related to #17617
2019-03-29 17:39:12 +01:00
Luca Cavanna a0b02ce6ef Move top-level pipeline aggs out of QuerySearchResult (#40319)
As part of #40177 we have added top-level pipeline aggs to
`InternalAggregations`. Given that `QuerySearchResult` holds an
`InternalAggregations` instance, there is no need to keep on setting
top-level pipeline aggs separately. Top-level pipeline aggs can then
always be transported through `InternalAggregations`. Such change is
made in a backwards compatible manner.
2019-03-29 17:01:14 +01:00
Oghenovo Usiwoma 444b4c4136 Improve error message for absence of indices (#39789)
"no indices exist" has been added to the error message for absence of indices
2019-03-29 17:01:14 +01:00
Luca Cavanna 48b0deef4f Remove throws IOException from PipelineAggregationBuilder#create (#40222)
IOException are never thrown in any of the existing pipeline aggregation
builders. Removing the throws IOException from the create method allows
to remove it also from a couple of other methods which ends up simplifying
 AggregationPhase (one less catch).
2019-03-29 17:01:14 +01:00
Jason Tedor 585f38787c
Add usage indicators for the bundled JDK (#40616)
This commit adds indications whether or not a distribution is from the
bundled JDK, and whether or not we are using the bundled JDK.
2019-03-29 08:25:32 -04:00
Martijn van Groningen be31800154
Update ingest jdocs that a null return value will drop the current document. (#40359) 2019-03-29 09:46:54 +01:00
Jason Tedor 7255562afd
Add start and stop time to cat recovery API (#40378)
The cat recovery API is incredibly useful. Yet it is missing the start
and stop time as an option from the output. This commit adds these as
options to the cat recovery API. We elect to make these not visible by
default to avoid breaking the output that users might rely on.
2019-03-28 16:23:37 -04:00
Mayya Sharipova 24755209b4 Add randomScore function in script_score query (#40186)
To make script_score query to have the same features
as function_score query, we need to add randomScore
function.

This function produces different
random scores on different index shards.
It is also able to produce random scores
based on the internal Lucene Document Ids.
2019-03-28 13:23:47 -04:00
David Turner 1a3916a8de Optimise rejection of out-of-range `long` values (#40325)
Today if you try and insert a very large number like `1e9999999` into a long
field we first construct this number as a `BigDecimal`, convert this to a
`BigInteger` and then reject it because it is out of range. Unfortunately
making such a large `BigInteger` is rather expensive.

We can avoid this expense by performing a (weaker) range check on the
`BigDecimal` representation of incoming `long`s too.

Relates #26137
Closes #40323
2019-03-28 12:27:34 +00:00
David Turner 073b13f5b0 Add docs for cluster.remote.*.proxy setting (#40281)
In #33062 we introduced the `cluster.remote.*.proxy` setting for proxied
connections to remote clusters, but left it deliberately undocumented since it
needed followup work so that it could work with SNI. However, since #32517 is
now closed we can add this documentation and remove the comment about its lack
of documentation.
2019-03-28 12:11:24 +00:00
jimczi 8775e37d03 Fix SearchResponseMerger#testMergeSearchHits
This commit fixes an edge case in tests where search hits are empty
after the merge but some shards returned hits. This can happen if
the total number of merged hits is less than the provided `from`.

Closes #40553
2019-03-28 09:57:21 +01:00
Adrien Grand 65a35c985c
Remove type from VersionConflictEngineException. (#37490) (#40514)
It initially mentioned the type in the exception because the type used to be
required to uniquely identify a document. This is not necessary anymore given
that indices have at most one type.
2019-03-28 09:32:09 +01:00
Adrien Grand 2326a3dccb
Remove String interning from `o.e.index.Index`. (#40350) (#40517)
`Index` interns its name and uuid. My guess is that the main goal is to avoid
having duplicate strings in the representation of the cluster state. However
I doubt it helps much given that we have many other objects in the cluster state
that we don't try to reuse, and interning has some cost. When looking into
 #40263 my profiler pointed to string interning because of the `Index` object
that is created in `QueryShardContext` as one of the bottlenecks of the
`can_match` phase.
2019-03-28 09:31:42 +01:00
Andy Bristol 23395a9b9f
search as you type fieldmapper (#35600)
Adds the search_as_you_type field type that acts like a text field optimized
for as-you-type search completion. It creates a couple subfields that analyze
the indexed terms as shingles, against which full terms are queried, and a
prefix subfield that analyze terms as the largest shingle size used and
edge-ngrams, against which partial terms are queried

Adds a match_bool_prefix query type that creates a boolean clause of a term
query for each term except the last, for which a boolean clause with a prefix
query is created.

The match_bool_prefix query is the recommended way of querying a search as you
type field, which will boil down to term queries for each shingle of the input
text on the appropriate shingle field, and the final (possibly partial) term
as a term query on the prefix field. This field type also supports phrase and
phrase prefix queries however
2019-03-27 13:29:13 -07:00
Benjamin Trent 6563dc7ed9
Muting test for #40553 (#40555) 2019-03-27 14:52:12 -05:00
Tim Brooks ab44f5fd5d
Add InboundHandler for inbound message handling (#40430)
This commit adds an InboundHandler to handle inbound message processing.
With this commit, this code is moved out of the TcpTransport.
Additionally, finer grained unit tests are added to ensure that the
inbound processing works as expected
2019-03-27 12:33:26 -06:00
Yannick Welsch 64b31f44af No mapper service and index caches for replicated closed indices (#40423)
Replicated closed indices can't be indexed into or searched, and therefore don't need a shard with
full indexing and search capabilities allocated. We can save on a lot of heap memory for those
indices by not allocating a mapper service and caching infrastructure (which preallocates a constant
amount per instance). Before this change, a 1GB ES instance could host 250 replicated closed
metricbeat indices (each index with one shard). After this change, the same instance can host 7300
replicated closed metricbeat instances (not that this would be a recommended configuration). Most
of the remaining memory is in the cluster state and the IndexSettings object.
2019-03-27 19:04:24 +01:00
Yannick Welsch 8f7c5732f1 Use default discovery implementation for single-node discovery (#40036)
Switches "discovery.type: single-node" from using a separate implementation for single-node discovery to using the existing standard discovery implementation, with two small adaptions:

-  auto-bootstrapping, but requiring initial_master_nodes not to be set.
- not actively pinging other nodes using the Peerfinder
- not allowing other nodes to join its single-node cluster (if they have e.g. been set up using regular discovery and connect to the single-disco node).
2019-03-27 19:04:24 +01:00
Tim Brooks 3860ddd1a4
Move outbound message handling to OutboundHandler (#40336)
Currently there are some components of message serializer and sending
that still occur in TcpTransport. This commit makes it possible to
send a message without the TcpTransport by moving all of the remaining
application logic to the OutboundHandler. Additionally, it adds unit
tests to ensure that this logic works as expected.
2019-03-27 11:47:36 -06:00
David Turner 707d40ce06 Stabilise testStaleMasterNotHijackingMajority (#40253)
This test inadvertently asserts that the election occurs after a master failure
is clean. However, messy elections are a fact of life so we should not fail on
a messy election.

This change moves this test away from an `AbstractDisruptionTestCase` since it
does not need the fault detector to be so enthusiastic, and weakens the
assertions to merely say that we ignore states published by the old master
without saying anything about the cleanliness of the election.

Closes #36556
2019-03-27 16:00:14 +00:00
Tim Brooks 760cfffe4b
Move TransportMessageListener to TransportService (#40474)
Currently the TransportMessageListener is applied and used in the
Transport class. However, local requests and responses never make it to
this class. This PR moves the listener add/remove methods to the
TransportService. After this change the Transport can only have one
listener set with it. This one listener is the TransportService, which
will then propogate the events to the external listeners.

Additionally this commit back ports #40237

Remove Tracer from MockTransportService

Currently the TransportMessageListener is applied and used in the
Transport class. However, local requests and responses never make it to
this class. This PR moves the listener add/remove methods to the
TransportService. After this change the Transport can only have one
listener set with it. This one listener is the TransportService, which
will then propogate the events to the external listeners.
2019-03-27 09:24:20 -06:00
Przemyslaw Gomulka 65f01277ed
Parse composite patterns using ClassicFormat.parseObject backport(#40100) (#40501)
Java-time fails parsing composite patterns when first pattern matches only the prefix of the input. It expects pattern in longest to shortest order. Because of this constructing just one DateTimeFormatter with appendOptional is not sufficient. Parsers have to be iterated and if the parsing fails, the next one in order should be used. In order to not degrade performance parsing should not be throw exceptions on failure. Format.parseObject was used as it only returns null when parsing failed and allows to check if full input was read.
 
closes #39916
backport #40100
2019-03-27 13:51:44 +01:00
Yannick Welsch b4b17e16e0 Remove TransportSingleItemBulkWriteAction as replication action (#40424)
The implementation of TransportIndexAction and TransportDeleteAction as
TransportReplicationAction existed for interoperability with older 5.x nodes, as these older nodes
coordinated single index / deletes as replication requests. This BWC layer is no longer needed in 7.x,
where these single actions are now mapped to bulk requests. Completely removing the deprecated
transport actions is not possible yet if we want to keep BWC with a 6.x transport client. The best
way here is to wait for the transport client to go away and then just remove the actions.
2019-03-27 13:16:58 +01:00
Jim Ferenczi fe05a4d511 Fix random failures in SearchResponseMerger#testMergeSearchHits (#40223)
This commit fixes the expectation in the test when the search hits are empty.

Closes #40214
2019-03-27 11:17:10 +01:00
alex101101 fb8ad0cf30 Add a soft limit to the field name length (#40309)
Adds an optional limit to the length of field names, throws an IllegalArgumentException if the limit is breached. 
Closes #33651
2019-03-26 17:58:32 +01:00
Daniel Mitterdorfer f2b5960f90 Add version 6.7.1 2019-03-26 17:38:14 +01:00
Yannick Welsch bf7b167bba Remove timeout task after completing cluster state publication (#40411)
Each cluster state publication schedules a cancellation task with the provided publication timeout
(30s by default). This scheduled cancellation keeps a reference to the publication, and therefore the
full cluster state that was published. In case of frequently updating a large cluster state, this results
in a large number of cancellation tasks keeping references to all previously published cluster states.
2019-03-26 17:13:57 +01:00
Henning Andersen bf444b9f02 Store Pending Deletions Fix (#40345)
FilterDirectory.getPendingDeletions does not delegate, fixed
temporarily by overriding in StoreDirectory.

This in turn caused duplicate file name use after a trimUnsafeCommits
had been done, since a new IndexWriter would not consider the pending
deletes in IndexFileDeleter. This should only happen on windows (AFAIK).

Reenabled doing index updates for all tests using
IndexShardTests.indexOnReplicaWithGaps (which could fail due to above
when using mocked WindowsFS).

Added getPendingDeletions delegation to all elasticsearch
FilterDirectory subclasses that were not trivial test-only overrides to
minimize the risk of hitting this issue in another case.
2019-03-26 15:30:44 +01:00
Alan Woodward 12634850d6 IntervalQueryBuilderTests#testNonIndexedFields test fix (#40418)
This test checks that interval queries constructed against a field with no indexed
positions will throw exceptions. It uses a randomly-build IntervalsSourceProvider
against a fixed set of fields; however, the random source builder can occasionally
provide a source with a fixed field, meaning that even if the top-level query asks
for a set of intervals over a non-indexed field, the source will delegate to another
field, and no exception will be thrown.

This commit changes the test to always use a simple Match provider.

Fixes #40436
2019-03-26 08:33:42 +00:00
Nhat Nguyen 495dc11c9c Mute testPendingRefreshWithIntervalChange
Tracked at #39565
2019-03-25 11:47:08 -04:00
Armin Braun 3968d46a17
Remove Redundant Request Wrappers from RepositoryService (#40192) (#40404) 2019-03-25 16:36:02 +01:00
Armin Braun dc5ff0fffc
Log Warning on Failed Blob Deletes in BlobStoreRepository (#40188) (#40340)
* Log Warning on Failed Blob Deletes in BlobStoreRepository
* We should not just debug log these spots, they all can and will lead to leaked files when snapshot deletion fails
2019-03-25 08:52:09 +01:00
Nhat Nguyen b9f96a8e1f
Expose external refreshes through the stats API (#38643)
Right now, the stats API only provides refresh metrics regarding
internal refreshes. This isn't very useful and somewhat misleading for
cluster administrators since the internal refreshes are not indicative
of documents being available for search.

In this PR I added a new metric for collecting external refreshes as
they occur and exposing them through the stats API. Now, calling an
endpoint for stats will yield external refresh metrics as well.

Relates #36712
2019-03-24 22:21:00 -04:00
Armin Braun 13d76239a0
Use Netty ByteBuf Bulk Operations for Faster Deserialization (#40158) (#40339)
* Use bulk methods to read numbers faster from byte buffers
2019-03-24 19:08:51 +01:00
Jason Tedor 10bbb082a4
Only run retention lease actions on active primary (#40386)
In some cases, a request to perform a retention lease action can arrive
on a primary shard before it is active. In this case, the primary shard
would not yet be in primary mode, tripping an assertion in the
replication tracker. Instead, we should not attempt to perform such
actions on an initializing shard. This commit addresses this by not
returning the primary shard in the single shard iterator if the primary
shard is not yet active.
2019-03-23 09:39:39 -04:00
Zachary Tong 78f737dad3
Map value field to double in MovavgIT (#40230)
We were accidentally not mapping the index, which meant dynamic mapping
was choosing floats for the values.  This led to enough loss of precision
for the aggregated values to differ slightly from the test doubles,
which accumulated into large differences in the holt output.

This test fix adds an explicit mapping.
2019-03-21 14:03:14 -04:00
Jason Tedor 1e6941b138
Reduce retention lease sync intervals (#40302)
This commit adjusts the frequency with which CCR renews retention leases
and with which primaries sync retention leases to replicas. This helps
Lucene reclaim soft-deleted documents more aggressively, which we have
found in some use-cases can help improve performance, and either way
will help keep disk space under more control.
2019-03-21 07:37:44 -04:00
Alan Woodward 83d2870308 Add `use_field` option to intervals query (#40157)
This is the equivalent of the `field_masking_span` query, allowing users to
merge intervals from multiple fields - for example, to search for stemmed tokens
near unstemmed tokens.
2019-03-20 16:26:04 +00:00
Like 6f64267626 Make setting index.translog.sync_interval be dynamic (#37382)
Currently, we cannot update index setting index.translog.sync_interval if index is open, because it's
not dynamic which can be updated for closed index only.

Closes #32763
2019-03-20 17:12:45 +01:00
Yannick Welsch a5fb7fb17c Fix snapshot restore logging on fresh restore (#40252)
A recent refactoring (#37130) where imports got mixed up (changing Lucene's
IndexNotFoundException to Elasticsearch's IndexNotFoundException) led to many warnings being
logged in case of restoring a fresh snapshot.
2019-03-20 16:51:44 +01:00
Jim Ferenczi 3400483af4
Add date and date_nanos conversion to the numeric_type sort option (#40199) (#40224)
This change adds an option to convert a `date` field to nanoseconds resolution
 and a `date_nanos` field to millisecond resolution when sorting.
The resolution of the sort can be set using the `numeric_type` option of the
field sort builder. The conversion is done at the shard level and is restricted
to dates from 1970 to 2262 for the nanoseconds resolution in order to avoid
numeric overflow.
2019-03-20 16:50:28 +01:00
Nhat Nguyen efaf95628b Use separate translog dir in testDeleteWithFatalError
This test currently opens a new engine but shares the same translog
directory of the previous opening engine.
2019-03-20 10:22:27 -04:00
Mayya Sharipova 49a7c6e0e8
Expose proximity boosting (#39385) (#40251)
Expose DistanceFeatureQuery for geo, date and date_nanos types

Closes #33382
2019-03-20 09:24:41 -04:00
Henning Andersen 4c2a8638ca Cascading primary failure lead to MSU too low (#40249)
If a replica were first reset due to one primary failover and then
promoted (before resync completes), its MSU would not include changes
since global checkpoint, leading to errors during translog replay.

Fixed by re-initializing MSU before restoring local history.
2019-03-20 14:00:43 +01:00
Simon Willnauer 235f57989f Return cached segments stats if `include_unloaded_segments` is true (#39698)
Today we don't return segments stats for closed indices which makes it
hard to tell how much memory such an index would require. With this change
we return the statistics if requested by setting `include_unloaded_segments` to
true on the rest request.

Relates to #39512
2019-03-20 12:08:41 +01:00
Jason Tedor 9ce740a2eb
Modfiy casing in JVM home log message
This makes the log message consistent with the following line that shows
the JVM arguments.
2019-03-20 00:06:16 -04:00
Zachary Tong 69f5869707 Mute SearchResponseMergerTests#testMergeSearchHits
Tracking issue: https://github.com/elastic/elasticsearch/issues/40214
2019-03-19 13:40:38 -04:00