Commit Graph

9054 Commits

Author SHA1 Message Date
Jason Tedor 766d29e7cf
Correctly encode warning headers
The warnings headers have a fairly limited set of valid characters
(cf. quoted-text in RFC 7230). While we have assertions that we adhere
to this set of valid characters ensuring that our warning messages do
not violate the specificaion, we were neglecting the possibility that
arbitrary user input would trickle into these warning headers. Thus,
missing here was tests for these situations and encoding of characters
that appear outside the set of valid characters. This commit addresses
this by encoding any characters in a deprecation message that are not
from the set of valid characters.

Relates #27269
2017-11-06 13:20:30 -05:00
kel d7fa09153a Remove duplicated SnapshotStatus (#27276) 2017-11-06 16:19:16 +01:00
Simon Willnauer bd7efa908a Add ability to split shards (#26931)
This change adds a new `_split` API that allows to split indices into a new
index with a power of two more shards that the source index.  This API works
alongside the `_shrink` API but doesn't require any shard relocation before
indices can be split.

The split operation is conceptually an inverse `_shrink` operation since we
initialize the index with a _syntetic_ number of routing shards that are used
for the consistent hashing at index time. Compared to indices created with
earlier versions this might produce slightly different shard distributions but
has no impact on the per-index backwards compatibility.  For now, the user is
required to prepare an index to be splittable by setting the
`index.number_of_routing_shards` at index creation time.  The setting allows the
user to prepare the index to be splittable in factors of
`index.number_of_routing_shards` ie. if the index is created with
`index.number_of_routing_shards: 16` and `index.number_of_shards: 2` it can be
split into `4, 8, 16` shards. This is an intermediate step until we can make
this the default. This also allows us to safely backport this change to 6.x.

The `_split` operation is implemented internally as a DeleteByQuery on the
lucene level that is executed while the primary shards execute their initial
recovery. Subsequent merges that are triggered due to this operation will not be
executed immediately. All merges will be deferred unti the shards are started
and will then be throttled accordingly.

This change is intended for the 6.1 feature release but will not support pre-6.1
indices to be split unless these indices have been shrunk before. In that case
these indices can be split backwards into their original number of shards.
2017-11-06 11:37:55 +01:00
Tanguy Leroux 43e7a4a349
Upgrade to Jackson 2.8.10 (#27230)
While it's not possible to upgrade the Jackson dependencies 
to their latest versions yet (see #27032 (comment) for more) 
it's still possible to upgrade to the latest 2.8.x version.
2017-11-06 10:20:05 +01:00
kel 76f81e002c Remove unused parameters in AnalysisRegistry (#27232)
Removes unused parameters for AnalysisRegistry#processAnalyzerFactory and AnalysisRegistry#processNormalizerFactory.
2017-11-06 09:48:57 +01:00
kel 5d661df174 Add more information on `_failed_to_convert_` exception (#27034) 2017-11-06 09:40:28 +01:00
Jim Ferenczi 429275a773
Remove ElasticsearchQueryCachingPolicy (#27190)
We have an hidden setting called `index.queries.cache.term_queries` that disables caching of term queries in the query cache.
Though term queries are not cached in the Lucene UsageTrackingQueryCachingPolicy since version 6.5.
This makes the es policy useless but also makes it impossible to re-enable caching for term queries.
This change appeared in Lucene 6.5 so this setting is no-op since version 5.4 of Elasticsearch
The change in this PR removes the setting and the custom policy.
2017-11-06 08:26:24 +01:00
Nhat fd3fac9565 Backport the size-based index rollver to v6.1.0
Relates #27004
2017-11-04 20:14:59 -04:00
Nhat c7ce5a07f2
Add size-based condition to the index rollover API (#27160)
This is to add a max_size condition to the index rollover API. We use
a totalSizeInBytes from DocsStats to evaluate this condition.

Closes #27004
2017-11-04 19:51:48 -04:00
David Roberts 749c3ec716
Remove the single argument Environment constructor (#27235)
Only tests should use the single argument Environment constructor.  To
enforce this the single arg Environment constructor has been replaced with
a test framework factory method.

Production code (beyond initial Bootstrap) should always use the same
Environment object that Node.getEnvironment() returns.  This Environment
is also available via dependency injection.
2017-11-04 13:25:09 +00:00
Chris Earle 964016e228 Fix RestGetAction name typo
This changes the name from docuemnt_get_action to document_get_action.

Relates #27266
2017-11-04 08:29:00 -04:00
Igor Motov 117f0f3a44
Fix snapshot getting stuck in INIT state (#27214)
If the master disconnects from the cluster after initiating snapshot, but just before the snapshot switches from INIT to STARTED state, the snapshot can get indefinitely stuck in the INIT state. This error is specific to v5.x+ and was triggered by keeping the master node that stepped down in the node list, the cleanup logic in snapshot/restore assumed that if master steps down it is always removed from the the node list. This commit changes the logic to trigger cleanup even if no nodes left the cluster.

Closes #27180
2017-11-03 19:36:08 -04:00
Colin Goodheart-Smithe 20e8005859
Fixes QueryStringQueryBuilderTests
Closes #27246
2017-11-03 13:24:56 +00:00
Jim Ferenczi 262422375e [Test] Fix QueryStringQueryBuilderTests.testExistsFieldQuery BWC
Handle BWC version in this test.

Closes #27246
2017-11-03 14:17:11 +01:00
kel 0f21262b36 Do not create directories if repository is readonly (#26909)
For FsBlobStore and HdfsBlobStore, if the repository is read only, the blob store should be aware of the readonly setting and do not create directories if they don't exist.

Closes #21495
2017-11-03 13:10:50 +01:00
Christoph Büscher 9abc26ee92 [Test] Fix InternalStatsTests
After recent changes in InternalStats#doXContentBody the corresponding xContent
output of the parsed aggregation needed to be changed in a similar way.
2017-11-03 11:26:41 +01:00
Jim Ferenczi d503782699 [Test] Fix QueryStringQueryBuilderTests.testExistsFieldQuery
Adapt the test to check for the new NormsFieldExistsQuery.

Closes #27246
2017-11-03 11:24:45 +01:00
Colin Goodheart-Smithe 28b4d95cf5
Uses norms for exists query if enabled (#27237)
* Uses norms for exists query if enabled

This change means that for indexes created from 6.1.0, if normas are enabled we will not write the field name to the `_field_names` field and for an exists query we will instead use the NormsFieldExistsQuery which was added in Lucene 7.1.0. If norms are not enabled or if the index was created before 6.1.0 `_field_names` will be used as before.

* Fixes tests
2017-11-03 08:51:40 +00:00
Mathias Fußenegger 827ba7f82d Avoid uid creation in ParsedDocument (#27241)
The uid bytes (as the type#id) were needlessly being created even though
they are no longer needed after the move to single type per index. This
commit avoids creating these when parsed documents are constructed.

Relates #27241
2017-11-02 20:10:07 -04:00
kel 55b9dfdd52 Rander sum as zero if count is zero for stats aggregation (#26893) (#27193) 2017-11-02 16:02:47 +00:00
Simon Willnauer b294250aba
Remove unused searcher parameter in SearchService#createContext (#27227)
This parameter isn't used anywhere and just adds complexity.
2017-11-02 14:58:34 +01:00
Colin Goodheart-Smithe c1b8140c83
Upgrade to Lucene 7.1 (#27225) 2017-11-02 13:25:33 +00:00
Simon Willnauer f928d613ad
Move IndexShard#getWritingBytes() under InternalEngine (#27209)
We do some accounting in IndexShard that is not necessarily correct since
we maintain two different index readers. This change moves the accounting under
the engine which knows what reader we are refreshing.

Relates to #26972
2017-11-02 10:43:17 +01:00
olcbean b9896465cd Introducing took time for _msearch
This commit adds the took time to the response for _msearch.

Relates #23767
2017-11-01 21:39:04 -04:00
Jason Tedor 59657ad1cb
Lazy initialize checkpoint tracker bit sets
This local checkpoint tracker uses collections of bit sets to track
which sequence numbers are complete, eventually removing these bit sets
when the local checkpoint advances. However, these bit sets were eagerly
allocated so that if a sequence number far ahead of the checkpoint was
marked as completed, all bit sets between the "last" bit set and the bit
set needed to track the marked sequence number were allocated. If this
sequence number was too far ahead, the memory requirements could be
excessive. This commit opts for a different strategy for holding on to
these bit sets and enables them to be lazily allocated.

Relates #27179
2017-11-01 21:26:52 -04:00
Jason Tedor 90d6317437
Remove checkpoint tracker bit sets setting
We added an index-level setting for controlling the size of the bit sets
used to back the local checkpoint tracker. This setting is really only
needed to control the memory footprint of the bit sets but we do not
think this setting is going to be needed. This commit removes this
setting before it is released to the wild after which we would have to
worry about BWC implications.

Relates #27191
2017-11-01 21:13:01 -04:00
Colin Goodheart-Smithe 99aca9cdfc
Enhances exists queries to reduce need for `_field_names` (#26930)
* Enhances exists queries to reduce need for `_field_names`

Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance.

This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`.

Closes #26770

* Addresses review comments

* Addresses more review comments

* implements existsQuery explicitly on every mapper

* Reinstates ability to perform term query on `_field_names`

* Added bwc depending on index created version

* Review Comments

* Skips tests that are not supported in 6.1.0

These values will need to be changed after backporting this PR to 6.x
2017-11-01 10:46:59 +00:00
Martijn van Groningen d805c41b28
Added new terms_set query
This query returns documents that match with at least one ore more
of the provided terms. The number of terms that must match varies
per document and is either controlled by a minimum should match
field or computed per document in a minimum should match script.

Closes #26915
2017-11-01 10:55:18 +01:00
Jack Conradson fd73e5fa41 Add version 6.0.0 2017-10-31 17:49:52 -07:00
Tanguy Leroux 13cd08b1e6
Convert index blocks to cluster block exceptions (#27050) 2017-10-31 16:11:18 +01:00
Shai Erera bd0261916c Fix Laplace scorer to multiply by alpha (and not add) (#27125) 2017-10-31 13:08:44 +01:00
javanna 34666844b3 [DOCS] Clarify migrate guide and search request validation
Relates to  #26811
2017-10-31 12:36:00 +01:00
kel c3e2bdf20c Raise IllegalArgumentException if query validation failed (#26811)
Closes #26799
2017-10-31 12:17:27 +01:00
Armin Braun a4c159e91e prevent duplicate fields when mixing parent and root nested includes (#27072)
Closes #26990
2017-10-31 10:01:33 +01:00
Adrien Grand 3812d3cb43
TopHitsAggregator must propagate calls to `setScorer`. (#27138)
It is required in order to work correctly with bulk scorer implementations
that change the scorer during the collection process. Otherwise sub collectors
might call `Scorer.score()` on the wrong scorer.

Closes #27131
2017-10-31 09:59:06 +01:00
Jason Tedor a566942219
Refactor internal engine
This commit is a minor refactoring of internal engine to move hooks for
generating sequence numbers into the engine itself. As such, we refactor
tests that relied on this hook to use the new hook, and remove the hook
from the sequence number service itself.

Relates #27082
2017-10-30 13:10:20 -04:00
Martijn van Groningen c406a91158
Fix division by zero in phrase suggester that causes assertion to fail 2017-10-30 09:04:56 +01:00
Nhat d01ad9367e Enable Docstats with totalSizeInBytes for 6.1.0
Relates https://github.com/elastic/elasticsearch/pull/27117
2017-10-28 14:54:53 -04:00
Nhat 07d270b45f
Adds average document size to DocsStats (#27117)
This change is required in order to support a size based check for the
index rollover.

The index size is estimated by sampling the existing segments only. We
prefer using segments to StoreStats because StoreStats is not reliable
if indexing or merging operations are in progress.

Relates #27004
2017-10-28 12:47:08 -04:00
Jim Ferenczi 6625ecfff4 Fix max score tracking with field collapsing (#27122)
This change makes sure that we track score when sort is set to relevancy only.
In this case we always track max score like normal search does.

Closes #23840
2017-10-27 09:18:34 +02:00
olcbean 35a2cc1003 fixed typo in ConstructingObjectParse (#27129) 2017-10-26 13:14:56 -06:00
Jim Ferenczi d1acf449f5 Apply missing request options to the expand phase (#27118)
* Apply missing request options to the expand phase

This change adds some missing options to the expand query that builds the inner hits for field collapsing.
The following options are now applied to the inner_hits query:
 * post_filters
 * preferences
 * routing

Closes #27079
Closes #26649
2017-10-26 17:01:57 +02:00
Simon Willnauer 1460a3feac Only pull SegmentReader once in getSegmentInfo (#27121) 2017-10-26 14:56:14 +02:00
Jason Tedor 0174d13ca2 Fix BWC for discovery stats
The new discovery stats were pushed to the 6.x branch (currently
versioned at 6.1.0) but master was not updated to reflect this. This
impacts the mixed-cluster BWC tests because a 6.1.0 node will be trying
to send a 7.0.0 node the new discovery stats but the 7.0.0 did not yet
understand that it should be reading these when talking to a 6.1.0
node. This commit addresses this, and changes the skip version on the
discovery stats REST tests.
2017-10-26 07:53:18 -04:00
Catalin Ursachi 8bf33241ed Add Delete Index API support to high-level REST client (#27019)
Relates to #25847
2017-10-26 09:52:46 +02:00
Jason Tedor 77f87732ef Adjust .DS_Store test assertions on Windows
Windows handles trying to read a file that does not exist because a
component of the path is not a directory differently than other OS
handle this situation. This commit adjusts these assertions for Windows.
2017-10-25 22:36:53 -04:00
Jason Tedor 17d6820a4b Emit settings deprecation logging on empty update
When executing a cluster settings update that leaves the cluster state
unchanged, we skip validation and this avoids deprecation logging for
deprecated settings in the cluster state. This commit addresses this by
running validation even if the settings are unchanged.

Relates #27017
2017-10-25 22:15:38 -04:00
Jason Tedor 9aae2f593a Avoid stack overflow on search phases
When a search is executing locally over many shards, we can stack
overflow during query phase execution. This happens due to callbacks
that occur after a phase completes for a shard and we move to the same
phase on another shard. If all the shards for the query are local to the
local node then we will never go async and these callbacks will end up
as recursive calls. With sufficiently many shards, this will end up as a
stack overflow. This commit addresses this by truncating the stack by
forking to another thread on the executor for the phase.

Relates #27069
2017-10-25 22:05:46 -04:00
Nhat adc195e30c Fix error message for a put index template request without index_patterns (#27102)
Just correct the error message from "Validation Failed: 1: pattern is
missing;" to "Validation Failed: 1: index_patterns is missing;".

Closes #27100
2017-10-25 18:54:40 -04:00
Armin Braun 6533b165d6 #25601 Add pipeline support for REST API bulk upsert (#27075) 2017-10-25 19:03:25 +02:00
Jason Tedor 6722b9c4a2 Ignore .DS_Store files on macOS
Finder creates these files if you browse a directory there. These files
are really annoying, but it's an incredible pain for users that these
files are created unbeknownst to them, and then they get in the way of
Elasticsearch starting. This commit adds leniency on macOS only to skip
these files.

Relates #27108
2017-10-25 11:25:29 -04:00
Luca Cavanna 5818ff6b56 Make ShardSearchTarget optional when parsing ShardSearchFailure (#27078)
Turns out that `ShardSearchTarget` is nullable, hence its fields may not be printed out as part of `ShardSearchFailure#toXContent`, in which case `fromXContent` cannot parse it back. We would previously try to create the object with all of its fields set to null, but `Index` complains about it in the constructor. Also made sure that this code path is covered by our unit tests in `ShardSearchFailureTests`.

Closes #27055
2017-10-25 13:26:06 +02:00
Luca Cavanna 8caf7d4ff8 Decouple BulkProcessor from ThreadPool (#26727)
Introduce minimal thread scheduler as a base class for `ThreadPool`. Such a class can be used from the `BulkProcessor` to schedule retries and the flush task. This allows to remove the `ThreadPool` dependency from `BulkProcessor`, which requires to provide settings that contain `node.name` and also needed log4j for logging. Instead, it needs now a `Scheduler` that is much lighter and gets automatically created and shut down on close.

Closes #26028
2017-10-25 10:30:23 +02:00
David Turner cc3364e4f8 Stats to record how often the ClusterState diff mechanism is used successfully (#26973)
It's believed that using diffs obsoletes the other mechanism for reusing the
bits of the ClusterState that didn't change between updates, but in fact we
don't know for sure how often the diff mechanism works successfully. The stats
collected here will tell us.
2017-10-25 07:35:25 +01:00
Lee Hinman 6bc7024f26 Tie-break shard path decision based on total number of shards on path (#27039)
Right now if the number of shards for a particular index is equal across the
data paths, we tie-break on space. This changes to tie-break first on the total
number of shards for each path, and then, if that is the same, on the usable
bytes.

Relates to #26654 (it's a follow-up)
2017-10-24 16:12:47 -06:00
Jason Tedor 7a792d2c1f Timed runnable should delegate to abstract runnable
If timed runnable wraps an abstract runnable, then it should delegate to
the abstract runnable otherwise force execution and handling rejections
is dropped on the floor. Thus, timed runnable should itself be an
abstract runnable delegating all methods to the wrapped runnable in
cases when it is an abstract runnable. This commit causes this to be the
case.

Relates #27095
2017-10-24 11:36:50 -04:00
Lee Hinman fcfbdf1f37 Expose adaptive replica selection stats in /_nodes/stats API
This exposes the collected metrics we store for ARS in the nodes stats, as well
as the computed rank of nodes. Each node exposes its perspective about the
cluster.

Here's an example output (with `?human`):

```json
...
"adaptive_selection" : {
  "_k6v1-wERxyUd5ke6s-D0g" : {
    "outgoing_searches" : 0,
    "avg_queue_size" : 0,
    "avg_service_time" : "7.8ms",
    "avg_service_time_ns" : 7896963,
    "avg_response_time" : "9ms",
    "avg_response_time_ns" : 9095598,
    "rank" : "9.1"
  },
  "VJiCUFoiTpySGmO00eWmtQ" : {
    "outgoing_searches" : 0,
    "avg_queue_size" : 0,
    "avg_service_time" : "1.3ms",
    "avg_service_time_ns" : 1330240,
    "avg_response_time" : "4.5ms",
    "avg_response_time_ns" : 4524154,
    "rank" : "4.5"
  },
  "DHNGTdzyT9iiaCpEUsIAKA" : {
    "outgoing_searches" : 0,
    "avg_queue_size" : 0,
    "avg_service_time" : "2.1ms",
    "avg_service_time_ns" : 2113164,
    "avg_response_time" : "6.3ms",
    "avg_response_time_ns" : 6375810,
    "rank" : "6.4"
  }
}
...
```
2017-10-24 08:58:42 -06:00
David Turner cf2d0834f5 Remove duplicated test (#27091) 2017-10-24 11:52:01 +01:00
Nhat bf557fd886 test: avoid generating duplicate multiple fields (#27080)
Multifields parser does not allow duplicate values, however the
MultiFieldTests may produce duplicate field values.

See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+release-tests/132/console.
2017-10-23 09:59:40 -04:00
Adrien Grand d0104c22a5 Reduce the default number of cached queries. (#26949)
Memory usage of queries can't be properly accounted, which can be an issue when
large queries are cached since the actual memory usage will be much higher than
what the cache thinks. This problem is very hard if not impossible to fix so as
a workaround I would like to decrease the maximum number of cached queries so
that this problem is less likely to cause trouble in practice.

For the record, this problem is more likely to occur in envirenments that have
small shards or don't give much memory to the JVM.

Closes #26938
2017-10-23 14:11:35 +02:00
Jason Tedor 35984a616e Keep cumulative elapsed scroll time in microseconds
Today we internally accumulate elapsed scroll time in nanoseconds. The
problem here is that this can reasonably overflow. For example, on a
system with scrolls that are open for ten minutes on average, after
sixteen million scrolls the largest value that can be represented by a
long will be executed. To address this, we switch to internally
representing scrolls using microseconds as this enables with the same
number of scrolls scrolls that are open for seven days on average, or
with the same average elapsed time sixteen billion scrolls which will
never happen (executing one scroll a second until sixteen billion have
executed would not occur until more than five-hundred years had
elapsed).

Relates #27068
2017-10-21 13:18:28 +02:00
Tanguy Leroux 463e7e6fa3 Revert "Upgrade to Jackson 2.9.2 (#27032)"
This reverts commit 0b9acc5ace.
2017-10-20 08:25:41 +02:00
Tanguy Leroux 0b9acc5ace Upgrade to Jackson 2.9.2 (#27032)
Upgrade to Jackson 2.9.2 and also use a boolean `closed` flag to
indicate that a FastStringReader instance is closed, so that length
is still correctly reported after the reader is closed.
2017-10-19 15:15:02 +02:00
Martijn van Groningen 87c9b79b10
Return the _source of inner hit nested as is without wrapping it into its full path context
Due to a change happened via #26102 to make the nested source consistent
with or without source filtering, the _source of a nested inner hit was
always wrapped in the parent path. This turned out to be not ideal for
users relying on the nested source, as it would require additional parsing
on the client side. This change fixes this, the _source of nested inner hits
is now no longer wrapped by parent json objects, irregardless of whether
the _source is included as is or source filtering is used.

Internally source filtering and highlighting relies on the fact that the
_source of nested inner hits are accessible by its full field path, so
in order to now break this, the conversion of the _source into its binary
form is performed in FetchSourceSubPhase, after any potential source filtering
is performed to make sure the structure of _source of the nested inner hit
is consistent irregardless if source filtering is performed.

PR for #26944

Closes #26944
2017-10-19 12:04:56 +02:00
Alexander Kazakov 9a3a1cd1b7 Handle leniency for cross_fields type in multi_match query (#27045) 2017-10-19 10:29:28 +02:00
Stephen Yeargin 8a05e5b92c Fix typo in thrown exception in IndicesAliasesRequest (#27025)
There is a typo in the exception thrown in `IndicesAliasesRequest`. This PR corrects the spelling and removes extraneous word.
2017-10-18 13:54:16 +00:00
Lee Hinman 78c54c4560 Balance shards for an index more evenly across multiple data paths (#26654)
* Balance shards for an index more evenly across multiple data paths

When a node has multiple data paths configured, and is assigned all of the
shards for a particular index, it's possible now that all shards will be
assigned to the same path (see #16763).

This change keeps the same behavior around determining the "best" path for a
shard based on space, however, it enforces limits for the number of shards on a
path for an index from the single-node perspective. For example:

Assume you had a node with 4 data paths, where `/path1` has a tremendously high
amount of disk space available compared to the other paths. If you create an
index with 5 primary shards, the previous behavior would be to assign all 5
shards to `/path1`.

This change would enforce a limit of 2 shards to each data path for that
particular node, so you would end up with the following distribution:

- `/path1` - 2 shards (because it has the most usable space)
- `/path2` - 1 shard
- `/path3` - 1 shard
- `/path4` - 1 shard

Note, however, that this limit is only enforced at the local node level for
simplicity in implementation, so if you had multiple nodes, the "limit" for the
node is still 2, so assuming you had enough nodes that there was only 2 shards
for this index assigned to this node, they would still both be assigned to
`/path1`.

* Switch from ObjectLongHashMap to regular HashMap

* Remove unneeded Files.isDirectory check

* Skip iterating directories when not necessary

* Add message to assert

* Implement different (better) ranking for node paths

This is the method we discussed

* Remove unused pathHasEnoughSpace method

* Use findFirst instead of .get(0);

* Update for master merge to fix compilation

Settings.putArray -> Settings.putList
2017-10-17 05:49:24 -06:00
Jason Tedor 62bf3c11a9 Stop invoking non-existant syscall
Today when getting ready to enter seccomp, we do some probes to ensure
that we are really talking to seccomp, etc. One of these probes is pure
paranoia. The paranoia was driven by a kernel bug
(https://lkml.org/lkml/2014/7/20/222) that only impacted 32-bit x86
kernels wherein invoking a non-existant syscall was not returning ENOSYS
(as it should). This probe causes problems though, for example in
containers with syscall filters, invoking a non-existant syscall will
lead to the process being sent SIGSYS and terminated. We do not need
this paranoid, we do not support 32-bit, and our other probes give us
enough of a defense to ensure that we are talking to seccomp (and we
hardcode the seccomp syscall number for platforms that we
support). Given that this probe offers us little value, but does cause
problems in valid use-cases, this commit removes this paranoia.

Relates #27016
2017-10-17 11:34:44 +02:00
Jason Tedor 3664ede9b5 Remove unnecessary exception for engine constructor
The internal engine constructor declares a checked engine exception yet
this constructor does not actually throw this exception. This commit
removes this declaration from the internal engine constructor.

Relates #27022
2017-10-16 10:17:37 -04:00
Simon Willnauer 8dda827ff4 Don't refresh on `_flush` `_force_merge` and `_upgrade` (#27000)
Today all these API calls have a sideeffect of making documents visible
to search requests. While this is sometimes desired it's an unnecessary sideeffect
and now that we have an internal (engine-private) index reader (#26972) we artificially
add a refresh call for bwc. This change removes this sideeffect in 7.0.
2017-10-16 10:16:35 +02:00
Tim Brooks 277637f42f Do not set SO_LINGER on server channels (#26997)
Right now we are attempting to set SO_LINGER to 0 on server channels
when we are stopping the tcp transport. This is not a supported socket
option and throws an exception. This also prevents the channels from
being closed.

This commit 1. doesn't set SO_LINGER for server channges, 2. checks
that it is a supported option in nio, and 3. changes the log message
to warn for server channel close exceptions.
2017-10-13 13:06:38 -06:00
olcbean bb013c60b5 Fix inconsistencies in the rest api specs for *_script (#26971) 2017-10-13 11:20:34 -07:00
Simon Willnauer cae1790492 [TEST] Add test that replicates versioned updates with random flushes 2017-10-13 11:54:51 +02:00
Simon Willnauer a517758432 Use internal searcher for all indexing related operations in the engine
The changes introduced in #26972 missed two places where an internal searcher should be used.
2017-10-13 11:41:00 +02:00
Tim Brooks e40c597b67
Fix reference to TcpTransport in documentation 2017-10-12 13:27:24 -06:00
Simon Willnauer 047a916169 Allow Uid#decodeId to decode from a byte array slice (#26987)
Today we only allow to decode byte arrays where the data has a 0 offset
and the same length as the array. Allowing to decode stuff from a slice will
make decoding IDs cheaper if the the ID is for instance coming from a term dictionary
or BytesRef.

Relates to #26931
2017-10-12 20:19:14 +02:00
Simon Willnauer 21eb9bdf6a Use separate searchers for "search visibility" vs "move indexing buffer to disk (#26972)
Today, when ES detects it's using too much heap vs the configured indexing
buffer (default 10% of JVM heap) it opens a new searcher to force Lucene to move
the bytes to disk, clear version map, etc.

But this has the unexpected side effect of making newly indexed/deleted
documents visible to future searches, which is not nice for users who are trying
to prevent that, e.g. #3593.

This is also an indirect spinoff from #26802 where we potentially pay a big
price on rebuilding caches etc. when updates / realtime-get is used. We are
refreshing the internal reader for realtime gets which causes for instance
global ords to be rebuild. I think we can gain quite a bit if we'd use a reader
that is only used for GETs and not for searches etc. that way we can also solve
problems of searchers being refreshed unexpectedly aside of replica recovery /
relocation.

Closes #15768
Closes #26912
2017-10-12 17:19:43 +02:00
Colin Goodheart-Smithe e1679bfe5e Create weights lazily in filter and filters aggregation (#26983)
Previous to this change the weights for the filter and filters aggregation were created in the `Filter(s)AggregatorFactory` which meant that they were created regardless of whether the aggregator actually collects any documents. This meant that for filters that are expensive to initialise, requests would not be quick when the query of the request was (or effectively was) a `match_none` query.

This change maintains a single Weight instance for each filter across parent buckets but passes a weight supplier to the aggregator instances which will create the weight on first call and then return that instance for subsequent calls.
2017-10-12 14:56:32 +01:00
Jason Tedor f81ee225ff Fire global checkpoint sync under system context
The global checkpoint sync action should fire under the system context
since it is not a user-facing management action.

Relates #26984
2017-10-12 09:18:58 -04:00
kel 2e36f19051 Add support for parsing inline script (#23824) (#26846)
* Add support for parsing inline script (#23824)

* Fix test
2017-10-11 09:15:37 -07:00
Alexander Kazakov 592ab043dd Change default value to true for transpositions parameter of fuzzy query (#26901) 2017-10-11 15:31:48 +02:00
Yannick Welsch d97b21d1da Adding unreleased 5.6.4 version number to Version.java 2017-10-11 08:40:30 +02:00
Tim Brooks 4a870d8c63 Rename TCPTransportTests to TcpTransportTests (#26954)
Our convention is to use lower case when naming things "Tcp". For
example, `TcpTransport`. This commit renames the outlier
(TcpTransportTests) to use lower case.
2017-10-10 20:15:14 -06:00
Nhat b63f718a0b Fix NPE for /_cat/indices when no primary shard (#26953)
When a node which contains the primary shard is unavailable, the primary
stats (and the total stats) of an `IndexStats` will be empty for a short
moment (while the primary shard is being relocated). However, we assume
that these stats are always non-empty when handling `_cat/indices` in
RestIndicesAction. This commit checks the content of these stats before
accessing.

Closes #26942
2017-10-10 17:04:55 -04:00
Jason Tedor 4c06b8f1d2 Check for closed connection while opening
While opening a connection to a node, a channel can subsequently
close. If this happens, a future callback whose purpose is to close all
other channels and disconnect from the node will fire. However, this
future will not be ready to close all the channels because the
connection will not be exposed to the future callback yet. Since this
callback is run once, we will never try to disconnect from this node
again and we will be left with a closed channel. This commit adds a
check that all channels are open before exposing the channel and throws
a general connection exception. In this case, the usual connection retry
logic will take over.

Relates #26932
2017-10-10 13:34:51 -04:00
Tanguy Leroux 6658ff0fd6 Don't detect source's XContentType in DocumentParser.parseDocument() (#26880)
DocumentParser.parseDocument() auto detects the XContentType of the
document to parse, but this information is already provided by SourceToParse.
2017-10-10 15:31:56 +02:00
hanbj 3ab27d16ad Fix thread context handling of headers overriding (#26068)
Previously collisions in headers between old and new contexts could be silently ignored, allowing the original context's headers to "win". This commit fixes the headers to require they are disjoint.
2017-10-09 14:41:09 -07:00
Boaz Leskes 84742690cd SearchWhileCreatingIndexIT: remove usage of _only_nodes
the only nodes preference was used as a replacement of `_primary` which was removed. Sadly, it's not the same as we also check that it makes sense - i.e., that the given node has a shard copy. Since the test uses indices with >1 shards, the primaries may be spread to multiple nodes. Using one (like it currently does) will fail for some primaries. Using all will probably end up hitting all nodes.

This commit removed the `_only_nodes` usage in favor a simple search

Relates to #26791
2017-10-09 19:37:19 +02:00
Martijn van Groningen 96823b0480
update Lucene version for 6.0-RC2 version 2017-10-09 15:27:06 +02:00
kel 1d4f70210f Calculate and cache result when advanceExact is called (#26920)
Cache final result instead of result of advanceExact.
Fix SortedNumericDoubleValues does not test MEDIAN mode
Replace deprecated random string generation method
2017-10-09 14:02:38 +02:00
Simon Willnauer cdd7c1e6c2 Return List instead of an array from settings (#26903)
Today we return a `String[]` that requires copying values for every
access. Yet, we already store the setting as a list so we can also directly
return the unmodifiable list directly. This makes list / array access in settings
a much cheaper operation especially if lists are large.
2017-10-09 09:52:08 +02:00
Nhat bf4c3642b2 remove _primary and _replica shard preferences (#26791)
The shard preference _primary, _replica and its variants were useful
for the asynchronous replication. However, with the current impl, they
are no longer useful and should be removed.

Closes #26335
2017-10-08 11:03:06 -04:00
kel 100e3c9a8a Remove UnsortedNumericDoubleValues (#26817)
Closes #24086
2017-10-06 16:31:50 +02:00
Thomas Kappler 16431a6601 Fix IndexOutOfBoundsException in histograms for NaN doubles (#26787) (#26856) 2017-10-06 16:27:01 +02:00
Jim Ferenczi e8f72353d8 Fix search_after with geo distance sorting (#26891)
Support for search_after and geo distance sorting is broken when the optimized LatLonDocValuesField.distanceSort is used.
This commit fixes the parsing of the search_after value for this case.
2017-10-06 11:34:33 +02:00
Jason Tedor 470e5e7cfc Add additional low-level logging handler ()
* Add additional low-level logging handler

We have the trace handler which is useful for recording sent messages
but there are times where it would be useful to have more low-level
logging about the events occurring on a channel. This commit adds a
logging handler that can be enabled by setting a certain log level
(org.elasticsearch.transport.netty4.ESLoggingHandler) to trace that
provides trace logging on low-level channel events and includes some
information about the request/response read/write events on the channel
as well.

* Remove imports

* License header

* Remove redundant

* Add test

* More assertions
2017-10-05 12:10:58 -04:00
Simon Willnauer 8583727590 [TEST] add test to ensure legacy list syntax in yml works fine 2017-10-05 14:41:51 +02:00
Simon Willnauer 41925e1171 Bump BWC version for settings serialization to 6.1.0 2017-10-05 14:07:05 +02:00
Md. Abdulla-Al-Sun a40c474e10
Added Bengali Analyzer to Elasticsearch with respect to the lucene update(PR#238) 2017-10-05 13:25:05 +02:00
kel a978ddf37b Fix toString() in SnapshotStatus (#26852)
Closes #26851
2017-10-05 12:57:46 +02:00