Commit Graph

38943 Commits

Author SHA1 Message Date
Ioannis Kakavas febb46b702 [SAML] Saml metadata signing (elastic/x-pack-elasticsearch#4184)
Adds option to sign generated Service Provider SAML metadata
- Using a (possibly password protected) PEM encoded keypair
- Using a keypair stored in a (possibly password protected) PKCSelastic/x-pack-elasticsearch#12 keystore

Resolves elastic/x-pack-elasticsearch#3982


Original commit: elastic/x-pack-elasticsearch@7b806d76f8
2018-03-28 13:43:29 +03:00
Yannick Welsch cacf759213
Remove RELOCATED index shard state (#29246)
as this information is already covered by ReplicationTracker.primaryMode.
2018-03-28 12:25:46 +02:00
David Roberts c63d32482f [ML] Avoid timeout if ML persistent task assignment fails on master node (elastic/x-pack-elasticsearch#4236)
The ML open_job and start_datafeed endpoints start persistent tasks and
wait for these to be successfully assigned before returning.  Since the
setup sequence is complex they do a "fast fail" validation step on the
coordinating node before the setup sequence.  However, this leads to the
possibility of the "fast fail" validation succeeding and the eventual
persistent task assignment failing due to other changes during the setup
sequence.  Previously when this happened the endpoints would time out,
which in the case of the open_job action takes 30 minutes by default.
The start_datafeed endpoint has a shorter default timeout of 20 seconds,
but in both cases the result of a timeout is an unfriendly HTTP 500
status.

This change adjusts the criteria used to wait for the persistent tasks to
be assigned to account for the possibility of assignment failure and, if
this happens, return an error identical to what the "fast fail"
validation would have returned.  Additionally in this case the unassigned
persistent task is cancelled, leaving the system in the same state as if
the "fast fail" validation had failed.

Original commit: elastic/x-pack-elasticsearch@16916cbc13
2018-03-28 10:06:14 +01:00
Robin Neatherway ea8e3661d0 Fix a type check that is always false (#27726)
DocumentParser: The checks for Text and Keyword were masked by the
earlier check for String, which they are child classes of. As String
field types are no longer supported, this check can be removed.
2018-03-28 10:20:20 +02:00
Tanguy Leroux 36f8531bf4
Don't load global state when only restoring indices (#29239)
Restoring a snapshot, or getting the status of finished
snapshots, currently always load the global state metadata
 file from the repository even if it not required. This
slows down the restore process (or listing statuses process)
 and can also be an issue if the global state cannot be
deserialized (because it has unknown customs for example).

This commit splits the Repository.getSnapshotMetadata()
method into two distincts methods: getGlobalMetadata()
and getIndexMetadata() that are now called only when needed.
2018-03-28 09:35:05 +02:00
Albert Zaharovits b7515f03cf LdapUserSearch rebind with bind DN after user bind (elastic/x-pack-elasticsearch#4209)
Fixes an inconsistency bug in which `LdapSession`s built by
`LdapUserSearchSessionFactory` are different if the factory is
configured to use a connection pool or not. The bind status of the
connection, or the connection(s) from the pool, passed through to
the newly minted `LdapSession` are now identical. Connections are
bind to the bind_dn configuration entry in the realm config.

Original commit: elastic/x-pack-elasticsearch@094af063ea
2018-03-28 09:36:02 +03:00
Ioannis Kakavas d1ed4e0bff Adds missing SAML Realm Settings (elastic/x-pack-elasticsearch#4221)
Adds idp.use_single_logout and populate_user_metadata 
in the SAML Realm Settings Set.

Resolves elastic/x-pack-elasticsearch#4219

Original commit: elastic/x-pack-elasticsearch@360f1f744e
2018-03-28 09:20:28 +03:00
Jason Tedor 1f6a3c1d80
Fix building Javadoc JARs on JDK for client JARs (#29274)
When a module or plugin register that it has a client JAR, we copy
artifacts like the Javadoc and sources JARs as the JARs for the client
as well (with -client added to the name). I previously had to disable
the Javadoc task on JDK 10 due to a bug in bin/javadoc. After JDK 10
went GA without a fix for this bug, I added workaround to fix the
Javadoc task on JDK 10. However, I made a mistake reverting the
previously skipped Javadocs tasks and missed that one that copies the
Javadoc JAR for client JARs. This commit fixes that issue.
2018-03-27 22:58:44 -04:00
Jason Tedor 38fd9998e7
Require JDK 10 to build Elasticsearch (#29174)
This commit bumps the minimum compiler version required to build
Elasticsearch from JDK 9 to JDK 10.
2018-03-27 19:45:13 -04:00
Lee Hinman eebda6974d
Decouple NamedXContentRegistry from ElasticsearchException (#29253)
* Decouple NamedXContentRegistry from ElasticsearchException

This commit decouples `NamedXContentRegistry` from using either
`ElasticsearchException`, `ParsingException`, or `UnknownNamedObjectException`.

This will allow us to move NamedXContentRegistry to its own lib as part of the
xcontent extraction work.

Relates to #28504
2018-03-27 16:51:31 -06:00
Bart van Oort 67a6a76aad Docs: Update generating test coverage reports (#29255)
Old docs said to use maven. That doesn't work. We can't generate the
reports right now.
2018-03-27 17:29:19 -04:00
Andy Bristol 77614658d5 Revert "[TEST] mute CoreWithSecurityClientYamlTestSuiteIT"
This reverts commit elastic/x-pack-elasticsearch@3cdc3e4b6d.

Original commit: elastic/x-pack-elasticsearch@82de67cbc8
2018-03-27 14:21:40 -07:00
Lee Hinman 7df66abaf5 [TEST] Fix issue with HttpInfo passed invalid parameter
HttpInfo is passed the maxContentLength as a parameter, but this value should
never be negative. This fixes the test to only pass a positive random value.
2018-03-27 14:20:06 -06:00
Andy Bristol 98b48b3a61 [TEST] mute CoreWithSecurityClientYamlTestSuiteIT
For elastic/x-pack-elasticsearch#4164

Original commit: elastic/x-pack-elasticsearch@3cdc3e4b6d
2018-03-27 13:02:45 -07:00
Lee Hinman b646abd12c Adjust to XContentBuilder decoupling (elastic/x-pack-elasticsearch#4212)
This is the x-pack side of https://github.com/elastic/elasticsearch/pull/29225
where some methods were renamed or take different arguments.

Original commit: elastic/x-pack-elasticsearch@525e118381
2018-03-27 12:58:26 -06:00
Lee Hinman b4c78019b0
Remove all dependencies from XContentBuilder (#29225)
* Remove all dependencies from XContentBuilder

This commit removes all of the non-JDK dependencies from XContentBuilder, with
the exception of `CollectionUtils.ensureNoSelfReferences`. It adds a third
extension point around dealing with time-based fields and formatters to work
around the Joda dependency.

This decoupling allows us to be able to move XContentBuilder to a separate lib
so it can be available for things like the high level rest client.

Relates to #28504
2018-03-27 12:58:22 -06:00
Jim Ferenczi 3db6f1c9d5 Fix sporadic failure in CompositeValuesCollectorQueueTests
This commit fixes a test bug that causes an NPE on empty segments.

Closes #29269
2018-03-27 20:11:21 +02:00
Zachary Tong 9cc33f4e29 [Rollup] Select best jobs then execute msearch-per-job (elastic/x-pack-elasticsearch#4152)
If there are multiple jobs that are all the "best" (e.g. share the
best interval) we have no way of knowing which is actually the best.
Unfortunately, we cannot just filter for all the jobs in a single
search because their doc_counts can potentially overlap.

To solve this, we execute an msearch-per-job so that the results
stay isolated.  When rewriting the response, we iteratively
unroll and reduce the independent msearch responses into a single
"working tree".  This allows us to intervene if there are
overlapping buckets and manually choose a doc_count.

Job selection is found by recursively descending through the aggregation
tree and independently pruning the list of valid job caps in each branch.
When a leaf node is reached in the branch, the remaining jobs are
sorted by "best'ness" (see comparator in RollupJobIdentifierUtils for the
implementation) and added to a global set of "best jobs". Once
all branches have been evaluated, the final set is returned to the
calling code.

Job "best'ness" is, briefly, the job(s) that have
 - The largest compatible date interval
 - Fewer and larger interval histograms
 - Fewer terms groups

Note: the final set of "best" jobs is not guaranteed to be minimal,
there may be redundant effort due to independent branches choosing
jobs that are subsets of other branches.

Related changes:
- We have to include the job's ID in the rollup doc's
hash, so that different jobs don't overwrite the same summary
document.
- Now that we iteratively reduce the agg tree, the agg framework
injects empty buckets while we're working.  In most cases this
is harmless, but for `avg` aggs the empty bucket is a SumAgg while
any unrolled versions are converted into AvgAggs... causing a cast
exception.  To get around this, avg's are renamed to
`{source_name}.value` to prevent a conflict
- The job filtering has been pushed up into a query filter, since it
applies to the entire msearch rather than just individual agg components
- We no longer add a filter agg clause about the date_histo's interval, because 
that is handled by the job validation and pruning.

Original commit: elastic/x-pack-elasticsearch@995be2a039
2018-03-27 10:33:59 -07:00
Jim Ferenczi 2aaa057387
Propagate ignore_unmapped to inner_hits (#29261)
In 5.2 `ignore_unmapped` was added to `inner_hits` in order to ignore invalid mapping.
This value was automatically set to the value defined in the parent query (`nested`, `has_child`, `has_parent`) but the refactoring of the parent/child in 5.6 removed this behavior unintentionally.
This commit restores this behavior but also makes sure that we always automatically enforce this value when the query builder is used directly (previously this was only done by the XContent deserialization).

Closes #29071
2018-03-27 18:55:42 +02:00
Nhat Nguyen dfc9e721d8 TEST: Increase timeout for testPrimaryReplicaResyncFailed
The default timeout (eg. 10 seconds) may not be enough for CI to
re-allocate shards after the partion is healed. This commit increases
the timeout to 30 seconds and enables logging in order to have more
detailed information in case this test failed again.

Closes #29060
2018-03-27 12:18:09 -04:00
David Roberts f1a948bc54 [ML] More corrections to BWC version for model min version in job serialization
Original commit: elastic/x-pack-elasticsearch@408f35f784
2018-03-27 16:19:39 +01:00
Dimitris Athanasiou e34cb2085f [ML] Also adjust bwc version in job serialization
Original commit: elastic/x-pack-elasticsearch@e6231ba6d3
2018-03-27 16:13:51 +01:00
Luca Cavanna 13f9e922f3
REST client: hosts marked dead for the first time should not be immediately retried (#29230)
This was the plan from day one but due to a silly bug nodes were immediately retried after they were marked as dead for the first time. From the second time on, the expected backoff was applied.
2018-03-27 16:15:44 +02:00
Nhat Nguyen d1d3edf156 TEST: Use different translog dir for a new engine
In #testPruneOnlyDeletesAtMostLocalCheckpoint, we create a new engine
but mistakenly use the same translog directory of  the existing engine.
This prevents translog files from cleaning up when closing the engines.

ERROR   0.12s J2 | InternalEngineTests.testPruneOnlyDeletesAtMostLocalCheckpoint <<< FAILURES!
   > Throwable #1: java.io.IOException: could not remove the following files (in the order of attempts):
   >    translog-primary-060/translog-2.tlog:  java.io.IOException: access denied:

This commit makes sure to use a separate directory for each engine in
this tes.
2018-03-27 09:45:51 -04:00
Christoph Büscher 8d6832c5ee
Make SearchStats implement Writeable (#29258)
Moves another class over from Streamable to Writeable. By this,
also some constructors can be removed or made private.
2018-03-27 15:21:11 +02:00
Andrew Banchich d2baf4b191 [Docs] Spelling and grammar changes to reindex.asciidoc (#29232) 2018-03-27 12:17:46 +02:00
Nhat Nguyen 0ac89a32cc
Do not optimize append-only if seen normal op with higher seqno (#28787)
When processing an append-only operation, primary knows that operations 
can only conflict with another instance of the same operation. This is
true as the id was freshly generated. However this property doesn't hold
for replicas. As soon as an auto-generated ID was indexed into the
primary, it can be exposed to a search and users can issue a follow up
operation on it. In extremely rare cases, the follow up operation can be
arrived and processed on a replica before the original append-only
request. In this case we can't simply proceed with the append-only
request and blindly add it to the index without consulting the version
map. 

The following scenario can cause difference between primary and
replica.

1. Primary indexes an auto-gen-id doc. (id=X, v=1, s#=20)
2. A refresh cycle happens on primary
3. The new doc is picked up and modified - say by a delete by query
   request - Primary gets a delete doc (id=X, v=2, s#=30)
4. Delete doc is processed first on the replica (id=X, v=2, s#=30)
5. Indexing operation arrives on the replica, since it's an auto-gen-id
   request and the retry marker is lower, we put it into lucene without 
   any check. Replica has a doc the primary doesn't have.

To deal with a potential conflict between an append-only operation and a 
normal operation on replicas, we need to rely on sequence numbers. This
commit maintains the max seqno of non-append-only operations on replica
then only apply optimization for an append-only operation only if its
seq# is higher than the seq# of all non-append-only.
2018-03-26 16:56:12 -04:00
Andy Bristol f3cd9a69a2 [test] packaging: renamed packaging configuration (elastic/x-pack-elasticsearch#4112)
For elastic/elasticsearch#26741

Original commit: elastic/x-pack-elasticsearch@401e9bb0e4
2018-03-26 13:43:29 -07:00
Andy Bristol 7bf9091942
[test] packaging: gradle tasks for groovy tests (#29046)
The vagrant test plugin adds tasks for the groovy packaging tests,
which run after the bats packaging test tasks.Rename the 'bats'
configuration to 'packaging' and remove the option to inherit
archives from this configuration.
2018-03-26 13:43:09 -07:00
Nhat Nguyen 87957603c0
Prune only gc deletes below local checkpoint (#28790)
Once a document is deleted and Lucene is refreshed, we will not be able 
to look up the `version/seq#` associated with that delete in Lucene. As
conflicting operations can still be indexed, we need another mechanism
to remember these deletes. Therefore deletes should still be stored in
the Version Map, even after Lucene is refreshed. Obviously, we can't
remember all deletes forever so a trimming mechanism is needed.
Currently, we remember deletes for at least 1 minute (the default GC
deletes cycle) and clean them periodically. This is, at the moment, the
best we can do on the primary for user facing APIs but this arbitrary
time limit is problematic for replicas. Furthermore, we can't rely on
the primary and replicas doing the trimming in a synchronized manner,
and failing to do so results in the replica and primary making different
decisions. 

The following scenario can cause inconsistency between
primary and replica.

1. Primary index doc (index, id=1, v2)
2. Network packet issue causes index operation to back off and wait
3. Primary deletes doc (delete, id=1, v3)
4. Replica processes delete (delete, id=1, v3)
5. 1+ minute passes (GC deletes runs replica)
6. Indexing op is finally sent to the replica which no processes it 
   because it forgot about the delete.

We can reply on sequence-numbers to prevent this issue. If we prune only 
deletes whose seqno at most the local checkpoint, a replica will
correctly remember what it needs. The correctness is explained as
follows:

Suppose o1 and o2 are two operations on the same document with seq#(o1) 
< seq#(o2), and o2 arrives before o1 on the replica. o2 is processed
normally since it arrives first; when o1 arrives it should be discarded:
 
1. If seq#(o1) <= LCP, then it will be not be added to Lucene, as it was
  already previously added.

2. If seq#(o1)  > LCP, then it depends on the nature of o2:
  - If o2 is a delete then its seq# is recorded in the VersionMap,
    since seq#(o2) > seq#(o1) > LCP, so a lookup can find it and
    determine that o1 is stale.
  
  - If o2 is an indexing then its seq# is either in Lucene (if
    refreshed) or the VersionMap (if not refreshed yet), so a 
    real-time lookup can find it and determine that o1 is stale.

In this PR, we prefer to deploy a single trimming strategy, which 
satisfies both requirements, on primary and replicas because:

- It's simpler - no need to distinguish if an engine is running at
primary mode or replica mode or being promoted.

- If a replica subsequently is promoted, user experience is fully
maintained as that replica remembers deletes for the last GC cycle.

However, the version map may consume less memory if we deploy two 
different trimming strategies for primary and replicas.
2018-03-26 13:42:08 -04:00
Dimitris Athanasiou afb6a06f61 [ML] Model snapshot min_version is now present since 7.0.0
Original commit: elastic/x-pack-elasticsearch@39d193461d
2018-03-26 17:09:11 +01:00
Chris Earle a600350d4c [Monitoring] Remove 202 responses in favor of 200 responses (elastic/x-pack-elasticsearch#4213)
This changes `_xpack/monitoring/_bulk` to fundamentally behave in the same
way as `_bulk` and never return 202 when data is ignored (something
`_bulk` cannot do). Instead, anyone interested will have to inspect the
returned response for the ignored flag.

Original commit: elastic/x-pack-elasticsearch@07254a006d
2018-03-26 11:36:04 -04:00
Boaz Leskes bca264699a remove testUnassignedShardAndEmptyNodesInRoutingTable
testUnassignedShardAndEmptyNodesInRoutingTable and that test is as old as time and does a very bogus thing.
it is an IT test which extracts the GatewayAllocator from the node and tells it to allocated unassigned
shards, while giving it a conjured cluster state with no nodes in it (it uses the DiscoveryNodes.EMPTY_NODES.
This is never a cluster state we want to reroute on (we always have at least master node in it).
I'm going to just delete the test as I don't think it adds much value.

Closes #21463
2018-03-26 17:10:57 +02:00
Alexander Reelsen 67badaadb0 Docs: Fix secure settings link
Original commit: elastic/x-pack-elasticsearch@f98a8dabc6
2018-03-26 15:32:27 +02:00
Jim Ferenczi dd77d7fd0a #28745: remove extra option in the composite rest tests
`allow_partial_search_results` is not needed for these tests.
2018-03-26 14:32:59 +02:00
Alexander Reelsen c2764cef98 Docs: Fix deprecation notices and typo to build docs
Original commit: elastic/x-pack-elasticsearch@6e5504efd9
2018-03-26 14:25:42 +02:00
Boaz Leskes f5d4550e93
Fold EngineDiskUtils into Store, for better lock semantics (#29156)
#28245 has introduced the utility class`EngineDiskUtils` with a set of methods to prepare/change
translog and lucene commit points. That util class bundled everything that's needed to create and
empty shard, bootstrap a shard from a lucene index that was just restored etc. 

In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That
would sometime fail due to concurrent shard store fetching or other short activities that require the
files not to be changed while they read from them. 

Since there is no way to wait on the index writer lock, the `Store` class has other locks to make
sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this
PR folds `EngineDiskUtils` into `Store`. Sadly this comes with a price - the store class doesn't and
shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the
translog manipulations on their own.
2018-03-26 14:08:03 +02:00
Christoph Büscher a9392f6d42 Add file permissions checks to precommit task
This adds a check for source files that have the execute bit set to the
precommit task.
2018-03-26 13:37:55 +02:00
Christoph Büscher 318b0af953 Remove execute mode bit from source files
Some source files seem to have the execute bit (a+x) set, which doesn't
really seem to hurt but is a bit odd. This change removes those, making
the permissions similar to other source files in the repository.
2018-03-26 13:37:55 +02:00
Jim Ferenczi 3a75435980 Fix IndexerUtilsTests that relies on indexed fields
This test creates doc values fields only but does not set the index options to none.
This commit fixes this discrepancy by adding an indexed point field for all doc values field.

relates elastic/x-pack-elasticsearch#4223

Original commit: elastic/x-pack-elasticsearch@8adab7c849
2018-03-26 13:37:18 +02:00
David Turner 8c8de0a774 Mute failing IndexerUtilsTests
Awaiting a fix of elastic/x-pack-elasticsearch#4223

Original commit: elastic/x-pack-elasticsearch@d385099719
2018-03-26 10:57:34 +01:00
Jim Ferenczi 5288235ca3
Optimize the composite aggregation for match_all and range queries (#28745)
This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values
present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate
the collection when the leading source value is greater than the lowest value in the queue.
Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents
in the order of the values present in the leading source.
For instance the following aggregation:

```
"composite" : {
  "sources" : [
    { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } }
  ],
  "size": 10
}
```
... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents.
For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough
composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it
is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited.

This mode can execute iff:
 * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`.
 * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition.
 * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only).

If these conditions are not met this aggregation visits each document like any other agg.
2018-03-26 09:51:37 +02:00
Alexander Reelsen 6eeacf339c Build: Use environment variables for credentials (elastic/x-pack-elasticsearch#4058)
The credentials now get injected via environment variables, so that
external services can pull those.

As soon as the specified environment variables are set, the tests are run. No need to check for the @Network annotation

This also introduces new secret store settings for the secure settings in order to be sure to not leak them in the configuration files, that get dumped.

Relates elastic/x-pack-elasticsearch#3800

Original commit: elastic/x-pack-elasticsearch@a2cfb9cb86
2018-03-26 09:10:04 +02:00
Jason Tedor e66072c09f Enable security in packaging tests (elastic/x-pack-elasticsearch#4216)
Now that security is not enabled by default for a trial license, the
packaging tests are failing because they expect security to be
enabled. This commit adds enabling security in all instances started
during the packaging tests.

Original commit: elastic/x-pack-elasticsearch@9838393ecb
2018-03-24 15:36:05 -04:00
Tim Sullivan 05a0d6273c [Monitoring/Beats] Add new CPU fields, remove old CPU fields (elastic/x-pack-elasticsearch#3991)
* [Monitoring/Beats] Add new CPU fields, remove old CPU fields

* use long instead of double for cpu counters

* time => time.ms

Original commit: elastic/x-pack-elasticsearch@244b08a574
2018-03-23 16:19:40 -07:00
Dimitris Athanasiou 67c64a6dfd [ML] Return error when process cause has been killed (elastic/x-pack-elasticsearch#4211)
relates elastic/x-pack-elasticsearch#4210

Original commit: elastic/x-pack-elasticsearch@c5169328ee
2018-03-23 17:30:10 +00:00
Christoph Büscher afe95a7738
[Docs] Add rank_eval size parameter k (#29218)
The rank_eval documentation was missing an explanation of the parameter
`k` that controls the number of top hits that are used in the ranking evaluation.

Closes #29205
2018-03-23 18:04:32 +01:00
Nicholas Knize d400a08788 [DOCS] Remove ignore_z_value parameter link
Removes invalid ignore_z_value parameter link in geo-point.asciidoc.
2018-03-23 11:07:24 -05:00
Jean-Charles Legras 687fe860ac Docs: Update docs/index_.asciidoc (#29172)
Use `_doc` in the routing example instead of `tweet` to agree with the
text and line up with the other examples.
2018-03-23 11:35:10 -04:00
Petr Novák 16bffc7394 Docs: Link C++ client lib elasticlient (#28949)
elasticlient is simple library for simplified work with Elasticsearch in C++
2018-03-23 11:30:01 -04:00