Commit Graph

1510 Commits

Author SHA1 Message Date
Zachary Tong 299d044bfc
Collapse pipeline aggs into single package (#34658)
- Restrict visibility of Aggregators and Factories
- Move PipelineAggregatorBuilders up a level so it is consistent with
AggregatorBuilders
- Checkstyle line length fixes for a few classes
- Minor odds/ends (swapping to method references, formatting, etc)
2018-10-23 16:01:01 -04:00
Jake Landis 89dc07bdd9
ingest: better support for conditionals with simulate?verbose (#34155)
This commit introduces two corrections to the way simulate?verbose
handles conditionals on processors.

1) Prior to this change when executing simulate?verbose for
processors with conditionals that evaluate to false, that processor
would still be displayed in the result set. What was displayed was
correct, such that no changes to the document occurred. However, if the
conditional evaluates to false, the processor should not even be
displayed.

2) Prior to this change when executing simulate?verbose for
pipeline processors with conditionals, the individual steps would no
longer be displayed. Commit e37e5df addressed the issue, but
failed account for a conditional on the pipeline processor. Since
a pipeline processor can introduce cycles and is effectively a
single processor that encapsulates multiple other processors that
are potentially guarded by a single conditional, special handling is
needed to for pipeline and conditional pipeline processors.
2018-10-23 11:33:48 -05:00
Zachary Tong 4dbf498721
[Rollup] Job deletion should be invoked on the allocated task (#34574)
We should delete a job by directly talking to the allocated 
task and telling it to shutdown. Today we shut down a job 
via the persistent task framework. This is not ideal because, 
while the job has been removed from the persistent task 
CS, the allocated task continues to live until it gets the 
shutdown message.

This means a user can delete a job, immediately delete 
the rollup index, and then see new documents appear in
 the just-deleted index. This happens because the indexer
 in the allocated task is still running and indexes a few 
more documents before getting the shutdown command.

In this PR, the transport action is changed to a TransportTasksAction, 
and we invoke onCancelled() directly on the matching job. 
The race condition still exists after this PR (albeit less likely), 
but this was a precursor to fixing the issue and a self-contained
chunk of code. A second PR will followup to fix the race itself.
2018-10-23 12:23:22 -04:00
Albert Zaharovits 11881e7b50
Empty GetAliases authorization fix (#34444)
This fixes a bug about aliases authorization.
That is, a user might see aliases which he is not authorized to see.
This manifests when the user is not authorized to see any aliases
and the `GetAlias` request is empty which normally is a marking
that all aliases are requested. In this case, no aliases should be
returned, but due to this bug, all aliases will have been returned.
2018-10-23 18:50:20 +03:00
Christoph Büscher 583f2852f0
[Test] Remove dead code from ExceptionSerializationTests (#34713)
The `ignore` set contains entries of type Class<?>, but the check is performed
on Path objects. This always returns false so is useless currently. Looking at
the first commit of this test that already shows this behaviour this never
excluded anything, so it can be removed.
2018-10-23 15:44:47 +02:00
Jake Landis ad94e79350
ingest: processor stats (#34724)
This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.
2018-10-23 07:30:52 -05:00
Igor Motov 123f784e32
Tests: Add checks to GeoDistanceQueryBuilderTests (#34273)
Adds checks for parsed geo distance query. It is a bit hack-ish since it
compares with query's toString() output, but it is better than no
checks. The parsed query itself has default visibility, so we cannot
access it here unless we move the test to org.apache.lucene.document
package.

Fixes #34043
2018-10-23 07:55:41 -04:00
Armin Braun 8e155b8430
INGEST: Rename Pipeline Processor Param. (#34733)
* `name` is more readable/ergnomic than having `pipeline` twice
2018-10-23 13:43:26 +02:00
Alexander Reelsen 83fd93b2fd
Core: Move IndexNameExpressionResolver to java time (#34507)
This switches from joda time to java time when resolving index names
using date math. This commit also removes two non registered settings
from the code, which could not be used anyway. An unused method was
removed as well.

Relates #27330
2018-10-23 13:26:02 +02:00
Alpar Torok 0536635c44
Upgrade forbiddenapis to 2.6 (#33809)
* Upgrade forbiddenapis to 2.6

Closes #33759

* Switch forbiddenApis back to official plugin

* Remove CLI based task

* Fix forbiddenApisJava9
2018-10-23 12:06:46 +03:00
Julie Tibshirani f854330e06
Make sure to use the type _doc in the REST documentation. (#34662)
* Replace custom type names with _doc in REST examples.
* Avoid using two mapping types in the percolator docs.
* Rename doc -> _doc in the main repository README.
* Also replace some custom type names in the HLRC docs.
2018-10-22 11:54:04 -07:00
Lee Hinman 5dd79bf58c
Make accounting circuit breaker settings dynamic (#34372)
* Make accounting circuit breaker settings dynamic

These missed the original property making them dynamic. This fixes the issue so
these can now be set at any time.

Resolves #34368
2018-10-22 09:55:00 -06:00
Julie Tibshirani fbb9ac34f9
Deprecate type exists requests. (#34663) 2018-10-22 08:46:11 -07:00
Jason Tedor 0577703183
Revert "ingest: processor stats (#34202)"
This reverts commit 6567729600.
2018-10-21 13:16:15 -04:00
Ryan Ernst 222652dfce
Scripting: Convert script fields to use script context (#34164)
This commit removes the use of SearchScript for script fields and adds
a new FieldScript.
2018-10-20 16:33:49 -07:00
Nhat Nguyen 7ab464807d TEST: Mute testDedupByPrimaryTerm
Should be fixed by #34667
2018-10-20 18:23:02 -04:00
Jake Landis 6567729600
ingest: processor stats (#34202)
This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.
2018-10-20 16:01:01 -05:00
Nhat Nguyen d90b6730c7
CCR: Following primary should process NoOps once (#34408)
This is a follow-up for #34288.

Relates #34412
2018-10-19 21:10:13 -04:00
Jim Ferenczi ba87c543c0 [TEST] Fix sporadic failures in CompletionSuggestSearchIT#testTiebreak
Relates #34508
2018-10-20 01:05:48 +02:00
Nhat Nguyen bd92a28cfc
CCR: Replicate existing ops with old term on follower (#34412)
Since #34288, we might hit deadlock if the FollowTask has more fetchers
than writers. This can happen in the following scenario:

Suppose the leader has two operations [seq#0, seq#1]; the FollowTask has
two fetchers and one writer.

1. The FollowTask issues two concurrent fetch requests: {from_seq_no: 0,
num_ops:1} and {from_seq_no: 1, num_ops:1} to read seq#0 and seq#1
respectively.

2. The second request which fetches seq#1 completes before, and then it
triggers a write request containing only seq#1.

3. The primary of a follower fails after it has replicated seq#1 to
replicas.

4. Since the old primary did not respond, the FollowTask issues another
write request containing seq#1 (resend the previous write request).

5. The new primary has seq#1 already; thus it won't replicate seq#1 to
replicas but will wait for the global checkpoint to advance at least
seq#1.

The problem is that the FollowTask has only one writer and that writer
is waiting for seq#0 which won't be delivered until the writer completed.

This PR proposes to replicate existing operations with the old primary
term (instead of the current term) on the follower. In particular, when
the following primary detects that it has processed an process already,
it will look up the term of an existing operation with the same seq_no
in the Lucene index, then rewrite that operation with the old term
before replicating it to the following replicas. This approach is
wait-free but requires soft-deletes on the follower.

Relates #34288
2018-10-19 13:56:00 -04:00
Igor Motov 94bde37bcf
Geo: Don't flip longitude of envelopes crossing dateline (#34535)
When a envelope that crosses the dateline is specified as a part of
geo_shape query is parsed it shouldn't have its left and right points
flipped.

Fixes #34418
2018-10-19 13:53:54 -04:00
Jim Ferenczi fba5d39bbb
Fix completion suggester's score tie-break (#34508)
The shard suggestion sort uses a different tie-break than the one that is used
to merge different shards responses. The former uses the internal document identifier
when scores are the same whereas the latter compares the surface form first.
Because of this discrepancy some suggestion outputs are linked to the wrong documents
because the merge sort reorders the shard suggestions differently. This change
fixes this bug by duplicating the Lucene collector in order to be able to apply the
same tiebreak strategy than the merge sort. This logic will be removed when
https://issues.apache.org/jira/browse/LUCENE-8529 is fixed.

Closes #34378
2018-10-19 19:46:55 +02:00
Nhat Nguyen 90ca5b1fde
Fill LocalCheckpointTracker with Lucene commit (#34474)
Today we rely on the LocalCheckpointTracker to ensure no duplicate when
enabling optimization using max_seq_no_of_updates. The problem is that
the LocalCheckpointTracker is not fully reloaded when opening an engine
with an out-of-order index commit. Suppose the starting commit has seq#0
and seq#2, then the current LocalCheckpointTracker would return "false"
when asking if seq#2 was processed before although seq#2 in the commit.

This change scans the existing sequence numbers in the starting commit,
then marks these as completed in the LocalCheckpointTracker to ensure
the consistent state between LocalCheckpointTracker and Lucene commit.
2018-10-19 12:38:06 -04:00
Christophe Bismuth 3036ab1048 Don't omit default values when updating routing exclusions (#33638)
Exclusion setting `cluster.routing.allocation.exclude._host` default value
is an empty string.

When an exclusion setting is sent with a null value the
o.e.c.s.Setting#innerGetRaw API return an empty string (probably to
avoid a NullPointerException to be raised).

The o.e.c.r.a.d.FilterAllocationDecider class is developed to omit
updates of default values for exclusion setting.

That's why a null exclusion setting value is translated to an empty
string which is equals to the exclusion default value which is
configured to be ignored.

A simple fix would be to not omit default values for exclusion setting
and keep the NullPointerException guard. This is the purpose of this
commit.

Closes #32721
2018-10-19 13:57:41 +02:00
Jim Ferenczi 7b49beb9b0
Fix threshold frequency computation in Suggesters (#34312)
The `term` and `phrase` suggesters have different options to filter candidates
based on their frequencies. The `popular` mode for instance filters candidate
terms that occur in less docs than the original term. However when we compute this threshold
we use the total term frequency of a term instead of the document frequency. This is not inline
with the actual filtering which is always based on the document frequency. This change fixes
this discrepancy and clarifies the meaning of the different frequencies in use in the suggesters.
It also ensures that the threshold doesn't overflow the maximum allowed value (Integer.MAX_VALUE).

Closes #34282
2018-10-19 13:33:19 +02:00
markharwood fe623acf66
Docs - removed experimental/beta markers from adjacency matrix aggregation (#34599) 2018-10-19 09:33:59 +01:00
Daniel Mitterdorfer dbb6fe58fa
Remove hand-coded XContent duplicate checks
With this commit we cleanup hand-coded duplicate checks in XContent
parsing. They were necessary previously but since we reconfigured the
underlying parser in #22073 and #22225, these checks are obsolete and
were also ineffective unless an undocumented system property has been
set. As we also remove this escape hatch, we can remove the additional
checks as well.

Closes #22253
Relates #34588
2018-10-19 10:13:13 +02:00
Alexander Reelsen e498b7d437
Core: Parse floats in epoch millis parser (#34504)
In order to stay BWC compatible with joda time, the epoch millis date
formatter needs to parse dates with a dot like `123.45`. This
adds this functionality for the epoch millis parser in the same way as
for the epoch seconds parser. It also adds support for scientific
notations like `1.0e3` and fixes parsing of negative values for epoch
seconds and epoch millis.
2018-10-19 10:02:45 +02:00
Christoph Büscher 4f7895800e
Remove unused methods in ValueType (#34624)
The removed methods seem unused in the rest of the project.
2018-10-19 09:50:45 +02:00
Christoph Büscher 7bcf496315
[Tests] Correct map lookup in ReplicationTrackerTests (#34565) 2018-10-18 11:23:53 +02:00
Ryan Ernst 8734540345
Ensure map keys cannot be self referencing (#34569)
This commit improves self reference checking to map keys, as well as
adds it to ingest script processing.
2018-10-17 15:16:13 -07:00
Jason Tedor 9be87adb95
Increment settings version when upgrading index (#34566)
When we upgrade an index, we set the settings version upgraded
setting. This should be considered a settings change, and therefore we
need to increment the settings version. This commit addresses that.
2018-10-17 18:00:17 -04:00
Nik Everett b6aa42777a
Search: Wrap lucene classes at 140 columns (#34491)
Applies our line length guidance for all classes in the server in `lucene`
directories *except* `XMoreLikeThis`. The only long line in
`XMoreLikeThis` says "remove this when we upgrade to Lucene 5. Given
that we're on Lucene 8, this is a little terrifying and deserves another
look.
2018-10-17 15:54:35 -04:00
Armin Braun 08d4bf6e84
TESTS: Remove Dead Code in Test Infra. (#34548)
* None of this infrastructure is used
* Some redundant throws and resulting catch code removed
2018-10-17 20:08:39 +01:00
Simon Willnauer b0e98cbce2
Pass the host name on as `server_name` if proxy mode is on (#34559)
In remote cluster setup if we see a configured proxy we should set
the seed nodes host name as the `server_name` to trigger SNI based
routing even for seed nodes. Since remote cluster connections are
plain TCP connections we have to set the host manually since the other
side can't take it from the request URL like in the HTTP case.
This also adds some more informative logging to remote cluster connection.
2018-10-17 19:11:50 +02:00
Armin Braun 3954d041a0
SCRIPTING: Move sort Context to its Own Class (#33717)
* SCRIPTING: Move sort Context to its own Class
2018-10-17 10:02:44 +01:00
Simon Willnauer a93aefb4a4
Assume that rollover datemath tests run on the same day. (#34527)
in #28741 RolloverIT fails because we are cutting over to the
next day while the test executes. We assume that this doesn't happen
based on the assertions in the test. This adds a assumeTrue to ensure
we are at least 5 min away form a date-flip.

Closes #28741
2018-10-16 20:22:32 +02:00
Armin Braun ea576a8ca2
Disc: Move AbstractDisruptionTC to filebased D. (#34461)
* Discovery: Move AbstractDisruptionTestCase to file-based discovery.
* Relates #33675
* Simplify away ClusterDiscoveryConfiguration
2018-10-16 15:28:40 +01:00
Simon Willnauer d43a1fac33
Lock down Engine.Searcher (#34363)
`Engine.Searcher` is non-final today which makes it error prone
in the case of wrapping the underlying reader or lucene `IndexSearcher`
like we do in `IndexSearcherWrapper`. Yet, there is no subclass of it yet
that would be dramatic to just drop on the floor. With the start of development
of frozen indices this changed since in #34357 functionality was added to
a subclass which would be dropped if a `IndexSearcherWrapper` is installed on an index.
This change locks down the `Engine.Searcher` to prevent such a functionality trap.
2018-10-16 14:53:07 +02:00
Martijn van Groningen a1ec91395c
Changed CCR internal integration tests to use a leader and follower cluster instead of a single cluster (#34344)
The `AutoFollowTests` needs to restart the clusters between each tests, because
it is using auto follow stats in assertions. Auto follow stats are only reset
by stopping the elected master node.

Extracted the `testGetOperationsBasedOnGlobalSequenceId()` test to its own test, because it just tests the shard changes api.

* Renamed AutoFollowTests to AutoFollowIT, because it is an integration test.
Renamed ShardChangesIT to IndexFollowingIT, because shard changes it the name
of an internal api and isn't a good name for an integration test.

* move creation of NodeConfigurationSource to a seperate method

* Fixes issues after merge, moved assertSeqNos() and assertSameDocIdsOnShards() methods from ESIntegTestCase to InternalTestCluster, so that ccr tests can use these methods too.
2018-10-16 14:45:46 +02:00
Jason Tedor 05911fb499
Adjust settings version BWC version after backport
This commit adjusts the settings version BWC version after backporting
the change to the 6.x branch which currently is versioned as 6.5.0.
2018-10-16 06:38:38 -04:00
Jim Ferenczi 544de13d8e
Disallow negative query boost (#34486)
This change disallows negative query boosts. Negative scores are not allowed in Lucene 8 so
it is easier to just disallow negative boosts entirely. We should also deprecate negative boosts
in 6x in order to ensure that users are aware when they'll upgrade to ES 7.

Relates #33309
2018-10-16 11:31:53 +01:00
Jason Tedor 4b2052c683
Introduce index settings version (#34429)
This commit introduces settings version to index metadata. This value is
monotonically increasing and is updated on settings updates. This will
be useful in cross-cluster replication so that we can request settings
updates from the leader only when there is a settings update.
2018-10-16 06:22:20 -04:00
Daniel Mitterdorfer 92b2e1a209
Remove lenient boolean handling
With this commit we remove some leftovers from #26389 which cleaned up
lenient boolean handling.

Relates #26389
Relates #22298
Relates #34467
2018-10-16 06:30:00 +02:00
Jason Tedor 55dee53046
Do not update number of replicas on no indices (#34481)
Today when submitting an update settings request to update the number of
replicas with a wildcard that does not match any indices and allow no
indices is set to true, the request ends up being interpreted as
updating the number of replicas for all indices. That is, consider the
following sequence:

PUT /test-index
{
  "settings": {
    "index.number_of_replicas": 0
  }
}

PUT /non-existent-*/_settings?expand_wildcards=open&allow_no_indices=true
{
  "settings": {
    "index.number_of_replicas": 1
  }
}

GET /test-index/_settings

The latter will show that the number of replicas on test-index is now
one. This is surprising, and should be considered a bug.

The underlying problem here is treating no indices in the underlying
methods used to update the routing table and the metadata as meaning all
indices. This commit takes away this assumption. Tests that relied on
this behavior have been changed to no longer rely on this.

A test for this situation is added in UpdateNumberOfReplicasIT.
2018-10-15 19:49:58 -04:00
Nik Everett 23ece922c9
Core: Remove two methods from AbstractComponent (#34336)
This removes another two methods from `AbstractComponent`. One isn't
used at all and another is only used in a single class in watcher. I've
moved the method that watcher uses into the single class that uses it.
2018-10-15 16:05:14 -04:00
Nik Everett a6d1cc6ca9 Revert "Search: Fix spelling mistake in Javadoc (#34480)"
This reverts commit 4e1d7baed0.
2018-10-15 15:42:11 -04:00
fonxian 4e1d7baed0 Search: Fix spelling mistake in Javadoc (#34480)
"iff" -> "if".
2018-10-15 15:38:37 -04:00
Ryan Ernst 26f1d7fc94
Tests: Handle epoch date formatters edge cases (#34437)
This commit handles cases testing withLocale and withZone when the zone
and locale in question is the same as the special base case. This can
happen sometimes since the locale and zoneids are randomized.
2018-10-15 12:18:18 -07:00
Jim Ferenczi 67577fca56
Fix handling of empty keyword in terms aggregation (#34457)
Empty values on keyword fields are filtered by the `map` execution mode
of the `terms` aggregation. This commit restores them as valid buckets.

Closes #34434
2018-10-15 19:33:52 +01:00