Commit Graph

15329 Commits

Author SHA1 Message Date
Marshall Bockrath-Vandegrift 1c773e235a Estimate HyperLogLog bias via k-NN regression
The implementation this commit replaces was almost k-NN regression with
k=2, but had two bugs: (a) it depends on the empirical raw estimates
being in strictly non-decreasing order for the binary search (which they
are not); and (b) it weights the biases positively with increased
distance from the corresponding raw estimate.

“HyperLogLog in Practice” leaves the choice of exact algorithm here
fairly vague, just noting: “We use k-nearest neighbor interpolation to
get the bias for a given raw estimate (for k = 6).”  The majority of
other open source HyperLogLog++ implementations appear to use k-NN
regression with uniform weights (and generally k = 6).  Uniform
weighting does decrease variance, but also introduces bias at the domain
extrema.  This problem, plus the use of the word “interpolation” in the
original paper, suggests (inverse) distance-weighted k-NN, as
implemented here.
2015-09-01 08:42:29 -04:00
Adrien Grand 7bc1acf956 Merge pull request #13239 from jpountz/upgrade/lucene-5.3.0
Upgrade to lucene-5.3.0.
2015-09-01 14:03:29 +02:00
Clinton Gormley 20921fcc3d Document transport.ping_schedule
Closes #13241
2015-09-01 13:43:17 +02:00
Britta Weber d386d909fc rename actions back to admin/* and add suffix [s] instead 2015-09-01 12:53:07 +02:00
Jason Tedor edac404f22 Merge pull request #13023 from jasontedor/enforce-maven-version
Enforce supported Maven versions
2015-09-01 06:37:47 -04:00
Britta Weber 05b48b904d set timeout for refresh and flush to default
Since #13068 refresh and flush requests go to the primary first and are then replicated.
One difference to before is though that if a shard is  not available (INITIALIZING for example)
we wait a little for an indexing request but for refresh we don't and just give up immediately.
Before, refresh requests were just send to the shards regardless of what their state is.

In tests we sometimes create an index, issue an indexing request, refresh and
then get the document. But we do not wait until all nodes know that all primaries have ben assigned.
Now potentially one node can be one cluster state behind and not know yet that
the shards have ben started. If the refresh is executed through this node then the
refresh request will silently fail on shards that are started already because from
the nodes perspective they are still initializing. As a consequence, documents
that expected to be available in the test are now not.
Example test failures are here: http://build-us-00.elastic.co/job/elasticsearch-20-oracle-jdk7/395/

This commit changes the timeout to 1m (default) to make sure we don't miss shards
when we refresh. This will trigger the same retry mechanism as for indexing requests.
We still have to make a decision if this change of behavior is acceptable.

see #13238
2015-09-01 12:20:05 +02:00
Robert Muir 7caed74d5d Merge pull request #13232 from rmuir/nullcheck_policy
Add missing null check in ESPolicy.
2015-09-01 06:03:35 -04:00
Adrien Grand 5d9fb2e8a6 Upgrade to lucene-5.3.0.
From a user perspective, the main benefit from this upgrade is that the new
Lucene53Codec has disk-based norms. The elasticsearch directory has been fixed
to load these norms through mmap instead of nio.

Other changes include the removal of `max_thread_states`, the fact that
PhraseQuery and BooleanQuery are now immutable, and that deleted docs are now
applied on top of the Scorer API.

This change introduces a couple of `AwaitsFix`s but I don't think it should
hold us from merging.
2015-09-01 11:58:45 +02:00
Isabel Drost-Fromm 8cd86a615a Adds template support to _msearch resource
Much like we already do with search this adds templating support to the _msearch resource.

Closes #10885
2015-09-01 11:54:43 +02:00
Clinton Gormley 1ee6ea9247 Docs: index.codec is static, not dynamic
The `index.codec` setting can only be set on a closed index, not dynamically
2015-09-01 11:49:42 +02:00
Adrien Grand f0b7fa2f31 Merge pull request #13060 from andrestc/enhancement/functionscore-unmapped
Make FunctionScore work on unmapped field with `missing` parameter
2015-09-01 11:05:30 +02:00
Simon Willnauer 7571276b84 Pass in relevant disk usage map for early termination 2015-09-01 10:35:56 +02:00
Ivannikov Kirill 2fe2c7fef8 Add listeners to postCreate etc 2015-09-01 12:45:40 +05:00
xuzha f46e66e7d0 Remove the experimental indices.fielddata.cache.expire
closes #10781
2015-09-01 00:40:04 -07:00
Britta Weber 333831c126 Merge pull request #13068 from brwe/broadcast_replication
Make refresh a replicated action
2015-09-01 09:21:54 +02:00
Robert Muir a58c5dba89 Add missing null check in ESPolicy.
This allows reducing privileges with doPrivileged to work,
otherwise it will fail with NPE.

In general, if some code wants to do that, let it. The null
check is needed, even though ProtectionDomain(CodeSource, PermissionCollection)
is more than a bit misleading: "the current Policy will not be consulted".

Additionally add a defensive check for location, since the docs
there are even more confusing: https://bugs.openjdk.java.net/browse/JDK-8129972

The jdk policy impl has both these checks.
2015-09-01 00:34:34 -04:00
Jason Tedor aea00a62f3 Merge pull request #13227 from jasontedor/immutable-lists-be-gone
Remove and forbid use of com.google.common.collect.ImmutableList
2015-08-31 15:29:35 -04:00
Martijn van Groningen 238b56dedf Merge pull request #13046 from jimhooker2002/issue-4665-clean
Turn DestructiveOperations into a Guice module.

To share the same instance between component inside a node.

Closes #4665
2015-08-31 21:22:55 +02:00
Martijn van Groningen 51d052c32a Merge pull request #13215 from martijnvg/tests/enable_mock_modules
Allow tests to override whether mock modules are used
2015-08-31 21:11:34 +02:00
Martijn van Groningen 30ffa9a61b test: Allow tests to override whether mock modules are used 2015-08-31 21:02:49 +02:00
Jason Tedor a8bace9f97 Remove and forbid final uses of ImmutableList 2015-08-31 14:35:23 -04:00
Jason Tedor b0af7a1426 Fix NettyTransport 2015-08-31 14:29:00 -04:00
Jason Tedor e39a3bae2c Merge branch 'master' into lists_are_simple 2015-08-31 14:07:00 -04:00
Nik Everett da16dcf527 [docs] Fix docs for position_increment_gap
Closes #13207
2015-08-31 14:05:55 -04:00
Britta Weber d81f426b68 Make refresh a replicated action
prerequisite to #9421
see also #12600
2015-08-31 19:44:00 +02:00
Martijn van Groningen 1230cb0278 Merge pull request #13222 from martijnvg/tests/external_tests_provide_transport_client_with_plugins
Provide the plugins to transport client communicating with the the external cluster
2015-08-31 17:06:53 +02:00
Tanguy Leroux db7aecab4d update list of available os stats
os cpu information is no longer exposed through the nodes stats api
2015-08-31 17:03:45 +02:00
Martijn van Groningen 1b84cadb7b test: The transport client that interacts with the external cluster shoud be provided a list of transport client plugins. 2015-08-31 16:58:03 +02:00
Nik Everett a3616e9aec Merge pull request #13221 from nik9000/force_virtualbox
Lock vagrant to virtualbox
2015-08-31 10:54:26 -04:00
Nik Everett fb561ce228 [docs] Lock vagrant to virtualbox 2015-08-31 10:52:09 -04:00
Nik Everett 23c1766cdc [packaging] Lock vagrant to virtualbox
Virtualbox is the default virtualization provier for vagrant but folks
override that from time to time. If they do then the build will fail because
the boxes used by the build don't usually support non-virtualbox providers.

Closes #13217
2015-08-31 10:45:46 -04:00
Simon Willnauer 66b78341e4 Add note about multi data path and disk threshold deciders
Prior to 2.0 we summed up the available space on all disk on a node
due to the raid-0 like behavior. Now we don't do this anymore and use the
min & max disk space to make decisions.

Closes #13106
2015-08-31 16:23:54 +02:00
Britta Weber a7e240077d Merge pull request #13218 from brwe/resolve-index-default-impl
add default impl for resolveIndex()
2015-08-31 15:53:57 +02:00
Michael McCandless a49217949f Merge pull request #13199 from mikemccand/remove_merge_docs
Move expert segment merge settings documentation off site into javadocs.
2015-08-31 09:52:19 -04:00
Britta Weber 73785e075e add default impl for resolveIndex() 2015-08-31 15:48:32 +02:00
Tanguy Leroux dbbecce8f2 Sort thread pools by name in Nodes Stats 2015-08-31 14:30:43 +02:00
Clinton Gormley aa52c4f712 Docs: Fixed variations of spelling of buckets_path
Closes #13201
2015-08-31 13:47:40 +02:00
Martijn Laarman a80317c4b3 cmd /C needs to be quoted as a whole
To support spaces in both the command as well as its arguments cmd needs
be called like this:

cmd /C ""c:\a b\c.bat" "argument 1" "argument2""

ant was running

cmd /C "c:\a b\c.bat" "argument 1" "argument2"

which in windows causes to be preprocessed to

cmd /C c:\a b\c.bat" "argument 1" "argument2

Which would make it appear as though ant was not properly quoting (which
it did sort of).
2015-08-31 13:33:10 +02:00
Jason Tedor 6e2dc73023 Merge pull request #13205 from jasontedor/feature/13204
Convert upgrade action to broadcast by node
2015-08-31 06:02:08 -04:00
Jason Tedor d1223b7369 Convert upgrade action to broadcast by node
Several shard-level operations that previously broadcasted a request
per shard were converted to broadcast a request per node. This commit
converts upgrade action to this new model as well.

Closes #13204
2015-08-31 05:59:57 -04:00
Alexander Reelsen 856b040a0a Plugins: Replace HTTP urls with HTTPS
Switch to use HTTPS by default for all hardcoded plugin URLs.
If users want to install via HTTP they can still specify a HTTP
URL manually.

Closes #12748
2015-08-31 11:45:38 +02:00
Alexander Reelsen 00902207a6 Tests: Ensure binding on localhost host is consistently ipv4/v6
The current netty multiport tests bind on localhost and then try to connect
to 127.0.0.1, which may fail, if localhost is resolved to ipv6 by default.

This randomly chooses between 127.0.0.1, localhost and ::1 (if available) for
binding and then uses this throughout the test.
2015-08-31 10:56:42 +02:00
Simon Willnauer a17d7500d3 Take Shard data path into account in DiskThresholdDecider
The path that a shard is allocated on is not taken into account when
we decide to move a shard away from a node because it passed a watermark.
Even worse we potentially moved away (relocated) a shard that was not even
allocated on that disk but on another on the node in question. This commit
adds a ShardRouting -> dataPath mapping to ClusterInfo that allows to identify
on which disk the shards are allocated on.

Relates to #13106
2015-08-31 10:40:42 +02:00
Ryan Ernst c01b377ea8 Mappings: Fix numerous checks for equality and compatibility
The field type tests for mappings had a huge hole: check compatibility
was not tested directly at all! I had meant for this to happen in a
follow up after #8871, and was relying on existing mapping tests.
However, there were a number of issues.

This change reworks the fieldtype tests to be able to check all settable
properties on a field type work with checkCompatibility. It fixes a
handful of small bugs in various field types. In particular, analyzer
comparison was just wrong: it was comparing reference equality for
search analyzer instead of the analyzer name. There was also no check
for search quote analyzer.

closes #13112
2015-08-30 23:05:38 -07:00
Ryan Ernst fc840407db Merge pull request #13055 from rjernst/tell_me_your_plugins
Plugins: Removed plugin.types
2015-08-30 16:40:49 -07:00
Michael McCandless 7ad2222ccc copy over merge docs as javadocs 2015-08-30 18:14:47 -04:00
Ryan Ernst 6295f8e795 Merge branch 'master' into tell_me_your_plugins 2015-08-30 14:20:54 -07:00
Ryan Ernst 2539b779c8 Merge pull request #13137 from rjernst/empty_doc_again
Fix doc parser to still pre/post process metadata fields on disabled type
2015-08-30 12:14:18 -07:00
Ivannikov Kirill 38805f3cbd Fix 13202 2015-08-30 23:56:34 +05:00
Jason Tedor aa26b66e96 Remove leftover debugging statement 2015-08-30 14:19:30 -04:00