Commit Graph

10241 Commits

Author SHA1 Message Date
Adrien Grand d9515e9717 Tests: Fix more bad assumptions about routing in TransportTwoNodesSearchTests. 2014-11-05 09:15:43 +01:00
Adrien Grand fc84666756 Tests: Fix GroovyScriptTests to not depend on the way documents are routed to shards. 2014-11-04 20:12:12 +01:00
Adrien Grand dfeb12996b Gateway: Prefer recovering the state file that uses the latest format.
Currently MetaDataStateFormat loads the first available state file that has
the latest version. In case several files are available and some of them use
the new format while other ones use the legacy format, it should also prefer
the new format. This is typically useful when we upgrade the metadata when
recovering from the gateway: we might write the upgraded state with the new
format while the previous state used the legacy format, so we end up with
two files having the same version but using different formats.

Close #8343
2014-11-04 19:58:08 +01:00
Adrien Grand 6523cd9377 Tests: Fix SimpleQueryStringTests.testSimpleQueryString assumption that depends on how documents are routed. 2014-11-04 18:07:33 +01:00
Adrien Grand 181bd6e56a Tests: Temporarily ignore RoutingBackwardCompatibilityUponUpgradeTests. 2014-11-04 18:01:35 +01:00
Adrien Grand 3501e32dce Mappings: Generate dynamic mappings for empty strings.
This will help the exists/missing filters behave as expected in presence of
empty strings, as well as when using a default analyzer that would generate
tokens for an empty string (uncommon).

Close #8198
2014-11-04 17:15:48 +01:00
javanna ab0bee47c5 [TEST] assign a name to the transport client created within ExternalTestCluster
The transport client created within ExternalTestCluster needs a name that follows our naming convention otherwise the thread leak filter barfs when running tests against an external cluster. Used "transport_client_external_{n}" where n gets incremented every time a new external cluster gets created. Updated thread leak filters rules to ignore threads created by such transport client.
2014-11-04 17:08:03 +01:00
Adrien Grand 9ea25df649 Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.

Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.

Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.

5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB:     [20000, 20000, 20000, 20000, 20000]

3 shards:
Murmur3: [33185, 33347, 33468]
DJB:     [30100, 30000, 39900]

33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB:     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]

Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).

Some tests have been modified because they relied on implementation details of
the routing hash function.

Close #7954
2014-11-04 16:32:42 +01:00
Britta Weber 8ef6e7e7ec geo sort: remove unneded code from geo distance builder
The if statements are unneded and also wrong (second
else if can never be reached).

closes #8338
2014-11-04 16:26:42 +01:00
Britta Weber f9b7fe136a [TEST] add description of -Dvalidate.skip parameter 2014-11-04 16:20:01 +01:00
Clinton Gormley 5797682bd0 Update cluster.asciidoc - fix invalid asciidoc 2014-11-04 15:22:36 +01:00
Simon Willnauer 8163107be5 Catch NoSuchDirectoryException on consistency check - the directory might not be there anymore 2014-11-04 14:34:28 +01:00
Simon Willnauer 7a6fb892c9 [TEST] only assert consistency before closing 2014-11-04 14:34:28 +01:00
Clinton Gormley 60eaeb5052 Update cluster.asciidoc
Fixed asciidoc on cluster module page
2014-11-04 14:32:05 +01:00
Clinton Gormley b0e5fb7823 Update zen.asciidoc
Tidied up the "No master block" asciidoc
2014-11-04 14:27:22 +01:00
javanna ac2ee35c22 [TEST] move ClusterDiscoveryConfiguration to org.elasticsearch.test.discovery
ClusterDiscoveryConfiguration is part of the test infra and should get exported as part of the test jar. This is achieved by moving the class to org.elasticsearch.test.discovery

Closes #8337
2014-11-04 13:56:24 +01:00
Brian Murphy 792b25e857 [TEST] Fix the throttle test. 2014-11-04 12:36:52 +00:00
javanna 8997dba52f [TEST] move NettyTransport*Tests to org.elasticsearch.transport.netty package
NettyTransport*Tests were previously in org.elasticsearch.test.transport and ended up being exported with the test jar. org.elasticsearch.transport.netty should be a better place for them together with exising tests.
2014-11-04 13:21:37 +01:00
Boaz Leskes 4396e6b48e Test: ClusterServiceTetsts.testLocalNodeMasterListenerCallbacks should verify cluster state is applied
The test verifies the correct behavior of a listener but we only call the listener after publishing a new cluster state. Only checking on the publishing of the state introduces a racing condition.
2014-11-04 12:38:45 +01:00
Boaz Leskes 1c66317443 Test: MinimumMasterNodesTests.testCanNotBringClusterDown didn't check for cluster health properly
Also reduced the number of nodes the test uses
2014-11-04 12:17:19 +01:00
Simon Willnauer 44e24d3916 [STORE] Remove special file handling from DistributorDirecotry
This commit removes all special file handling from DistributorDirectory
that assigned certain files to the primary directory. This special handling
was added to ensure that files that are written more than once are essentially
overwritten. Yet this implementation is consistent all the time and doesn't need
this special handling for files that are written through this directory. Writes
to the underlying directory not going through the distributor directory are not
and have never been supported.

Note: this commit also fixes the problem of adding directories to the distributor
during restart where the primary can suddenly change and file mappings are by-passed.

Closes #8276
2014-11-04 11:31:18 +01:00
javanna f2d545c40e [TEST] exclude org.elasticsearch.test.test package from test-jar
The package was only excluded during test-jar sources generation but ended up in the actual jar.
2014-11-04 10:46:41 +01:00
Adrien Grand 3e50bce822 Tests: Do not index dummy documents in ExistsMissingTests.
This way we make sure that there is only one mapping for
_field_names.
2014-11-04 09:49:05 +01:00
Martijn van Groningen 4ddb0575b5 Discovery: Improve the lifecycle management of the join control thread in zen discovery.
Also added:
* Better exception handling in UnicastZenPing#ping and MulticastZenPing#ping
* In the join thread that runs the innerJoinCluster loop, remember the last known exception and throw that when assertions are enabled. We loop until inner join has completed and if continues exceptions are thrown we should fail the test, because the exception shouldn't occur in production (at least not too often).
Applied feedback 3

Closes #8327
2014-11-04 09:45:03 +01:00
Martijn Laarman 82278bb7bc [Aggregations] Meta data support
This commit adds the ability to associate a bit of state with each
individual aggregation.

The aggregation response can be hard to stitch back together without
having a reference to the aggregation request. In many cases this is not
available, many json serializer frameworks cache types globally or have a
static deserialisation override mechanism. In these cases making the
original request available, if at all possible, would be a hack.

The old facets returned `_type` which was just enough metadata to know
what the originating facet type in the request was.

This PR takes `_type` one step further by introducing ANY arbitrary meta
data. This could be further <strike>ab</strike>used for instance by
generic/automated aggregations that include UI state (color information,
thresholds, user input states, etc) per aggregation.
2014-11-03 22:32:23 +01:00
Ryan Ernst 7ec31abbb7 Fix missing word in upgrade docs. 2014-11-03 11:44:41 -08:00
Robert Muir 3c720730c9 Internal: when corruption strikes, don't create exceptions with circular references
Closes #8331
2014-11-03 14:18:18 -05:00
Lee Hinman a6d7742cb5 Return 0 instead of -1 for unknown/non-exposed ramBytesUsed()
The accountable interface specifies that such values are illegal

Fixes #8239
2014-11-03 17:07:12 +01:00
Ryan Ernst 8aff3b6273 FunctionScore: RandomScoreFunction now accepts long, as well a strings.
closes #8267
closes #8311
2014-11-03 07:53:12 -08:00
Boaz Leskes f1f50ac423 Discovery: don't accept a dynamic update to min_master_nodes which is larger then current master node count
The discovery.zen.minimum_master_nodes setting can be updated dynamically. Settings it to a value higher then the current number of master nodes will cause the current master to step down. This is dangerous because if done by mistake (typo) there is no way to restore the settings (this requires an active master).

Closes #8321
2014-11-03 14:53:12 +01:00
Adrien Grand 2b639ae1b5 Geo: Fix IndexedGeoBoundingBoxFilter to not modify the bits of other filters.
Close #8325
2014-11-03 11:06:16 +01:00
printercu 695cd31678 Docs: Add elastics-rb to the list of community clients
Closes #8319
2014-11-02 13:55:21 +01:00
Alexander Reelsen c04fa43587 Docs: Convert markdown to asciidoc in transport profile docs 2014-11-02 08:25:45 +01:00
Aarni Koskela 6011a18381 Docs: Add mention of `hyphenation_patterns_path`
Refs ElasticSearch's HyphenationCompoundWordTokenFilterFactory.java.

Closes #8305
2014-11-01 15:47:53 +01:00
Alexander Reelsen f50deecf12 Tests: Stop measuring request time in HTTP pipelining tests
This destabilizes tests on virtualized hardware. Functionality
testing is sufficient here. Performance tests should to be conducted
elsewhere.
2014-11-01 09:03:59 +01:00
Alex Leonhardt 443c98477f Packaging: Export JAVA_HOME in RPM init script
Closes #5433
Closes #5434
2014-11-01 08:38:46 +01:00
Ryan Ernst 02debfd127 Tests: Remove accidentally added bwc behavior for auto choosing a
version.

An early version of #7966 had the ability to choose a bwc version
automatically, but this was removed before the change was committed.
However, the change was not removed from the ongoing work in #7922
and it made it in unknowningly.
2014-10-31 19:09:38 -07:00
Martijn van Groningen 7761154e83 Core: Allow to configure custom thread pools
Closes #8247
2014-10-31 23:32:09 +01:00
Ryan Ernst 2ebf34b93e Tests: Move logSegmentsState to shared location, and remove no longer
needed verbose logging from upgrade test.
2014-10-31 15:04:05 -07:00
Mathias Fussenegger b4cad96597 Search: Reduce memory usage during fetch source sub phase.
If includes or excludes are set
XContentFactory.xcontentBuilder() allocates a new
BytesStreamOutput using the default page size which is 16kb.

Can be optimized to use the length of the sourceRef because
that is the maximum possible size that the streamOutput will
use.

This redcues the amount of memory allocated for a request
that is fetching 200.000 small documents (~150 bytes each)
by about 300 MB

Close #8138
2014-10-31 18:32:19 +01:00
Alexander Reelsen 5eeac2fdf6 Netty: Add HTTP pipelining support
This adds HTTP pipelining support to netty. Previously pipelining was not
supported due to the asynchronous nature of elasticsearch. The first request
that was returned by Elasticsearch, was returned as first response,
regardless of the correct order.

The solution to this problem is to add a handler to the netty pipeline
that maintains an ordered list and thus orders the responses before
returning them to the client. This means, we will always have some state
on the server side and also requires some memory in order to keep the
responses there.

Pipelining is enabled by default, but can be configured by setting the
http.pipelining property to true|false. In addition the maximum size of
the event queue can be configured.

The initial netty handler is copied from this repo
https://github.com/typesafehub/netty-http-pipelining

Closes #2665
2014-10-31 16:30:11 +01:00
Clinton Gormley e56d85439c Update search-template.asciidoc
Clarified using the conditional clause template example as a string
2014-10-31 15:32:14 +01:00
Oliver Eilhard a239935f90 Docs: Add elastic client for Google Go.
Add elastic, an Elasticsearch client for Google Go.

Closes #8302
2014-10-31 14:42:27 +01:00
Clinton Gormley 2569188d25 Update search-template.asciidoc
Fixed asciidoc typo

Closes #8308
2014-10-31 14:40:32 +01:00
astefan 4049154dbc Docs: Document action.replication_type setting
Document action.replication_type setting

Closes #8290
2014-10-31 13:53:34 +01:00
David Pilato 5bd720b259 Logs: Change log level for mpercolate
When using _mpercolate API we log by default a lot of DEBUG `Percolate shard response`.
They should be in TRACE level instead of DEBUG.
2014-10-31 13:22:56 +01:00
Lee Hinman 42b6e01a37 Use a 1024 byte minimum weight for filter cache entries
This changes the weighing function for the filter cache to use a
configurable minimum weight for each filter cached. This value defaults
to 1kb and can be configured with the
`indices.cache.filter.minimum_entry_weight` setting.

This also fixes an issue with the filter cache where the concurrency
level of the cache was exposed as a setting, but not used in cache
construction.

Relates to #8268
2014-10-31 12:54:41 +01:00
Lee Hinman 4ac7b02ce7 Reroute shards automatically when high disk watermark is exceeded
This adds a Listener interface to the ClusterInfoService, this is used
by the DiskThresholdDecider, which adds a listener to check for nodes
passing the high watermark. If a node is past the high watermark an
empty reroute is issued so shards can be reallocated if desired.

A reroute will only be issued once every
`cluster.routing.allocation.disk.reroute_interval`, which is "60s" by
default.

Refactors InternalClusterInfoService to delegate the nodes stats and
indices stats gathering into separate methods so they have be overriden
by extending classes. Each stat gathering method returns a
CountDownLatch that can be used to wait until processing for that part
is successful before calling the listeners.

Fixes #8146
2014-10-31 11:58:22 +01:00
Martijn van Groningen 1645434af5 Forgot to cut over the child filter in nested filter to use fixed bitset cache. 2014-10-31 11:00:32 +01:00
Simon Willnauer f6b37a31c7 [STORE] Cut over MetaDataStateFormat to NIO Path API
This class already uses Path most of the time since it
uses ATOMIC_MOVE. This commit makes it a bit more consistent.
2014-10-30 18:03:35 +01:00