Commit Graph

5488 Commits

Author SHA1 Message Date
Simon Willnauer 76cc8c3549 Only pull searcher once during completion stats
The inner call to the completion stats pulled a second searcher
that never got released causing the underlying readers to never
get closed unless the node is shut down. This was triggered
with literally each stats call including the completion stats
even if no completion service was used on the index.

Closes #3652
2013-09-09 17:50:41 +02:00
Lee Hinman 7d52d58747 Add AllocationDecider that takes free disk space into account
This commit adds two main pieces, the first is a ClusterInfoService
that provides a service running on the master nodes that fetches the
total/free bytes for each data node in the cluster as well as the
sizes of all shards in the cluster. This information is gathered by
default every 30 seconds, and can be changed dynamically by setting
the `cluster.info.update.interval` setting. This ClusterInfoService
can hopefully be used in the future to weight nodes for allocation
based on their disk usage, if desired.

The second main piece is the DiskThresholdDecider, which can disallow
a shard from being allocated to a node, or from remaining on the node
depending on configuration parameters. There are three main
configuration parameters for the DiskThresholdDecider:

`cluster.routing.allocation.disk.threshold_enabled` controls whether
the decider is enabled. It defaults to false (disabled). Note that the
decider is also disabled for clusters with only a single data node.

`cluster.routing.allocation.disk.watermark.low` controls the low
watermark for disk usage. It defaults to 0.70, meaning ES will not
allocate new shards to nodes once they have more than 70% disk
used. It can also be set to an absolute byte value (like 500mb) to
prevent ES from allocating shards if less than the configured amount
of space is available.

`cluster.routing.allocation.disk.watermark.high` controls the high
watermark. It defaults to 0.85, meaning ES will attempt to relocate
shards to another node if the node disk usage rises above 85%. It can
also be set to an absolute byte value (similar to the low watermark)
to relocate shards once less than the configured amount of space is
available on the node.

Closes #3480
2013-09-09 09:49:30 -06:00
Luca Cavanna 563111f0f9 Fixed typo: renamed test wamer package to warmer 2013-09-09 14:24:21 +02:00
Luca Cavanna 8ad583b35e Fixed typo: renamed test.listerners package to test.listener 2013-09-09 14:24:21 +02:00
Luca Cavanna 5fdae8ba10 Restored log lines to TRACE for notification post alias creation and open/close index
Introduced specific log level for those in OpenCloseIndexTests#testCloseOpenAliasMultipleIndices
2013-09-09 14:24:21 +02:00
Luca Cavanna 9e72683ba4 Added @TestLogging annotation to set a specific level per test method
It supports multiple logger:level comma separated key value pairs
 Use the _root keyword to set the root logger level
 e.g. @Logging("_root:DEBUG,org.elasticsearch.cluster.metadata:TRACE")
 or just @TestLogging("_root:DEBUG,cluster.metadata:TRACE") since we start the test with -Des.logger.prefix=
2013-09-09 14:24:21 +02:00
Simon Willnauer 453e7c1510 Use same index name for indexing that is used for creating the index 2013-09-09 13:01:53 +02:00
Simon Willnauer 732e38b8c7 Throw IAE if reserved completion suggester chars are used in input
The completion suggester reserves 0x00 and 0xFF as for internal use.
If those chars are used in the input string an IAE is thrown and the
input is rejected.

Closes #3648
2013-09-09 11:57:01 +02:00
Simon Willnauer d7b3ed7e8b Only use one document to test empty shards 2013-09-09 10:53:53 +02:00
Simon Willnauer a5bf8824d2 Remove createMapped and mapping helpers from AbstractSharedClusterTest
our tests should use the API we have in favor of hard to read / understand
test helpers.
2013-09-09 09:29:22 +02:00
Nik Everett da4c58d853 Beautify SuggestSearchTests.
SuggestSearchTests had tons of duplicate code and didn't use all of the
fancy new integration test helper method.  I've removed a ton of duplicate
code and used as many of the nice test helper method I could think of.

Closes #3611
2013-09-09 09:16:47 +02:00
Shay Banon 3c5dd43928 fix logging level 2013-09-08 00:32:44 +02:00
Martijn van Groningen 86d69bb41c field rename 2013-09-06 21:35:53 +02:00
Martijn van Groningen b01dececac Removed matched_filters in favor for matched_queries.
Closes #3644
2013-09-06 21:34:08 +02:00
Martijn van Groningen 29ea6b3071 Deprecate the method names `namedFilter` in favor of `namedQueries`.
The reason behind this is that the `_name` support has been extended to queries as well and the name `namedQueries` suggests better that it is applicable to the whole query dsl.

Relates to #3581
2013-09-06 21:15:22 +02:00
Shay Banon 2767c081cd Allow to control the number of processors sizes are based on
Sometimes, one wants to just control the number of processors our different size base calculations for thread pools and network workers are based on, and not use the reported available processor by the OS. Add processors setting, where it can be controlled.

closes #3643
2013-09-06 19:04:15 +02:00
Adrien Grand c9a7bb26ba Remember to think about filter caching when upgrading to Lucene 4.5. 2013-09-06 18:23:16 +02:00
Simon Willnauer 8203d4dbcf fix potential blocking of NettyTransport connect and disconnect methods
Currently, in NettyTransport the locks for connecting and disconnecting channels are stored in a ConcurrentMap that has 500 entries. A tread can acquire a lock for an id and the lock returned is chosen on a hash function computed from the id.
Unfortunately, a collision of two ids can cause a deadlock as follows:

Scenario: one master (no data), one datanode (only data node)

DiscoveryNode id of master is X
DiscoveryNode id of datanode is Y

Both X and Y cause the same lock to be returned by NettyTransport#connectLock()

Both are up and running, all is fine until master stops.

Thread 1: The master fault detection of the datanode is notified (onNodeDisconnected()), which in turn leads the node to try and reconnect to master via the callstack titled "Thread 1" below.

-> connectToNode() is called and lock for X is acquired. The method waits for 45s for the cannels to reconnect.

Furthermore, Thread 1 holds the NettyTransport#masterNodeMutex.

Thread 2: The connection fails with an exception (connection refused, see callstack below), because the master shut down already. The exception is handled in NettyTransport#exceptionCaught which calls NettyTransport#disconnectFromNodeChannel. This method acquires the lock for Y (see Thread 2 below).

Now, if Y and X have two different locks, this would get the lock, disconnect the channels and notify thread 1. But since X and Y have the same locks, thread 2 is deadlocked with thread 1 which waits for 45s.

In this time, no thread can acquire the masterNodeMutex (held by thread 1), so the node can, for example, not stop.

This commit introduces a mechanism that assures unique locks for unique ids. This lock is not reentrant and therfore assures that threads can not end up in an infinite recursion (see Thread 3 below for an example on how a thread can aquire a lock twice).
While this is not a problem right now, it is potentially dangerous to have it that way, because the callstacks are complex as is and slight changes might cause unecpected recursions.

Thread 1
----

	owns: Object  (id=114)
	owns: Object  (id=118)
	waiting for: DefaultChannelFuture  (id=140)
	Object.wait(long) line: not available [native method]
	DefaultChannelFuture(Object).wait(long, int) line: 461
	DefaultChannelFuture.await0(long, boolean) line: 311
	DefaultChannelFuture.awaitUninterruptibly(long) line: 285
	NettyTransport.connectToChannels(NettyTransport$NodeChannels, DiscoveryNode) line: 672
	NettyTransport.connectToNode(DiscoveryNode, boolean) line: 609
	NettyTransport.connectToNode(DiscoveryNode) line: 579
	TransportService.connectToNode(DiscoveryNode) line: 129
	MasterFaultDetection.handleTransportDisconnect(DiscoveryNode) line: 195
	MasterFaultDetection.access$0(MasterFaultDetection, DiscoveryNode) line: 188
	MasterFaultDetection$FDConnectionListener.onNodeDisconnected(DiscoveryNode) line: 245
	TransportService$Adapter$2.run() line: 298
	EsThreadPoolExecutor(ThreadPoolExecutor).runWorker(ThreadPoolExecutor$Worker) line: 1145
	ThreadPoolExecutor$Worker.run() line: 615
	Thread.run() line: 724

Thread 2
-------

	waiting for: Object  (id=114)
	NettyTransport.disconnectFromNodeChannel(Channel, Throwable) line: 790
	NettyTransport.exceptionCaught(ChannelHandlerContext, ExceptionEvent) line: 495
	MessageChannelHandler.exceptionCaught(ChannelHandlerContext, ExceptionEvent) line: 228
	MessageChannelHandler(SimpleChannelUpstreamHandler).handleUpstream(ChannelHandlerContext, ChannelEvent) line: 112
	DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline$DefaultChannelHandlerContext, ChannelEvent) line: 564
	DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(ChannelEvent) line: 791
	SizeHeaderFrameDecoder(FrameDecoder).exceptionCaught(ChannelHandlerContext, ExceptionEvent) line: 377
	SizeHeaderFrameDecoder(SimpleChannelUpstreamHandler).handleUpstream(ChannelHandlerContext, ChannelEvent) line: 112
	DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline$DefaultChannelHandlerContext, ChannelEvent) line: 564
	DefaultChannelPipeline.sendUpstream(ChannelEvent) line: 559
	Channels.fireExceptionCaught(Channel, Throwable) line: 525
	NioClientBoss.processSelectedKeys(Set<SelectionKey>) line: 110
	NioClientBoss.process(Selector) line: 79
	NioClientBoss(AbstractNioSelector).run() line: 312
	NioClientBoss.run() line: 42
	ThreadRenamingRunnable.run() line: 108
	DeadLockProofWorker$1.run() line: 42
	ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) line: 1145
	ThreadPoolExecutor$Worker.run() line: 615
	Thread.run() line: 724

Thread 3
---------

        org.elasticsearch.transport.netty.NettyTransport.disconnectFromNode(NettyTransport.java:772)
        org.elasticsearch.transport.netty.NettyTransport.access$1200(NettyTransport.java:92)
        org.elasticsearch.transport.netty.NettyTransport$ChannelCloseListener.operationComplete(NettyTransport.java:830)
        org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427)
        org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:418)
        org.jboss.netty.channel.DefaultChannelFuture.setSuccess(DefaultChannelFuture.java:362)
        org.jboss.netty.channel.AbstractChannel$ChannelCloseFuture.setClosed(AbstractChannel.java:355)
        org.jboss.netty.channel.AbstractChannel.setClosed(AbstractChannel.java:185)
        org.jboss.netty.channel.socket.nio.AbstractNioChannel.setClosed(AbstractNioChannel.java:197)
        org.jboss.netty.channel.socket.nio.NioSocketChannel.setClosed(NioSocketChannel.java:84)
        org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:357)
        org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:58)
        org.jboss.netty.channel.Channels.close(Channels.java:812)
        org.jboss.netty.channel.AbstractChannel.close(AbstractChannel.java:197)
        org.elasticsearch.transport.netty.NettyTransport$NodeChannels.closeChannelsAndWait(NettyTransport.java:892)
        org.elasticsearch.transport.netty.NettyTransport$NodeChannels.close(NettyTransport.java:879)
        org.elasticsearch.transport.netty.NettyTransport.disconnectFromNode(NettyTransport.java:778)
        org.elasticsearch.transport.netty.NettyTransport.access$1200(NettyTransport.java:92)
        org.elasticsearch.transport.netty.NettyTransport$ChannelCloseListener.operationComplete(NettyTransport.java:830)
        org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427)
        org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:418)
        org.jboss.netty.channel.DefaultChannelFuture.setSuccess(DefaultChannelFuture.java:362)
        org.jboss.netty.channel.AbstractChannel$ChannelCloseFuture.setClosed(AbstractChannel.java:355)
        org.jboss.netty.channel.AbstractChannel.setClosed(AbstractChannel.java:185)
        org.jboss.netty.channel.socket.nio.AbstractNioChannel.setClosed(AbstractNioChannel.java:197)
        org.jboss.netty.channel.socket.nio.NioSocketChannel.setClosed(NioSocketChannel.java:84)
        org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:357)
        org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
        org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
        org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
        org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
        org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        java.lang.Thread.run(Thread.java:724)
2013-09-06 17:56:18 +02:00
Shay Banon 4155741f7f BytesStreamOutput default size should be 2k instead of 32k
We changed the default of BytesStreamOutput (used in various places in ES) to 32k from 1k with the assumption that most stream tend to be large. This doesn't hold for example when indexing small documents and adding them using XContentBuilder (which will have a large overhead).

Default the buffer size to 2k now, but be relatively aggressive in expanding the buffer when below 256k (double it), and just use oversize (1/8th) when larger to try and minimize garbage and buffer copies.

relates to #3624
closes #3638
2013-09-06 02:25:17 +02:00
Shay Banon 3e92b1d6b8 Not allowing index names in request body for multi-get/search/bulk when indices are already given in url
closes #3636
2013-09-06 01:46:24 +02:00
Shay Banon 95b894e6f6 On Solaris, default LZF compress type (for transport) can cause segfault
closes #3634
2013-09-06 00:41:14 +02:00
Shay Banon b3d51df493 change to trace level logging 2013-09-05 20:30:22 +02:00
Shay Banon cebd3ca9c3 improve setting response/failure on nodes actions
we can use the index in the node ids list as the index for the array when we set the response or the exception, removing the need for an index AtomicInteger
2013-09-05 17:59:52 +02:00
Simon Willnauer f968ca317a Use lucene test framework if low level lucene parts are involved in tests.
Test should never write outside of the test directories.
2013-09-05 17:21:59 +02:00
Simon Willnauer 1b756ba23a Revert "Add FastVectorHighlighter support for more complex queries."
This reverts commit e943cc81a5.

The more complex queries support causes StackOverflowErrors that
can influence the cluster performance and stability dramatically.
This commit backs out this change to reduce the risk for deep
stacks.

Reverts #3357
2013-09-05 17:19:28 +02:00
Shay Banon 623e340d4f ignore rejected exception when shutting down in cluster service 2013-09-05 16:52:57 +02:00
Luca Cavanna 81d70b1ef8 Added tests for count api
testDateRangeInQueryString awaits fix (#3625)
2013-09-05 16:01:09 +02:00
Boaz Leskes c6ac5ac433 Make the acceptable compression overhead used by MultiOrdinals configurable and default to PackedInts.FASTEST (causing it to byte align).
Closes #3623

Before this commit , this was the output of TermsFacetSearchBenchmark, on my MacBookAir:

```
------------------ SUMMARY -------------------------------
                     name      took    millis
                  terms_s      7.3s        36
              terms_map_s     28.8s       144
                  terms_l     15.9s        79
              terms_map_l     15.5s        77
                 terms_sm        1m       319
             terms_map_sm      4.9m      1491
                 terms_lm      2.7m       825
             terms_map_lm      2.7m       829
          terms_stats_s_l     37.6s       188
         terms_stats_s_lm      2.4m       722
         terms_stats_sm_l      6.5m      1958
------------------ SUMMARY -------------------------------
```

After the change to FASTEST, we have:

```
------------------ SUMMARY -------------------------------
                     name      took    millis
                  terms_s      6.9s        34
              terms_map_s     28.8s       144
                  terms_l     17.4s        87
              terms_map_l     17.6s        88
                 terms_sm       42s       210
             terms_map_sm      4.2m      1287
                 terms_lm      2.3m       714
             terms_map_lm      2.3m       716
          terms_stats_s_l     37.5s       187
         terms_stats_s_lm      1.6m       482
         terms_stats_sm_l      6.1m      1852
------------------ SUMMARY -------------------------------
```
2013-09-05 15:05:37 +02:00
Boaz Leskes e33107d493 TransportSearchTypeAction logs used shards under TRACE level 2013-09-05 13:34:12 +02:00
Boaz Leskes 1de7b1a70c Minor cleanup to logging messages. 2013-09-05 12:42:38 +02:00
Boaz Leskes 1262ba4912 Removed port range option from discovery.zen.ping.unicast.hosts example. 2013-09-05 12:37:33 +02:00
Clinton Gormley 9e6d30a14a [DOCS] Changed the deprecation of custom_boost/score/filters_score
queries to 0.90.4
2013-09-05 12:14:10 +02:00
Clinton Gormley 2b3a762c27 [DOCS] Function score was added in 0.90.4 not 1.00.Beta 2013-09-05 11:25:06 +02:00
Simon Willnauer 764a519627 Return empty completion stats if engine throws an exception
If the index shard is not started yet the underlying engine can
throw an exception since not everything is initialized. This can
happen if a shard is just about starting up or recovering / initalizing
and stats are requested.

Closes #3619
2013-09-05 10:10:40 +02:00
Clinton Gormley 8257aba166 [DOCS] Fixed fielddata regex syntax 2013-09-04 23:20:56 +02:00
Clinton Gormley 6d667e5d41 [DOCS] Missing sort values now works for all field types 2013-09-04 23:20:55 +02:00
Clinton Gormley 765bd026f5 [DOCS] Added function score query 2013-09-04 23:20:55 +02:00
Clinton Gormley aa59ef2e84 [DOCS] Added the human flag 2013-09-04 23:20:55 +02:00
Clinton Gormley 9d0dd545cb [DOCS] Tidied up the plugins page and added Graphite and Statsd 2013-09-04 23:20:55 +02:00
Clinton Gormley e1c6f45ff0 [DOCS] Added clarification about global scope in facets 2013-09-04 23:20:55 +02:00
Clinton Gormley 08f8e77b8f [DOCS] Added fuzzy options to completion suggester 2013-09-04 23:20:55 +02:00
Clinton Gormley 047c86e3b2 [DOCS] Added wildcard template matching 2013-09-04 23:20:55 +02:00
Clinton Gormley 9f5d0b6e89 [DOCS] Added a few clarifications to the docs from the issues list 2013-09-04 23:20:55 +02:00
Clinton Gormley 94be785726 [DOCS] Added multi-index open/close 2013-09-04 23:20:55 +02:00
Clinton Gormley 6568dae12c [DOCS] Added "elastics" client for javascript 2013-09-04 23:20:55 +02:00
Clinton Gormley 5b60506b2e [DOCS] Added highlighting to the phrase suggester 2013-09-04 23:20:54 +02:00
Clinton Gormley 53ad7330fc [DOCS] Added docs for term vectors 2013-09-04 23:20:54 +02:00
Clinton Gormley eac2b3a52e [DOCS] Fixed typo 2013-09-04 23:20:54 +02:00
Simon Willnauer e68303d6a6 Allow test output to be configurable via CMD 2013-09-04 12:46:43 +02:00
Boaz Leskes f93efc3605 Added cluster state version to the debug logging of shards instances used in search. 2013-09-04 11:08:00 +02:00