Commit Graph

5333 Commits

Author SHA1 Message Date
Boaz Leskes dbdef00a88 Added a needed ensureYellow to UpdateMappingTests.updateMappingConcurrently
Also:
more cluster state related debug logging.
get cluster state api didn't populate cluster state version in response (was always 0)
added logging when testing ends and before indices cleanup start.
2013-08-15 20:27:02 +02:00
Martijn van Groningen c43d0d1746 Added infrastructure to figure out the number of unique values for a field on the atomic level and the highest number of atomic field values for all segments. This can use as a heuristic for initializing data structures.
Also moved the load method from concrete classes to AbstractIndexFieldData class.
2013-08-15 19:24:02 +02:00
Simon Willnauer 1e7c0d69ff Add test method information to log output
Often it's hard to tell which testmethods were already executed
when a particualr test fails. This commit adds a log output when a
new test is started to better differentiate which log output belongs
to which test.

This commit also moves the reproduce line to logging output to gain
timestamps for the failure.
2013-08-15 14:57:00 +02:00
Simon Willnauer 43fcc55625 Remove 'concrete_bytes' fielddata impl from tests
We don't have this implementation anymore tests will just fallback
to default and issue a warning.
2013-08-15 14:57:00 +02:00
Boaz Leskes 9869427ef6 Allow to configure root logging level using system properties. Ex. -Des.logger.level=DEBUG . Defaults to INFO as before. 2013-08-15 14:36:00 +02:00
Alexander Reelsen f644ae5550 CompletionSuggester cleanups
* Fuzzy Suggester parameter names are now easier to understand
  * non_prefix_length became prefix_length
  * min_prefix_length became min_length
* Instead of specyfying search_analyzer and index_analyzer using analyzer for both is supported
* CompletionSuggester used the CharsRef spare instead of too much toString() now
2013-08-15 13:29:09 +02:00
Martijn van Groningen bfac2f575e Always catch exceptions from TransportBroadcastOperationAction#newResponse (reduce phase) 2013-08-15 13:15:52 +02:00
Martijn van Groningen 174707061c Moved the reduce logic to the percolator type. 2013-08-15 13:14:34 +02:00
Simon Willnauer 27b973830d Use ClusterService#localNode instead of checking the cluster state.
The ClusterState can hold an 'invalid' local 'DiscoveryNode' during
node startup and rare race conditions can cause NPEs if an 'invalid'
'DiscoveryNode' is serialized.

Closes #3515
2013-08-15 11:41:54 +02:00
Martijn van Groningen cbdaf4950b Added percolator improvements:
* The _percolator type now has always to _id field enabled (index=not_analyzed, store=no)
* During loading shard initialization the query ids are fetched from field data, before ids were fetched from stored values.
* Moved internal percolator query map storage from Text to HashedBytesRef based keys.
2013-08-15 10:58:40 +02:00
Simon Willnauer 0472bac2ef Limit the number created threads for machines with large number of cores
For machines with lots of cores ie. >= 48 the number of threads
created by default might cause unecessary memory pressure on the system
and can even lead to OOM where the system is not able to create any
native threads anymore. This commit limits the number of available
CPUs on the system used for thread pool initialization to at most
24 cores.

Closes #3478
2013-08-15 00:19:47 +02:00
Shay Banon 28ae4d6393 Errors (like StackOverflow) can cause a search context to not be released
It will eventually time out (with the default 5 minutes timeout), but we should properly handle it, and also, properly propagate the failure.
closes #3513
2013-08-14 23:42:05 +02:00
Simon Willnauer 4a15106d6a Improve backwards compatibility handling for NGram / EdgeNGram analysis
The '"side" : "back"' parameter is not supported anymore on
EdgeNGramTokenizer if the mapping is created with 0.90.2 / Lucene 4.3
The 'EdgeNgramTokenFilter' handles this automatically wrapping the
'EdgeNgramTokenFilter' in a 'ReverseTokenFilter' yet with tokenizers this
optimization is not possible. This commit also add a more verbose error message
how to work around this limitation.

Closes #3489
2013-08-14 23:19:16 +02:00
Nik Everett becbbf53d5 Correctly apply boosts in query string.
This applies boosts to phrase queries generated by query string queries
both in boolean and dismax mode.
2013-08-14 21:49:47 +02:00
Simon Willnauer ddad4fe2f7 Add more information to asserts and assert on the result of refresh. 2013-08-14 21:49:00 +02:00
Boaz Leskes 34442c8d0a Added a timeout check to searchWhileCreatingIndex with cluster state dump on failure. 2013-08-14 20:21:00 +02:00
Shay Banon 594e03b695 expose simplified field methods for custom scripts
also, add respective iter methods to the script values to be used in custom scripts
2013-08-14 18:21:27 +02:00
Boaz Leskes 3ac3c7d12c Put Mappings CountDownListener validates cluster state version of incoming change confirmations.
Closes #3508
2013-08-14 17:05:55 +02:00
Boaz Leskes 3eed2625e2 Small protection against a high number of nodes in UpdateMappingTests.updateMappingConcurrently 2013-08-14 16:23:53 +02:00
Boaz Leskes 256bf1f4bc Added index and type checks to MetaDataMappingService.CountDownListener
Closes #3507
2013-08-14 16:15:53 +02:00
Britta Weber ebbd00acc2 Fix some minor things in function score parser/builder
- remove default scale weight in builder
- make parameters object/double instead of string
- do not convert number to string and back again, parse double instead
- remove javadoc reference to test classes
- Set parameters in constructor instead of in method
2013-08-14 14:21:07 +02:00
Britta Weber 592e637293 remove check and test for more than one mapper per field 2013-08-14 14:21:07 +02:00
Martijn van Groningen 691ac8e105 Added scoring support to percolate api
Scoring support will allow the percolate matches to be sorted, or just assign a scores to percolate matches. Sorting by score can be very useful when millions of matching percolate queries are being returned.

The scoring support hooks in into the percolate query option and adds two new boolean options:
* `sort` - Whether to sort the matches based on the score. This will also include the score for each match. The `size` option is a required option when sorting percolate matches is enabled.
* `score` - Whether to compute the score and include it with each match. This will not sort the matches.

For both new options the `query` option needs to be specified, which is used to produce the scores. The `query` option is normally used to control which percolate queries are evaluated. In order to give meaning to these scores, the recently added `function_score` query in #3423 can be used to wrap the percolate query, this way the scores have meaning.

Closes #3506
2013-08-14 13:51:13 +02:00
Britta Weber 32cdddb671 remove sysout 2013-08-14 10:38:02 +02:00
Shay Banon 2f1680839f empty double/long values should return 0
to conform with all other implementations (non empty), they getValue when there is no value associated with a doc should be 0
2013-08-14 00:06:40 +02:00
Shay Banon eb9c0d077b no need doc action test to check count in before class
- also, since we randomize client transports, no need for specific classes to test for it, we test different clients across all our tests
2013-08-14 00:02:29 +02:00
Shay Banon 3db8be6c77 rename class to conform with Tests suffix 2013-08-13 20:38:57 +02:00
Shay Banon f1467dbde2 empty numeric field data should retain the correct num docs
the fact that there are no values in the numeric field data, doesn't mean there are no docs, behavior should be to the bytes variant
2013-08-13 18:52:35 +02:00
Simon Willnauer 8774c46cc5 Fix assert to check the deviation rather than the absolute difference.
Deviation should be less or equal to 0.01 ~ 1% after the cast.
2013-08-13 17:46:57 +02:00
Simon Willnauer ba13930b32 Fix test include pattern to include *Test.class
We missed *Test.class which is not our convention but we could miss
some tests. We should better include the *Test.class tests as well.
2013-08-13 17:46:48 +02:00
Simon Willnauer 7e1d8a6ca3 Raise default DeleteIndex Timeout
Currently the timeout for an delete index operation is set to 10 seconds.
Yet, if a full flush is running while we delete and index this can
easily exceed 10 seconds. The timeout is not dramatic ie. the index
will be deleted eventually but the client request is not acked which
can cause confusion. We should raise it to prevent unnecessary confusion
especially in client tests where this can happen if the machine is pretty busy.

The new timeout is set to 60 seconds.

Closes #3498
2013-08-13 17:28:19 +02:00
Shay Banon 534299a27c function score test cleanup
- also, properly report on the failed assertion in toFloat
- use function score in the explain compared to custom score
- use the Tests suffix convention
2013-08-13 17:23:38 +02:00
Boaz Leskes 9d28002077 UpdateMappingTests - updateDefaultMappingSettings now creates the index with a mapping. 2013-08-13 17:16:11 +02:00
Boaz Leskes e5f459af83 Allow to update the _source mapping exclude/include dynamically when we merge mappings.
Closes #3491
2013-08-13 17:09:02 +02:00
David Pilato 328608f55f Make RestSearchAction#parseSearchXXX(RestRequest) public
When building a plugin with a new search endpoint, you need to parse the request as a searchRequest.

Methods exist in RestSearchAction class but are private.

We will modify them to be public static. This applies to:

* `RestSearchAction#parseSearchRequest(RestRequest)`
* `RestSearchAction#parseSearchSource(RestRequest)`

 Closes #3499.
2013-08-13 17:06:04 +02:00
Shay Banon e111a7da62 lazily create the no shard available exception 2013-08-13 16:41:16 +02:00
Boaz Leskes acf17b4e39 Added some comments regarding the acknowledgement logic in MetaDataMappingService.putMapping
Made left over cluster state debug log entry less verbose.
2013-08-13 14:32:51 +02:00
Martijn van Groningen e8909396b4 Removed todo 2013-08-13 14:26:19 +02:00
Shay Banon af17ae55ab remove the assert on AnalyzerWrapper
see https://issues.apache.org/jira/browse/LUCENE-5170
2013-08-13 12:15:05 +02:00
Shay Banon 9126d11824 better log message for none gateway, also make it debug level 2013-08-13 00:19:50 +02:00
Simon Willnauer c6a803b677 Also catch EsRejectedExecutionException next to
RejectedExcecutionException
2013-08-12 21:25:40 +02:00
Martijn van Groningen bc0abd8226 Added multi percolate api
The multi percolate allows the bundle multiple percolate requests into one request. This api works similar to the multi search api. The request body format is line based. Each percolate request item takes two lines, the first line is the header and the second line is the body.

The header can contain any parameter that normally would be set via the request path or query string parameters. There are several percolate actions, because there are multiple types of percolate requests:
* `percolate` - Action for defining a regular percolate request.
* `count_percolate` - Action for defining a count percolate request.
* `percolate_existing_doc` - Action for defining a percolate existing document request.
* `count_percolate_existing_doc` - Action for defining a count percolate existing document request.

Each action has its own set of parameters that need to be specified in the percolate action.
Format:
```
{"[header_type]" : {[options...]}
{[body]}
```

Depending on the percolate action different parameters can be specified. For example the percolate and percolate existing document actions support different parameters.

The following endpoints are supported:
```
POST localhost:9200/[index]/[type]/_mpercolate
POST localhost:9200/[index]/_mpercolate
POST localhost:9200/_mpercolate
```

The `index` and `type` defined in the url path are the default index and type.

Closes #3488
2013-08-12 18:32:28 +02:00
Simon Willnauer 82d3693a91 Catch Throwable rather than Exception if latches are present. 2013-08-12 17:46:44 +02:00
Simon Willnauer 8a876ea80e Limit the number of extracted token instance per query token.
FVH deploys some recursive logic to extract terms from documents
that need to highlighted. For documents that have terms with super
large term frequency like a document that repeats a terms very
very often this can produce some very large stacks when extracting
the terms. Taken to an extreme this causes stack overflow errors
when this grow beyond a term frequency >= 6000.

The ultimate solution is a iterative implementation of the extract
logic but until then we should protect users from these massive
term extractions which might be not very useful in the first place.

Closes #3486
2013-08-12 17:46:44 +02:00
Boaz Leskes ab6163898f Postponed acknowledging put mapping requests to after master has finished processed them
Also - TransportMasterNodeOperationAction was potentially use stale cluster state

Closes #3487
2013-08-12 17:00:47 +02:00
Martijn van Groningen 4b25e6b63e Changed default operation_threading from single_thread to thread_per_shard.
Closes #3483
2013-08-12 15:30:09 +02:00
Simon Willnauer 8a48e2f969 Use awaitBusy rather than hand crafted version in tests. 2013-08-12 15:17:48 +02:00
Alexander Reelsen f58f165522 Support fuzzy queries in CompletionSuggest
Added the FuzzySuggester in order to support completion queries

The following options have been added for the fuxxy suggester

* edit_distance: Maximum edit distance
* transpositions: Sets if transpositions should be counted as one or two changes
* min_prefix_len: Minimum length of the input before fuzzy suggestions are returned
* non_prefix_len: Minimum length of the input, which is not checked for fuzzy alternatives

Closes #3465
2013-08-12 15:07:07 +02:00
Alexander Reelsen a7b643305a Fix debian init script to not depend on new start-stop-daemon
By making use of the lsb provided functions, one does not depend on the start-stop daemon version to test if elasticsearch is running.
This ensures, that the init script works on debian wheezy, squeeze, current ubuntu and LTS versions.

Closes #3452
2013-08-12 15:03:42 +02:00
Alexander Reelsen 5c853fb22d Use TransportMasterNodeOperationAction in TransportGetIndexTemplatesAction
No need to use ClusterInfoRequest, as we do not need to access any indices.
2013-08-12 14:48:20 +02:00