Commit Graph

6071 Commits

Author SHA1 Message Date
Alexander Reelsen 210683d70b Checking for paylods as JSON objects in completion suggester
If the payload is not a JSON object on indexation, an exception will be thrown.
2013-08-21 14:25:49 +02:00
Alexander Reelsen cdddbb7585 Expose size statistics for completion suggest
In order to determine how many RAM the completion suggest structures will eat up, this data should be exposed.

Closes #3522
2013-08-21 13:15:38 +02:00
David Pilato ac3d5d67be Adding Contributing guidelines 2013-08-21 11:27:32 +02:00
Shay Banon b9c8ca8071 Smarter default to `index.index_concurrency`
closes #3546
2013-08-20 22:26:42 +02:00
Shay Banon d442e089ac Bound processor size based cals to 32
We use number of processors to choose default thread pool sizes, and number of workers in networking (for HTTP and transport). Bound it to max at 32 by default as a safety measure to create too many threads.

This relates to #3478, where we set the default to 24, but 32 is probably a better default.

closes #3545
2013-08-20 22:17:26 +02:00
Simon Willnauer 9af7a850e9 Prevent FVH from entering a very long running loop on large docs with high freq phrase terms.
Terminate phrase searches early if max phrase window is exceeded in
FastVectorHighlighter to prevent very long running phrase
extraction if phrase terms are high frequent. See LUCENE-5182

Closes #3543
2013-08-20 17:54:34 +02:00
Benjamin Devèze 65056a63a1 Share shards computation logic between searchShards and searchShardsCount 2013-08-20 15:52:14 +02:00
uboness 7dab7ab585 when doing range queries within a query_string query, the range terms should only be lowercased if the field is not numeric
fixes #3540
2013-08-20 14:19:08 +02:00
Luca Cavanna 3b03bc65b9 Renamed readable_format flag to human
Closes #3541
2013-08-20 13:43:36 +02:00
Lee Hinman 81bee86778 Add the /_cat/count and /_cat/count/{index} API.
The cat count API allows retrieving the count of all docs in the
cluster, or docs in specific indices, in a human-readable format.
2013-08-19 17:18:36 -06:00
Martijn van Groningen 5ec0276fc5 Improved multi percolate api, to bundle all the requests by shard into one shard level request instead of sending each percolate request separately to the shards. 2013-08-19 18:59:48 +02:00
Boaz Leskes 11fab5c66f XContentHelper.mergeDefaults with single key object list ([ { key: .. } , {key : ... } .. ] ) didn't add defaults which were not already in content
Closes #3538
2013-08-19 17:57:22 +02:00
Luca Cavanna 7f7f79d622 Added test for #3268 and comments to test for #2682 2013-08-19 17:39:24 +02:00
Benjamin Devèze 8e137b1450 Fix search shards count method when targeting concrete and routing-aliased indices 2013-08-19 17:39:24 +02:00
Boaz Leskes 28e867b5c1 Change response format of term vector endpoint
This commit changes the response format of the term vectors
to be consistent with the response format of the analyze endpoint.

```
{
   "_index": "test",
   "_type": "type1",
   "_id": "1",
   "_version": 1,
   "exists": true,
   "term_vectors": {
      "field_with_positions_offsets": {
         "field_statistics": {..},
         "terms": {
            "evil": {
               "term_freq": 2,
               "pos": [ 4 , 7 ],
               "start": [ 17, 40 ],
               "end": [ 21 , 44 ]
            },
            "orthodontist": {
               "term_freq": 1,
               "pos": [ 5 ]
               ],
               "start": [ 22 ],
               "end": [ 34]
            }
         }
      }
   }
}
```

becomes

```
{
   "_index": "test",
   "_type": "type1",
   "_id": "1",
   "_version": 1,
   "exists": true,
   "term_vectors": {
      "field_with_positions_offsets": {
         "field_statistics": {..},
         "terms": {
            "evil": {
               "term_freq": 2,
               "tokens": [
                   { "position": 4, "start_offset": 17, "end_offset" : 21 },
                   { "position": 7, "start_offset": 40, "end_offset" : 44 }
               ]
            },
            "orthodontist": {
               "term_freq": 1,
               "tokens" : [ { "position": 5 , "start_offset" : 22, "end_offset" : 34 } ]
            }
         }
      }
   }
}
```

Closes issue #3484
2013-08-19 16:31:59 +02:00
Boaz Leskes cd5ebac7dd GetMapping failed when index had no mapping (yet)
Closes #3534
2013-08-19 15:51:10 +02:00
Shay Banon 103059d9ef use the now publicly available getDisjuncts 2013-08-19 15:01:14 +02:00
Boaz Leskes b1ac8e9027 Switched all catch (Exception e) to catch (Throwable e) 2013-08-19 14:07:13 +02:00
Boaz Leskes 097b4078a4 added a top level try catch for threaded operations. Exception which bubbled up could cause requests to hang. 2013-08-19 13:41:21 +02:00
Shay Banon 67787421b2 increase (use default) timeout to green in test 2013-08-19 10:18:51 +02:00
Shay Banon 616b09e9b4 Thread Pool: Remove blocking type option
The blocking thread pool type is not recommended to be used, since it will end up blocking the IO thread most times when executing, which is not recommended (other operations will then stall as well).
closes #3531
2013-08-18 22:28:51 +02:00
Boaz Leskes 766c787737 Changed the default of SourceFieldMapper's includes & excludes to be null, to support a "not specified" state, which is important now that they are update-able & merge-able 2013-08-18 11:05:21 +02:00
Simon Willnauer e83eb49a80 Fix RandomScoreFunctionTests to be more reproducible. 2013-08-18 08:39:11 +02:00
uboness 91e78dcbe8 fixed RandomScoreFunctionTests 2013-08-17 23:42:01 +02:00
uboness c93eae8545 Added support for random_score function:
* can be used to return matching results in random order

 Closes #1170
2013-08-17 22:25:35 +02:00
Benjamin Devèze 610f262aac Fix small javadoc typos for IndexShardRoutingTable 2013-08-17 13:43:23 +02:00
Simon Willnauer 1257109944 reduce max replica in test to stabelize with low number of available nodes 2013-08-17 08:25:34 +02:00
Boaz Leskes 2db9c2df83 Added a debug log when sending an mapping updated request to master 2013-08-16 21:58:38 +02:00
Shay Banon ad0eeef859 Better exception handling in actions when forking to a thread pool
An execution on a thread pool might be rejected due to its settings, have better handling in those cases across the actions we have.
closes #3524
2013-08-16 21:56:39 +02:00
Simon Willnauer b11f81d744 Removed static versions of MatchAllDocsQuery
If a static cached version of MatchAllDocsQuery is used through for
instanst the `query_string` together with a boost like `*:*^2.0` the
globally used version is modified since queries are not immutable and it's
boost variable can change at any time. Holding on to queries that are modifiable
is risky and should not be done in a global scope.
This commit also adds tests for constant scores from `constant_score` query.

Closes #3521
2013-08-16 21:18:37 +02:00
Simon Willnauer 57c0d29114 Prevent Phrase Suggester from failing on missing fields.
Unless the field is not mapped phrase suggester should return
empty results or skip candidate generation if a field in not in
the index rather than failing hard with an illegal argument exception.
Some shards might not have a value in a certain field.

Closes #3469
2013-08-16 20:14:24 +02:00
Martijn van Groningen 5d91bb04b6 * Check if size is set if sort is used, if not throw exception.
* Fix reduce error when reducing shard responses.
* Short circuit reduce phase when possible (single matches & no matches)
2013-08-16 18:55:28 +02:00
David Pilato cca84431f5 Fix typo 2013-08-16 15:11:59 +02:00
David Pilato ac06722e32 Suggest should ignore empty shards
From #3469.
When running suggest on empty shards, it raises an error like:

```
"failures" : [ {
      "status" : 400,
      "reason" : "ElasticSearchIllegalArgumentException[generator field [title] doesn't exist]"
    } ]
```

We should ignore empty shards.

Closes #3473.
2013-08-16 11:29:33 +02:00
Simon Willnauer 459d59a04a Add assertion that index creation & deletion is acknowledged 2013-08-16 10:59:20 +02:00
Shay Banon f0914d13af refactor cache recycler, introduce thread local one, and default to it 2013-08-16 03:15:25 +02:00
Shay Banon fdd5e53aa7 better exception handling when building the search response 2013-08-16 01:10:50 +02:00
Simon Willnauer a16d1142a3 Cleanup Exception Handling in RobinEngine & raise write lock timeout
The write-lock timeout on the index writer is 1s by default. Given the
default lock poll interval of 1s this gives an upper bound of 2 obtain
checks for a write lock which might be not enough under load.
This commit adds cleaned up exception handling and more warn logging
related to obtaining locks under load.
2013-08-15 22:12:57 +02:00
Boaz Leskes dbdef00a88 Added a needed ensureYellow to UpdateMappingTests.updateMappingConcurrently
Also:
more cluster state related debug logging.
get cluster state api didn't populate cluster state version in response (was always 0)
added logging when testing ends and before indices cleanup start.
2013-08-15 20:27:02 +02:00
Martijn van Groningen c43d0d1746 Added infrastructure to figure out the number of unique values for a field on the atomic level and the highest number of atomic field values for all segments. This can use as a heuristic for initializing data structures.
Also moved the load method from concrete classes to AbstractIndexFieldData class.
2013-08-15 19:24:02 +02:00
Simon Willnauer 1e7c0d69ff Add test method information to log output
Often it's hard to tell which testmethods were already executed
when a particualr test fails. This commit adds a log output when a
new test is started to better differentiate which log output belongs
to which test.

This commit also moves the reproduce line to logging output to gain
timestamps for the failure.
2013-08-15 14:57:00 +02:00
Simon Willnauer 43fcc55625 Remove 'concrete_bytes' fielddata impl from tests
We don't have this implementation anymore tests will just fallback
to default and issue a warning.
2013-08-15 14:57:00 +02:00
Boaz Leskes 9869427ef6 Allow to configure root logging level using system properties. Ex. -Des.logger.level=DEBUG . Defaults to INFO as before. 2013-08-15 14:36:00 +02:00
Alexander Reelsen f644ae5550 CompletionSuggester cleanups
* Fuzzy Suggester parameter names are now easier to understand
  * non_prefix_length became prefix_length
  * min_prefix_length became min_length
* Instead of specyfying search_analyzer and index_analyzer using analyzer for both is supported
* CompletionSuggester used the CharsRef spare instead of too much toString() now
2013-08-15 13:29:09 +02:00
Martijn van Groningen bfac2f575e Always catch exceptions from TransportBroadcastOperationAction#newResponse (reduce phase) 2013-08-15 13:15:52 +02:00
Martijn van Groningen 174707061c Moved the reduce logic to the percolator type. 2013-08-15 13:14:34 +02:00
Simon Willnauer 27b973830d Use ClusterService#localNode instead of checking the cluster state.
The ClusterState can hold an 'invalid' local 'DiscoveryNode' during
node startup and rare race conditions can cause NPEs if an 'invalid'
'DiscoveryNode' is serialized.

Closes #3515
2013-08-15 11:41:54 +02:00
Martijn van Groningen cbdaf4950b Added percolator improvements:
* The _percolator type now has always to _id field enabled (index=not_analyzed, store=no)
* During loading shard initialization the query ids are fetched from field data, before ids were fetched from stored values.
* Moved internal percolator query map storage from Text to HashedBytesRef based keys.
2013-08-15 10:58:40 +02:00
Simon Willnauer 0472bac2ef Limit the number created threads for machines with large number of cores
For machines with lots of cores ie. >= 48 the number of threads
created by default might cause unecessary memory pressure on the system
and can even lead to OOM where the system is not able to create any
native threads anymore. This commit limits the number of available
CPUs on the system used for thread pool initialization to at most
24 cores.

Closes #3478
2013-08-15 00:19:47 +02:00
Shay Banon 28ae4d6393 Errors (like StackOverflow) can cause a search context to not be released
It will eventually time out (with the default 5 minutes timeout), but we should properly handle it, and also, properly propagate the failure.
closes #3513
2013-08-14 23:42:05 +02:00