Commit Graph

5507 Commits

Author SHA1 Message Date
Boaz Leskes 28e867b5c1 Change response format of term vector endpoint
This commit changes the response format of the term vectors
to be consistent with the response format of the analyze endpoint.

```
{
   "_index": "test",
   "_type": "type1",
   "_id": "1",
   "_version": 1,
   "exists": true,
   "term_vectors": {
      "field_with_positions_offsets": {
         "field_statistics": {..},
         "terms": {
            "evil": {
               "term_freq": 2,
               "pos": [ 4 , 7 ],
               "start": [ 17, 40 ],
               "end": [ 21 , 44 ]
            },
            "orthodontist": {
               "term_freq": 1,
               "pos": [ 5 ]
               ],
               "start": [ 22 ],
               "end": [ 34]
            }
         }
      }
   }
}
```

becomes

```
{
   "_index": "test",
   "_type": "type1",
   "_id": "1",
   "_version": 1,
   "exists": true,
   "term_vectors": {
      "field_with_positions_offsets": {
         "field_statistics": {..},
         "terms": {
            "evil": {
               "term_freq": 2,
               "tokens": [
                   { "position": 4, "start_offset": 17, "end_offset" : 21 },
                   { "position": 7, "start_offset": 40, "end_offset" : 44 }
               ]
            },
            "orthodontist": {
               "term_freq": 1,
               "tokens" : [ { "position": 5 , "start_offset" : 22, "end_offset" : 34 } ]
            }
         }
      }
   }
}
```

Closes issue #3484
2013-08-19 16:31:59 +02:00
Boaz Leskes cd5ebac7dd GetMapping failed when index had no mapping (yet)
Closes #3534
2013-08-19 15:51:10 +02:00
Shay Banon 103059d9ef use the now publicly available getDisjuncts 2013-08-19 15:01:14 +02:00
Boaz Leskes b1ac8e9027 Switched all catch (Exception e) to catch (Throwable e) 2013-08-19 14:07:13 +02:00
Boaz Leskes 097b4078a4 added a top level try catch for threaded operations. Exception which bubbled up could cause requests to hang. 2013-08-19 13:41:21 +02:00
Shay Banon 67787421b2 increase (use default) timeout to green in test 2013-08-19 10:18:51 +02:00
Shay Banon 616b09e9b4 Thread Pool: Remove blocking type option
The blocking thread pool type is not recommended to be used, since it will end up blocking the IO thread most times when executing, which is not recommended (other operations will then stall as well).
closes #3531
2013-08-18 22:28:51 +02:00
Boaz Leskes 766c787737 Changed the default of SourceFieldMapper's includes & excludes to be null, to support a "not specified" state, which is important now that they are update-able & merge-able 2013-08-18 11:05:21 +02:00
Simon Willnauer e83eb49a80 Fix RandomScoreFunctionTests to be more reproducible. 2013-08-18 08:39:11 +02:00
uboness 91e78dcbe8 fixed RandomScoreFunctionTests 2013-08-17 23:42:01 +02:00
uboness c93eae8545 Added support for random_score function:
* can be used to return matching results in random order

 Closes #1170
2013-08-17 22:25:35 +02:00
Benjamin Devèze 610f262aac Fix small javadoc typos for IndexShardRoutingTable 2013-08-17 13:43:23 +02:00
Simon Willnauer 1257109944 reduce max replica in test to stabelize with low number of available nodes 2013-08-17 08:25:34 +02:00
Boaz Leskes 2db9c2df83 Added a debug log when sending an mapping updated request to master 2013-08-16 21:58:38 +02:00
Shay Banon ad0eeef859 Better exception handling in actions when forking to a thread pool
An execution on a thread pool might be rejected due to its settings, have better handling in those cases across the actions we have.
closes #3524
2013-08-16 21:56:39 +02:00
Simon Willnauer b11f81d744 Removed static versions of MatchAllDocsQuery
If a static cached version of MatchAllDocsQuery is used through for
instanst the `query_string` together with a boost like `*:*^2.0` the
globally used version is modified since queries are not immutable and it's
boost variable can change at any time. Holding on to queries that are modifiable
is risky and should not be done in a global scope.
This commit also adds tests for constant scores from `constant_score` query.

Closes #3521
2013-08-16 21:18:37 +02:00
Simon Willnauer 57c0d29114 Prevent Phrase Suggester from failing on missing fields.
Unless the field is not mapped phrase suggester should return
empty results or skip candidate generation if a field in not in
the index rather than failing hard with an illegal argument exception.
Some shards might not have a value in a certain field.

Closes #3469
2013-08-16 20:14:24 +02:00
Martijn van Groningen 5d91bb04b6 * Check if size is set if sort is used, if not throw exception.
* Fix reduce error when reducing shard responses.
* Short circuit reduce phase when possible (single matches & no matches)
2013-08-16 18:55:28 +02:00
David Pilato cca84431f5 Fix typo 2013-08-16 15:11:59 +02:00
David Pilato ac06722e32 Suggest should ignore empty shards
From #3469.
When running suggest on empty shards, it raises an error like:

```
"failures" : [ {
      "status" : 400,
      "reason" : "ElasticSearchIllegalArgumentException[generator field [title] doesn't exist]"
    } ]
```

We should ignore empty shards.

Closes #3473.
2013-08-16 11:29:33 +02:00
Simon Willnauer 459d59a04a Add assertion that index creation & deletion is acknowledged 2013-08-16 10:59:20 +02:00
Shay Banon f0914d13af refactor cache recycler, introduce thread local one, and default to it 2013-08-16 03:15:25 +02:00
Shay Banon fdd5e53aa7 better exception handling when building the search response 2013-08-16 01:10:50 +02:00
Simon Willnauer a16d1142a3 Cleanup Exception Handling in RobinEngine & raise write lock timeout
The write-lock timeout on the index writer is 1s by default. Given the
default lock poll interval of 1s this gives an upper bound of 2 obtain
checks for a write lock which might be not enough under load.
This commit adds cleaned up exception handling and more warn logging
related to obtaining locks under load.
2013-08-15 22:12:57 +02:00
Boaz Leskes dbdef00a88 Added a needed ensureYellow to UpdateMappingTests.updateMappingConcurrently
Also:
more cluster state related debug logging.
get cluster state api didn't populate cluster state version in response (was always 0)
added logging when testing ends and before indices cleanup start.
2013-08-15 20:27:02 +02:00
Martijn van Groningen c43d0d1746 Added infrastructure to figure out the number of unique values for a field on the atomic level and the highest number of atomic field values for all segments. This can use as a heuristic for initializing data structures.
Also moved the load method from concrete classes to AbstractIndexFieldData class.
2013-08-15 19:24:02 +02:00
Simon Willnauer 1e7c0d69ff Add test method information to log output
Often it's hard to tell which testmethods were already executed
when a particualr test fails. This commit adds a log output when a
new test is started to better differentiate which log output belongs
to which test.

This commit also moves the reproduce line to logging output to gain
timestamps for the failure.
2013-08-15 14:57:00 +02:00
Simon Willnauer 43fcc55625 Remove 'concrete_bytes' fielddata impl from tests
We don't have this implementation anymore tests will just fallback
to default and issue a warning.
2013-08-15 14:57:00 +02:00
Boaz Leskes 9869427ef6 Allow to configure root logging level using system properties. Ex. -Des.logger.level=DEBUG . Defaults to INFO as before. 2013-08-15 14:36:00 +02:00
Alexander Reelsen f644ae5550 CompletionSuggester cleanups
* Fuzzy Suggester parameter names are now easier to understand
  * non_prefix_length became prefix_length
  * min_prefix_length became min_length
* Instead of specyfying search_analyzer and index_analyzer using analyzer for both is supported
* CompletionSuggester used the CharsRef spare instead of too much toString() now
2013-08-15 13:29:09 +02:00
Martijn van Groningen bfac2f575e Always catch exceptions from TransportBroadcastOperationAction#newResponse (reduce phase) 2013-08-15 13:15:52 +02:00
Martijn van Groningen 174707061c Moved the reduce logic to the percolator type. 2013-08-15 13:14:34 +02:00
Simon Willnauer 27b973830d Use ClusterService#localNode instead of checking the cluster state.
The ClusterState can hold an 'invalid' local 'DiscoveryNode' during
node startup and rare race conditions can cause NPEs if an 'invalid'
'DiscoveryNode' is serialized.

Closes #3515
2013-08-15 11:41:54 +02:00
Martijn van Groningen cbdaf4950b Added percolator improvements:
* The _percolator type now has always to _id field enabled (index=not_analyzed, store=no)
* During loading shard initialization the query ids are fetched from field data, before ids were fetched from stored values.
* Moved internal percolator query map storage from Text to HashedBytesRef based keys.
2013-08-15 10:58:40 +02:00
Simon Willnauer 0472bac2ef Limit the number created threads for machines with large number of cores
For machines with lots of cores ie. >= 48 the number of threads
created by default might cause unecessary memory pressure on the system
and can even lead to OOM where the system is not able to create any
native threads anymore. This commit limits the number of available
CPUs on the system used for thread pool initialization to at most
24 cores.

Closes #3478
2013-08-15 00:19:47 +02:00
Shay Banon 28ae4d6393 Errors (like StackOverflow) can cause a search context to not be released
It will eventually time out (with the default 5 minutes timeout), but we should properly handle it, and also, properly propagate the failure.
closes #3513
2013-08-14 23:42:05 +02:00
Simon Willnauer 4a15106d6a Improve backwards compatibility handling for NGram / EdgeNGram analysis
The '"side" : "back"' parameter is not supported anymore on
EdgeNGramTokenizer if the mapping is created with 0.90.2 / Lucene 4.3
The 'EdgeNgramTokenFilter' handles this automatically wrapping the
'EdgeNgramTokenFilter' in a 'ReverseTokenFilter' yet with tokenizers this
optimization is not possible. This commit also add a more verbose error message
how to work around this limitation.

Closes #3489
2013-08-14 23:19:16 +02:00
Nik Everett becbbf53d5 Correctly apply boosts in query string.
This applies boosts to phrase queries generated by query string queries
both in boolean and dismax mode.
2013-08-14 21:49:47 +02:00
Simon Willnauer ddad4fe2f7 Add more information to asserts and assert on the result of refresh. 2013-08-14 21:49:00 +02:00
Boaz Leskes 34442c8d0a Added a timeout check to searchWhileCreatingIndex with cluster state dump on failure. 2013-08-14 20:21:00 +02:00
Shay Banon 594e03b695 expose simplified field methods for custom scripts
also, add respective iter methods to the script values to be used in custom scripts
2013-08-14 18:21:27 +02:00
Boaz Leskes 3ac3c7d12c Put Mappings CountDownListener validates cluster state version of incoming change confirmations.
Closes #3508
2013-08-14 17:05:55 +02:00
Boaz Leskes 3eed2625e2 Small protection against a high number of nodes in UpdateMappingTests.updateMappingConcurrently 2013-08-14 16:23:53 +02:00
Boaz Leskes 256bf1f4bc Added index and type checks to MetaDataMappingService.CountDownListener
Closes #3507
2013-08-14 16:15:53 +02:00
Britta Weber ebbd00acc2 Fix some minor things in function score parser/builder
- remove default scale weight in builder
- make parameters object/double instead of string
- do not convert number to string and back again, parse double instead
- remove javadoc reference to test classes
- Set parameters in constructor instead of in method
2013-08-14 14:21:07 +02:00
Britta Weber 592e637293 remove check and test for more than one mapper per field 2013-08-14 14:21:07 +02:00
Martijn van Groningen 691ac8e105 Added scoring support to percolate api
Scoring support will allow the percolate matches to be sorted, or just assign a scores to percolate matches. Sorting by score can be very useful when millions of matching percolate queries are being returned.

The scoring support hooks in into the percolate query option and adds two new boolean options:
* `sort` - Whether to sort the matches based on the score. This will also include the score for each match. The `size` option is a required option when sorting percolate matches is enabled.
* `score` - Whether to compute the score and include it with each match. This will not sort the matches.

For both new options the `query` option needs to be specified, which is used to produce the scores. The `query` option is normally used to control which percolate queries are evaluated. In order to give meaning to these scores, the recently added `function_score` query in #3423 can be used to wrap the percolate query, this way the scores have meaning.

Closes #3506
2013-08-14 13:51:13 +02:00
Britta Weber 32cdddb671 remove sysout 2013-08-14 10:38:02 +02:00
Shay Banon 2f1680839f empty double/long values should return 0
to conform with all other implementations (non empty), they getValue when there is no value associated with a doc should be 0
2013-08-14 00:06:40 +02:00
Shay Banon eb9c0d077b no need doc action test to check count in before class
- also, since we randomize client transports, no need for specific classes to test for it, we test different clients across all our tests
2013-08-14 00:02:29 +02:00