Commit Graph

5412 Commits

Author SHA1 Message Date
Martijn van Groningen 9f2e615ed9 Added extra logging 2013-08-26 09:48:04 +02:00
Boaz Leskes a96ecea653 Multi term vector request
--------------------------

This feature allows to retrieve [term vectors](https://github.com/elasticsearch/elasticsearch/issues/3114) for a list of documents. The json request has exactly the same [format](https://github.com/elasticsearch/elasticsearch/issues/3484) as the ```_termvectors``` endpoint

It use it, call

```
curl -XGET 'http://localhost:9200/index/type/_mtermvectors' -d '{
    "fields": [
        "field1",
        "field2",
        ...
    ],
    "ids": [
        "docId1",
        "docId2",
        ...
    ],
    "offsets": false|true,
    "payloads": false|true,
    "positions": false|true,
    "term_statistics": false|true,
    "field_statistics": false|true
}'

```

The return format is an array, each entry of which conatins the term vector response for one document:

```
{
   "docs": [
      {
         "_index": "index",
         "_type": "type",
         "_id": "docId1",
         "_version": 1,
         "exists": true,
         "term_vectors": {
         	...
         }
      },
      {
         "_index": "index",
         "_type": "type",
         "_id": "docId2",
         "_version": 1,
         "exists": true,
         "term_vectors": {
         ...
         }
      }
   ]
}
```

Note that, like term vectors, the mult term vectors request will silenty skip over documents that have no term vectors stored in the index and will simply return an empty response in this case.

Closes #3536
2013-08-26 09:25:21 +02:00
Boaz Leskes 18c71b16b5 Refactor term vector api
This is necessary to allow adding a mult term vector request
2013-08-26 09:25:21 +02:00
Martijn van Groningen a09f217b45 Don't reduce twice if failure occurs. 2013-08-25 21:50:04 +02:00
Simon Willnauer 020e68f2a0 Make sure all shards are refreshed during recovery test 2013-08-24 09:08:39 +02:00
Shay Banon f3a35ccc90 upgrade to joda 2.3 2013-08-24 00:07:57 +02:00
Shay Banon e892ec2c37 use our abort policy in the scheduler thread pool 2013-08-23 22:03:14 +02:00
Shay Banon 7791f2612d remove unused variable 2013-08-23 21:33:14 +02:00
Simon Willnauer c688ed6c9f Don't assert a second time if awaitBusy returned successfully 2013-08-23 20:57:53 +02:00
Shay Banon 86c95ab2ab Setting index/bulk thread pools with queue_size can cause replica shard failures
closes #3526
2013-08-23 19:24:47 +02:00
Simon Willnauer 19cce0b329 Use awaiBusy rather than a tight loop in IndicesStoreTests 2013-08-23 18:04:53 +02:00
Simon Willnauer a943135ef6 Improve assertion and busy waiting for RecoveryWhileUnterLoadTests 2013-08-23 18:04:53 +02:00
Simon Willnauer 6c24a0af3e Wait for consistent view on both clients in ClusterServiceTests 2013-08-23 18:04:53 +02:00
Simon Willnauer 6d49170509 Introduce base test classes to share thread scope annotations
Currently we run unittests with clusters running in the background
that can potentiallly spawn threads causeing the thread leak control
to fire off in tests that don't use the test cluster. This commit
introduces some base classes for that purpose shadowning lucene test
framework classes adding the approriate ThreadScope.
2013-08-23 18:04:53 +02:00
Simon Willnauer 71ebb14b58 Add ESAbortPolicy to cached pools
All ES ThreadPools / Executors should use the ESAbortPolicy or at least
one that throws the ESRejectedExecutionException.
2013-08-23 18:04:53 +02:00
Simon Willnauer 7c76819040 Randomize numeric types in MLT test and apply mapping ahead of time. 2013-08-23 18:04:52 +02:00
Britta Weber 8b9396b6da add additional method for setting combine mode with CombineFunction parameter 2013-08-23 16:04:00 +02:00
Britta Weber 8d6dc5908e add builders for nicer java api
This commit and the previous 10 commits close issue 3533.

Closes #3533
2013-08-23 16:03:23 +02:00
Britta Weber 5258940d9e format code 2013-08-23 14:44:34 +02:00
Britta Weber 1b085b069b rename reference -> origin 2013-08-23 14:44:34 +02:00
Britta Weber 9e7ad7249f rename scale_weight -> decay 2013-08-23 14:44:34 +02:00
Britta Weber 41b4a14933 Add offset to decay function score
Docs within the offset will be scored with 1.0, decay only starts after
offset is reached.
2013-08-23 14:44:22 +02:00
Britta Weber c0288a62e6 rename 'total' to 'sum', both enum and for query 2013-08-23 14:44:01 +02:00
Britta Weber 6035134047 add more combine functions and rename PLAIN to REPLACE 2013-08-23 14:44:01 +02:00
Britta Weber db100aa2de make GeoPoint parsable in lat/lon json format 2013-08-23 14:43:54 +02:00
Britta Weber f125ac122c format code 2013-08-23 13:55:14 +02:00
Britta Weber 41c59c6b49 make mult default boost mode
always multiply query score to function score. For script score
functions, this means that boost_mode has to be set to `plain` if
'function_score' should behave like 'custom_score'
2013-08-23 13:55:14 +02:00
Britta Weber 634f1036a0 add boost_mode to rest interface
allow user to set combine functions explicitely via boost_mode variable.
2013-08-23 13:55:14 +02:00
Britta Weber b007af1f46 Fix inconsistent usage of ScriptScoreFunction in FiltersFunctionScoreQuery
This commit fixes inconsistencies in `function_score` and `filters_function_score`
using scripts, see issue #3464

The method 'ScoreFunction.factor(docId)' is removed completely, since the name
suggests that this method actually computes a factor which was not the case.
Multiplying the computed score is now handled by 'FiltersFunctionScoreQuery'
and 'FunctionScoreQuery' and not implicitely performed in
'ScoreFunction.factor(docId, subQueryScore)' as was the case for 'BoostScoreFunction'
and 'DecayScoreFunctions'.

This commit also fixes the explain function for FiltersFunctionScoreQuery. Here,
the influence of the maxBoost was never printed. Furthermore, the queryBoost was
printed as beeing multiplied to the filter score.

Closes #3464
2013-08-23 13:55:14 +02:00
Alexander Reelsen 2b03bc83a4 Dont write pidfile twice on startup
There is no need to write the pidfile in the bin/elasticsearchshell script
as this happens already in the java code.

Also cleaning up the bin/elasticsearch shell script a bit (no need to return
an error code when exec is called, as this forks and exits the shell script
immediately).

Closes #3529
Closes #1745
2013-08-23 13:20:29 +02:00
Shay Banon 1ac00a13fb cleanup removable of reject_policy 2013-08-23 12:15:47 +02:00
Louis Gueye 048a02eebc Debian init script: Add debian default java location
Adding /usr/lib/jvm/default-java to JAVA_HOME candidates to check

Closes #3500
2013-08-23 11:27:12 +02:00
Martijn van Groningen e173f9a369 Added test for verifying the id cache size if the clear cache api is invoked. 2013-08-23 10:04:13 +02:00
Dan Everton c76e589bc5 Call onRemoval of shard IdCache during clear.
This looks like a copy/paste issue where onCached was being called
rather than onRemoval. This should fix the ID cache stats not being
correct after a call to /_cache/clear?id_cache=true
2013-08-23 09:55:51 +02:00
uboness 4b3a883111 random_score function - Added the index name and shard id to the randomization, and improved the PRNG itself
Closes #3559
2013-08-23 02:48:43 +02:00
Boaz Leskes 109e2944f2 ClusterUpdateSettingsAction will hang if no changes were made
Closes #3560
2013-08-22 20:59:31 +02:00
Simon Willnauer fc3133d087 Prevent NPE if all docs for the field are pruned during merge
During segment merges FieldConsumers for all fields from the source
segments are created. Yet, if all documents that have a value in a
certain field  are deleted / pruned during the merge the FieldConsumer
will not be called at all causing the internally used FST Builder to
return `null` from it's `build()` method. This means we can't store
it and run potentially into a NPE. This commit adds handling for
this corner case both during merge and during suggest phase since
we also don't have a Lookup instance for this segments field.

Closes #3555
2013-08-22 11:59:25 +02:00
Martijn van Groningen 7fda12316a Properly reduce in onFailure 2013-08-21 23:57:15 +02:00
David Pilato 352d2aaf18 Plugin Manager can not download _site plugins from github
Sounds like github changes a bit download url for master zip file.

From `https://github.com/username/reponame/zipball/master` to `https://github.com/username/reponame/archive/master.zip`.

We need to update plugin manager to reflect that change.

In the meantime, we invite users having this issue to use:

```sh
bin/plugin -install reponame -url https://github.com/username/reponame/archive/master.zip
```

For example:

```sh
bin/plugin -install paramedic -url https://github.com/karmi/elasticsearch-paramedic/archive/master.zip
```

Closes #3551
2013-08-21 17:44:08 +02:00
David Pilato 8668479b92 Plugin Manager can not download _site plugins from github
Sounds like github changes a bit download url for master zip file.

From `https://github.com/username/reponame/zipball/master` to `https://codeload.github.com/username/reponame/zip/master`.

We need to update plugin manager to reflect that change.

In the meantime, we invite users having this issue to use:

```sh
bin/plugin -install reponame -url https://codeload.github.com/username/reponame/zip/master
```

For example:

```sh
bin/plugin -install paramedic -url https://codeload.github.com/karmi/elasticsearch-paramedic/zip/master
```

Closes #3551
2013-08-21 16:08:00 +02:00
Shay Banon 25d28f8afa Completion Suggester: Allow payload to be a value
closes #3550
2013-08-21 15:33:52 +02:00
Alexander Reelsen 210683d70b Checking for paylods as JSON objects in completion suggester
If the payload is not a JSON object on indexation, an exception will be thrown.
2013-08-21 14:25:49 +02:00
Alexander Reelsen cdddbb7585 Expose size statistics for completion suggest
In order to determine how many RAM the completion suggest structures will eat up, this data should be exposed.

Closes #3522
2013-08-21 13:15:38 +02:00
David Pilato ac3d5d67be Adding Contributing guidelines 2013-08-21 11:27:32 +02:00
Shay Banon b9c8ca8071 Smarter default to `index.index_concurrency`
closes #3546
2013-08-20 22:26:42 +02:00
Shay Banon d442e089ac Bound processor size based cals to 32
We use number of processors to choose default thread pool sizes, and number of workers in networking (for HTTP and transport). Bound it to max at 32 by default as a safety measure to create too many threads.

This relates to #3478, where we set the default to 24, but 32 is probably a better default.

closes #3545
2013-08-20 22:17:26 +02:00
Simon Willnauer 9af7a850e9 Prevent FVH from entering a very long running loop on large docs with high freq phrase terms.
Terminate phrase searches early if max phrase window is exceeded in
FastVectorHighlighter to prevent very long running phrase
extraction if phrase terms are high frequent. See LUCENE-5182

Closes #3543
2013-08-20 17:54:34 +02:00
Benjamin Devèze 65056a63a1 Share shards computation logic between searchShards and searchShardsCount 2013-08-20 15:52:14 +02:00
uboness 7dab7ab585 when doing range queries within a query_string query, the range terms should only be lowercased if the field is not numeric
fixes #3540
2013-08-20 14:19:08 +02:00
Luca Cavanna 3b03bc65b9 Renamed readable_format flag to human
Closes #3541
2013-08-20 13:43:36 +02:00