Commit Graph

5196 Commits

Author SHA1 Message Date
Britta Weber 720b550a94 Unify custom scores
===================

The custom boost factor, custom script boost and the filters function query all do the same thing: They take a query and for each found document compute a new score based on the query score and some script, come custom boost factor or a combination of these two. However, the json format for these three functionalities is very different. This makes it hard to add new functions.

This commit introduces one keyword <code>function_score</code> for all three functions.

The new format can be used to either compute a new score with one function:

	"function_score": {
        "(query|filter)": {},
        "boost": "boost for the whole query",
        "function": {}
    }

or allow to combine the newly computed scores

    "function_score": {
        "(query|filter)": {},
        "boost": "boost for the whole query",
        "functions": [
            {
                "filter": {},
                "function": {}
            },
            {
                "function": {}
            }
        ],
        "score_mode": "(mult|max|...)"
    }

<code>function</code> here can be either

	"script_score": {
    	"lang": "lang",
    	"params": {
        	"param1": "value1",
        	"param2": "value2"
   		 },
    	"script": "some script"
	}

or

	"boost_factor" : number

New custom functions can be added via the function score module.

Changes
---------

The custom boost factor query

	"custom_boost_factor" : {
    	"query" : {
        	....
    	},
    	"boost_factor" : 5.2
	}

becomes

	"function_score" : {
    	"query" : {
        	....
    	},
    	"boost_factor" : 5.2
	}

The custom script score

	"custom_score" : {
    	"query" : {
        	....
	    },
    	"params" : {
        	"param1" : 2,
 	       	"param2" : 3.1
    	},
	    "script" : "_score * doc['my_numeric_field'].value / pow(param1, param2)"
	}

becomes

	"custom_score" : {
    	"query" : {
        	....
	    },
	    "script_score" : {

    		"params" : {
        		"param1" : 2,
 	       		"param2" : 3.1
    		},
	    	"script" : "_score * doc['my_numeric_field'].value / pow(param1, param2)"
	    }
	}

and the custom filters score query

    "custom_filters_score" : {
        "query" : {
            "match_all" : {}
       	 },
        "filters" : [
            {
                "filter" : { "range" : { "age" : {"from" : 0, "to" : 10} } },
                "boost" : "3"
            },
            {
                "filter" : { "range" : { "age" : {"from" : 10, "to" : 20} } },
                "script" : "_score * doc['my_numeric_field'].value / pow(param1, param2)"
            }
        ],
        "score_mode" : "first",
        "params" : {
        	"param1" : 2,
 	       	"param2" : 3.1
    	}
    	"score_mode" : "first"
    }

becomes:

    "function_score" : {
        "query" : {
            "match_all" : {}
       	},
        "functions" : [
            {
                "filter" : { "range" : { "age" : {"from" : 0, "to" : 10} } },
                "boost" : "3"
            },
            {
                "filter" : { "range" : { "age" : {"from" : 10, "to" : 20} } },
                "script_score" : {
                	"script" : "_score * doc['my_numeric_field'].value / pow(param1, param2)",
                	"params" : {
        				"param1" : 2,
 	       				"param2" : 3.1
    				}

            	}
            }
        ],
        "score_mode" : "first",
    }

Partially closes issue #3423
2013-08-06 18:37:34 +02:00
Luca Cavanna e1c739fe6f Improved test, printed out potential shard failures 2013-08-06 16:24:29 +02:00
Alexander Reelsen 0db2db612b RPM Init script bugfix, which might prevent startup
Removing dangerous set calls, which might not set back the current state, but something invalid which leads to stop the script when proceeding
2013-08-06 16:19:53 +02:00
Luca Cavanna a3071540d7 Added support for readable_format parameter when printing out time and size values
The following are the API affected by this change and support now the readable_format flag (default false when not specified):
- indices segments
- indices stats
- indices status
- cluster nodes stats
- cluster nodes info

Closes #3432
2013-08-06 16:08:47 +02:00
Shay Banon ebb4bcd45e add 0.90.4 2013-08-06 15:28:02 +02:00
Alexander Reelsen 68b77c1ae3 Included only runtime dependencies when copying
This makes sure, that no test dependencies are placed in the distribution
2013-08-06 15:13:25 +02:00
Martijn van Groningen fec196b8d8 Better check for verifying that the _percolator type is removed 2013-08-06 14:20:36 +02:00
Boaz Leskes 43e374f793 Maxing out retries on conflict in bulk update cause null pointer exceptions
Also:
Bulk update one less retry then requested
Document for retries on conflict says it default to 1 (but default is 0)
TransportShardReplicationOperationAction methods now catches Throwables instead of exceptions
Added a little extra check to UpdateTests.concurrentUpdateWithRetryOnConflict

Closes #3447 & #3448
2013-08-06 13:06:06 +02:00
Luca Cavanna 636c35d0d4 Added missing metadata fields to upserted documents (parent, routing, ttl, timestamp, version and versionType)
Closes #3444
2013-08-06 12:00:44 +02:00
Simon Willnauer 88a0e4628a Catch RejectedExecutionException in outer ping request 2013-08-05 23:33:38 +02:00
Martijn van Groningen a237eead55 If the _percolator has been removed then also remove percolator queries. 2013-08-05 18:43:11 +02:00
Simon Willnauer 1983a3676a Use domain specific assertions for shard failures across tests 2013-08-05 17:50:24 +02:00
Simon Willnauer df747836d8 Use busy sleeps in NoMasterNodeTests
The busy sleep is less prone to slow tests / machines while still
fails if the actual condition isn't met.
2013-08-05 16:50:45 +02:00
Simon Willnauer d949f67241 Add better assertion reporting if nodes are not present in the ClusterState 2013-08-05 15:40:54 +02:00
Martijn van Groningen e55dab94ea the ttl purger might have already deleted the documents. 2013-08-05 14:22:47 +02:00
Shay Banon d7922b8554 Streamline Search / Broadcast (count, suggest, refresh, ...) APIs header
closes #3441
2013-08-05 12:55:38 +02:00
Simon Willnauer 539ffb9ef5 Fix occasionally hanging test moving away from timeouts.
Fixes EsExecutorTests to use latches and a busy wait util from
ElasticsearchTestCase. This commit also adds some minor randomization
to the test.
2013-08-05 11:43:48 +02:00
Simon Willnauer 094c10d62d Added busy waiting util and add suite timeout.
Some rare tests require to busy-wait a short time until a given
condition occurs for instance until a threadpool scaled down the
number of threads. This commit adds a util that waits a give time
until a condition is met, in contrast to Thread.sleep this method
waits increases the wait time by doubleling the waiting time
iterativly by doubeling it to prevent fast tests to always wait
a given sleep interval.

This commit also adds a suite timeout to fail a test if the test
times out. The test infrastructure will provide thread stack traces
if the timeout kicks in. The default timeout is set to 1h.
2013-08-05 11:43:47 +02:00
Alexander Reelsen 9c7a87f118 Overwriting pidfile on startup
The current implementation does not overwrite, but only prepend the new PID into the pidfile.
So if the process is 4 digits long, but the file is already there with a 5 digit number, the file will contain 5 digits after the write.

Note: If the pidfile still exists this usually means, there either is already an instance running using this pidfile or the process has not finished correctly.

Closes #3425
2013-08-05 11:28:37 +02:00
Alexander Reelsen 94d3e27940 Added index templates REST support for HEAD and proper 404
* Added HEAD support for index templates to find out of they exist
* Returning a 404 instead of a 200 if a GET hits on a non-existing index template

Closes #3434
2013-08-04 13:51:34 +02:00
Lukas Vlcek f2168d32c1 Make (HighlightBuilder|SearchContextHighlight).Field consistent
Update HighlightBuilder.Field API, it should allow for the same API
as SearchConstextHighlight.Field. In other words, what is possible
to setup using DSL in highlighting at the field level is also
possible via the Java API.

Closes #3435
2013-08-02 22:01:35 +02:00
Martijn van Groningen 5cf429d144 Wait for green status when index is created 2013-08-02 20:56:54 +02:00
Simon Willnauer 263c5808bb Don't cache BytesRef in ThreadLocal 2013-08-02 20:30:52 +02:00
Luca Cavanna 85b7efa08b Added support for named filters in top-level filter
Closes #3097
2013-08-02 17:13:46 +02:00
Martijn van Groningen bd324676bc Removed AliasMissingException, get alias api will now just return an empty map. In the rest layer a 404 is returned when map is empty. 2013-08-02 17:10:16 +02:00
Martijn van Groningen 1f71890e10 Use assertions that print out shard failures, if there are any 2013-08-02 16:31:00 +02:00
Shay Banon 1a6514c413 mark bool field type as not toknized
even though we use keyword analyzer for the bool type, we should mark it as not tokenized in the lucene field type as well, no reason to take it though analysis phase to begin with
2013-08-02 14:44:00 +02:00
Simon Willnauer 012d47b500 Use debug logging rather than info for rejected ping task
This exception is thrown on node shutdown and doesn't indicate
an critical situation but rather is caught for consistency reasons.
2013-08-02 14:10:55 +02:00
Martijn van Groningen 890d06f018 Added count percolate api
Added a new percolate api that only returns the number of percolate queries that have matched with the document being percolated. The actual query ids are not included. The percolate total count will be put in the total field and is the only result that will be returned from the dedicated count apis.

The total field will also be included in the already existing percolate and percolating existing document apis and are equal to the number of matches.

Closes #3430
2013-08-02 12:30:20 +02:00
Simon Willnauer 2a211705a3 Catch and Log RejectedExecutionException in async ping 2013-08-02 11:32:15 +02:00
Shay Banon a8dcfa5deb Search on a shard group while relocation final flip happens might fail
single shard read operations should have the same override exception logic as search and broadcast

relates to #3427
2013-08-02 09:56:56 +02:00
Alexander Reelsen 343871fcf5 Allow bin/plugin to set -D JVM parameters
Currently the bin/plugin command did not allow one to set jvm parameters
for startup. Usually this parameters are not needed (no need to configure
heap sizes for such a short running process), but one could not set the
configuration path. And that one is important for plugins in order find
out, where the plugin directory is.

This is especially problematic when elasticsearch is installed as
debian/rpm package, because the configuration file is not placed in the
same directory structure the plugin shell script is put.

This pull request allows to call bin/plugin like this

bin/plugin -Des.default.config=/etc/elasticsearch/elasticsearch.yml -install mobz/elasticsearch-head

As a last small improvement, the PluginManager now outputs the directort
the plugin was installed to in order to avoid confusion.

Closes #3304
2013-08-02 09:19:57 +02:00
Shay Banon 235b3a3635 Search on a shard group while relocation final flip happens might fail
make sure relocation shards add their corresponding initializing shard routing when search across initializing shards

also, make shardFailures lazy again

closes #3427
2013-08-02 00:20:10 +02:00
Shay Banon ebda203ce6 less agreesive timeout to catch it on the pending check 2013-08-01 19:52:37 +02:00
Shay Banon 192025401b improve test to wait for nodes before getting the local node id 2013-08-01 19:45:08 +02:00
Shay Banon f3d3a8bd58 Search on a shard group while relocation final flip happens might fail
At the final stage of a relocation, during the final flip of the states, a search request might hit a node that would then execute it on a shard that has already relocated.

For this, we need to execute broadcast and search operations against initializing shards as well, but only as a last resort. The operation will be rejected if not applicable (i.e. IndexShard#searcher() checked for read allowed).

Note, this requires careful though about which failures we send back. If we try and initializing shard and it fails, its failure should not override an actual failure of an active shard.

Also, removed an atomic integer used in broadcast request and use a similar shard index trick we now have in our search execution.

closes #3427
2013-08-01 18:35:58 +02:00
Luca Cavanna 60bddc28eb Modified test to make failures clearer
Added shard failure check when sorting on unmapped field, could be any SearchPhaseExecutionException otherwise (e.g. missing shards)
2013-08-01 17:07:52 +02:00
Simon Willnauer f2f70a415a Take fragile test out of the loop
UpdateNumberOfReplicasTests#simpleUpdateNumberOfReplicasTests is very
fragile due to executing searches based on dated knowledge of
the cluster state and calling shards that have been relocating away in
the mean time. A fix is on the way.
2013-08-01 15:40:04 +02:00
Martijn van Groningen 300db594aa Run refresh before executing non realtime get 2013-08-01 15:12:15 +02:00
Martijn van Groningen a95ce1987e Made use of the static client() method instead of the client field. 2013-08-01 15:07:35 +02:00
Martijn van Groningen 31fd7764e7 If no mapping can be found for value field, throw a proper exception. 2013-08-01 14:09:41 +02:00
Shay Banon 13845e47d6 cleamup ShardIter#reset method
don't have teh reset method return an instance, as it might confuse usage into thinking it might be a different instance
2013-08-01 13:23:39 +02:00
Shay Banon 074b89b7ad better failure message in test 2013-08-01 12:41:42 +02:00
Martijn van Groningen 56227cc141 Improved alias support in the percolate api
* Changed the response to include the alias as part of each match.
* Added `percolate_format=ids` query string option to just serialize the ids in the rest response.
* Added support for multiple indices in the percolate api.

Closes #3420
2013-08-01 10:42:14 +02:00
Alexander Reelsen 5cd01461f6 Improving test CompletionSuggestSearch test stability
Ensuring that the maximum number of replicas is less than the number of nodes.
2013-08-01 10:14:18 +02:00
Alexander Reelsen 4f4f3a2b10 Added prefix suggestions based on AnalyzingSuggester
This commit introduces near realtime suggestions. For more information about
its usage refer to github issue #3376

From the implementation point of view, a custom AnalyzingSuggester is used
in combination with a custom postingsformat (which is not exposed to the user
anywhere for him to use).

Closes #3376
2013-08-01 08:44:09 +02:00
Shay Banon fd15b6278b Query/Filter Facet should support 64bit counter, not 32
closes #3419
2013-07-31 22:10:43 +02:00
Shay Banon e3480a1c0a Reroute eagerly on shard started events
We have an optimization where we try to delay reroute after we processed the shard started events to try and combine a few into the same event. With teh queueing of shard started events in places, we don't need to do it, and we can reroute right away, which will actually reduce the amount of cluster state events we send.

This will also have a nice side effect of not missing on "waitForRelocatingShards(0)" on cluster health checks since relocations will happen right away.

closes #3417
2013-07-31 16:58:26 +02:00
Luca Cavanna 2f8a397aa5 Improved test
- checking routing table taken from same (up-to-date) cluster state
- added @Slow annotation
- forced cluster reroute when needed
- changed order of assertions so that if it fails again it's easier to understand why
2013-07-31 14:26:44 +02:00
Shay Banon 433f0cc86c process deleted index events on a node even if it has no local FS
this will not happen now, but in the future, if data nodes will only be in memory (including translog and such), then we need to fire the deleted events
2013-07-31 13:59:53 +02:00