OpenSearch

Commit Graph

Author	SHA1	Message	Date
Shay Banon	9126d11824	better log message for none gateway, also make it debug level	2013-08-13 00:19:50 +02:00
Simon Willnauer	c6a803b677	Also catch EsRejectedExecutionException next to RejectedExcecutionException	2013-08-12 21:25:40 +02:00
Martijn van Groningen	bc0abd8226	Added multi percolate api The multi percolate allows the bundle multiple percolate requests into one request. This api works similar to the multi search api. The request body format is line based. Each percolate request item takes two lines, the first line is the header and the second line is the body. The header can contain any parameter that normally would be set via the request path or query string parameters. There are several percolate actions, because there are multiple types of percolate requests: * `percolate` - Action for defining a regular percolate request. * `count_percolate` - Action for defining a count percolate request. * `percolate_existing_doc` - Action for defining a percolate existing document request. * `count_percolate_existing_doc` - Action for defining a count percolate existing document request. Each action has its own set of parameters that need to be specified in the percolate action. Format: ``` {"[header_type]" : {[options...]} {[body]} ``` Depending on the percolate action different parameters can be specified. For example the percolate and percolate existing document actions support different parameters. The following endpoints are supported: ``` POST localhost:9200/[index]/[type]/_mpercolate POST localhost:9200/[index]/_mpercolate POST localhost:9200/_mpercolate ``` The `index` and `type` defined in the url path are the default index and type. Closes #3488	2013-08-12 18:32:28 +02:00
Simon Willnauer	82d3693a91	Catch Throwable rather than Exception if latches are present.	2013-08-12 17:46:44 +02:00
Simon Willnauer	8a876ea80e	Limit the number of extracted token instance per query token. FVH deploys some recursive logic to extract terms from documents that need to highlighted. For documents that have terms with super large term frequency like a document that repeats a terms very very often this can produce some very large stacks when extracting the terms. Taken to an extreme this causes stack overflow errors when this grow beyond a term frequency >= 6000. The ultimate solution is a iterative implementation of the extract logic but until then we should protect users from these massive term extractions which might be not very useful in the first place. Closes #3486	2013-08-12 17:46:44 +02:00
Boaz Leskes	ab6163898f	Postponed acknowledging put mapping requests to after master has finished processed them Also - TransportMasterNodeOperationAction was potentially use stale cluster state Closes #3487	2013-08-12 17:00:47 +02:00
Martijn van Groningen	4b25e6b63e	Changed default operation_threading from single_thread to thread_per_shard. Closes #3483	2013-08-12 15:30:09 +02:00
Simon Willnauer	8a48e2f969	Use awaitBusy rather than hand crafted version in tests.	2013-08-12 15:17:48 +02:00
Alexander Reelsen	f58f165522	Support fuzzy queries in CompletionSuggest Added the FuzzySuggester in order to support completion queries The following options have been added for the fuxxy suggester * edit_distance: Maximum edit distance * transpositions: Sets if transpositions should be counted as one or two changes * min_prefix_len: Minimum length of the input before fuzzy suggestions are returned * non_prefix_len: Minimum length of the input, which is not checked for fuzzy alternatives Closes #3465	2013-08-12 15:07:07 +02:00
Alexander Reelsen	a7b643305a	Fix debian init script to not depend on new start-stop-daemon By making use of the lsb provided functions, one does not depend on the start-stop daemon version to test if elasticsearch is running. This ensures, that the init script works on debian wheezy, squeeze, current ubuntu and LTS versions. Closes #3452	2013-08-12 15:03:42 +02:00
Alexander Reelsen	5c853fb22d	Use TransportMasterNodeOperationAction in TransportGetIndexTemplatesAction No need to use ClusterInfoRequest, as we do not need to access any indices.	2013-08-12 14:48:20 +02:00
Martijn van Groningen	4d40a1e77c	Set default `operation_threading` to `thread_per_shard` and exposed it as an option in the rest api.	2013-08-12 14:45:51 +02:00
Simon Willnauer	59be83f9fc	Remove accidentially committed default values `-Dtests.maxFailures` and `-Dtests.failfast` should not be enabled by default.	2013-08-12 14:32:04 +02:00
Alexander Reelsen	45c0d1de04	Added IndicesAdminClient.getIndexTemplates() In addition to creating and removing a template, one can now receive index templates as well. Simple regexes like template* are supported. Closes #3439	2013-08-12 13:56:25 +02:00
Simon Willnauer	dbed36a13f	Added support for `tests.failfast` and `tests.maxiters` This commit adds support for failing fast when running a test case with `-Dtests.iters=N` and uses some goodness from LuceneTestCase in a new base `AbstractRandomizedTest`. This class checks among other things if a tests doesn't call `super.setup` / `super.tearDown` when it should do and checks if a large static resources are not cleaned up after the tests ie. a running node.	2013-08-12 13:18:56 +02:00
Martijn van Groningen	2ed3dbbf67	Test deleting the percolate type differently. Instead of checking the types exist api, register a DocumentTypeListener that notifies when percolate queries have been cleared.	2013-08-12 12:05:22 +02:00
David Pilato	83c26eb74a	NPE for POST mode facets if facet_filter gives no document. Closes #3479.	2013-08-11 21:21:02 +02:00
Shay Banon	5c7d7fb399	Failure to execute search request with empty top level filter closes #3477	2013-08-10 10:21:30 +02:00
Simon Willnauer	be103c188b	Disable UpdateMappingTets#updateDefaultMappingSettings Test has been too flaky over nightly builds. Disabling it with AwaitFix.	2013-08-10 07:57:58 +02:00
Boaz Leskes	4debf44cd9	Separated index creation from mapping creation pending bug fix concerning concurrent not-acked mapping requests	2013-08-09 21:39:47 +02:00
Boaz Leskes	5f4dc5433e	when changing the mapping of the _default_ mapping, do not apply the old _default_ mapping to the new one and also do not validate the new version with a merge but parse is as a new type. Closes #3474, Closes #3476	2013-08-09 20:15:51 +02:00
Britta Weber	f64065c9d2	termvectors: fix null pointer exception if field has no term vectors Retrieving termvectors for a document that does not have the requested field caused a null pointer exception. Same for documents if the field has no term vectors, for example, because the field only contains "?". Now, an empty response is returned. Closes #3471	2013-08-09 15:06:09 +02:00
Simon Willnauer	ec770373ab	Added random sort test for dense and sparse fields. This test triggers a MultiDoc / MultiOrds in-memory representation even if the field is not multivalued Relates to #3470	2013-08-09 14:15:26 +02:00
Simon Willnauer	417c193cc3	Return ordinals from MultiOrdinals.MultiDocs MultiOrdinals.MultiDocs returned 'null' ordinals which caused a NPE if the field was single valued and would allow a significantly smaller in memory representation than single packed int ordinals. Closes #3470	2013-08-09 08:03:08 +02:00
Simon Willnauer	2ed87b5312	Use nonzero status code to signal abnormal termination We currently return with status code 0 when an IOException occurs. The plugin manager should in any case return a nonzero status if the operation was not successful. Now the PluginManager uses the following reponse codes based on 'sysexists.sh': * '0' on success * '64' command line usage error * '70' internal software error * '74' input/output Closes #3463	2013-08-08 17:48:56 +02:00
Martijn van Groningen	f8f8cac0ed	ttl can be as lower than 0 (purge interval)	2013-08-08 17:43:11 +02:00
Martijn van Groningen	c568fb6344	In case ttl has passed, then just check the delete count	2013-08-08 17:42:12 +02:00
Simon Willnauer	5b8ce393db	Create mapping ahead of time and don't rely on index request in test	2013-08-08 17:28:59 +02:00
Simon Willnauer	4e2b9ff2ad	Expose 'index.compound_on_flush' via engine settings Lucene 4.4 shipped with a fundamental change in how the decision on when to write compound files is made. During segment flush the compound files are written by default which solely relies on a flag in the IndexWriterConfig. The merge policy has been factored out to only make decisions on merges and not on IW flushes. The default now is always writing CFS on flush to reduce resource usage like open files etc. if segments are flushed regularly. While providing a senseable default certain users / usecases might need to change this setting if re-packing flushed segments into CFS is not desired. Closes #3461	2013-08-08 13:36:05 +02:00
Simon Willnauer	04b23a8fab	Catch RejectedExecutionException on node shutdown	2013-08-08 13:10:13 +02:00
Simon Willnauer	ef365098e7	Use DiscoveryModule instead of ClusterService to obtain local node id The ClusterService might not see the latest cluster state and therefore might not contain the local node id. Discovery will always see the local node id since it's set on startup.	2013-08-08 12:39:49 +02:00
Martijn van Groningen	d450d3b016	Simplified checks	2013-08-08 11:33:06 +02:00
Shay Banon	c7d5881686	make sure we add the _uid as the first field in a doc this will improve early termination loading times, but requires potential improvements in Lucene in terms of decompression	2013-08-07 23:28:07 +02:00
Simon Willnauer	6c91ff83f2	Assert on index delete in tests to ensure all indices are wiped even on disk	2013-08-07 17:56:55 +02:00
Simon Willnauer	bcda6dfe54	Remove random empty string from test since it triggers a different exception	2013-08-07 14:17:27 +02:00
Shay Banon	80fa91d873	improve effort into figuring out the shard associated with a search failure	2013-08-07 14:16:22 +02:00
Martijn van Groningen	d26b165af3	Added improvements for terms filter on _parent field similar to what has been for term filter. Relates to #3454	2013-08-07 14:05:40 +02:00
Simon Willnauer	f2dc4f810c	Added tests for malformed mappings with no root object This commit also makes the error message more consistent with other exception messages in the DocumentMapperParser.	2013-08-07 14:01:32 +02:00
Manuel Bernhardt	27518b5e41	Improved error message when the mapping document is malformed	2013-08-07 13:41:49 +02:00
Simon Willnauer	7f0115ba9a	Return nothing instead of everything in MLT if no field is supported. Today due the optimizations in the boolean query builder we adjust a pure negative query with a 'match_all'. This is not the desired behavior in the MLT API if all the fields in a document are unsupported. If that happens today we return all documents but the one MLT is executed on. Closes #3453	2013-08-07 13:25:09 +02:00
Martijn van Groningen	73c038fb48	Improved filtering by _parent field In the _parent field the type and id of the parent are stored as type#id, because of this a term filter on the _parent field with the parent id is always resolved to a terms filter with a type / id combination for each type in the mapping. This can be improved by automatically use the most optimized filter (either term or terms) based on the number of parent types in the mapping. Also added support to use the parent type in the term filter for the _parent field. Like this: ```json { "term" : { "_parent" : "parent_type#1" } } ``` This will then always automatically use the term filter. Closes #3454	2013-08-07 13:20:21 +02:00
Martijn van Groningen	5e0b1621b4	added Lucene upgrade reminder	2013-08-07 10:46:25 +02:00
Martijn van Groningen	12c7eeb262	Added `size` option to percolate api The `size` option in the percolate api will limit the number of matches being returned: ```bash curl -XGET 'localhost:9200/my-index/my-type/_percolate' -d '{ "size" : 10, "doc" : {...} }' ``` In the above request no more than 10 matches will be returned. The `count` field will still return the total number of matches the document matched with. The `size` option is not applicable for the count percolate api. Closes #3440	2013-08-07 10:27:20 +02:00
Simon Willnauer	662bb80d6b	Add binary protocol backwards compatibility for suggest highlights This change requires different request processing on the binary protocol level since it has been we provide compatibilty across minor version. Yet, the suggest feature is still experimental but we try best effort to make upgrades as seamless as possible.	2013-08-07 10:19:11 +02:00
Luca Cavanna	3574d9de49	added explicit creation of parent type in create index	2013-08-06 23:10:33 +02:00
Nik Everett	72d6d822ae	Add highlighting support for suggester. This commit adds general highlighting support to the suggest feature. The only implementation that implements this functionality at this point is the phrase suggester. The API supports a 'pre_tag' and a 'post_tag' that are used to wrap suggested parts of the given user input changed by the suggester. Closes #3442	2013-08-06 20:57:39 +02:00
Britta Weber	a938bd57a9	add assertion for cast double->float ScoreFunction scoring might result in under or overflow, for example if a user decides to use the timestamp as a boost in the script scorer. Therefore, check if cast causes a huge precision loss. Note that this does not always detect casting issues. For example in ScriptFunction.score() the function SearchScript.runAsDouble() is called. AbstractFloatSearchScript implements it as follows: @Override  public double runAsDouble() {  return runAsFloat();  } In this case the cast happens before the assertion and therfore precision lossor over/underflows cannot be detected by the assertion.	2013-08-06 18:39:36 +02:00
Britta Weber	e707308f1f	Distance scoring ================ It might sometimes be desirable to have a tool available that allows to multiply the original score for a document with a function that decays depending on the distance of a numeric field value of the document from a user given reference. These functions could be computed for several numeric fields and eventually be combined as a sum or a product and multiplied on the score of the original query. This commit adds new score functions similar to boost factor and custom script scoring, that can be used togeter with the <code>function_score</code> keyword in a query. To use distance scoring, the user has to define 1. a reference and 2. a scale for each field the function should be applied on. A reference is needed to define a distance for the document and a scale to define the rate of decay. Example use case ---------------- Suppose you are searching for a hotel in a certain town. Your budget is limited. Also, you would like the hotel to be close to the town center, so the farther the hotel is from the desired location the less likely you are to check in. You would like the query results that match your criterion (for example, "hotel, Berlin, non-smoker") to be scored with respect to distance to the town center and also the price. Intuitively, you would like to define the town center as the origin and maybe you are willing to walk 2km to the town center from the hotel. In this case your reference for the location field is the town center and the scale is ~2km. If your budget is low, you would probably prefer something cheap above something expensive. For the price field, the reference would be 0 Euros and the scale depends on how much you are willing to pay, for example 20 Euros. Usage ---------------- The distance score functions can be applied in two ways: In the most simple case, only one numeric field is to be evaluated. To do so, call <code>function_score</code>, with the appropriate function. In the above example, this might be: curl 'localhost:9200/hotels/_search/' -d '{ "query": { "function_score": { "gauss": { "location": { "reference": [ 52.516272, 13.377722 ], "scale": "2km" } }, "query": { "bool": { "must": { "city": "Berlin" } } } } } }' which would then search for hotels in berlin with a balcony and weight them depending on how far they are from the Brandenburg Gate. If you have more that one numeric field, you can combine them by defining a series of functions and filters, like, for example, this: curl 'localhost:9200/hotels/_search/' -d '{ "query": { "function_score": { "functions": [ { "filter": { "match_all": {} }, "gauss": { "location": { "reference": "11,12", "scale": "2km" } } }, { "filter": { "match_all": {} }, "linear": { "price": { "reference": "0", "scale": "20" } } } ], "query": { "bool": { "must": { "city": "Berlin" } } }, "score_mode": "multiply" } } }' This would effectively compute the decay function for "location" and "price" and multiply them onto the score. See <code> function_score</code> for the different options for combining functions. Supported fields ---------------- Only single valued numeric fields, including time and geo locations, are be supported. What is a field is missing? ---------------- Is the numeric field is missing in the document, that field will not be taken into account at all for this document. The function value for this field is set to 1 for this document. Suppose you have two hotels both of which are in Berlin and cost the same. If one of the documents does not have a "location", this document would get a higher score than the document having the "location" field set. To avoid this, you could, for example, use the exists or the missing filter and add a custom boost factor to the functions. … "functions": [ { "filter": { "match_all": {} }, "gauss": { "location": { "reference": "11, 12", "scale": "2km" } } }, { "filter": { "match_all": {} }, "linear": { "price": { "reference": "0", "scale": "20" } } }, { "boost_factor": 0.001, "filter": { "bool": { "must_not": { "missing": { "existence": true, "field": "coordinates", "null_value": true } } } } } ], ... Closes #3423	2013-08-06 18:37:55 +02:00
Britta Weber	720b550a94	Unify custom scores =================== The custom boost factor, custom script boost and the filters function query all do the same thing: They take a query and for each found document compute a new score based on the query score and some script, come custom boost factor or a combination of these two. However, the json format for these three functionalities is very different. This makes it hard to add new functions. This commit introduces one keyword <code>function_score</code> for all three functions. The new format can be used to either compute a new score with one function: "function_score": { "(query\|filter)": {}, "boost": "boost for the whole query", "function": {} } or allow to combine the newly computed scores "function_score": { "(query\|filter)": {}, "boost": "boost for the whole query", "functions": [ { "filter": {}, "function": {} }, { "function": {} } ], "score_mode": "(mult\|max\|...)" } <code>function</code> here can be either "script_score": { "lang": "lang", "params": { "param1": "value1", "param2": "value2" }, "script": "some script" } or "boost_factor" : number New custom functions can be added via the function score module. Changes --------- The custom boost factor query "custom_boost_factor" : { "query" : { .... }, "boost_factor" : 5.2 } becomes "function_score" : { "query" : { .... }, "boost_factor" : 5.2 } The custom script score "custom_score" : { "query" : { .... }, "params" : { "param1" : 2, "param2" : 3.1 }, "script" : "_score * doc['my_numeric_field'].value / pow(param1, param2)" } becomes "custom_score" : { "query" : { .... }, "script_score" : { "params" : { "param1" : 2, "param2" : 3.1 }, "script" : "_score * doc['my_numeric_field'].value / pow(param1, param2)" } } and the custom filters score query "custom_filters_score" : { "query" : { "match_all" : {} }, "filters" : [ { "filter" : { "range" : { "age" : {"from" : 0, "to" : 10} } }, "boost" : "3" }, { "filter" : { "range" : { "age" : {"from" : 10, "to" : 20} } }, "script" : "_score * doc['my_numeric_field'].value / pow(param1, param2)" } ], "score_mode" : "first", "params" : { "param1" : 2, "param2" : 3.1 } "score_mode" : "first" } becomes: "function_score" : { "query" : { "match_all" : {} }, "functions" : [ { "filter" : { "range" : { "age" : {"from" : 0, "to" : 10} } }, "boost" : "3" }, { "filter" : { "range" : { "age" : {"from" : 10, "to" : 20} } }, "script_score" : { "script" : "_score * doc['my_numeric_field'].value / pow(param1, param2)", "params" : { "param1" : 2, "param2" : 3.1 } } } ], "score_mode" : "first", } Partially closes issue #3423	2013-08-06 18:37:34 +02:00
Luca Cavanna	e1c739fe6f	Improved test, printed out potential shard failures	2013-08-06 16:24:29 +02:00

1 2 3 4 5 ...

5244 Commits All Branches Search

5244 Commits

All Branches