OpenSearch

Commit Graph

Author	SHA1	Message	Date
Christoph Büscher	619e4c1a44	Merge branch 'master' into feature/rank-eval	2016-11-04 12:48:42 +01:00
Adrien Grand	2a70f6e7b1	Upgrade to lucene-6.3.0-snapshot-a66a445. (#21309 ) This addresses a bug that was introduced with https://issues.apache.org/jira/browse/LUCENE-7501.	2016-11-04 10:34:04 +01:00
Christoph Büscher	2dad72e68c	Rank Eval: Handle precion@ edge case There's a currently unhandled edge case for the precion@ metric. When none of the search hits in the result are rated, we have neither true nor false positives which currently leads to division by zero. We should return a precion of 0.0 in this case.	2016-11-03 12:59:36 +01:00
Nik Everett	24d5f31a54	Make painless's assertion about out of bound less brittle Instead of asserting that the message is shaped a certain way we cause the exception and catch it and assert that the messages are the same. This is the way to go because the exception message from the jvm is both local and jvm dependent. This is the CI failure that found this: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+java9-periodic/515/consoleFull	2016-11-02 12:38:51 -04:00
Christoph Büscher	b3370de715	Tests: Add warning header checks to QueryBuilder tests and QueryParseContextTests This adds checks for expected warning headers to the query builder test infrastructure. Tests that are adding deprecation warnings to the response headers need to check those, otherwise the abstract base class for the test class will complain at teardown.	2016-11-02 15:45:33 +01:00
Adrien Grand	aa6cd93e0f	Require arguments for QueryShardContext creation. (#21196 ) The `IndexService#newQueryShardContext()` method creates a QueryShardContext on shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to resolve `now`. This may hide bugs, since the shard id is sometimes used for query parsing (it is used to salt random score generation in `function_score`), passing a `null` reader disables query rewriting and for some use-cases, it is simply not ok to rely on the current timestamp (eg. percolation). So this pull request removes this method and instead requires that all call sites provide these parameters explicitly.	2016-11-02 09:48:49 +01:00
Nik Everett	a612e5988e	Bump reindex-from-remote's buffer to 200mb It was 10mb and that was causing trouble when folks reindex-from-remoted with large documents. We also improve the error reporting so it tells folks to use a smaller batch size if they hit a buffer size exception. Finally, adds some docs to reindex-from-remote mentioning the buffer and giving an example of lowering the size. Closes #21185	2016-11-01 13:19:28 -04:00
Christoph Büscher	25565b9baa	RankEval: Check for duplicate keys in rated documents When multiple ratings for the same document (identified by _index, _type, _id) are specified in the request we should throw an error. This change adds a check for this in the RatedRequest setter (and ctor that uses that setter). Closes #20997	2016-11-01 14:54:05 +01:00
Christoph Büscher	51102ee91c	Fixing compile issue with ScriptType after merge with master	2016-11-01 14:42:24 +01:00
Isabel Drost-Fromm	0b8a2e40cb	First step towards supporting templating in rank eval requests. (#20374 ) This adds support for templating in rank eval requests. Relates to #20231 Problem: In it's current state the rank-eval request API forces the user to repeat complete queries for each test request. In most use cases the structure of the query to test will be stable with only parameters changing across requests, so this looks like lots of boilerplate json for something that could be expressed in a more concise way. Uses templating/ ScriptServices to enable users to submit only one test request template and let them only specify template parameters on a per test request basis.	2016-11-01 11:36:22 +01:00
Jason Tedor	38663351dc	Fix logger names for Netty Previously Elasticsearch would only use the package name for logging levels, truncating the package prefix and the class name. This meant that logger names for Netty were just prefixed by netty3 and netty. We changed this for Elasticsearch so that it's the fully-qualified class name now, but never corrected this for Netty. This commit fixes the logger names for the Netty modules so that their levels are controlled by the fully-qualified class name. Relates #21223	2016-10-31 17:23:21 -04:00
Jack Conradson	185dff7346	Cleanup ScriptType (#21179 ) Refactored ScriptType to clean up some of the variable and method names. Added more documentation. Deprecated the 'in' ParseField in favor of 'stored' to match the indexed scripts being replaced by stored scripts.	2016-10-31 13:48:51 -07:00
Nik Everett	1bbd3c5400	Fix painless's out of bounds assertions in java 9 Java 9's exception message when lists have an out of bounds index is much better than java 8 but the painless code asserted on the java 8 message. Now it'll accept either. I'm tempted to weaken the assertion but I like asserting that the message is readable.	2016-10-29 22:21:57 -04:00
Nik Everett	3a7a218e8f	Support negative array ofsets in painless Adds support for indexing into lists and arrays with negative indexes meaning "counting from the back". So for if `x = ["cat", "dog", "chicken"]` then `x[-1] == "chicken"`. This adds an extra branch to every array and list access but some performance testing makes it look like the branch predictor successfully predicts the branch every time so there isn't a in execution time for this feature when the index is positive. When the index is negative performance testing showed the runtime is the same as writing `x[x.length - 1]`, again, presumably thanks to the branch predictor. Those performance metrics were calculated for lists and arrays but `def`s get roughly the same treatment though instead of inlining the test they need to make a invoke dynamic so we don't screw up maps. Closes #20870	2016-10-29 16:12:40 -04:00
Adrien Grand	b3cc54cf0d	Upgrade to lucene-6.3.0-snapshot-ed102d6 (#21150 ) Lucene 6.3 is expected to be released in the next weeks so it'd be good to give it some integration testing. I had to upgrade randomized-testing too so that both Lucene and Elasticsearch are on the same version.	2016-10-28 14:47:15 +02:00
Christoph Büscher	51a2e3bf1e	Merge branch 'master' into feature/rank-eval	2016-10-27 11:11:37 +02:00
Jack Conradson	512a77a633	Refactor ScriptType to be a top-level class.	2016-10-26 10:21:22 -07:00
Jason Tedor	9c3e4d6e22	Add correct Content-Length on HEAD requests This commit fixes responses to HEAD requests so that the value of the Content-Length is correct per the HTTP spec. Namely, the value of this header should be equal to the Content-Length if the request were not a HEAD request. This commit also fixes a memory leak on HEAD requests to the main action that arose from the bytes on a builder not being released due to them being dropped on the floor to ensure that the response to the main action did not have a body. Relates #21123	2016-10-25 23:08:19 -04:00
Christoph Büscher	67e2de6702	Merge branch 'master' into feature/rank-eval	2016-10-25 11:05:31 +02:00
Nik Everett	18393a06f3	Fix reindex-from-remote for parent/child from <2.0 Versions before 2.0 needed to be told to return interesting fields like `_parent`, `_routing`, `_ttl`, and `_timestamp`. And they come back inside a `fields` block which we need to parse. Closes #21044	2016-10-21 13:14:33 -04:00
Jason Tedor	f51bf8ee47	Upgrade to Netty 4.1.6 This commit upgrades the transport-netty4 module dependency from Netty version 4.1.5 to version 4.1.6. This is a bug fix release of Netty. Relates #21051	2016-10-20 20:13:29 -04:00
Jack Conradson	ceaae47d38	Remove more equivalents of the now method from the Painless whitelist.	2016-10-20 10:35:26 -07:00
Nik Everett	b5da42905f	Remove publishAddress from reindex whitelist Removes the `publishAddress` parameter from the reindex-from-remote whitelist checking because it isn't in use after #21004.	2016-10-20 12:51:10 -04:00
Fanfan	043a45746c	some misspelled words in code (#21012 ) as the title mentioned, misspelling as follows, "construct" to "constrcut", "cumulation" to "cumalation", "initialize" to "intialize".	2016-10-19 11:42:38 -04:00
Christoph Büscher	e8e65c3a1e	Merge branch 'master' into feature/rank-eval	2016-10-19 11:37:29 +02:00
Nik Everett	acf7c7430b	Add "simple match" support for reindex-from-remote whitelist This allows you to whitelist `localhost:` or `127.0.10.:9200`. It explicitly checks for patterns like `*` in the whitelist and refuses to start if the whitelist would match everything. Beyond that the user is on their own designing a secure whitelist.	2016-10-18 21:47:21 -04:00
Tal Levy	38c650f376	make painless the default scripting language for ScriptProcessor (#20981 ) - fixes a bug in the docs that mentions `lang` as optional - now `lang` defaults to "painless"	2016-10-18 16:22:01 -07:00
Ryan Ernst	dca614aa3b	Build: Change `gradle run` to use zip distribution (#21001 ) When running `gradle run`, a developer usually intends to get a running instance as if they had run elasticsearch from the command line. This is different than the isolated environment we use for integration testing plugins. This change switches the run task to use the zip distribution, so that all modules included in the normal distribution are included.	2016-10-18 11:48:58 -07:00
Ryan Ernst	3d3dd7185d	Add support for booleans in scripts (#20950 ) * Scripting: Add support for booleans in scripts Since 2.0, booleans have been represented as numeric fields (longs). However, in scripts, this is odd, since you expect doing a comparison against a boolean to work. While languages like groovy will auto convert between booleans and longs, painless does not. This changes the doc values accessor for boolean fields in scripts to return Boolean objects instead of Long objects. closes #20949 * Make Booleans final and remove wrapping of `this` for getValues()	2016-10-17 11:11:42 -07:00
Christoph Büscher	f927a235b3	Merge branch 'master' into feature/rank-eval	2016-10-17 14:50:10 +02:00
Jason Tedor	c1bdaaf80f	Fix connection keep-alive header handling This commit fixes an issue with the handling of the value "keep-alive" on the Connection header in the Netty 4 HTTP implementation while handling an HTTP 1.0 request. The issue was using the wrong equals method to compare an AsciiString instance and a String instance (they could never be equal). This commit fixes this to use the correct equals method to compare for content equality.	2016-10-16 19:51:00 -04:00
Jason Tedor	cd5777593a	Fix connection close header handling This commit fixes an issue with the handling of the value "close" on the Connection header in the Netty 4 HTTP implementation. The issue was using the wrong equals method to compare an AsciiString instance and a String instance (they could never be equal). This commit fixes this to use the correct equals method to compare for content equality. Relates #20956	2016-10-16 13:18:09 -04:00
Christoph Büscher	dfc6d1f369	Remove unknown docs from EvalQueryQuality The unknown document section in the response for each query can be rendered using the rated hits that are now also part of the response by just filtering the documents without a rating.	2016-10-14 17:14:05 +02:00
Christoph Büscher	9e394b0644	Pull common operations into RankedListQualityMetric interface Currently each implementation of RankedListQualityMetric does some initial joining operation that links the input search hits with a rated document rating, if available. Also all metrics collect unknown docs and now also need to add the list of rated search hits to the partial query evaluation. This change centralizes this work in some new helper methods in RankedListQualityMetric.	2016-10-14 17:14:05 +02:00
Christoph Büscher	ebe13100df	Add `hits` section to response for each ranking evaluation query This change adds a `hits` section to the response part for each ranking evaluation query, containing a list of documents (index/type/id) and ratings (if the document was rated in the request). This section can be used to better understand the calculation of the ranking quality of this particular query, but it can also be used to identify the "unknown" (that is unrated) documents that were part of the seach hits, for example because a UI later wants to present those documents to the user to get a rating for them. If the user specifies a set of field names using a parameter called `summary_fields` in the request, those fields are also included as part of the response in addition to "_index", "_type", "_id".	2016-10-14 17:14:05 +02:00
Christoph Büscher	cd9d07b91b	Merge branch 'master' into feature/rank-eval	2016-10-14 17:03:30 +02:00
Jason Tedor	595ec8c948	Remove artificial default processors limit Today Elasticsearch limits the number of processors used in computing thread counts to 32. This was from a time when Elasticsearch created more threads than it does now and users would run into out of memory errors. It appears the real cause of these out of memory errors was not well understood (it's often due to ulimit settings) and so users were left hitting these out of memory errors on boxes with high core counts. Today Elasticsearch creates less threads (but still a lot) and we have a bootstrap check in place to ensure that the relevant ulimit is not too low. There are some caveats still to having too many concurrent indexing threads as it can lead to too many little segments, and it's not a magical go faster knob if indexing is already bottlenecked by disk, but this limitation is artificial and surprising to users and so it should be removed. This commit also increases the lower bound of the max processes ulimit, to prepare for a world where Elasticsearch instances might be running with more the previous cap of 32 processors. With the current settings, Elasticsearch wants to create roughly 576 + 25 * p / 2 threads, where p is the number of processors. Add in roughly 7 * p / 8 threads for the GC threads and a fudge factor, and 4096 should cover us pretty well up to 256 cores. Relates #20874	2016-10-14 05:47:26 -04:00
Tanguy Leroux	e71c30c71d	Mustache: Add {{#url}}{{/url}} function to URL encode strings (#20838 ) This commit adds a new Mustache function (codename: url) and a new URLEncoder that can be used to URL encode strings in mustache templates.	2016-10-13 16:17:28 +02:00
Simon Willnauer	12392b5425	Ensure port range is readable in the exception message (#20893 ) Both netty3 and netty4 http implementation printed the default toString representation of PortRange if ports couldn't be bound. This commit adds a better default toString method to PortRange and uses the string representation for the error message in the http implementations.	2016-10-12 22:33:47 +02:00
Areek Zillur	133be6631d	Merge branch 'master' into cleanup/transport_bulk	2016-10-12 13:09:29 -04:00
Isabel Drost-Fromm	20c1e25609	Merge branch 'master' into feature/rank-eval	2016-10-12 14:41:17 +02:00
Tanguy Leroux	44ac5d057a	Remove empty javadoc (#20871 ) This commit removes as many as empty javadocs comments my regexp has found	2016-10-12 10:27:09 +02:00
Areek Zillur	481f7909ae	Merge branch 'master' into cleanup/transport_bulk	2016-10-11 16:04:47 -04:00
Areek Zillur	0e8b6532ec	rename DocumentRequest to DocWriteRequest	2016-10-11 16:00:10 -04:00
Tanguy Leroux	e4c7d8183e	XContentBuilder: Avoid building self-referencing objects (#20550 ) Some objects like maps, iterables or arrays of objects can self-reference themselves. This is mostly due to a bug in code but the XContentBuilder should be able to detect such situations and throws an IllegalArgumentException instead of building objects over and over until a stackoverflow occurs. closes #20540 closes #19475	2016-10-11 11:41:54 +02:00
Simon Willnauer	37ca38df3d	Expose `ctx._now` in update scripts (#20835 ) Update scripts might want to update the documents `_timestamp` but need a notion of `now()`. Painless doesn't support any notion of now() since it would make scripts non-pure functions. Yet, in the update case this is a valid value and we can pass it with the context together to allow the script to record the timestamp the document was updated. Relates to #17895	2016-10-10 21:14:14 +02:00
Jim Ferenczi	c80a563a71	Replace org.elasticsearch.common.lucene.search.MatchNoDocsQuery with its Lucene version (org.apache.lucene.search.MatchNoDocsQuery) (#20832 ) * Replace org.elasticsearch.common.lucene.search.MatchNoDocsQuery with its Lucene version (org.apache.lucene.search.MatchNoDocsQuery) This change removes the ES version of the match no docs query and replaces it with the Lucene version. relates #18030 * Add missing change	2016-10-10 17:45:19 +02:00
Simon Willnauer	4fd1276542	Prevent AbstractArrays from release bytes more than once (#20819 ) Today we throw an assertion error if we release an AbstractArray more than once. Yet, it's recommended to implement close methods such that they can be invoked more than once. Guaranteed single release calls are hard to implement and some situations might not be tested causing for instance `CircuitBreaker` to operate on corrupted memory stats.	2016-10-10 17:30:37 +02:00
Christoph Büscher	c3380863be	Adapting RestRankEvalAction to changes in API on master	2016-10-10 12:57:18 +02:00
Christoph Büscher	0c25cfbd16	Merge branch 'master' into feature/rank-eval	2016-10-10 12:10:34 +02:00

1 2 3 4 5 ...

3785 Commits