OpenSearch

Commit Graph

Author	SHA1	Message	Date
Robin Neatherway	6dadce4761	Painless: Correct ClassToName string conversion (#28997 ) A typo of 'dimensions' rather than 'dimension' caused an infinite loop.	2018-03-13 13:16:48 -07:00
Jason Tedor	5904d936fa	Copy Lucene IOUtils (#29012 ) As we have factored Elasticsearch into smaller libraries, we have ended up in a situation that some of the dependencies of Elasticsearch are not available to code that depends on these smaller libraries but not server Elasticsearch. This is a good thing, this was one of the goals of separating Elasticsearch into smaller libraries, to shed some of the dependencies from other components of the system. However, this now means that simple utility methods from Lucene that we rely on are no longer available everywhere. This commit copies IOUtils (with some small formatting changes for our codebase) into the fold so that other components of the system can rely on these methods where they no longer depend on Lucene.	2018-03-13 12:49:33 -04:00
Martijn van Groningen	beb22d89c8	percolator: Take `matchAllDocs` and `verified` of the sub result into account when analyzing a function_score query. Before the `matchAllDocs` was ignored and this could lead to percolator queries not matching when the inner query was a match_all query and min_score was specified. Before when `verified` was not taken into account if the function_score query wrapped an unverified query this could lead to matching percolator queries that shouldn't match at all.	2018-03-09 07:16:21 +01:00
Lee Hinman	46a79127ed	Remove FastStringReader in favor of vanilla StringReader (#28944 ) This allows us to remove another dependency in the decoupling of the XContent code. Rather than move this class over or decouple it, it can simply be removed. Relates tangentially to #28504	2018-03-08 17:17:36 -07:00
Tal Levy	7784c1bff9	Continue registering pipelines after one pipeline parse failure. (#28752 ) Ingest has been failing to apply existing pipelines from cluster-state into the in-memory representation that are no longer valid. One example of this is a pipeline with a script processor. If a cluster starts up with scripting disabled, these pipelines will not be loaded. Even though GETing a pipeline worked, indexing operations claimed that this pipeline did not exist. This is because one gets information from cluster-state and the other is from an in-memory data-structure. Now, two things happen 1. suppress the exceptions until after other successful pipelines are loaded 2. replace failed pipelines with a placeholder pipeline If the pipeline execution service encounters the stubbed pipeline, it is known that something went wrong at the time of pipeline creation and an exception was thrown to the user at some point at start-up. closes #28269.	2018-03-08 15:22:59 -08:00
Martijn van Groningen	bcfb7ab591	Improved percolator's random candidate query duel test and fixed bugs that were exposed by this: * Duplicates query leafs were not detected in a multi level boolean query * Tracking fields for numeric range queries did not work properly. * The sorting that was used to find the less restrictive clauses in disjunction query did not work too.	2018-03-08 11:39:03 +01:00
Lee Hinman	818920a281	Decouple XContentType from StreamInput/Output (#28927 ) This removes the readFrom and writeTo methods from XContentType, instead using the more generic `readEnum` and `writeEnum` methods. Luckily they are both encoded exactly the same way, so there is no compatibility layer needed for backwards compatibility. Relates to #28504	2018-03-07 14:50:30 -07:00
Lee Hinman	e7d1e12675	Wrap stream passed to createParser in try-with-resources (#28897 ) * Wrap stream passed to createParser in try-with-resources This wraps the stream (`.streamInput()`) that is passed to many of the `createParser` instances in the enclosing (or a new) try-with-resources block. This ensures the `BytesReference.streamInput()` is closed. Relates to #28504 * Use try-with-resources instead of closing in a finally block	2018-03-04 16:48:03 -07:00
Luca Cavanna	1df711c5b7	Remove AcknowledgedRestListener in favour of RestToXContentListener (#28724 ) This commit makes AcknowledgedResponse implement ToXContentObject, so that the response knows how to print its own content out to XContent, which allows us to remove AcknowledgedRestListener.	2018-02-22 09:13:30 +01:00
Lee Hinman	d7eae4b90f	Pass InputStream when creating XContent parser (#28754 ) * Pass InputStream when creating XContent parser Rather than passing the raw `BytesReference` in when creating the xcontent parser, this passes the StreamInput (which is an InputStream), this allows us to decouple XContent from BytesReference. This also removes the use of `commons.Booleans` so it doesn't require more external commons classes. Related to #28504 * Undo boolean removal * Enhance deprecation javadoc	2018-02-21 11:03:25 -07:00
Martijn van Groningen	793cbc651a	Moved Grok helper code to a separate Gradle module and let ingest-common module depend on it.	2018-02-21 11:18:08 +01:00
Yu	7d8fb69d50	version set in ingest pipeline (#27573 ) Add support version and version_type in ingest pipelines Add support for setting document version and version type in set processor of an ingest pipeline.	2018-02-21 09:34:51 +01:00
Lee Hinman	d4fddfa2a0	Remove log4j dependency from elasticsearch-core (#28705 ) * Remove log4j dependency from elasticsearch-core This removes the log4j dependency from our elasticsearch-core project. It was originally necessary only for our jar classpath checking. It is now replaced by a `Consumer<String>` so that the es-core dependency doesn't have external dependencies. The parts of #28191 which were moved in conjunction (like `ESLoggerFactory` and `Loggers`) have been moved back where appropriate, since they are not required in the core jar. This is tangentially related to #28504 * Add javadocs for `output` parameter * Change @code to @link	2018-02-20 09:15:54 -07:00
Martijn van Groningen	9c405e8595	made load method private and add another static getter that users of Grok can use to get the builtin patterns.	2018-02-20 08:09:24 +01:00
Martijn van Groningen	3fad16e76c	renamed module	2018-02-20 08:02:02 +01:00
Martijn van Groningen	9e13cb59a2	Moved Grok helper code to a separate Gradle module and let ingest-common module depend on it.	2018-02-19 09:49:07 +01:00
Lee Hinman	0dd79028c9	Remove deprecated createParser methods (#28697 ) * Remove deprecated createParser methods This removes the final instances of the callers of `XContent.createParser` and `XContentHelper.createParser` that did not pass in the `DeprecationHandler`. It also removes the now-unused deprecated methods and fully removes any mention of Log4j or LoggingDeprecationHandler from the XContent code. Relates to #28504 * Add comments in JsonXContentGenerator	2018-02-16 08:26:30 -07:00
Jason Tedor	671e7e2f00	Lift error finding utility to exceptions helpers We have code used in the networking layer to search for errors buried in other exceptions. This code will be useful in other locations so with this commit we move it to our exceptions helpers. Relates #28691	2018-02-15 09:48:52 -05:00
Lee Hinman	b59b1cf59d	Move more XContent.createParser calls to non-deprecated version (#28672 ) * Move more XContent.createParser calls to non-deprecated version Part 2 This moves more of the callers to pass in the DeprecationHandler. Relates to #28504 * Use parser's deprecation handler where appropriate * Use logging handler in test that uses deprecated field on purpose	2018-02-14 11:24:48 -07:00
Lee Hinman	7c1f5f5054	Move more XContent.createParser calls to non-deprecated version (#28670 ) * Move more XContent.createParser calls to non-deprecated version This moves more of the callers to pass in the DeprecationHandler. Relates to #28504 * Use parser's deprecation handler where available	2018-02-14 09:01:40 -07:00
Robin Neatherway	8d0f976024	Removed unused local variable from ScriptClassInfo (#28617 ) Following [`8999104`](`8999104b14`) the local variable `argumentNames` is no longer used, so removed.	2018-02-12 15:58:09 -08:00
Boaz Leskes	4aece92b2c	IndexShardOperationPermits: shouldn't use new Throwable to capture stack traces (#28598 ) The is a follow up to #28567 changing the method used to capture stack traces, as requested during the review. Instead of creating a throwable, we explicitly capture the stack trace of the current thread. This should Make Jason Happy Again ™️ .	2018-02-12 10:33:13 +01:00
Ryan Ernst	20c37efea2	Build: Replace provided configuration with compileOnly (#28564 ) When elasticsearch was originally moved to gradle, the "provided" equivalent in maven had to be done through a plugin. Since then, gradle added the "compileOnly" configuration. This commit removes the provided plugin and replaces all uses with compileOnly.	2018-02-09 11:30:24 -08:00
Martijn van Groningen	766b9d600e	Fixed a bug that prevents pipelines to load that use stored scripts after a restart. The bug was caused because the ScriptService had no reference to a ClusterState instance, because it received the ClusterState after the PipelineStore. This only is the case after a restart. A bad side effect is that during a restart, any pipeline to be loaded after the pipeline that uses a stored script, was never loaded, which caused many pipeline to be missing in bulk / index request api calls.	2018-02-09 17:14:00 +01:00
Christoph Büscher	01791277cb	Test that rank_eval request parsing is not lenient (#28516 ) Parsing of a ranking evaluation request and its subcomponents should throw parsing errors on unknown fields. This change adds tests for this and changes the parser behaviour in cases where it is needed.	2018-02-08 17:38:45 +01:00
Lee Hinman	eebff4d2b3	Use non deprecated xcontenthelper (#28503 ) * Move to non-deprecated XContentHelper.createParser(...) This moves away from one of the now-deprecated XContentHelper.createParser methods in favor of specifying the deprecation logger at parser creation time. Relates to #28449 Note that this doesn't move all the `createParser` calls because some of them use the already-deprecated method that doesn't specify the XContentType. * Remove the deprecated (and now non-needed) createParser method	2018-02-05 16:18:18 -07:00
Jack Conradson	5c1d3aa2f0	Painless: Fixes a null pointer exception in certain cases of for loop usage (#28506 ) The initializer and afterthought were not having their types appropriately cast which is necessary with expressions which in turn caused values to be popped off the stack that were null.	2018-02-05 11:57:21 -08:00
Nik Everett	5003ef18ac	Scripts: Fix security for deprecation warning (#28485 ) If you call `getDates()` on a long or date type field add a deprecation warning to the response and log something to the deprecation logger. This mostly worked just fine but if the deprecation logger happens to roll then the roll will be performed with the script's permissions rather than the permissions of the server. And scripts don't have permissions to, say, open files. So the rolling failed. This fixes that by wrapping the call the deprecation logger in `doPriviledged`. This is a strange `doPrivileged` call because it doens't check Elasticsearch's `SpecialPermission`. `SpecialPermission` is a permission that no-script code has and that scripts never have. Usually all `doPrivileged` calls check `SpecialPermission` to make sure that they are not accidentally acting on behalf of a script. But in this case we are intentionally acting on behalf of a script. Closes #28408	2018-02-03 14:56:08 -05:00
Jack Conradson	90c74a7e09	Remove RuntimeClass from Painless Definition in favor of just Painless Struct. (#28486 )	2018-02-02 10:26:02 -08:00
Lee Hinman	3ddea8d8d2	Start switching to non-deprecated ParseField.match method (#28488 ) This commit switches all the modules and server test code to use the non-deprecated `ParseField.match` method, passing in the parser's deprecation handler or the logging deprecation handler when a parser is not available (like in tests). Relates to #28449	2018-02-02 10:10:13 -07:00
Jack Conradson	df1c696e1d	Remove Painless Type From Locals, Variables, Params, and ScriptInfo (#28471 )	2018-02-01 12:46:20 -08:00
Martijn van Groningen	ecb1d07d00	percolator: remove deprecated map_unmapped_fields_as_string setting	2018-02-01 11:11:22 +01:00
Jim Ferenczi	dd40b984c4	Add a shallow copy method to aggregation builders (#28430 ) This change adds a shallow copy method for aggregation builders. This method returns a copy of the builder replacing the factoriesBuilder and metaDada This method is used when the builder is rewritten (AggregationBuilder#rewrite) in order to make sure that we create a new instance of the parent builder when sub aggregations are rewritten. Relates #27782	2018-02-01 09:22:32 +01:00
Martijn van Groningen	9bada306dc	Improved percolator candidate query tests.	2018-02-01 07:43:03 +01:00
Jack Conradson	e281d57d82	Remove Painless Type From Painless Method/Field (#28466 )	2018-01-31 16:27:52 -08:00
markharwood	77d2dd203e	Search - add allow_partial_search_results flag with default setting false (#28440 ) Adds allow_partial_search_results flag to search requests with default setting = true. When false, will error if search either timeouts, has partial errors or has missing shards rather than returning partial search results. A cluster-level setting provides a default for search requests with no flag. Closes #27435	2018-01-31 15:51:29 +00:00
Nik Everett	3b6af15a60	XContent: Factor deprecation handling into callback (#28449 ) Factors the way in which XContent parsing handles deprecated fields into a callback that is set at parser construction time. The goals here are: 1. Remove Log4J as a dependency of XContent so that XContent can be used by clients without forcing log4j and our particular deprecation handling scheme. 2. Simplify handling of deprecated fields in tests. Now tests can listen directly for the deprecation callback rather than digging through a ThreadLocal. More accurately, this change begins this work. It deprecates a number of methods, pointing folks to the new versions of those methods that take `DeprecationHandler`. The plan is to slowly drop these deprecated methods. Once they are entirely removed we can remove Log4j as dependency of XContent.	2018-01-30 18:21:10 -05:00
Nik Everett	9aeed74fe9	Reindex: Raise timeout on flaky test This gives the test longer to block its updates. Now that we're checking if the updates actually blocked saw that they may not do so in the normal 10 seconds on a highly loaded system. And our jenkins machines often function like highly loaded systems. Maybe this fixes #26758!	2018-01-30 17:59:58 -05:00
Nik Everett	6f64e9728b	Reindex: More digging on flaky test This adds more logging and a missing assertion to a flaky reindex test.	2018-01-30 17:42:40 -05:00
Christoph Büscher	1c296fe7ed	Update bwc version for rank_eval rest tests	2018-01-30 21:02:19 +01:00
Simon Willnauer	3bf8554114	Remove tribe node support (#28443 ) Tribe node has been superseeded by Cross-Cluster-Search. This change removes the tribe node support entirely.	2018-01-30 20:40:19 +01:00
Christoph Büscher	6731c76900	Add ranking evaluation API to High Level Rest Client (#28357 ) This change adds support for the new ranking evaluation API to the High Level Rest Client. This mostly means adding support for parsing the various response objects back from the REST representation. It includes one change to the response syntax where previously we didn't print the type of the metric details section but we now need it to pick the right parser to parse this section back. Closes #28198	2018-01-30 17:48:09 +01:00
Jack Conradson	fa8e62d48f	Removes Painless Type in favor of Java Class from the expression nodes related to function references and lambdas. (#28433 )	2018-01-30 08:18:14 -08:00
Martijn van Groningen	204f4022c2	percolator: Do not take duplicate query extractions into account for minimum_should_match attribute If a percolator query contains duplicate query clauses somewhere in the query tree then when these clauses are extracted then they should not affect the msm. This can lead a percolator query that should be a valid match not become a candidate match, because at query time, the msm that is being used by the CoveringQuery would never match with the msm used at index time. Closes #28315	2018-01-30 07:25:33 +01:00
Jack Conradson	1d01bcf421	Remove Painless Type in favor of Java Class in FunctionRef. (#28429 )	2018-01-29 16:43:36 -08:00
Jack Conradson	f13da9f534	Remove Painless Type from e-nodes in favor of Java Class (#28364 )	2018-01-29 12:44:50 -08:00
Simon Willnauer	7957c9751c	[TEST] fix test to use a dedicated index to ensure lucene docIDs are guaranteed	2018-01-26 14:52:39 +01:00
Jay Modi	e59f14d139	Update Netty to 4.1.16.Final (#28345 ) This commit updates netty to 4.1.16.Final. This is the latest version that we can have work without extra permissions. This updated version of netty fixes issues seen with Java 9 and some data not being sent, which results in timeouts.	2018-01-25 12:48:43 -07:00
Jack Conradson	a57a0ae78b	Remove Painless Type from MethodWriter in favor of Java Class. (#28346 )	2018-01-24 11:02:46 -08:00
Nik Everett	2eede9b876	Reindex: Shore up rethrottle test The rethrottle test fails from time to time because one of the child task that want to be rethrottled hasn't properly started yet. We retry in this case but it looks like the retry either isn't long enough or something else strange is happening. This change adds yet more logging so future failure of this kind will be easier to track down and it adds an extra wait condition: this waits for all child tasks to be running or completed before rethrottling. This might avoid the failure because once a child task is properly started it should be quite ok to rethrottle. Relates to #26192	2018-01-24 11:03:30 -05:00
Jack Conradson	b945006938	Completely remove Painless Type from AnalyzerCaster in favor of Java Class. (#28329 ) Second part in a series of PR's to remove Painless Type in favor of Java Class. This completely removes the Painless Type dependency from AnalyzerCaster. Both casting and promotion are now based on Java Class exclusively. This also allows AnalyzerCaster to be decoupled from Definition and make cast checks be static calls again.	2018-01-23 12:38:31 -08:00
Nik Everett	eded5bc4f3	Reindex: Wait for deletion in test The test failure tracked by #28053 occurs because we fail to get the failure response from the reindex on the first try and on our second try the delete index API call that was supposed to trigger the failure actually deletes the index during document creation. This causes the test to fail catastrophically. This PR attempts to wait for the failure to finish before the test moves on to the second attempt. The failure doesn't reproduce locally for me so I can't be sure that this helps at all with the failure, but it certainly feels like it should help some. Here is hoping this prevents similar failures in the future.	2018-01-23 13:35:23 -05:00
Nik Everett	049f29710e	Reindex: log more on rare test failure The test failure tracked by #26758 occurs when we cancel a running reindex request that has been sliced into many children. The main reindex response looks canceled but none of the children look canceled. This is super strange because for the main request to look canceled for any length of time one of the children has to be canceled. This change adds additional logging to the test so we have more to go on to debug this the next time it fails.	2018-01-23 12:21:28 -05:00
Simon Willnauer	4d3f7a7695	Ensure we protect Collections obtained from scripts from self-referencing (#28335 ) Self referencing maps can cause SOE if they are iterated ie. in their toString methods. This chance adds some protected to the usage of those collections.	2018-01-23 16:57:26 +01:00
Jack Conradson	ef5c041819	Painless: Replace Painless Type with Java Class during Casts (#27847 ) This is the first step in a series to replace Painless Type with Java Class for any casting done during compilation. There should be no behavioural change.	2018-01-22 13:01:13 -08:00
Christoph Büscher	a6bfe67f8b	[Test] Lower bwc version for rank-eval rest tests The API was backported to 6.2 so the version we test against on master can be lowered to that.	2018-01-22 13:33:42 +01:00
Adrien Grand	700d9ecc95	Remove the `update_all_types` option. (#28288 ) This option is not useful in 7.x since no indices may have more than one type anymore.	2018-01-22 12:03:07 +01:00
Ryan Ernst	ba9c9e08e7	Painless: Add spi jar that will be published for extending whitelists (#28302 ) In order to build a plugin that extends the painless whitelist, the spi classes must be available to the plugin at compile time. This commit moves the spi classes into a separate jar which will be published. Any plugin authors whiching to extend painless through spi would then add a compileOnly dependency on this jar.	2018-01-18 19:16:26 -08:00
Ryan Ernst	19a2b01e43	Build: Omit dependency licenses check for elasticsearch deps (#28304 ) Sometimes modules/plugins depend on locally built elasticsearch jars. This means not only that the jar is constantly changing (so no need for a sha check), but also that the license falls under the Elasticsearch license, and there is no need to keep another copy. This commit updates the dependencies checked by dependencyLicenses to exclude those that are built by elasticsearch.	2018-01-18 14:15:44 -08:00
Christoph Büscher	77dcaab34f	Simplify RankEvalResponse output (#28266 ) Currenty the rest response of the ranking evaluation API wraps all inside an enclosing `rank_eval` object. This is redundant since it is clear from the API call and it doesn't provide any other useful information. This change removes this.	2018-01-18 09:32:27 +01:00
Ryan Ernst	18463e7e9f	Painless: Add whitelist extensions (#28161 ) This commit adds a PainlessExtension which may be plugged in via SPI to add additional classes, methods and members to the painless whitelist on a per context basis. An example plugin adding and using a whitelist is also added.	2018-01-15 11:28:31 -08:00
Tim Brooks	ee7eac8dc1	`MockTcpTransport` to connect asynchronously (#28203 ) The method `initiateChannel` on `TcpTransport` is explicit in that channels can be connect asynchronously. All production implementations do connect asynchronously. Only the blocking `MockTcpTransport` connects in a synchronous manner. This avoids testing some of the blocking code in `TcpTransport` that waits on connections to complete. Additionally, it requires a more extensive method signature than required for other transports. This commit modifies the `MockTcpTransport` to make these connections asynchronously on a different thread. Additionally, it simplifies that `initiateChannel` method signature.	2018-01-15 10:20:30 -07:00
Tim Brooks	3895add2ca	Introduce elasticsearch-core jar (#28191 ) This is related to #27933. It introduces a jar named elasticsearch-core in the lib directory. This commit moves the JarHell class from server to elasticsearch-core. Additionally, PathUtils and some of Loggers are moved as JarHell depends on them.	2018-01-15 09:59:01 -07:00
Jim Ferenczi	be012b1326	upgrade to lucene 7.2.1 (#28218 )	2018-01-15 16:47:46 +01:00
Igor Motov	c75ac319a6	Add ability to associate an ID with tasks (#27764 ) Adds support for capturing the X-Opaque-Id header from a REST request and storing it's value in the tasks that this request started. It works for all user-initiated tasks (not only search). Closes #23250 Usage: ``` $ curl -H "X-Opaque-Id: imotov" -H "foo:bar" "localhost:9200/_tasks?pretty&group_by=parents" { "tasks" : { "7qrTVbiDQKiZfubUP7DPkg:6998" : { "node" : "7qrTVbiDQKiZfubUP7DPkg", "id" : 6998, "type" : "transport", "action" : "cluster:monitor/tasks/lists", "start_time_in_millis" : 1513029940042, "running_time_in_nanos" : 266794, "cancellable" : false, "headers" : { "X-Opaque-Id" : "imotov" }, "children" : [ { "node" : "V-PuCjPhRp2ryuEsNw6V1g", "id" : 6088, "type" : "netty", "action" : "cluster:monitor/tasks/lists[n]", "start_time_in_millis" : 1513029940043, "running_time_in_nanos" : 67785, "cancellable" : false, "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998", "headers" : { "X-Opaque-Id" : "imotov" } }, { "node" : "7qrTVbiDQKiZfubUP7DPkg", "id" : 6999, "type" : "direct", "action" : "cluster:monitor/tasks/lists[n]", "start_time_in_millis" : 1513029940043, "running_time_in_nanos" : 98754, "cancellable" : false, "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998", "headers" : { "X-Opaque-Id" : "imotov" } } ] } } } ```	2018-01-12 15:34:17 -05:00
Yu	228f7ffcdf	Add scroll parameter to _reindex API (#28041 ) Be able to change scroll timeout in _reindex API (by default: 5m)	2018-01-11 14:40:04 +01:00
Martijn van Groningen	73f6857dff	test: ensure we endup with a single segment Closes #28127	2018-01-10 15:14:26 +01:00
Jack Conradson	5d795afddb	Painless: Add public member read/write access test. (#28156 )	2018-01-09 15:06:51 -08:00
Jack Conradson	1d1dcd4ae7	Painless: Add a simple cache for whitelist methods and fields. (#28142 ) With support for multiple contexts we are adding some caching to the whitelist to keep the memory footprint for definitions from exploding.	2018-01-08 17:54:45 -08:00
Jack Conradson	b5377d294f	Painless: Modify Loader to Load Classes Directly from Definition (#28088 )	2018-01-05 13:06:36 -08:00
Tim Brooks	38701fb6ee	Create nio-transport plugin for NioTransport (#27949 ) This is related to #27260. This commit moves the NioTransport from :test:framework to a new nio-transport plugin. Additionally, supporting tcp decoding classes are moved to this plugin. Generic byte reading and writing contexts are moved to the nio library. Additionally, this commit adds a basic MockNioTransport to :test:framework that is a TcpTransport implementation for testing that is driven by nio.	2018-01-05 09:41:29 -07:00
Ryan Ernst	d36ec18029	Plugins: Add plugin extension capabilities (#27881 ) This commit adds the infrastructure to plugin building and loading to allow one plugin to extend another. That is, one plugin may extend another by the "parent" plugin allowing itself to be extended through java SPI. When all plugins extending a plugin are finished loading, the "parent" plugin has a callback (through the ExtensiblePlugin interface) allowing it to reload SPI. This commit also adds an example plugin which uses as-yet implemented extensibility (adding to the painless whitelist).	2018-01-03 11:12:43 -08:00
Christoph Büscher	29b07bb6c4	[Test] Fix scores for dcg in RankEvalRequestIT and RankEvalYamlIT Allow small deviations when asserting ranking scores, otherwise some tests break on floating point calculation differences e.g. when running on ARM.	2018-01-03 17:24:10 +01:00
Sian Lerk Lau	5e3ba8a88d	Enable convert processor to support Long and Double. (#27957 ) Closes #23085	2018-01-03 11:27:55 +01:00
Christoph Büscher	8925dabcb8	[Test] Fix allowed delta for calculated scores in DiscountedCumulativeGainTests	2018-01-02 16:46:31 +01:00
Tanguy Leroux	d2939a9daa	[Test] Mute DiscountedCumulativeGainTests on ARM These tests fail on ARM architectures. This is tracked in https://github.com/elastic/elasticsearch/issues/28048	2018-01-02 16:16:43 +01:00
Adrien Grand	7d88851766	Upgrade beats templates that we use for bwc testing. (#27929 ) These templates were generated with 5.0. We need those generated with 6.0 since we do not guarantee compatibility with previous versions of the template anyway. I removed the winlogbeat template which is a bit harder to generate as it requires Windows, since we do not aim to be exhaustive.	2017-12-21 08:50:14 +01:00
Andy Bristol	863432668b	[TEST] logging for update by query test #27820	2017-12-20 18:27:16 -05:00
Sian Lerk Lau	47eefbe889	Enable grok processor to support long, double and boolean (#27896 )	2017-12-20 11:19:49 -08:00
Adrien Grand	77711508b0	Upgrade to Lucene 7.2.0. (#27910 )	2017-12-20 14:17:40 +01:00
Alan Woodward	af3f63616b	Allow TrimFilter to be used in custom normalizers (#27758 ) AnalysisFactoryTestCase checks that the ES custom token filter multi-term awareness matches the underlying lucene factory. For the trim filter this won't be the case until LUCENE-8093 is released in 7.3, so we add a temporary exclusion Closes #27310	2017-12-18 14:27:03 +00:00
Jason Tedor	75c0cd0672	Move range field mapper back to core This commit moves the range field mapper back to core so that we can remove the compile-time dependency of percolator on mapper-extras which compilcates dependency management for the percolator client JAR, and modules should not be intertwined like this anyway. Relates #27854	2017-12-17 14:27:10 -05:00
Martijn van Groningen	e9160fc014	percolator: also extract match_all queries I've seen several cases where match_all queries were being used inside percolator queries, because these queries were created generated by other systems. Extracting these queries will allow the percolator at query time in a filter context to skip over these queries without parsing or validating that these queries actually match with the document being percolated.	2017-12-15 08:50:29 +01:00
Christoph Büscher	c541a0c60e	Add skip versions for rank_eval yaml tests	2017-12-14 22:18:37 +01:00
Jack Conradson	1de927c80d	Painless: Clean Up Painless Cast Object (#27794 ) Added static methods to make creating Painless casts obvious as to what is being boxed/unboxed.	2017-12-14 09:08:10 -08:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Christoph Büscher	bb14b8f7c5	Merge branch 'rankeval' This commit adds a new module that provides an endpoint that can be used to evaluate search ranking results. Closes #19195	2017-12-14 16:45:03 +01:00
Christoph Büscher	33bcfddb54	Use SPI to provide named XContent parsers for ranking evaluation	2017-12-12 18:39:01 +01:00
Alan Woodward	77617c8e62	[TEST] Add test for _range fields in query_string queries (#27756 ) [TEST] Add test for _range fields in query_string queries Closes #26555	2017-12-12 13:33:37 +00:00
Jack Conradson	8188d9f7e5	Painless: Only allow Painless type names to be the same as the equivalent Java class. (#27264 ) Also adds a parameter called only_fqn to the whitelist to enforce that a painless type must be specified as the fully-qualifed java class name.	2017-12-11 16:37:35 -08:00
Christoph Büscher	97b25f3b0c	Merge branch 'master' into rankeval	2017-12-11 15:19:16 +01:00
Tim Brooks	d1acb7697b	Remove internal channel tracking in transports (#27711 ) This commit attempts to continue unifying the logic between different transport implementations. As transports call a `TcpTransport` callback when a new channel is accepted, there is no need to internally track channels accepted. Instead there is a set of accepted channels in `TcpTransport`. This set is used for metrics and shutting down channels.	2017-12-08 16:56:53 -07:00
Tim Brooks	d82c40d35c	Implement byte array reusage in `NioTransport` (#27696 ) This is related to #27563. This commit modifies the InboundChannelBuffer to support releasable byte pages. These byte pages are provided by the PageCacheRecycler. The PageCacheRecycler must be passed to the Transport with this change.	2017-12-08 10:39:30 -07:00
Tim Brooks	ad8a571677	Add read timeouts to http module (#27713 ) We currently do not have any server-side read timeouts implemented in elasticsearch. This commit adds a read timeout setting that defaults to 30 seconds. If after 30 seconds a read has not occurred, the channel will be closed. A timeout of value of 0 will disable the timeout.	2017-12-08 09:32:09 -07:00
Christoph Büscher	b83e14858a	Correcting some minor typos in comments	2017-12-07 16:39:23 +01:00
Christoph Büscher	52cb6c8ef2	Merge branch 'master' into rankeval	2017-12-07 14:22:46 +01:00
Robin Neatherway	057efea893	Correct two equality checks on incomparable types (#27688 )	2017-12-07 14:18:11 +01:00
Martijn van Groningen	4d78e1a9ad	Added msearch api to high level client	2017-12-05 10:17:47 +01:00
Christoph Büscher	bbec33d35c	Merge branch 'master' into rankeval	2017-12-04 12:57:19 +01:00
Christoph Büscher	72d0de4197	Add search window parameter k to MRR and DCG metric (#27595 )	2017-12-04 10:54:03 +01:00
Christoph Büscher	c4fe7d3f72	[Docs] add deprecation warning for `delimited_payload_filter` renaming	2017-12-04 10:22:05 +01:00
Adrien Grand	6323bb0d97	Upgrade to lucene-7.2.0-snapshot-8c94404. (#27619 ) This new snapshot mostly brings a change to TopFieldCollector which can now early terminate collection when trackTotalHits is `false`. As a follow-up, we should replace our usage of `EarlyTerminatingSortingCollector` with this new option.	2017-12-04 09:40:08 +01:00
Christoph Büscher	35688f6441	Merge branch 'master' into rankeval	2017-11-29 15:24:06 +01:00
Christoph Büscher	7bfb273763	Add k parameter to PrecisionAtK metric (#27569 )	2017-11-29 15:19:16 +01:00
Jack Conradson	2d927fabab	Painless: Fix errors allowing void to be assigned to def. (#27460 )	2017-11-28 13:44:52 -08:00
Jack Conradson	9e42b77f7e	Painless: Fix variable scoping issue in lambdas not including captured variables. (#27571 )	2017-11-28 13:30:13 -08:00
Adrien Grand	996990ad1f	Upgrade to lucene-7.2.0-snapshot-8c94404. (#27496 ) The main highlight of this new snapshot is that it introduces the opportunity for queries to opt out of caching. In case a query opts out of caching, not only will it never be cached, but also no compound query that wraps it will be cached.	2017-11-28 14:52:42 +01:00
Martijn van Groningen	cb1204774b	Include the _index, _type and _id to nested search hits in the top_hits and inner_hits response. Also include _type and _id for parent/child hits inside inner hits. In the case of top_hits aggregation the nested search hits are directly returned and are not grouped by a root or parent document, so it is important to include the _id and _index attributes in order to know to what documents these nested search hits belong to. Closes #27053	2017-11-28 14:05:29 +01:00
Christoph Büscher	1352b7c6ea	Use msearch instead of single search (#27520 ) Change TransportRankEvalAction to use one MultiSearchRequest instead of issuing several parallel search requests to simplify the transport action.	2017-11-27 10:15:59 +01:00
Martijn van Groningen	4ab638b71d	percolator: Avoid TooManyClauses exception if number of terms / ranges is exactly equal to 1024 The logic whether to use CoveringQuery was in two places which is why this bug snug in.	2017-11-27 08:55:11 +01:00
Christoph Büscher	5661b1c3df	Merge branch 'master' into rankeval	2017-11-24 16:25:05 +01:00
kel	4885acb048	Replace `delimited_payload_filter` by `delimited_payload` (#26625 ) The `delimited_payload_filter` is renamed to `delimited_payload`, the old name is deprecated and should be replaced by `delimited_payload`. Closes #21978	2017-11-24 13:03:19 +01:00
Simon Willnauer	fadbe0de08	Automatically prepare indices for splitting (#27451 ) Today we require users to prepare their indices for split operations. Yet, we can do this automatically when an index is created which would make the split feature a much more appealing option since it doesn't have any 3rd party prerequisites anymore. This change automatically sets the number of routinng shards such that an index is guaranteed to be able to split once into twice as many shards. The number of routing shards is scaled towards the default shard limit per index such that indices with a smaller amount of shards can be split more often than larger ones. For instance an index with 1 or 2 shards can be split 10x (until it approaches 1024 shards) while an index created with 128 shards can only be split 3x by a factor of 2. Please note this is just a default value and users can still prepare their indices with `index.number_of_routing_shards` for custom splitting. NOTE: this change has an impact on the document distribution since we are changing the hash space. Documents are still uniformly distributed across all shards but since we are artificually changing the number of buckets in the consistent hashign space document might be hashed into different shards compared to previous versions. This is a 7.0 only change.	2017-11-23 09:48:54 +01:00
Simon Willnauer	5a0b6d1977	Use the primary_term field to identify parent documents (#27469 ) This change stops indexing the `_primary_term` field for nested documents to allow fast retrieval of parent documents. Today we create a docvalues field for children to ensure we have a dense datastructure on disk. Yet, since we only use the primary term to tie-break on when we see the same seqID on indexing having a dense datastructure is less important. We can use this now to improve the nested docs performance and it's memory footprint. Relates to #24362	2017-11-21 15:14:03 +01:00
Christoph Büscher	d979ccace9	Merge branch 'master' into rankeval	2017-11-21 14:11:02 +01:00
Christoph Büscher	94a0631a3e	[Tests] Add testToXContent() RankEvalResponseTests	2017-11-21 14:09:50 +01:00
Christoph Büscher	35fabdaf8a	Parse EvluationMetrics as named Objects	2017-11-21 14:09:38 +01:00
Christoph Büscher	fdb24cd3e4	Fixing occasional test failure in RankEvalSpecTests	2017-11-21 14:09:13 +01:00
Christoph Büscher	3348d2317f	Reworking javadocs, minor changes in some implementation classes	2017-11-21 14:09:04 +01:00
Christoph Büscher	e278c1d17d	Improving and cleaning up tests Removing the unnecessary RankEvalTestHelper, making use of the common test infra in ESTestCase, also hardening a few of the classes by making more fields final.	2017-11-21 14:08:53 +01:00
Christoph Büscher	5c65a59369	Extending rank_eval asciidocs	2017-11-21 14:08:42 +01:00
Christoph Büscher	d9e67a2c95	Extending `_rank_eval` documentation	2017-11-21 14:08:28 +01:00
Christoph Büscher	0a6c6ac360	Remove usage of types in rank_eval endpoint	2017-11-21 14:07:41 +01:00
Jim Ferenczi	6319424e4a	Move composite aggregation to core (#27474 ) This change removes the module named aggs-composite and adds the `composite` aggs as a core aggregation. This allows other plugins to use this new aggregation and simplifies the integration in the HL rest client.	2017-11-21 13:31:01 +01:00
Luca Cavanna	29450de7b5	Cross Cluster Search: make remote clusters optional (#27182 ) Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that. This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory. Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile. The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped: "_clusters" : { "total" : 3, "successful" : 2, "skipped" : 1 } Such section won't be part of the response if no clusters have been skipped. The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.	2017-11-21 11:41:47 +01:00
Tim Brooks	4e04f95ab4	Fix issue where pages aren't released (#27459 ) This is related to #27422. Right now when we send a write to the netty transport, we attach a listener to the future. When you submit a write on the netty event loop and the event loop is shutdown, the onFailure method is called. Unfortunately, netty then tries to notify the listener which cannot be done without dispatching to the event loop. In this case, the dispatch fails and netty logs and error and does not tell us. This commit checks that netty is still not shutdown after sending a message. If netty is shutdown, we complete the listener.	2017-11-20 14:53:08 -07:00
Tim Brooks	0a8f48d592	Transition transport apis to use void listeners (#27440 ) Currently we use ActionListener<TcpChannel> for connect, close, and send message listeners in TcpTransport. However, all of the listeners have to capture a reference to a channel in the case of the exception api being called. This commit changes these listeners to be type <Void> as passing the channel to onResponse is not necessary. Additionally, this change makes it easier to integrate with low level transports (which use different implementations of TcpChannel).	2017-11-20 10:47:47 -07:00
Tim Brooks	80ef9bbdb1	Remove parameterization from TcpTransport (#27407 ) This commit is a follow up to the work completed in #27132. Essentially it transitions two more methods (sendMessage and getLocalAddress) from Transport to TcpChannel. With this change, there is no longer a need for TcpTransport to be aware of the specific type of channel a transport returns. So that class is no longer parameterized by channel type.	2017-11-16 11:19:36 -07:00
Jim Ferenczi	623367d793	Add composite aggregator (#26800 ) * This change adds a module called `aggs-composite` that defines a new aggregation named `composite`. The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources. The sources for each bucket can be defined as: * A `terms` source, values are extracted from a field or a script. * A `date_histogram` source, values are extracted from a date field and rounded to the provided interval. This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets. A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources. For instance the following aggregation: ```` "test_agg": { "terms": { "field": "field1" }, "aggs": { "nested_test_agg": "terms": { "field": "field2" } } } ```` ... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve all the combinations of `field1`, `field2` in the matching documents: ```` "composite_agg": { "composite": { "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } }, } } ```` The response of the aggregation looks like this: ```` "aggregations": { "composite_agg": { "buckets": [ { "key": { "field1": "alabama", "field2": "almanach" }, "doc_count": 100 }, { "key": { "field1": "alabama", "field2": "calendar" }, "doc_count": 1 }, { "key": { "field1": "arizona", "field2": "calendar" }, "doc_count": 1 } ] } } ```` By default this aggregation returns 10 buckets sorted in ascending order of the composite key. Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after. For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`: ```` "composite_agg": { "composite": { "after": {"field1": "alabama", "field2": "calendar"}, "size": 100, "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } } } } ```` This aggregation is optimized for indices that set an index sorting that match the composite source definition. For instance the aggregation above could run faster on indices that defines an index sorting like this: ```` "settings": { "index.sort.field": ["field1", "field2"] } ```` In this case the `composite` aggregation can early terminate on each segment. This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition. This is mandatory because index sorting picks only one value per document to perform the sort.	2017-11-16 15:13:36 +01:00
Tim Brooks	ca11085bb6	Add TcpChannel to unify Transport implementations (#27132 ) Right now our different transport implementations must duplicate functionality in order to stay compliant with the requirements of TcpTransport. They must all implement common logic to open channels, close channels, keep track of channels for eventual shutdown, etc. Additionally, there is a weird and complicated relationship between Transport and TransportService. We eventually want to start merging some of the functionality between these classes. This commit starts moving towards a world where TransportService retains all the application logic and channel state. Transport implementations in this world will only be tasked with returning a channel when one is requested, calling transport service when a channel is accepted from a server, and starting / stopping itself. Specifically this commit changes how channels are opened and closed. All Transport implementations now return a channel type that must comply with the new TcpChannel interface. This interface has the methods necessary for TcpTransport to completely manage the lifecycle of a channel. This includes setting the channel up, waiting for connection, adding close listeners, and eventually closing.	2017-11-15 12:38:39 -07:00
Clinton Gormley	1caa5c8e32	Rest test fixes (#27354 ) * REST: Rename ingest.processor.grok to ingest.processor_grok * REST: Rename remote.info to cluster.remote_info * REST: Fixed bad YAML comments * REST: Force dummy scripts to be strings, not numbers * REST: Fix bad YAML in search/110_field_collapsing.yml * REST: Adjust percentile tests to work with Perl number handling	2017-11-14 11:14:14 +01:00
Tal Levy	5c34533761	add json-processor support for non-map json types (#27335 ) The Json Processor originally only supported parsing field values into Maps even though the JSON spec specifies that strings, null-values, numbers, booleans, and arrays are also valid JSON types. This commit enables parsing these values now. response to #25972.	2017-11-13 10:28:19 -08:00
Martijn van Groningen	7c056f4523	reword comment	2017-11-13 08:00:34 +01:00
Ryan Ernst	8b9e23de93	Plugins: Add versionless alias to all security policy codebase properties (#26756 ) This is a followup to #26521. This commit expands the alias added for the elasticsearch client codebase to all codebases. The original full jar name property is left intact. This only adds an alias without the version, which should help ease the pain in updating any versions (ES itself or dependencies).	2017-11-10 11:00:09 -08:00
Martijn van Groningen	1bd31e9b53	percolator: fixed issue where in indices created before 6.1 if minimum should match has been specified on a disjunction, the query would be marked as verified candidate match. This is wrong as it can only marked as verified candidate match on indices created on or after 6.1, due to the use of the CoveringQuery.	2017-11-10 12:02:33 +01:00
Martijn van Groningen	b4048b4e7f	Use CoveringQuery to select percolate candidate matches and extract all clauses from a conjunction query. When clauses from a conjunction are extracted the number of clauses is also stored in an internal doc values field (minimum_should_match field). This field is used by the CoveringQuery and allows the percolator to reduce the number of false positives when selecting candidate matches and in certain cases be absolutely sure that a conjunction candidate match will match and then skip MemoryIndex validation. This can greatly improve performance. Before this change only a single clause was extracted from a conjunction query. The percolator tried to extract the clauses that was rarest in order (based on term length) to attempt less candidate queries to be selected in the first place. However this still method there is still a very high chance that candidate query matches are false positives. This change also removes the influencing query extraction added via #26081 as this is no longer needed because now all conjunction clauses are extracted. https://www.elastic.co/guide/en/elasticsearch/reference/6.x/percolator.html#_influencing_query_extraction Closes #26307	2017-11-10 07:44:42 +01:00
Tal Levy	d22fd4ea58	Introduce templating support to timezone/locale in DateProcessor (#27089 ) Sometimes systems like Beats would want to extract the date's timezone and/or locale from a value in a field of the document. This PR adds support for mustache templating to extract these values. Closes #24024.	2017-11-09 09:45:32 -08:00
Mayya Sharipova	148376c2c5	Add limits for ngram and shingle settings (#27211 ) * Add limits for ngram and shingle settings (#27211) Create index-level settings: max_ngram_diff - maximum allowed difference between max_gram and min_gram in NGramTokenFilter/NGramTokenizer. Default is 1. max_shingle_diff - maximum allowed difference between max_shingle_size and min_shingle_size in ShingleTokenFilter. Default is 3. Throw an IllegalArgumentException when trying to create NGramTokenFilter, NGramTokenizer, ShingleTokenFilter where difference between max_size and min_size exceeds the settings value. Closes #25887	2017-11-07 08:14:55 -05:00
David Roberts	749c3ec716	Remove the single argument Environment constructor (#27235 ) Only tests should use the single argument Environment constructor. To enforce this the single arg Environment constructor has been replaced with a test framework factory method. Production code (beyond initial Bootstrap) should always use the same Environment object that Node.getEnvironment() returns. This Environment is also available via dependency injection.	2017-11-04 13:25:09 +00:00
Armin Braun	3deba0ed1f	#26260 Allow ip_range to accept CIDR notation (#27192 ) * #26260 Allow ip_range to accept CIDR notation * #26260 added non-byte-alligned cidr test cases	2017-11-03 13:34:48 -06:00
Armin Braun	8f0f024507	#27189 Fixed rounding of bounds in scaled float comparison (#27207 ) * #27189 Fixed rounding of bounds in scaled float comparison * #27189 more assertions from CR	2017-11-03 13:23:07 -06:00
Armin Braun	f9e755f980	Fixed byte buffer leak in Netty4 request handler If creating the REST request throws an exception (for example, because of invalid headers), we leak the request due to failure to release the buffer (which would otherwise happen after replying on the channel). This commit addresses this leak by handling the failure case. Relates #27222	2017-11-02 20:22:19 -04:00
Colin Goodheart-Smithe	c1b8140c83	Upgrade to Lucene 7.1 (#27225 )	2017-11-02 13:25:33 +00:00
Colin Goodheart-Smithe	99aca9cdfc	Enhances exists queries to reduce need for `_field_names` (#26930 ) * Enhances exists queries to reduce need for `_field_names` Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance. This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`. Closes #26770 * Addresses review comments * Addresses more review comments * implements existsQuery explicitly on every mapper * Reinstates ability to perform term query on `_field_names` * Added bwc depending on index created version * Review Comments * Skips tests that are not supported in 6.1.0 These values will need to be changed after backporting this PR to 6.x	2017-11-01 10:46:59 +00:00
Jack Conradson	abaede2373	Upgrade Painless from ANTLR 4.5.1-1 to ANTLR 4.5.3. (#27153 )	2017-10-27 11:07:49 -07:00
Christoph Büscher	b88dbe8f49	[Tests] Fix occasional test failure due to two random values being the same	2017-10-27 12:06:16 +02:00
Jack Conradson	dda5d1af29	Allow for the Painless Definition to have multiple instances (#27096 )	2017-10-26 08:33:55 -07:00
Jason Tedor	9aae2f593a	Avoid stack overflow on search phases When a search is executing locally over many shards, we can stack overflow during query phase execution. This happens due to callbacks that occur after a phase completes for a shard and we move to the same phase on another shard. If all the shards for the query are local to the local node then we will never go async and these callbacks will end up as recursive calls. With sufficiently many shards, this will end up as a stack overflow. This commit addresses this by truncating the stack by forking to another thread on the executor for the phase. Relates #27069	2017-10-25 22:05:46 -04:00
Ryan Ernst	2a8452b513	Reindex: Fix headers in reindex action (#26937 ) The headers passed to reindex were skipped except for the last one. This commit fixes the copying of the headers, as well as adds a base test case for rest client builders to access the headers within the built rest client. relates #22976	2017-10-25 16:37:01 -07:00
Tim Brooks	a7fa5d3335	Remove dangerous `ByteBufStreamInput` methods (#27076 ) This commit removes the `ByteBufStreamInput` `readBytesReference` and `readBytesRef` methods. These methods are zero-copy which means that they retain a reference to the underlying netty buffer. The problem is that our `TcpTransport` is not designed to handle zero-copy. The netty implementation sets the read index past the current message once it has been deserialized, handled, and mostly likely dispatched to another thread. This means that netty is free to release this buffer. So it is unsafe to retain a reference to it without calling `retain`. And we cannot call `retain` because we are not currently designed to handle reference counting past the transport level. This should not currently impact us as we wrap the `ByteBufStreamInput` in `NamedWriteableAwareStreamInput` in the `TcpTransport`. This stream essentially delegates to the underling stream. However, in the case of `readBytesReference` and `readBytesRef` it leaves thw implementations to the standard `StreamInput` methods. These methods call the read byte array method which delegates to `ByteBufStreamInput`. The read byte array method on `ByteBufStreamInput` copies so it is safe. The only impact of this commit should be removing methods that could be dangerous if they were eventually called due to some refactoring.	2017-10-24 08:51:14 -06:00
Martijn van Groningen	93107f8466	removed unused import	2017-10-23 10:00:54 +02:00
Martijn van Groningen	141d1b62e9	ingest: date processor should not fail if timestamp is specified as json number Closes #26967	2017-10-23 09:32:44 +02:00
Tanguy Leroux	463e7e6fa3	Revert "Upgrade to Jackson 2.9.2 (#27032 )" This reverts commit `0b9acc5ace`.	2017-10-20 08:25:41 +02:00
Tanguy Leroux	0b9acc5ace	Upgrade to Jackson 2.9.2 (#27032 ) Upgrade to Jackson 2.9.2 and also use a boolean `closed` flag to indicate that a FastStringReader instance is closed, so that length is still correctly reported after the reader is closed.	2017-10-19 15:15:02 +02:00
Simon Willnauer	8dda827ff4	Don't refresh on `_flush` `_force_merge` and `_upgrade` (#27000 ) Today all these API calls have a sideeffect of making documents visible to search requests. While this is sometimes desired it's an unnecessary sideeffect and now that we have an internal (engine-private) index reader (#26972) we artificially add a refresh call for bwc. This change removes this sideeffect in 7.0.	2017-10-16 10:16:35 +02:00
Tim Brooks	277637f42f	Do not set SO_LINGER on server channels (#26997 ) Right now we are attempting to set SO_LINGER to 0 on server channels when we are stopping the tcp transport. This is not a supported socket option and throws an exception. This also prevents the channels from being closed. This commit 1. doesn't set SO_LINGER for server channges, 2. checks that it is a supported option in nio, and 3. changes the log message to warn for server channel close exceptions.	2017-10-13 13:06:38 -06:00
Anton Pozhidaev	cee9640c20	Update by Query is modified to accept short `script` parameter. (#26841 ) Update by Query is modified to accept short `script` parameter. Closes issue #24898	2017-10-11 21:57:46 +00:00
kel	2e36f19051	Add support for parsing inline script (#23824 ) (#26846 ) * Add support for parsing inline script (#23824) * Fix test	2017-10-11 09:15:37 -07:00
Jason Tedor	4c06b8f1d2	Check for closed connection while opening While opening a connection to a node, a channel can subsequently close. If this happens, a future callback whose purpose is to close all other channels and disconnect from the node will fire. However, this future will not be ready to close all the channels because the connection will not be exposed to the future callback yet. Since this callback is run once, we will never try to disconnect from this node again and we will be left with a closed channel. This commit adds a check that all channels are open before exposing the channel and throws a general connection exception. In this case, the usual connection retry logic will take over. Relates #26932	2017-10-10 13:34:51 -04:00
Nik Everett	4a06dd919a	Painless: add tests for cached boxing (#24163 ) We had a TODO about adding tests around cached boxing. In #24077 I tracked down the uncached boxing tests and saw the TODO. Cached boxing testing is a fairly small extension to that work.	2017-10-10 10:34:03 -04:00
Tanguy Leroux	6658ff0fd6	Don't detect source's XContentType in DocumentParser.parseDocument() (#26880 ) DocumentParser.parseDocument() auto detects the XContentType of the document to parse, but this information is already provided by SourceToParse.	2017-10-10 15:31:56 +02:00
Daniel Mitterdorfer	e22844bd2a	Allow only a fixed-size receive predictor (#26165 ) With this commit we simplify our network layer by only allowing to define a fixed receive predictor size instead of a minimum and maximum value. This also means that the following (previously undocumented) settings are removed: * http.netty.receive_predictor_min * http.netty.receive_predictor_max Using an adaptive sizing policy in the receive predictor is a very low-level optimization. The implications on allocation behavior are extremely hard to grasp (see our previous work in #23185) and adaptive sizing does not provide a lot of benefits (see benchmarks in #26165 for more details).	2017-10-10 13:29:45 +02:00
Martijn van Groningen	bba70205e3	ingest: Fix bug that prevent date_index_name processor from accepting timestamps specified as a json number Closes #26890	2017-10-10 10:04:29 +02:00
Ryan Ernst	6b53dadcf9	Scripting: Fix expressions to temporarily support filter scripts (#26824 ) This commit adds a hack converting 0.0 to false and non-zero to true for expressions operating under a filter context. closes #26429	2017-10-09 17:02:21 -07:00
Simon Willnauer	cdd7c1e6c2	Return List instead of an array from settings (#26903 ) Today we return a `String[]` that requires copying values for every access. Yet, we already store the setting as a list so we can also directly return the unmodifiable list directly. This makes list / array access in settings a much cheaper operation especially if lists are large.	2017-10-09 09:52:08 +02:00
Nik Everett	38989191e7	Use LF line endings in Painless generated files (#26822 ) Right now if you run `gradle regen` on Windows you'll get `CRLF` line endings on all the ANTLR generated files because we run ``` ant.fixcrlf(srcdir: outputPath) { patternset(includes: 'Painless*.java') } ``` The docs for fixcrlf say that the default line endings that it corrects to is based on the OS: https://ant.apache.org/manual/Tasks/fixcrlf.html This change locks it to `LF`.	2017-10-06 16:44:03 -04:00
Yannick Welsch	c1666f4a22	Use proper logging placeholder for Netty logging	2017-10-06 10:02:51 +02:00
Yannick Welsch	ec6ea9b403	Add Netty channel information on write and flush failure	2017-10-06 09:16:58 +02:00
Jason Tedor	470e5e7cfc	Add additional low-level logging handler () * Add additional low-level logging handler We have the trace handler which is useful for recording sent messages but there are times where it would be useful to have more low-level logging about the events occurring on a channel. This commit adds a logging handler that can be enabled by setting a certain log level (org.elasticsearch.transport.netty4.ESLoggingHandler) to trace that provides trace logging on low-level channel events and includes some information about the request/response read/write events on the channel as well. * Remove imports * License header * Remove redundant * Add test * More assertions	2017-10-05 12:10:58 -04:00
Jason Tedor	597187048b	Unwrap causes when maybe dying We should unwrap the cause looking for any suppressed errors or root causes that are errors when checking if we should maybe die. This commit causes that to be the case. Relates #26884	2017-10-05 12:00:30 -04:00
Jason Tedor	4835d61a48	Change log level on write and flush failure to warn This commit changes the log level on a write and flush failure to warn as this is not necessarily an Elasticsearch problem but more likely indicative of an infrastructure problem.	2017-10-05 11:18:43 -04:00
Martijn van Groningen	b27e408ed2	Removed void token filter entries and added two tests	2017-10-05 13:25:05 +02:00
Md. Abdulla-Al-Sun	a40c474e10	Added Bengali Analyzer to Elasticsearch with respect to the lucene update(PR#238)	2017-10-05 13:25:05 +02:00
Simon Willnauer	00dfdf50cf	Represent lists as actual lists inside Settings (#26878 ) Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penalties when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs. With this change we use a simple string list to represent the list. This change also forbids to add a settings that ends with a .0 which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal .0 handling can be removed since all settings will be converted. Relates to #26723	2017-10-05 09:27:08 +02:00
Martijn van Groningen	dca787ed8a	upgrade to Lucene 7.1.0 snapshot version	2017-10-05 09:06:56 +02:00
Simon Willnauer	d1533e2397	Remove Settings#getAsMap() (#26845 ) Since `#getAsMap` exposes internal representation we are trying to remove it step by step. This commit is cleaning up some xcontent writing as well as usage in tests	2017-10-04 01:21:38 -06:00
Tim Brooks	d80ad7f097	Check channel i open before setting SO_LINGER (#26857 ) This commit fixes a #26855. Right now we set SO_LINGER to 0 if we are stopping the transport. This can throw a ChannelClosedException if the raw channel is already closed. We have a number of scenarios where it is possible this could be called with a channel that is already closed. This commit fixes the issue be checking that the channel is not closed before attempting to set the socket option.	2017-10-02 15:09:52 -06:00
Jason Tedor	5869a7482b	Maybe die before trying to log cause This commit reorders a maybe die check and a logging statement for the following reasons: - we should die as quickly as possible if the cause is fatal - we do not want the JVM to be so broken that when we try to log another exception is thrown (maybe another out of memory exception) and then the maybe die is never invoked - maybe die will log the cause anyway if the cause is fatal so we only need to log if the cause is not fatal	2017-10-01 09:45:36 -04:00
Jason Tedor	1084c7b6b2	Log cause when a write and flush fails This commit logs the cause of a write and flush operation on the network layer that failed.	2017-10-01 09:41:13 -04:00
Jason Tedor	f79842be6f	Die if write listener fails due to fatal error This commit performs a maybe die check after a write listener fails.	2017-09-30 18:52:54 -04:00
Simon Willnauer	25d6778d31	Add comment to TCP transport impls why we set SO_LINGER on close	2017-09-28 13:07:01 +02:00
Martijn van Groningen	805437b8bc	percolator: Also support query extraction for queries wrapped inside a ESToParentBlockJoinQuery	2017-09-28 09:28:50 +02:00
Armin Braun	af06231d4c	#26701 Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets (#26764 ) #26701 Added option to RST instead of FIN to TcpTransport#closeChannels	2017-09-26 19:58:11 +00:00
Christoph Büscher	6189c54c84	Reject the `index_options` parameter for numeric fields (#26668 ) Numeric fields no longer support the index_options parameter. This changes the parameter to be rejected in numeric field types after it was deprecated in 6.0. Closes #21475	2017-09-25 23:43:14 +02:00
Nik Everett	eb754a71be	Fix update_by_query's default size parameter (#26784 ) We were accidentally defaulting it to the scroll size. Untwists some of the tricks that we play with parsing so that the size is no longer scrambled. Closes #26761	2017-09-25 16:25:27 -04:00
Jiri Tyr	76f8701eec	Fixing Grok pattern for Apache 2.4 (#26635 )	2017-09-25 07:59:37 -07:00
Simon Willnauer	aab4655e63	Unify Settings xcontent reading and writing (#26739 ) This change adds a fromXContent method to Settings that allows to read the xcontent that is produced by toXContent. It also replaces the entire settings loader infrastructure and removes the structured map representation. Future PRs will also tackle the `getAsMap` that exposes the internal represenation of settings for better encapsulation.	2017-09-25 13:23:01 +02:00
Jason Tedor	e0db89bc35	Upgrade to Lucene 7.0.0 This commit upgrades to the GA release of Luence 7! Relates #26744	2017-09-21 19:19:33 -04:00
Christoph Büscher	86b00b84bc	Remove parse field deprecations in query builders (#26711 ) The `fielddata` field and the use of the `_name` field in the short syntax of the range query have been deprecated in 5.0 and can be removed. The same goes for the deprecated `score_mode` field in HasParentQueryBuilder, the deprecated `like_text`, `ids` and `docs` parameter in the `more_like_this` query, the deprecated query name in the short version of the `regexp` query, and several deprecated alternative field names in other query builders.	2017-09-20 16:22:21 +02:00
Christoph Büscher	22e200e79a	Remove deprecated type and slop field in MatchQueryBuilder (#26720 ) The `type` field has been deprecated in 5.0 and can be removed. It has been replaced by using the MatchPhraseQueryBuilder or the MatchPhrasePrefixQueryBuilder. The `slop` field has also been deprecated and can be removed, the phrase and phrase prefix query builders still provide this parameter.	2017-09-20 14:24:30 +02:00
Yannick Welsch	ff1e26276d	Deguice ActionFilter (#26691 ) Allows to instantiate TransportAction instances without Guice.	2017-09-20 10:30:21 +02:00
Jack Conradson	c3746b268c	Separate Painless Whitelist Loading from the Painless Definition (#26540 ) Adds several small whitelist data structures and a new Whitelist class to separate the idea of loading a whitelist from the actual Painless Definition class. This is the first step of many in allowing users to define custom whitelists per context. Also supports the idea of loading multiple whitelists from different sources for a single context.	2017-09-18 15:51:07 -07:00
Simon Willnauer	9f97f9072a	Allow `InputStreamStreamInput` array size validation where applicable (#26692 ) Today we can't validate the array length in `InputStreamStreamInput` since we can't rely on `InputStream.available` yet in some situations we know the size of the stream and can apply additional validation.	2017-09-18 17:52:36 +02:00
Christoph Büscher	c83ec1f133	Fixing test after merging in master	2017-09-15 13:44:40 +02:00
Christoph Büscher	bea8451b2f	Merge branch 'master' into feature/rank-eval	2017-09-15 11:44:51 +02:00
Michael Basnight	f385e0cf26	Add bad_request to the rest-api-spec catch params (#26539 ) This adds another request to the catch params. It also makes sure that the generic request param does not allow 400 either.	2017-09-14 14:24:03 -05:00
Christoph Büscher	c7c6443b10	[Docs] "The the" is a great band, but ... (#26644 ) Removing several occurrences of this typo in the docs and javadocs, seems to be a common mistake. Corrections turn up once in a while in PRs, better to correct some of this in one sweep.	2017-09-14 15:08:20 +02:00
Jim Ferenczi	401f4ba2ce	Fix percolator highlight sub fetch phase to not highlight query twice (#26622 ) * Fix percolator highlight sub fetch phase to not highlight query twice The PercolatorHighlightSubFetchPhase does not override hitExecute and since it extends HighlightPhase the search hits are highlighted twice (by the highlight phase and then by the percolator). This does not alter the results, the second highlighting just overrides the first one but this slow down the request because it duplicates the work.	2017-09-14 09:31:14 +02:00
Adrien Grand	93da7720ff	Move non-core mappers to a module. (#26549 ) Today we have all non-plugin mappers in core. I'd like to start moving those that neither map to json datatypes nor are very frequently used like `date` or `ip` to a module. This commit creates a new module called `mappers-extra` and moves the `scaled_float` and `token_count` mappers to it. I'd like to eventually move `range` fields there but it's more complicated due to their intimate relationship with range queries. Relates #10368	2017-09-13 17:58:53 +02:00
Ryan Ernst	9834081254	Fix reference to painless inside expression engine (#26528 ) This was a simple copy/paste bug in an earlier refactoring.	2017-09-11 15:44:23 -07:00
Adrien Grand	1adee8b5a8	Fix the MapperFieldType.rangeQuery API. (#26552 ) RangeQueryBuilder needs to perform too many `instanceof` checks in order to check for `date` or `range` fields in order to know what it should do with the shape relation, time zone and date format. This commit adds those 3 parameters to the `rangeQuery` factory method so that those instanceof checks are not necessary anymore.	2017-09-11 11:02:05 +02:00
Martijn van Groningen	b391425da1	Added support to the percolate query to percolate multiple documents The percolator will add a `_percolator_document_slot` field to all percolator hits to indicate with what document it has matched. This number matches with the order in which the documents have been specified in the percolate query. Also improved the support for multiple percolate queries in a search request.	2017-09-08 17:28:39 +02:00
Ryan Ernst	c9964d17bf	Internal: Add versionless alias for rest client codebase in policy files (#26521 ) Security manager policy files contains grants for specific codebases, where a codebase is a jar file. We use a system property containing the name of the jar file to resolve the jar file location when parsing the policy file. However, this means the version of the jars must be modified when versions of dependencies change. This is particularly messy for elasticsearch, where we now have a dependency on the rest client, and need to support both a snapshot version for testing and non snapshot for release. This commit adds an alias for the elasticsearch rest client without a version to be used in policy files. That allows the policy files to not care whether the rest client is a snapshot or release.	2017-09-06 18:57:10 -07:00
Martijn van Groningen	6bdf591193	removed unused import	2017-09-06 07:01:58 +02:00
Martijn van Groningen	77bbe99102	Fix two unreleased percolator query analyze bugs * If in a range query upper is smaller than lower then ignore the range query * If two empty range extractions are compared don't fail with NoSuchElementException	2017-09-06 06:47:01 +02:00
Martijn van Groningen	2ad3608245	percolator: handle point queries with 2 or more dimensions correctly	2017-09-06 06:36:47 +02:00
Martijn van Groningen	78e9c96d7f	Added a limit to from + size in top_hits and inner hits. Relates to #11511	2017-09-05 08:44:45 +02:00
Martijn van Groningen	a4d5c6418e	percolator: Rename map_unmapped_fields_as_string setting to map_unmapped_fields_as_text The `index.percolator.map_unmapped_fields_as_text` is a more better name, because unmapped fields are mapped to a text field with default settings and string is no longer a field type (it is either keyword or text).	2017-09-04 14:12:44 +02:00
Alexander Reelsen	80d0a32f8e	ScriptService: Replace max compilation per minute setting with max compilation rate (#26399 ) The current script service has a script compilation limit for a one minute window. This is set to a small default value of 15. Instead of increasing that default value, this commit introduces a new setting that allows to configure a rate per time unit, so that the script service can deal with bursts better. The new setting is named `script.max_compilations_rate`, requires a nonnegative number and a positive time value. The default is `75/5m`, which is equivalent to the existing 15 per minute.	2017-09-01 10:15:27 +02:00
Adrien Grand	78681bc9e5	Upgrade to lucene-7.0.0-snapshot-d94a5f0. (#26441 )	2017-08-31 09:06:40 +02:00
Tal Levy	ed151d829d	Migrate Search requests to use Writeable reading strategies (#26428 ) Migrates many SearchRequest objects to use Writeable conventions and rejects usage of `readFrom` in these new classes.	2017-08-30 11:00:33 -07:00
Colin Goodheart-Smithe	ce1d85d7d0	Moves deferring code into its own subclass (#26421 ) * Moves deferring code into its own subclass This change moves the code that deals with deferring collection to a subclass of BucketAggregator called DeferringBucketAggregator. This means that the code in AggregatorBase is simplified and also means that the code for deferring colleciton is in one place and easier to maintain. * Makes SIngleBucketAggregator an interface This is so aggregators that extend BucketsAggregator directly and those that extend DeferringBucketAggregator can be a single bucket aggregator * review comments * More review comments	2017-08-30 11:15:40 +01:00
Adrien Grand	34a6c7af26	Consolidate locale parsing. (#26400 ) Mappings and ingest have different locale parsing code.	2017-08-30 10:58:33 +02:00
Adrien Grand	06b7f9c78e	Do not test the ingest date processor against random locales. Random locales include locales whose country name is obsolete like `CS` or have usage restrictions like `DG`. Closes #26425	2017-08-30 09:48:26 +02:00
Ryan Ernst	b56615ef46	Test: disable locale parsing test that is broken with some randomized values See https://github.com/elastic/elasticsearch/issues/26425	2017-08-29 11:57:57 -07:00
Andy Bristol	791cbc2ba7	[TEST] test logging for reindex RethrottleTests #26192	2017-08-28 15:50:38 -07:00
Jim Ferenczi	86d97971a4	Remove the _all metadata field (#26356 ) * Remove the _all metadata field This change removes the `_all` metadata field. This field is deprecated in 6 and cannot be activated for indices created in 6 so it can be safely removed in the next major version (e.g. 7).	2017-08-28 17:43:59 +02:00
Stuart Neivandt	f842ff1ae1	Simple verification of the format of the language tag used in DateProcessor. (#25513 ) Closes #26186	2017-08-28 10:59:00 +02:00
Adrien Grand	eb782492be	Remove support for lenient booleans. Closes #22298	2017-08-28 09:56:01 +02:00
Michael Basnight	cfd14cd2b8	Revert shading for the low level rest client (#26367 ) At current, we do not feel there is enough of a reason to shade the low level rest client. It caused problems with commons logging and IDE's during the brief time it was used. We did not know exactly how many users will need this, and decided that leaving shading out until we gather more information is best. Users can still shade the jar themselves. For information and feeback, see issue #26366. Closes #26328 This reverts commit `3a20922046`. This reverts commit `2c271f0f22`. This reverts commit `9d10dbea39`. This reverts commit `e816ef89a2`.	2017-08-25 14:13:12 -05:00
Tim Brooks	0551d2ff68	Move generic http settings out of netty module (#26310 ) There is a group of five settings relating to raw tcp configurations (no_delay, buffer sizes, etc) that we have for the http transport. These currently live in the netty module. As they are unrelated to netty specifically, this commit moves these settings to the `HttpTransportSettings` class in core.	2017-08-24 19:27:56 -05:00
Christoph Büscher	cb4fd3bac6	Fix more tests	2017-08-23 13:14:48 +02:00
Christoph Büscher	56360ecfb5	Fix failing tests due to xContent changes	2017-08-23 12:22:07 +02:00
Christoph Büscher	bc544e2d1b	Adapt branch to changes on master	2017-08-23 12:05:52 +02:00
Christoph Büscher	62a7cac3a0	Merge branch 'master' into feature/rank-eval	2017-08-23 11:19:16 +02:00
Yannick Welsch	0dbebd9177	Set classpath on Painless apiJavadoc task so it also works on JDK 9	2017-08-23 14:45:40 +09:30
Sergey Galkin	9a3216dfee	Stricter validation for min/max values for whole numbers (#26137 )	2017-08-21 12:16:45 +02:00
Andy Bristol	6eef6c4f7a	[TEST] wait until reindex tasks ready for rethrottle (#26250 ) When slices is set as auto, there's an additional network call needed for the reindex tasks to know how to rethrottle. Sometimes the rethrottle action happens before the reindex task is fully initialized, so in the test we wait for the task to be ready. This commit also adds some safeguards to ensure that cancel and rethrottle operations are handled correctly Closes #26192	2017-08-18 11:01:27 -07:00
Jack Conradson	23858789f0	Remove Sort enum from Painless Definition (#26179 ) This is step toward making Definition instanceable which is necessary for custom whitelists in different contexts.	2017-08-17 12:02:11 -07:00
Nik Everett	b840fa3117	Fix some links in Painless method reference Links to inner classes were using `$` in urls instead of `.`, causing them to 404. Also fixes the doc generation code to generate docs into the correct directory. We moved the docs but never updated the generation code.	2017-08-17 10:50:38 -04:00
Tim Brooks	f69cc78b67	Release pipelined http responses on close (#26226 ) Right now it is possible for the `HttpPipeliningHandler` to queue pipelined responses. On channel close, we do not clear and release these responses. This commit releases the responses and completes the promise.	2017-08-16 13:23:32 -05:00
Yannick Welsch	01f6851691	Serialize and expose timeout of acknowledged requests in REST layer (#26189 ) Due to the weird way of structuring the serialization code in AcknowledgedRequest, many request types forgot to properly serialize the request timeout, for example "index deletion", "index rollover", "index shrink", "putting pipeline", and other requests. This means that if those requests were not directly sent to the master node, the acknowledgement timeout information would be lost (and the default used instead). Some requests also don't properly expose the timeout mechanism in the REST layer, such as put / delete stored script. This commit fixes all that.	2017-08-16 07:43:05 +08:00
desmorto	292dd8f992	(refactor) some opportunities to use diamond operator (#25585 ) * (refactor) some opportunities to use diamond operator * Update ExceptionRetryIT.java update typo	2017-08-15 16:36:42 -06:00
Tal Levy	0c76d17fe1	fix targetField randomization in JoinProcessorTests (#26206 ) Closes #26203.	2017-08-14 09:26:47 -07:00
Andy Bristol	fcd8483505	AwaitsFix failing RethrottleTests	2017-08-14 08:10:47 -07:00
Tal Levy	10c3c1aef0	fix SplitProcessor targetField test (#26178 ) This test was too lenient with its randomization of targetFieldName and resulting in a conflict with the original existing fields. This commit fixes that. Closes #26177.	2017-08-11 16:18:04 -07:00
Martijn van Groningen	1146a35870	Move more token filters to analysis-common module The following token filters were moved: arabic_stem, brazilian_stem, czech_stem, dutch_stem, french_stem, german_stem and russian_stem. Relates to #23658	2017-08-11 17:39:24 +02:00
Andy Bristol	7e3cd6a019	reindex: automatically choose the number of slices (#26030 ) In reindex APIs, when using the `slices` parameter to choose the number of slices, adds the option to specify `slices` as "auto" which will choose a reasonable number of slices. It uses the number of shards in the source index, up to a ceiling. If there is more than one source index, it uses the smallest number of shards among them. This gives users an easy way to use slicing in these APIs without having to make decisions about how to configure it, as it provides a good-enough configuration for them out of the box. This may become the default behavior for these APIs in the future.	2017-08-11 08:25:25 -07:00
Martijn van Groningen	636e85e5b7	percolator: Hint what clauses are important in a conjunction query based on fields The percolator field mapper doesn't need to extract all terms and ranges from a bool query with must or filter clauses. In order to help to default extraction behavior, boost fields can be configured, so that fields that are known for not being selective enough can be ignored in favor for other fields or clauses with specific fields can forcefully take precedence over other clauses. This can help selecting clauses for fields that don't match with a lot of percolator queries over other clauses and thus improving performance of the percolate query. For example a status like field is something that should configured as an ignore field. Queries on this field tend to match with more documents and so if clauses for this fields get selected as best clause then that isn't very helpful for the candidate query that the percolate query generates to filter out percolator queries that are likely not going to match.	2017-08-11 15:32:01 +02:00
Daniel Mitterdorfer	637cc872f4	Remove unused Netty-related settings (#26161 ) With this commit we remove the following three previously unused (and undocumented) Netty 4 related settings: * transport.netty.max_cumulation_buffer_capacity, * transport.netty.max_composite_buffer_components and * http.netty.max_cumulation_buffer_capacity from Elasticsearch.	2017-08-11 12:03:00 +02:00
Martijn van Groningen	8285a0f399	percolator: Use correct version for bwc checking now that the change has been backported to 6.0 branch	2017-08-09 13:49:20 +02:00
Adrien Grand	f0c1e30544	Upgrade to lucene-7.0.0-snapshot-a128fcb. (#26090 )	2017-08-08 13:03:19 +02:00
olcbean	5c4c1c5e15	Verify that _bulk and _msearch requests are terminated by a newline (#25740 )	2017-08-08 10:45:44 +02:00
Simon Willnauer	82fa531ab4	Remove `_index` fielddata hack if cluster alias is present (#26082 ) We introduced a hack in #25885 to respect the cluster alias if available on the `_index` field. This is important if aggregations or other field data related operations are executed. Yet, we added a small hack that duplicated an implementation detail from the `_index` field data builder to make this work. This change adds a necessary but simple API change that allows us to remove the hack and only have a single implementation.	2017-08-08 09:24:24 +02:00
Adrien Grand	f0cba4fce5	Add a scripted similarity. (#25831 ) The goal of this similarity is to help users who would like to keep the functionality of the `tf-idf` similarity that we want to remove, or to allow for specific usec-cases (disabling idf, disabling tf, disabling length norm, etc.) to not have to build a custom plugin and familiarize with the low-level Lucene API.	2017-08-08 08:55:12 +02:00
Tal Levy	872526cad3	add URL-Decode Processor to Ingest (#26045 ) closes #25837 Adds a URL Decoder Processor to Ingest this will decode urls like: https%3a%2f%2felastic.co%2 to https://elastic.co/	2017-08-07 10:26:11 -07:00
Christoph Büscher	18155ed69a	Merge branch 'master' into feature/rank-eval	2017-08-07 16:07:34 +02:00
Luca Cavanna	14ba36977e	[TEST] prevent yaml tests from using raw requests (#26044 ) Raw requests are supported only by the java yaml test runner and were introduced to test docs snippets. Some yaml tests ended up using them (see #23497) which causes failures for other language clients. This commit migrates those yaml tests to Java tests that send requests through the Java low-level REST client, and also moves the ability to send raw requests to a special client that's only available when testing docs snippets. Closes #25694	2017-08-07 11:02:16 +02:00
Martijn van Groningen	11ce6b91a4	test: Do not use random index writer as test expects a single segment check against right version	2017-08-07 09:40:54 +02:00
Colin Goodheart-Smithe	bb3d5b7426	[TEST] Fix internalMatrixStatsTests failure	2017-08-02 16:36:34 +01:00
Colin Goodheart-Smithe	87c6e63e73	Adds mutate function to various tests (#25999 ) * Adds mutate function to various tests Relates to #25929 * fix test * implements mutate function for all single bucket aggs * review comments * convert getMutateFunction to mutateIInstance	2017-08-02 11:38:31 +01:00
Martijn van Groningen	53dd8afaea	fix test	2017-08-02 11:25:03 +02:00
Martijn van Groningen	a3d1248014	percolator: use correct version.	2017-08-02 10:37:59 +02:00
Adrien Grand	88d456989e	Make FieldMapper.copyTo() always non-null. (#25994 ) Otherwise it is confusing that both a null copyTo and an empty copyTo should be treated the same.	2017-08-02 10:07:29 +02:00
Tim Brooks	0f4f49496f	Use nio transport in test clusters (#25986 ) This commit adds the nio transport as an option in place of the mock tcp transport for tests. Each test will only use one transport type. The transport type is decided by a random boolean generated inside of the `ESTestCase` class.	2017-08-01 16:19:31 -05:00
Ryan Ernst	072281d5aa	Update version to 7.0.0-alpha1 (#25876 ) This commit updates the version for master to 7.0.0-alpha1. It also adds the 6.1 version constant, and fixes many tests, as well as marking some as awaits fix. Closes #25893 Closes #25870	2017-08-01 15:47:48 -04:00
Adrien Grand	53c829b6bc	Painless: allow doubles to be casted to longs. (#25936 ) Running `(long) someDoubleValue` currently throws a `ClassCastException` while eg. `(int) someDoubleValue` is accepted.	2017-08-01 16:22:55 +02:00
Jason Tedor	764f7ef2ef	Fix Netty 4 multi-port test This commit fixes an issue with the Netty 4 multi-port test that a transport client can connect. The problem here is that in case the bottom of the random port range was already bound to (for example, by another JVM) then then transport client could not connect to the data node. This is because the transport client was in fact using the bottom of the port range only. Instead, we simply try all the ports that the data node might be bound to. Closes #24441	2017-08-01 19:47:20 +09:00
Martijn van Groningen	5f36bdfda0	percolator: Also support IndexOrDocValuesQuery Otherwise ranges are never extracted properly.	2017-08-01 09:44:42 +02:00
Martijn van Groningen	ff3b909a83	Moved HtmlStripCharFilterFactory to analyis.common package like the other factories.	2017-07-31 15:34:54 +02:00
Martijn van Groningen	0b776a1de0	Move more token filters to analysis-common module The following token filters were moved: delimited_payload_filter, keep, keep_types, classic, apostrophe, decimal_digit, fingerprint, min_hash and scandinavian_folding. Relates to #23658	2017-07-31 15:15:04 +02:00
Martijn van Groningen	7c3735bdc4	percolator: Store the QueryBuilder's Writable representation instead of its XContent representation. The Writeble representation is less heavy to parse and that will benefit percolate performance and throughput. The query builder's binary format has now the same bwc guarentees as the xcontent format. Added a qa test that verifies that percolator queries written in older versions are still readable by the current version.	2017-07-28 12:24:10 +02:00
Yannick Welsch	1a01514081	Move tribe to a module (#25778 ) This commit moves tribe to a module, stripping core from the tribe functionality.	2017-07-28 11:23:50 +02:00
Jim Ferenczi	562c3744ca	Merge FunctionScoreQuery and FiltersFunctionScoreQuery (#25889 ) This change merges the functionality of the FiltersFunctionScoreQuery in the FunctionScoreQuery. It also ensures that an exception is thrown when the computed score is equals to Float.NaN or Float.NEGATIVE_INFINITY. These scores are invalid for TopDocsCollectors that relies on score comparison. Fixes #15709 Fixes #23628	2017-07-28 09:22:20 +02:00
Martijn van Groningen	edad7b4737	Add support for selecting percolator query candidate matches containing range queries. Extracts ranges from range queries on byte, short, integer, long, half_float, scaled_float, float, double, date and ip fields. byte, short, integer and date ranges are normalized to Lucene's LongRange. half_float and float are normalized to Lucene's DoubleRange. When extracting range queries, the QueryAnalyzer computes the width of the range. This width is used to determine what range should be preferred in a conjunction query. The QueryAnalyzer prefers the smaller ranges, because these ranges tend to match with less documents. Closes #21040	2017-07-26 21:25:45 +02:00
Simon Willnauer	b72c71083c	Cleanup IndexFieldData visibility (#25900 ) Today we expose `IndexFieldDataService` outside of IndexService to do maintenance or lookup field data in different ways. Yet, we have a streamlined way to access IndexFieldData via `QueryShardContext` that should encapsulate all access to it. This also ensures that we control all other functionality like cache clearing etc. This change also removes the `recycler` option from `ClearIndicesCacheRequest` this option is a no-op and should have been removed long ago.	2017-07-26 20:03:42 +02:00
Simon Willnauer	634ce90dc0	Respect cluster alias in `_index` aggs and queries (#25885 ) Today when we aggregate on the `_index` field the cross cluster search alias is not taken into account. Neither is it respected when we search on the field. This change adds support for cluster alias when the cluster alias is present on the `_index` field. Closes #25606	2017-07-26 09:16:52 +02:00
Michael Basnight	e816ef89a2	Shade external dependencies in the rest client jar This commit removes all external dependencies from the rest client jar and shades them in an 'org.elasticsearch.client' package within the jar using shadowJar gradle plugin. All projects that depended on the existing jar have been converted to using the 'org.elasticsearch.client' package prefixes to interact with the rest client. Closes #25208	2017-07-24 12:55:43 -05:00
Jim Ferenczi	ab3b5c695a	Pre-configured shingle filter should disable graph analysis (#25853 ) This change disables the graph analysis on default `shingle` filter. The pre-configured shingle filter produces shingles of different size. Graph analysis on such token stream is useless and dangerous as it may create too many paths. Fixes #25555	2017-07-24 18:42:15 +02:00
Simon Willnauer	0e3ad522a2	Rewrite search requests on the coordinating nodes (#25814 ) This change rewrites search requests on the coordinating node before we send requests to the individual shards. This will reduce the rewrite load and object creation for each rewrite on the executing nodes and will fetch resources only once instead of N times once per shard for queries like `terms` query with index lookups. (among percolator and geo-shape) Relates to #25791	2017-07-21 09:38:38 +02:00
Jack Conradson	9f7463e796	remove lang url parameter from stored script requests (#25779 ) Also has updates to ScriptMetaData for allowing the old namespace format to be loaded all the way back through 5.0; however, it will throw an exception if two scripts share the same id but different languages.	2017-07-20 08:51:08 -07:00
Simon Willnauer	5e629cfba0	Ensure query resources are fetched asynchronously during rewrite (#25791 ) The `QueryRewriteContext` used to provide a client object that can be used to fetch geo-shapes, terms or documents for percolation. Unfortunately all client calls used to be blocking calls which can have significant impact on the rewrite phase since it occupies an entire search thread until the resource is received. In the case that the index the resource is fetched from isn't on the local node this can have significant impact on query throughput. Note: this doesn't fix MLT since it fetches stuff in doQuery which is a different beast. Yet, it is a huge step in the right direction	2017-07-20 15:37:50 +02:00
Jay Modi	3e4bc027eb	RestClient uses system properties and system default SSLContext (#25757 ) This commit calls the `useSystemProperties` method on the HttpAsyncClientBuilder so that the jvm system properties are used. The primary reason for doing this is to ensure the builder uses the system default SSLContext rather than the default instance created by the http client library. Closes #23231	2017-07-20 07:36:56 -06:00
Simon Willnauer	4d78935df7	Introduce a new Rewriteable interface to streamline rewriting (#25788 ) Today we have duplicated code that is quite complicated to iterate over rewriteable (`QueryBuilders` mainly) This change introduces a `Rewriteable` interface that allow to share code to do the rewriting as well as encapsulation and composition of queries.	2017-07-19 15:06:49 +02:00
Adrien Grand	f1ff7f2454	Require a field when a `seed` is provided to the `random_score` function. (#25594 ) We currently use fielddata on the `_id` field which is trappy, especially as we do it implicitly. This changes the `random_score` function to use doc ids when no seed is provided and to suggest a field when a seed is provided. For now the change only emits a deprecation warning when no field is supplied but this should be replaced by a strict check on 7.0. Closes #25240	2017-07-19 14:11:15 +02:00
Martijn van Groningen	8003171a0c	Move more token filters to analysis-common module The following token filters were moved: arabic_normalization, german_normalization, hindi_normalization, indic_normalization, persian_normalization, scandinavian_normalization, serbian_normalization, sorani_normalization, cjk_width and cjk_width Relates to #23658	2017-07-17 08:29:44 +02:00
Ryan Ernst	072402463b	Scripting: Remove search template actions (#25717 ) The dedicated search template put/get/delete actions are deprecated in 5.6. This commit removes them from 6.0.	2017-07-14 23:12:05 -07:00
Christoph Büscher	887ed68cf2	Fixing compilation issues and tests after merging in master	2017-07-14 19:23:35 +02:00
Christoph Büscher	6d999f074a	Merge branch 'master' into feature/rank-eval	2017-07-14 18:36:08 +02:00
Jim Ferenczi	13da3eb53e	Refactor QueryStringQuery for 6.0 (#25646 ) This change refactors the query_string query to analyze the query text around logical operators of the query string the same way than a match_query/multi_match_query. It also adds a type parameter that can be used to change the way multi fields query are built the same way than a multi_match query does. Now that these queries share the same behavior regarding text analysis, some parameters are obsolete and have been deprecated: split_on_whitespace: This setting is now ignored with a deprecation notice if it is used explicitely. With this PR The query_string always splits on logical operator. It simplifies the understanding of the other parameters that can have different meanings depending on the value of split_on_whitespace. auto_generate_phrase_queries: This setting is now ignored with a deprecation notice if it is used explicitely. This setting only makes sense when the parser splits on whitespace. use_dismax: This setting is now ignored with a deprecation notice if it is used explicitely. The tie_breaker parameter is sufficient to handle best_fields/most_fields. Fixes #25574	2017-07-13 15:32:17 +02:00
Luca Cavanna	ec66d655b5	Rename client artifacts (#25693 ) It was brought up that our current client artifacts have generic names like 'rest' that may cause conflicts with other artifacts. This commit renames: - rest -> elasticsearch-rest-client - sniffer -> elasticsearch-rest-client-sniffer - rest-high-level -> elasticsearch-rest-high-level-client A couple of small changes are also preparing the high level client for its first release. Closes #20248	2017-07-13 09:44:25 +02:00
Ryan Ernst	70b2897bdf	Scripting: Deprecate stored search template apis (#25437 ) This commit deprecates the PUT, GET and DELETE search template apis. Instead, the stored script api should be used. closes #24596	2017-07-12 16:07:28 -07:00
Simon Willnauer	e81804cfa4	Add a shard filter search phase to pre-filter shards based on query rewriting (#25658 ) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.	2017-07-12 22:19:20 +02:00
Jack Conradson	d2b4f7ac5a	Disallow lang to be used with Stored Scripts (#25610 ) Requests that execute a stored script will no longer be allowed to specify the lang of the script. This information is stored in the cluster state making only an id necessary to execute against. Putting a stored script will still require a lang.	2017-07-12 07:55:57 -07:00
Tim Brooks	a3ade99fcf	Fix BytesReferenceStreamInput#skip with offset (#25634 ) There is a bug when a call to `BytesReferenceStreamInput` skip is made on a `BytesReference` that has an initial offset. The offset for the current slice is added to the current index and then subtracted from the length. This introduces the possibility of a negative number of bytes to skip. This happens inside a loop, which leads to an infinte loop. This commit correctly subtracts the current slice index from the slice.length. Additionally, the `BytesArrayTests` are modified to test instances that include an offset.	2017-07-11 09:54:29 -05:00
Adrien Grand	481d5d09b2	Upgrade to lucene-7.0.0-snapshot-00142c9. (#25641 ) Lucene 7.0 is feature-frozen now, so there should not be many changes until GA.	2017-07-11 13:58:55 +02:00
Tim Brooks	b22bbf94da	Avoid blocking on channel close on network thread (#25521 ) Currently when we close a channel in Netty4Utils.closeChannels we block until the closing is complete. This introduces the possibility that a network selector thread will block while waiting until a separate network selector thread closes a channel. For instance: T1 closes channel 1 (which is assigned to a T1 selector). Channel 1's close listener executes the closing of the node. That means that T1 now tries to close channel 2. However, channel 2 is assigned to a selector that is running on T2. T1 now must wait until T2 closes that channel at some point in the future. This commit addresses this by adding a boolean to closeChannels indicating if we should block on close. We only set this boolean to true if we are closing down the server channels at shutdown. This call is never made from a network thread. When we call the closeChannels method with that boolean set to false, we do not block on close.	2017-07-10 10:50:51 -05:00
Jason Tedor	c084542731	Bump version to 6.0.0-beta1 This commit does two things: - bumps the version from 6.0.0-alpha3 to 6.0.0-beta1 - renames the 6.0.0-alpha3 version constant to 6.0.0-beta1 Relates #25621	2017-07-09 18:12:50 -04:00
Adrien Grand	40bb1663ee	Index ids in binary form. (#25352 ) Indexing ids in binary form should help with indexing speed since we would have to compare fewer bytes upon sorting, should help with memory usage of the live version map since keys will be shorter, and might help with disk usage depending on how efficient the terms dictionary is at compressing terms. Since we can only expect base64 ids in the auto-generated case, this PR tries to use an encoding that makes the binary id equal to the base64-decoded id in the majority of cases (253 out of 256). It also specializes numeric ids, since this seems to be common when content that is stored in Elasticsearch comes from another database that uses eg. auto-increment ids. Another option could be to require base64 ids all the time. It would make things simpler but I'm not sure users would welcome this requirement. This PR should bring some benefits, but I expect it to be mostly useful when coupled with something like #24615. Closes #18154	2017-07-07 14:22:47 +02:00
Martijn van Groningen	6db708ef75	Move more token filters to analysis-common module The following token filters were moved: common grams, limit token, pattern capture and pattern raplace. Relates to #23658	2017-07-07 10:02:52 +02:00
Simon Willnauer	1f67d079b1	Validate `transport.profiles.` settings (#25508 ) Transport profiles unfortunately have never been validated. Yet, it's very easy to make a mistake when configuring profiles which will most likely stay undetected since we don't validate the settings but allow almost everything based on the wildcard in `transport.profiles.`. This change removes the settings subset based parsing of profiles but rather uses concrete affix settings for the profiles which makes it easier to fall back to higher level settings since the fallback settings are present when the profile setting is parsed. Previously, it was unclear in the code which setting is used ie. if the profiles settings (with removed prefixes) or the global node setting. There is no distinction anymore since we don't pull prefix based settings.	2017-07-07 09:40:59 +02:00
Jason Tedor	c96257ca73	Upgrade to Netty 4.1.13.Final This commit upgrades the Netty dependency from version 4.1.11.Final to 4.1.13.Final. Relates #25581	2017-07-06 15:37:00 -04:00
Martijn van Groningen	d0f9f425bd	parent/child: Removed ParentJoinFieldSubFetchPhase	2017-07-06 13:15:02 +02:00
Martijn van Groningen	407273f81d	parent/child: Support parent id being specified as number in the _source	2017-07-06 11:48:57 +02:00
Jun Ohtani	6894ef6057	[Analysis] Support normalizer in request param (#24767 ) * [Analysis] Support normalizer in request param Support normalizer param Support custom normalizer with char_filter/filter param Closes #23347	2017-07-04 19:16:56 +09:00
Colin Goodheart-Smithe	43efcffcc2	Adds check for negative search request size (#25397 ) * Adds check for negative search request size This change adds a check to `SearchSourceBuilder` to throw and exception if the size set on it is set to a negative value. Closes #22530 * fix error in reindex * update re-index tests * Addresses review comment * Fixed tests * Added random negative size test * Fixes test	2017-07-04 10:51:38 +01:00
Christoph Büscher	f576c987ce	Remove QueryParseContext (#25486 ) QueryParseContext is currently only used as a wrapper for an XContentParser, so this change removes it entirely and changes the appropriate APIs that use it so far to only accept a parser instead.	2017-07-03 17:30:40 +02:00
Simon Willnauer	5a7c8bb04e	Cleanup network / transport related settings (#25489 ) This commit makes the use of the global network settings explicit instead of implicit within NetworkService. It cleans up several places where we fall back to the global settings while we should have used tcp or http ones. In addition this change also removes unnecessary settings classes	2017-07-02 10:16:50 +02:00
Simon Willnauer	6f131a63d3	Remove unregistered `transport.netty.*` settings (#25476 ) These settings have not be working for a full major version since they are not registered. Given that they are simply duplicates we can just remove them.	2017-06-29 20:56:18 +02:00
Christoph Büscher	927111c91d	Remove QueryParseContext from parsing QueryBuilders (#25448 ) Currently QueryParseContext is only a thin wrapper around an XContentParser that adds little functionality of its own. I provides helpers for long deprecated field names which can be removed and two helper methods that can be made static and moved to other classes. This is a first step in helping to remove QueryParseContext entirely.	2017-06-29 17:10:20 +02:00

... 4 5 6 7 8 ...

4819 Commits