OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nik Everett	3adefb7b4a	Begin centralizing XContentParser creation into RestRequest (#22041 ) To get #22003 in cleanly we need to centralize as much `XContentParser` creation as possible into `RestRequest`. That'll mean we have to plumb the `NamedXContentRegistry` into fewer places. This removes `RestAction.hasBody`, `RestAction.guessBodyContentType`, and `RestActions.getRestContent`, moving callers over to `RestRequest.hasContentOrSourceParam`, `RestRequest.contentOrSourceParam`, and `RestRequest.contentOrSourceParamParser` and `RestRequest.withContentOrSourceParamParserOrNull`. The idea is to use `withContentOrSourceParamParserOrNull` if you need to handle requests without any sort of body content and to use `contentOrSourceParamParser` otherwise. I believe the vast majority of this PR to be purely mechanical but I know I've made the following behavioral change (I'll add more if I think of more): * If you make a request to an endpoint that requires a request body and has cut over to the new APIs instead of getting `Failed to derive xcontent` you'll get `Body required`. * Template parsing is now non-strict by default. This is important because we need to be able to deprecate things without requests failing.	2016-12-09 20:23:02 -05:00
Lee Hinman	ef64d230e7	Merge remote-tracking branch 'dakrone/index-seq-id-and-primary-term'	2016-12-08 19:47:21 -07:00
Lee Hinman	ee22a477df	Add internal _primary_term doc values field, fix _seq_no indexing This adds the `_primary_term` field internally to the mappings. This field is populated with the current shard's primary term. It is intended to be used for collision resolution when two document copies have the same sequence id, therefore, doc_values for the field are stored but the filed itself is not indexed. This also fixes the `_seq_no` field so that doc_values are retrievable (they were previously stored but irretrievable) and changes the `stats` implementation to more efficiently use the points API to retrieve the min/max instead of iterating on each doc_value value. Additionally, even though we intend to be able to search on the field, it was previously not searchable. This commit makes it searchable. There is no user-visible `_primary_term` field. Instead, the fields are updated by calling: ```java index.parsedDoc().updateSeqID(seqNum, primaryTerm); ``` This includes example methods in `Versions` and `Engine` for retrieving the sequence id values from the index (see `Engine.getSequenceID`) that are only used in unit tests. These will be extended/replaced by actual implementations once we make use of sequence numbers as a conflict resolution measure. Relates to #10708 Supercedes #21480 P.S. As a side effect of this commit, `SlowCompositeReaderWrapper` cannot be used for documents that contain `_seq_no` because it is a Point value and SCRW cannot wrap documents with points, so the tests have been updated to loop through the `LeafReaderContext`s now instead.	2016-12-08 19:47:03 -07:00
Christoph Büscher	7454a9647b	Add fromXContent to HighlightField This adds a fromXContent method and unit test to the HighlightField class so we can parse it as part of a serch response. This is part of the preparation for parsing search responses on the client side.	2016-12-07 16:32:44 +01:00
Luca Cavanna	5b8bdba12e	Remove subrequests method from CompositeIndicesRequest (#21873 )	2016-11-30 15:03:58 +01:00
Adrien Grand	6231009a8f	Remove 2.x backward compatibility of mappings. (#21670 ) For the record, I also had to remove the geo-hash cell and geo-distance range queries to make the code compile. These queries already throw an exception in all cases with 5.x indices, so that does not hurt any more. I also had to rename all 2.x bwc indices from `index-${version}` to `unsupported-${version}` to make `OldIndexBackwardCompatibilityIT` happy.	2016-11-30 13:34:46 +01:00
Nicholas Knize	af1ab68b64	Add RangeFieldMapper for numeric and date range types Lucene 6.2 added index and query support for numeric ranges. This commit adds a new RangeFieldMapper for indexing numeric (int, long, float, double) and date ranges and creating appropriate range and term queries. The design is similar to NumericFieldMapper in that it uses a RangeType enumerator for implementing the logic specific to each type. The following range types are supported by this field mapper: int_range, float_range, long_range, double_range, date_range. Lucene does not provide a DocValue field specific to RangeField types so the RangeFieldMapper implements a CustomRangeDocValuesField for handling doc value support. When executing a Range query over a Range field, the RangeQueryBuilder has been enhanced to accept a new relation parameter for defining the type of query as one of: WITHIN, CONTAINS, INTERSECTS. This provides support for finding all ranges that are related to a specific range in a desired way. As with other spatial queries, DISJOINT can be achieved as a MUST_NOT of an INTERSECTS query.	2016-11-29 10:10:14 -06:00
Ryan Ernst	6940b2b8c7	Remove groovy scripting language (#21607 ) * Scripting: Remove groovy scripting language Groovy was deprecated in 5.0. This change removes it, along with the legacy default language infrastructure in scripting.	2016-11-22 19:24:12 -08:00
Boaz Leskes	2c0338fa87	Merge remote-tracking branch 'upstream/master' into feature/seq_no	2016-11-15 17:09:08 +00:00
Adrien Grand	df4482fdc8	Do not cache the QueryShardContext in PercolatorFieldMapper: it is cheap to create.	2016-11-15 15:45:18 +01:00
Adrien Grand	54809065a6	Make PercolatorFieldMapper get a QueryShardContext lazily.	2016-11-15 12:02:40 +01:00
Boaz Leskes	c9f49039d3	Merge remote-tracking branch 'upstream/master' into feature/seq_no	2016-11-15 10:14:47 +00:00
Ryan Ernst	d14c470b89	Remove generics from ActionRequest closes #21368	2016-11-14 15:32:01 -08:00
Jason Tedor	d3417fb022	Merge branch 'master' into feature/seq_no * master: (516 commits) Avoid angering Log4j in TransportNodesActionTests Add trace logging when aquiring and releasing operation locks for replication requests Fix handler name on message not fully read Remove accidental import. Improve log message in TransportNodesAction Clean up of Script. Update Joda Time to version 2.9.5 (#21468) Remove unused ClusterService dependency from SearchPhaseController (#21421) Remove max_local_storage_nodes from elasticsearch.yml (#21467) Wait for all reindex subtasks before rethrottling Correcting a typo-Maan to Man-in README.textile (#21466) Fix InternalSearchHit#hasSource to return the proper boolean value (#21441) Replace all index date-math examples with the URI encoded form Fix typos (#21456) Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204 Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431) Add VirtualBox version check (#21370) Export ES_JVM_OPTIONS for SysV init Skip reindex rethrottle tests with workers Make forbidden APIs be quieter about classpath warnings (#21443) ...	2016-11-10 23:40:33 -05:00
Jack Conradson	aeb97ff412	Clean up of Script. Closes #21321	2016-11-10 09:59:13 -08:00
Ryan Ernst	7a2c984bcc	Test: Remove multi process support from rest test runner (#21391 ) At one point in the past when moving out the rest tests from core to their own subproject, we had multiple test classes which evenly split up the tests to run. However, we simplified this and went back to a single test runner to have better reproduceability in tests. This change removes the remnants of that multiplexing support.	2016-11-07 15:07:34 -08:00
Adrien Grand	aa6cd93e0f	Require arguments for QueryShardContext creation. (#21196 ) The `IndexService#newQueryShardContext()` method creates a QueryShardContext on shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to resolve `now`. This may hide bugs, since the shard id is sometimes used for query parsing (it is used to salt random score generation in `function_score`), passing a `null` reader disables query rewriting and for some use-cases, it is simply not ok to rely on the current timestamp (eg. percolation). So this pull request removes this method and instead requires that all call sites provide these parameters explicitly.	2016-11-02 09:48:49 +01:00
Jack Conradson	512a77a633	Refactor ScriptType to be a top-level class.	2016-10-26 10:21:22 -07:00
Jim Ferenczi	c80a563a71	Replace org.elasticsearch.common.lucene.search.MatchNoDocsQuery with its Lucene version (org.apache.lucene.search.MatchNoDocsQuery) (#20832 ) * Replace org.elasticsearch.common.lucene.search.MatchNoDocsQuery with its Lucene version (org.apache.lucene.search.MatchNoDocsQuery) This change removes the ES version of the match no docs query and replaces it with the Lucene version. relates #18030 * Add missing change	2016-10-10 17:45:19 +02:00
Simon Willnauer	9c9afe3f01	Remove SearchContext#current and all it's threadlocals (#20778 ) Today SearchContext expose the current context as a thread local which makes any kind of sane interface design very very hard. This PR removes the thread local entirely and instead passes the relevant context anywhere needed. This simplifies state management dramatically and will allow for a much leaner SearchContext interface down the road.	2016-10-06 19:51:54 +02:00
Simon Willnauer	ce21b607bb	move test to a single node test	2016-10-05 21:55:50 +02:00
Simon Willnauer	838c28eeb4	add percolate with script query test	2016-10-05 20:43:46 +02:00
Simon Willnauer	57afbadf33	PercolateQuery is never cacheable	2016-10-05 16:38:47 +02:00
Colin Goodheart-Smithe	7bffe95025	Fix percolator queries to not be cacheable	2016-10-05 15:03:29 +01:00
Jason Tedor	51d53791fe	Remove lenient URL parameter parsing Today when parsing a request, Elasticsearch silently ignores incorrect (including parameters with typos) or unused parameters. This is bad as it leads to requests having unintended behavior (e.g., if a user hits the _analyze API and misspell the "tokenizer" then Elasticsearch will just use the standard analyzer, completely against intentions). This commit removes lenient URL parameter parsing. The strategy is simple: when a request is handled and a parameter is touched, we mark it as such. Before the request is actually executed, we check to ensure that all parameters have been consumed. If there are remaining parameters yet to be consumed, we fail the request with a list of the unconsumed parameters. An exception has to be made for parameters that format the response (as opposed to controlling the request); for this case, handlers are able to provide a list of parameters that should be excluded from tripping the unconsumed parameters check because those parameters will be used in formatting the response. Additionally, some inconsistencies between the parameters in the code and in the docs are corrected. Relates #20722	2016-10-04 12:45:29 -04:00
Jason Tedor	25fd9e26c4	Merge branch 'master' into feature/seq_no * master: (1199 commits) [DOCS] Remove non-valid link to mapping migration document Revert "Default `include_in_all` for numeric-like types to false" test: add a test with ipv6 address docs: clearify that both ip4 and ip6 addresses are supported Include complex settings in settings requests Add production warning for pre-release builds Clean up confusing error message on unhandled endpoint [TEST] Increase logging level in testDelayShards() change health from string to enum (#20661) Provide error message when plugin id is missing Document that sliced scroll works for reindex Make reindex-from-remote ignore unknown fields Remove NoopGatewayAllocator in favor of a more realistic mock (#20637) Remove Marvel character reference from guide Fix documentation for setting Java I/O temp dir Update client benchmarks to log4j2 Changes the API of GatewayAllocator#applyStartedShards and (#20642) Removes FailedRerouteAllocation and StartedRerouteAllocation IndexRoutingTable.initializeEmpty shouldn't override supplied primary RecoverySource (#20638) Smoke tester: Adjust to latest changes (#20611) ...	2016-09-29 00:22:31 +02:00
Simon Willnauer	fe1803c957	Remove AnalysisService and reduce it to a simple name to analyzer mapping (#20627 ) Today we hold on to all possible tokenizers, tokenfilters etc. when we create an index service on a node. This was mainly done to allow the `_analyze` API to directly access all these primitive. We fixed this in #19827 and can now get rid of the AnalysisService entirely and replace it with a simple map like class. This ensures we don't create a gazillion long living objects that are entirely useless since they are never used in most of the indices. Also those objects might consume a considerable amount of memory since they might load stopwords or synonyms etc. Closes #19828	2016-09-23 08:53:50 +02:00
Jim Ferenczi	1764ec56b3	Fixed naming inconsistency for fields/stored_fields in the APIs (#20166 ) This change replaces the fields parameter with stored_fields when it makes sense. This is dictated by the renaming we made in #18943 for the search API. The following list of endpoint has been changed to use `stored_fields` instead of `fields`: * get * mget * explain The documentation and the rest API spec has been updated to cope with the changes for the following APIs: * delete_by_query * get * mget * explain The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering): * update: the fields are extracted from the _source directly. * bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields. Some APIs still have the `fields` parameter for various reasons: * cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed. * indices.clear_cache: used to indicate which fielddata fields should be cleared. * indices.get_field_mapping: used to filter fields in the mapping. * indices.stats: get stats on fields (stored or not stored). * termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise. * mtermvectors: * nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all. Fixes #20155	2016-09-13 20:54:41 +02:00
Lee Hinman	94625d74e4	No longer allow cluster name in data path In 5.x we allowed this with a deprecation warning. This removes the code added for that deprecation, requiring the cluster name to not be in the data path. Resolves #20391	2016-09-12 15:47:01 -06:00
javanna	90ab460fcc	move parsing of search ext sections to the coordinating node	2016-09-09 19:10:42 +02:00
Martijn van Groningen	245882cde3	* Removed `script.default_lang` setting and made `painless` the hardcoded default script language. ** The default script language is now maintained in `Script` class. * Added `script.legacy.default_lang` setting that controls the default language for scripts that are stored inside documents (for example percolator queries). This defaults to groovy. Added `QueryParseContext#getDefaultScriptLanguage()` that manages the default scripting language. Returns always `painless`, unless loading query/search request in legacy mode then the returns what is configured in `script.legacy.default_lang` setting. In the aggregation parsing code added `ParserContext` that also holds the default scripting language like `QueryParseContext`. Most parser don't have access to `QueryParseContext`. This is for scripts in aggregations. * The `lang` script field is always serialized (toXContent). Closes #20122	2016-09-06 18:44:48 +02:00
Jason Tedor	e166459bbe	Merge branch 'master' into log4j2 * master: Increase visibility of deprecation logger Skip transport client plugin installed on JDK 9 Explicitly disable Netty key set replacement percolator: Fail indexing percolator queries containing either a has_child or has_parent query. Make it possible for Ingest Processors to access AnalysisRegistry Allow RestClient to send array-based headers Silence rest util tests until the bogusness can be simplified Remove unknown HttpContext-based test as it fails unpredictably on different JVMs Tests: Improve rest suite names and generated test names for docs tests Add support for a RestClient base path	2016-08-31 10:59:27 -04:00
Martijn van Groningen	3fcb95b814	percolator: Fail indexing percolator queries containing either a has_child or has_parent query. Closes #2960	2016-08-31 07:46:17 +02:00
Jason Tedor	7da0cdec42	Introduce Log4j 2 This commit introduces Log4j 2 to the stack.	2016-08-30 13:31:24 -04:00
Jun Ohtani	450f47d5b5	Validate blank field name add validation and validate only 5.0+ Add tests before 5.0 Closes #19251	2016-08-26 20:10:33 +09:00
Ryan Ernst	743d9fd008	Merge branch 'master' into search_parser	2016-08-16 11:28:59 -07:00
Ryan Ernst	7fde410586	Internal: Consolidate search parser registries Parsing a search request is currently split up among a number of classes, using multiple public static methods, which take multiple regstries of elements that may appear in the search request like query parsers and aggregations. This change begins consolidating all this code by collapsing the registries normally used for parsing search requests into a single SearchRequestParsers class. It is also made available to plugin services to enable templating of search requests. Eventually all of the actual parsing logic should move to the class, and the registries should be hidden, but for now they are at least co-located to reduce the number of objects that must be passed around.	2016-08-16 01:59:24 -07:00
Nik Everett	1452ab4b9f	Squash the rest of o.e.rest.action Squashes all the subpackages of `org.elasticsearch.rest.action` down to the following: * `o.e.rest.action.admin` - Administrative actions * `o.e.rest.action.cat` - Actions that make tables for `grep`ing * `o.e.rest.action.document` - Actions that act on documents * `o.e.rest.action.ingest` - Actions that act on ingest pipelines * `o.e.rest.action.search` - Actions that search I'm tempted to merge `search` into `document` but the `document` package feels fairly complete as is and `Suggest` isn't actually always about documents either.... I'm also tempted to merge `ingest` into `admin.cluster` because the latter contains the actions for dealing with stored scripts. I've moved the `o.e.rest.action.support` into `o.e.rest.action`. I've also added `package-info.java`s to all packges in `o.e.rest`. I figure if the package is too small to deserve a `package-info.java` file then it is too small to deserve to be a package.... Also fixes checkstyle in all moved classes.	2016-08-15 21:06:32 -04:00
Nik Everett	cf6e1a4362	Move all FetchSubPhases to `o.e.search.fetch.subphase` As the most complicated `FetchSubPhase` highlighting gets its own package (`o.e.seach.fetch.subphase.highlight`. No other `FetchSubPhase`s get their own package. Instead they all reside together in `o.e.search.fetch.subphase`. Add package descriptions to `o.e.search.fetch` and subpackages.	2016-08-12 18:21:15 -04:00
Adrien Grand	0d6ac57acf	Collapse o.e.index.mapper packages. #19921 I also reduced the visibility of a couple classes and renamed/consolidated some test classes for consistency, eg. removing the `Simple` prefix or using the `<Type>FieldMapperTests` convention for testing field mappers.	2016-08-10 17:51:11 +02:00
javanna	2c44278ce8	[TEST] use ParseField instead of plain strings in query tests	2016-08-10 12:21:25 +02:00
javanna	0a98b5e56e	[TEST] make AbstractQueryTestCase#testUnknownObjectException more accurate testUnknownObjectException used to generate malformed json objects in some cases, due to the existence of arrays as it was not closing the injected object correctly. That is why the test was catching JsonParseException among the exception that are expected to be thrown. That is fixed by tracking where the new object is placed and placing its end object marker to the right level rather than always at the end. Also introduced a mechanism to explicitly declare objects that won't cause any exception when they get additional objects injected, so that there is no need to override the method anymore as that caused copy pasting of the whole test method. This also makes sure that changes are reflected in tests, as those inner objects are not skipped but we actually check that what is declared is true (no exceptions get thrown when an additional object is added within them.	2016-08-10 11:48:51 +02:00
javanna	2437226802	[TEST] restore tests repeatability in AbstractQueryTestCase Some random operations were conditionally performed in the before test, which made tests not repeatable. For instance take the seed chain to repeat a specific iteration and try to reproduce it, this conditional code would get executed in both cases when trying to isolate the failure, but not among the different iterations (as only the first method/iteration executes it), hence the failure will not reproduce. Moved the random operations to beforeClass and left the non random part in the before method, which is needed as it depends on some method that can be overridden by subclasses.	2016-08-05 22:38:31 +02:00
Nik Everett	9270e8b22b	Rename client yaml test infrastructure This makes it obvious that these tests are for running the client yaml suites. Now that there are other ways of running tests using the REST client against a running cluster we can't go on calling the shared client yaml tests "REST tests". They are rest tests, but they aren't the rest tests.	2016-07-26 13:53:44 -04:00
Nik Everett	a95d4f4ee7	Add Location header and improve REST testing This adds a header that looks like `Location: /test/test/1` to the response for the index/create/update API. The requirement for the header comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html https://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative URIs are OK. So we use an absolute path which should resolve to the appropriate location. Closes #19079 This makes large changes to our rest test infrastructure, allowing us to write junit tests that test a running cluster via the rest client. It does this by splitting ESRestTestCase into two classes: * ESRestTestCase is the superclass of all tests that use the rest client to interact with a running cluster. * ESClientYamlSuiteTestCase is the superclass of all tests that use the rest client to run the yaml tests. These tests are shared across all official clients, thus the `ClientYamlSuite` part of the name.	2016-07-25 17:02:40 -04:00
Nik Everett	3a82c613e4	Migrate query registration from push to pull Remove `ParseField` constants used for names where there are no deprecated names and just use the `String` version of the registration method instead. This is step 2 in cleaning up the plugin interface for extending search time actions. Aggregations are next. This is breaking for plugins because those that register a new query should now implement `SearchPlugin` rather than `onModule(SearchModule)`.	2016-07-20 12:33:51 -04:00
Martijn van Groningen	e0ebf5da1c	Template cleanup: * Removed `Template` class and unified script & template parsing logic. Templates are scripts, so they should be defined as a script. Unless there will be separate template infrastructure, templates should share as much code as possible with scripts. * Removed ScriptParseException in favour for ElasticsearchParseException * Moved TemplateQueryBuilder to lang-mustache module because this query is hard coded to work with mustache only	2016-07-18 10:16:01 +02:00
Ryan Ernst	c36850f114	Add client flag for percolator module build	2016-07-14 02:41:58 -07:00
Nik Everett	8263873783	Switch search extension from push to pull Switches most search behavior extensions from push (`onModule(SearchModule)`) to pull (`implements SearchPlugin`). This effort in general gives plugin authors a much cleaner view of how to extend Elasticsearch and starts to set up portions of Elasticsearch as "the plugin API". This commit in particular does that for search-time behavior like customized suggesters, highlighters, score functions, and significance heuristics. It also switches most such customization to being done at search module construction time which is much, much easier to reason about from a testing perspective. It also helps significantly in the process of de-guice-ing Elasticsearch's startup. There are at least two major search time extensions that aren't covered in this commit that will simply have to wait for the next commit on the topic because this one has already grown large: custom aggregations and custom queries. These will likely live in the same SearchPlugin interface as well.	2016-07-11 18:49:05 -04:00
Martijn van Groningen	ff5527f037	percolator: Forbid the usage or `range` queries with a range based on the current time If there are percolator queries containing `range` queries with ranges based on the current time then this can lead to incorrect results if the `percolate` query gets cached. These ranges are changing each time the `percolate` query gets executed and if this query gets cached then the results will be based on how the range was at the time when the `percolate` query got cached. The ExtractQueryTermsService has been renamed `QueryAnalyzer` and now only deals with analyzing the query (extracting terms and deciding if the entire query is a verified match) . The `PercolatorFieldMapper` is responsible for adding the right fields based on the analysis the `QueryAnalyzer` has performed, because this is highly dependent on the field mappings. Also the `PercolatorFieldMapper` is responsible for creating the percolate query.	2016-07-08 14:20:56 +02:00

1 2

93 Commits