Commit Graph

68 Commits

Author SHA1 Message Date
Jim Ferenczi c80a563a71 Replace org.elasticsearch.common.lucene.search.MatchNoDocsQuery with its Lucene version (org.apache.lucene.search.MatchNoDocsQuery) (#20832)
* Replace org.elasticsearch.common.lucene.search.MatchNoDocsQuery with its Lucene version (org.apache.lucene.search.MatchNoDocsQuery)

This change removes the ES version of the match no docs query and replaces it with the Lucene version.

relates #18030

* Add missing change
2016-10-10 17:45:19 +02:00
Simon Willnauer 9c9afe3f01 Remove SearchContext#current and all it's threadlocals (#20778)
Today SearchContext expose the current context as a thread local which makes any kind of sane interface design very very hard. This PR removes the thread local entirely and instead passes the relevant context anywhere needed. This simplifies state management dramatically and will allow for a much leaner SearchContext interface down the road.
2016-10-06 19:51:54 +02:00
Simon Willnauer ce21b607bb move test to a single node test 2016-10-05 21:55:50 +02:00
Simon Willnauer 838c28eeb4 add percolate with script query test 2016-10-05 20:43:46 +02:00
Simon Willnauer 57afbadf33 PercolateQuery is never cacheable 2016-10-05 16:38:47 +02:00
Colin Goodheart-Smithe 7bffe95025 Fix percolator queries to not be cacheable 2016-10-05 15:03:29 +01:00
Jason Tedor 51d53791fe Remove lenient URL parameter parsing
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).

This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.

Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.

Relates #20722
2016-10-04 12:45:29 -04:00
Simon Willnauer fe1803c957 Remove AnalysisService and reduce it to a simple name to analyzer mapping (#20627)
Today we hold on to all possible tokenizers, tokenfilters etc. when we create
an index service on a node. This was mainly done to allow the `_analyze` API to
directly access all these primitive. We fixed this in #19827 and can now get rid of
the AnalysisService entirely and replace it with a simple map like class. This
ensures we don't create a gazillion long living objects that are entirely useless since
they are never used in most of the indices. Also those objects might consume a considerable
amount of memory since they might load stopwords or synonyms etc.

Closes #19828
2016-09-23 08:53:50 +02:00
Jim Ferenczi 1764ec56b3 Fixed naming inconsistency for fields/stored_fields in the APIs (#20166)
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
2016-09-13 20:54:41 +02:00
Lee Hinman 94625d74e4 No longer allow cluster name in data path
In 5.x we allowed this with a deprecation warning. This removes the code
added for that deprecation, requiring the cluster name to not be in the
data path.

Resolves #20391
2016-09-12 15:47:01 -06:00
javanna 90ab460fcc move parsing of search ext sections to the coordinating node 2016-09-09 19:10:42 +02:00
Martijn van Groningen 245882cde3 * Removed `script.default_lang` setting and made `painless` the hardcoded default script language.
** The default script language is now maintained in `Script` class.
* Added `script.legacy.default_lang` setting that controls the default language for scripts that are stored inside documents (for example percolator queries).  This defaults to groovy.
** Added `QueryParseContext#getDefaultScriptLanguage()` that manages the default scripting language. Returns always `painless`, unless loading query/search request in legacy mode then the returns what is configured in `script.legacy.default_lang` setting.
** In the aggregation parsing code added `ParserContext` that also holds the default scripting language like `QueryParseContext`. Most parser don't have access to `QueryParseContext`. This is for scripts in aggregations.
* The `lang` script field is always serialized (toXContent).

Closes #20122
2016-09-06 18:44:48 +02:00
Jason Tedor e166459bbe Merge branch 'master' into log4j2
* master:
  Increase visibility of deprecation logger
  Skip transport client plugin installed on JDK 9
  Explicitly disable Netty key set replacement
  percolator: Fail indexing percolator queries containing either a has_child or has_parent query.
  Make it possible for Ingest Processors to access AnalysisRegistry
  Allow RestClient to send array-based headers
  Silence rest util tests until the bogusness can be simplified
  Remove unknown HttpContext-based test as it fails unpredictably on different JVMs
  Tests: Improve rest suite names and generated test names for docs tests
  Add support for a RestClient base path
2016-08-31 10:59:27 -04:00
Martijn van Groningen 3fcb95b814 percolator: Fail indexing percolator queries containing either a has_child or has_parent query.
Closes #2960
2016-08-31 07:46:17 +02:00
Jason Tedor 7da0cdec42 Introduce Log4j 2
This commit introduces Log4j 2 to the stack.
2016-08-30 13:31:24 -04:00
Jun Ohtani 450f47d5b5 Validate blank field name
add validation and validate only 5.0+
Add tests before 5.0

Closes #19251
2016-08-26 20:10:33 +09:00
Ryan Ernst 743d9fd008 Merge branch 'master' into search_parser 2016-08-16 11:28:59 -07:00
Ryan Ernst 7fde410586 Internal: Consolidate search parser registries
Parsing a search request is currently split up among a number of
classes, using multiple public static methods, which take multiple
regstries of elements that may appear in the search request like query
parsers and aggregations. This change begins consolidating all this code
by collapsing the registries normally used for parsing search requests
into a single SearchRequestParsers class. It is also made available to
plugin services to enable templating of search requests.  Eventually all
of the actual parsing logic should move to the class, and the registries
should be hidden, but for now they are at least co-located to reduce the
number of objects that must be passed around.
2016-08-16 01:59:24 -07:00
Nik Everett 1452ab4b9f Squash the rest of o.e.rest.action
Squashes all the subpackages of `org.elasticsearch.rest.action` down to
the following:
* `o.e.rest.action.admin` - Administrative actions
* `o.e.rest.action.cat` - Actions that make tables for `grep`ing
* `o.e.rest.action.document` - Actions that act on documents
* `o.e.rest.action.ingest` - Actions that act on ingest pipelines
* `o.e.rest.action.search` - Actions that search

I'm tempted to merge `search` into `document` but the `document`
package feels fairly complete as is and `Suggest` isn't actually always
about documents either....

I'm also tempted to merge `ingest` into `admin.cluster` because the
latter contains the actions for dealing with stored scripts.

I've moved the `o.e.rest.action.support` into `o.e.rest.action`.

I've also added `package-info.java`s to all packges in `o.e.rest`. I
figure if the package is too small to deserve a `package-info.java` file
then it is too small to deserve to be a package....

Also fixes checkstyle in all moved classes.
2016-08-15 21:06:32 -04:00
Nik Everett cf6e1a4362 Move all FetchSubPhases to `o.e.search.fetch.subphase`
As the most complicated `FetchSubPhase` highlighting gets its own package
(`o.e.seach.fetch.subphase.highlight`. No other `FetchSubPhase`s get their
own package. Instead they all reside together in `o.e.search.fetch.subphase`.

Add package descriptions to `o.e.search.fetch` and subpackages.
2016-08-12 18:21:15 -04:00
Adrien Grand 0d6ac57acf Collapse o.e.index.mapper packages. #19921
I also reduced the visibility of a couple classes and renamed/consolidated some
test classes for consistency, eg. removing the `Simple` prefix or using the
`<Type>FieldMapperTests` convention for testing field mappers.
2016-08-10 17:51:11 +02:00
javanna 2c44278ce8 [TEST] use ParseField instead of plain strings in query tests 2016-08-10 12:21:25 +02:00
javanna 0a98b5e56e [TEST] make AbstractQueryTestCase#testUnknownObjectException more accurate
testUnknownObjectException used to generate malformed json objects in some cases, due to the existence of arrays as it was not closing the injected object correctly. That is why the test was catching JsonParseException among the exception that are expected to be thrown. That is fixed by tracking where the new object is placed and placing its end object marker to the right level rather than always at the end.

Also introduced a mechanism to explicitly declare objects that won't cause any exception when they get additional objects injected, so that there is no need to override the method anymore as that caused copy pasting of the whole test method. This also makes sure that changes are reflected in tests, as those inner objects are not skipped but we actually check that what is declared is true (no exceptions get thrown when an additional object is added within them.
2016-08-10 11:48:51 +02:00
javanna 2437226802 [TEST] restore tests repeatability in AbstractQueryTestCase
Some random operations were conditionally performed in the before test, which made tests not repeatable. For instance take the seed chain to repeat a specific iteration and try to reproduce it, this conditional code would get executed in both cases when trying to isolate the failure, but not among the different iterations (as only the first method/iteration executes it), hence the failure will not reproduce.

Moved the random operations to beforeClass and left the non random part in the before method, which is needed as it depends on some method that can be overridden by subclasses.
2016-08-05 22:38:31 +02:00
Nik Everett 9270e8b22b Rename client yaml test infrastructure
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
2016-07-26 13:53:44 -04:00
Nik Everett a95d4f4ee7 Add Location header and improve REST testing
This adds a header that looks like `Location: /test/test/1` to the
response for the index/create/update API. The requirement for the header
comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

https://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative
URIs are OK. So we use an absolute path which should resolve to the
appropriate location.

Closes #19079

This makes large changes to our rest test infrastructure, allowing us
to write junit tests that test a running cluster via the rest client.
It does this by splitting ESRestTestCase into two classes:
* ESRestTestCase is the superclass of all tests that use the rest client
to interact with a running cluster.
* ESClientYamlSuiteTestCase is the superclass of all tests that use the
rest client to run the yaml tests. These tests are shared across all
official clients, thus the `ClientYamlSuite` part of the name.
2016-07-25 17:02:40 -04:00
Nik Everett 3a82c613e4 Migrate query registration from push to pull
Remove `ParseField` constants used for names where there are no deprecated
names and just use the `String` version of the registration method instead.

This is step 2 in cleaning up the plugin interface for extending
search time actions. Aggregations are next.

This is breaking for plugins because those that register a new query should
now implement `SearchPlugin` rather than `onModule(SearchModule)`.
2016-07-20 12:33:51 -04:00
Martijn van Groningen e0ebf5da1c Template cleanup:
* Removed `Template` class and unified script & template parsing logic. Templates are scripts, so they should be defined as a script. Unless there will be separate template infrastructure, templates should share as much code as possible with scripts.
* Removed ScriptParseException in favour for ElasticsearchParseException
* Moved TemplateQueryBuilder to lang-mustache module because this query is hard coded to work with mustache only
2016-07-18 10:16:01 +02:00
Ryan Ernst c36850f114 Add client flag for percolator module build 2016-07-14 02:41:58 -07:00
Nik Everett 8263873783 Switch search extension from push to pull
Switches most search behavior extensions from push (`onModule(SearchModule)`)
to pull (`implements SearchPlugin`). This effort in general gives plugin
authors a much cleaner view of how to extend Elasticsearch and starts to
set up portions of Elasticsearch as "the plugin API". This commit in
particular does that for search-time behavior like customized suggesters,
highlighters, score functions, and significance heuristics.

It also switches most such customization to being done at search module
construction time which is much, much easier to reason about from a testing
perspective. It also helps significantly in the process of de-guice-ing
Elasticsearch's startup.

There are at least two major search time extensions that aren't covered in
this commit that will simply have to wait for the next commit on the topic
because this one has already grown large: custom aggregations and custom
queries. These will likely live in the same SearchPlugin interface as well.
2016-07-11 18:49:05 -04:00
Martijn van Groningen ff5527f037 percolator: Forbid the usage or `range` queries with a range based on the current time
If there are percolator queries containing `range` queries with ranges based on the current time then this can lead to incorrect results if the `percolate` query gets cached.  These ranges are changing each time the `percolate` query gets executed and if this query gets cached then the results will be based on how the range was at the time when the `percolate` query got cached.

The ExtractQueryTermsService has been renamed `QueryAnalyzer` and now only deals with analyzing the query (extracting terms and deciding if the entire query is a verified match) . The `PercolatorFieldMapper` is responsible for adding the right fields based on the analysis the `QueryAnalyzer` has performed, because this is highly dependent on the field mappings. Also the `PercolatorFieldMapper` is responsible for creating the percolate query.
2016-07-08 14:20:56 +02:00
Martijn van Groningen 7b8ae54f0f percolator: Also support query term extract for queries wrapped inside a FunctionScoreQuery
Additionally for highlighting percolator hits, also extract percolator query from FunctionScoreQuery and DisjunctionMaxQuery
2016-07-08 10:51:48 +02:00
Jason Tedor 96f283c195 Rename writeThrowable to writeException
This commit renames writeThrowable to writeException. The situation here
stems from the fact that the StreamOutput method for serializing
Exceptions needs to accept Throwables too as Throwables can be the cause
of serialized Exceptions. Yet, we do not serialize Throwables in the
Error sub-hierarchy in a way that they can be deserialized into their
initial type. This leads to an asymmetry in the StreamOutput method for
serializing Exceptions and the StreamInput method for writing
Excpetions. Namely, the former will accept Throwables but the latter
will only return Exceptions. A goal with the stream methods has always
been symmetry in the method names so that serialization/deserialization
routines appear symmetrical in code. It is this asymmetry on the
input/output types for Exceptions on StreamOutput/StreamInput that
clashes with the desired symmetry of naming. Despite this, we should
favor symmetry in the naming of the methods. This commit renames
StreamOutput#writeThrowable to StreamOutput#writeException which leaves
us with Exception StreamInput#readException and void
StreamOutput#writeException(Throwable).
2016-07-05 14:37:01 -04:00
Jason Tedor 3343ceeae4 Do not catch throwable
Today throughout the codebase, catch throwable is used with reckless
abandon. This is dangerous because the throwable could be a fatal
virtual machine error resulting from an internal error in the JVM, or an
out of memory error or a stack overflow error that leaves the virtual
machine in an unstable and unpredictable state. This commit removes
catch throwable from the codebase and removes the temptation to use it
by modifying listener APIs to receive instances of Exception instead of
the top-level Throwable.

Relates #19231
2016-07-04 08:41:06 -04:00
Tanguy Leroux 8c40b2b54e Fix order of modifiers 2016-07-01 16:57:14 +02:00
Simon Willnauer 5c8164a561 Clean up BytesReference (#19196)
BytesReference should be a really simple interface, yet it has a gazillion
ways to achieve the same this. Methods like `#hasArray`, `#toBytesArray`, `#copyBytesArray`
`#toBytesRef` `#bytes` are all really duplicates. This change simplifies the interface
dramatically and makes implementations of it much simpler. All array access has been removed
and is streamlined through a single `#toBytesRef` method. Utility methods to materialize a
compact byte array has been added too for convenience.
2016-07-01 16:09:31 +02:00
Ryan Ernst c762e7aa15 Merge branch 'master' into rest_handler_client 2016-06-30 08:16:25 -07:00
Nik Everett e359be7632 Don't inject TransportPercolateAction into RestPercolateAction
Instead use the client. This will help us build the actions more
easily in the future.
2016-06-30 09:36:31 -04:00
Martijn van Groningen 4c2d6cf538 percolator: removed unused code 2016-06-30 14:43:28 +02:00
Ryan Ernst 865b951b7d Internal: Changed rest handler interface to take NodeClient
Previously all rest handlers would take Client in their injected ctor.
However, it was only to hold the client around for runtime. Instead,
this can be done just once in the HttpService which handles rest
requests, and passed along through the handleRequest method. It also
should always be a NodeClient, and other types of Clients (eg a
TransportClient) would not work anyways (and some handlers can be
simplified in follow ups like reindex by taking NodeClient).
2016-06-29 18:02:18 -07:00
Nik Everett 8db43c0107 Move RestHandler registration to ActionModule and ActionPlugin
`RestHandler`s are highly tied to actions so registering them in the
same place makes sense.

Removes the need to for plugins to check if they are in transport client
mode before registering a RestHandler - `getRestHandlers` isn't called
at all in transport client mode.

This caused guice to throw a massive fit about the circular dependency
between NodeClient and the allocation deciders. I broke the circular
dependency by registering the actions map with the node client after
instantiation.
2016-06-29 18:31:44 -04:00
Martijn van Groningen b97ea9954c percolator: Use RamDirectory for percolating nested document instead of using multiple MemoryIndex instances with SlowCompositeReaderWrapper workaround 2016-06-29 08:50:01 +02:00
Nik Everett fa4844c3f4 Pull actions from plugins
Instead of implementing onModule(ActionModule) to register actions,
this has plugins implement ActionPlugin to declare actions. This is
yet another step in cleaning up the plugin infrastructure.

While I was in there I switched AutoCreateIndex and DestructiveOperations
to be eagerly constructed which makes them easier to use when
de-guice-ing the code base.
2016-06-28 08:36:24 -04:00
Ryan Ernst 33ccc5aead Merge branch 'master' into mapper_plugin_api 2016-06-27 11:19:59 -07:00
Martijn van Groningen d3cd58eb2f Merges PR #18957
This commit fixes several NPEs caused by implicitly performing a get request for a document that exists with its _source disabled and then trying to access the source. Instead of causing an NPE the following queries will throw an exception with a "source disabled" message (similar behavior as if the document does not exist).:
- GeoShape query for pre-indexed shape (throws IllegalArgumentException)
- Percolate query for an existing document (throws IllegalArgumentException)

A Terms query with a lookup will ignore the document if the source does not exist (same as if the document does not exist).

GET and HEAD requests for the document _source will return a 404 if the source is disabled (even if the document exists).
2016-06-27 09:37:28 +02:00
Martijn van Groningen 9a0ce62550 percolator: Add support for the synonym query. 2016-06-27 07:42:44 +02:00
Alex Benusovich 3ca909dfea Fix NPEs due to disabled source
This commit fixes several NPEs caused by implicitly performing a get request for a document that exists with its _source disabled and then trying to access the source. Instead of causing an NPE the following queries will throw an exception with a "source disabled" message (similar behavior as if the document does not exist).:
- GeoShape query for pre-indexed shape (throws IllegalArgumentException)
- Percolate query for an existing document (throws IllegalArgumentException)

A Terms query with a lookup will ignore the document if the source does not exist (same as if the document does not exist).

GET and HEAD requests for the document _source will return a 404 if the source is disabled (even if the document exists).
2016-06-24 22:03:03 -07:00
Ryan Ernst 6995bde710 Merge branch 'master' into mapper_plugin_api 2016-06-24 11:15:06 -07:00
Martijn van Groningen 599a548998 percolator: Don't verify candidate matches with MemoryIndex that are verified matches
If we don't care about scoring then for certain candidate matches we can be certain, that if they are a candidate match,
then they will always match. So verifying these queries with the MemoryIndex can be skipped.
2016-06-24 15:46:55 +02:00
Ryan Ernst e817b5daa3 Plugins: Remove guice from Mapper plugins
This changes adds a MapperPlugin interface which allows pull style
retrieval of mappers and metadata mappers added by plugins. For now, I
have kept the MapperRegistry, but this should be removed in the future
as it is just a silly container for 2 maps which could themselves be
passed around.
2016-06-21 22:50:39 -07:00