OpenSearch

Commit Graph

Author	SHA1	Message	Date
David Turner	9766b858d0	Prepare for bump to 6.0.1 on the master branch (#27391 ) An assortment of fixes, particularly to version number calculations, in preparation for the bump to 6.0.1.	2017-11-16 18:38:54 +00:00
Tim Brooks	80ef9bbdb1	Remove parameterization from TcpTransport (#27407 ) This commit is a follow up to the work completed in #27132. Essentially it transitions two more methods (sendMessage and getLocalAddress) from Transport to TcpChannel. With this change, there is no longer a need for TcpTransport to be aware of the specific type of channel a transport returns. So that class is no longer parameterized by channel type.	2017-11-16 11:19:36 -07:00
kel	6b817489f3	Fix default value of ignore_unavailable for snapshot REST API (#27056 ) The default value for ignore_unavailable did not match what was documented when using the REST APIs for snapshot creation and restore. This commit sets the default value of ignore_unavailable to false, the way it is documented and ensures it's the same when using either REST API or transport client. Closes #25359	2017-11-16 16:03:09 +01:00
Jim Ferenczi	623367d793	Add composite aggregator (#26800 ) * This change adds a module called `aggs-composite` that defines a new aggregation named `composite`. The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources. The sources for each bucket can be defined as: * A `terms` source, values are extracted from a field or a script. * A `date_histogram` source, values are extracted from a date field and rounded to the provided interval. This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets. A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources. For instance the following aggregation: ```` "test_agg": { "terms": { "field": "field1" }, "aggs": { "nested_test_agg": "terms": { "field": "field2" } } } ```` ... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve all the combinations of `field1`, `field2` in the matching documents: ```` "composite_agg": { "composite": { "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } }, } } ```` The response of the aggregation looks like this: ```` "aggregations": { "composite_agg": { "buckets": [ { "key": { "field1": "alabama", "field2": "almanach" }, "doc_count": 100 }, { "key": { "field1": "alabama", "field2": "calendar" }, "doc_count": 1 }, { "key": { "field1": "arizona", "field2": "calendar" }, "doc_count": 1 } ] } } ```` By default this aggregation returns 10 buckets sorted in ascending order of the composite key. Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after. For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`: ```` "composite_agg": { "composite": { "after": {"field1": "alabama", "field2": "calendar"}, "size": 100, "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } } } } ```` This aggregation is optimized for indices that set an index sorting that match the composite source definition. For instance the aggregation above could run faster on indices that defines an index sorting like this: ```` "settings": { "index.sort.field": ["field1", "field2"] } ```` In this case the `composite` aggregation can early terminate on each segment. This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition. This is mandatory because index sorting picks only one value per document to perform the sort.	2017-11-16 15:13:36 +01:00
Simon Willnauer	303e0c0e86	Fix `ShardSplittingQuery` to respect nested documents. (#27398 ) Today if nested docs are used in an index that is split the operation will only work correctly if the index is not routing partitioned or unless routing is used. This change fixes the query that selectes the docs to delete to also select all parents nested docs as well. Closes #27378	2017-11-16 11:35:42 +01:00
Tim Brooks	ca11085bb6	Add TcpChannel to unify Transport implementations (#27132 ) Right now our different transport implementations must duplicate functionality in order to stay compliant with the requirements of TcpTransport. They must all implement common logic to open channels, close channels, keep track of channels for eventual shutdown, etc. Additionally, there is a weird and complicated relationship between Transport and TransportService. We eventually want to start merging some of the functionality between these classes. This commit starts moving towards a world where TransportService retains all the application logic and channel state. Transport implementations in this world will only be tasked with returning a channel when one is requested, calling transport service when a channel is accepted from a server, and starting / stopping itself. Specifically this commit changes how channels are opened and closed. All Transport implementations now return a channel type that must comply with the new TcpChannel interface. This interface has the methods necessary for TcpTransport to completely manage the lifecycle of a channel. This includes setting the channel up, waiting for connection, adding close listeners, and eventually closing.	2017-11-15 12:38:39 -07:00
Tim Brooks	a8f916911a	Remove implementations of `TransportChannel` (#27388 ) Right now we have unnecessary implementations of `TransportChannel`. Additionally, there are methods on the interface that are not used. This commit removes unnecessary implementations and methods.	2017-11-15 09:48:07 -07:00
olcbean	5ce407e26f	wildcard query on _index (#27334 )	2017-11-14 10:22:21 -07:00
Jason Tedor	be399965e3	Revert "Reduce synchronization on field data cache" This reverts commit `2e863572f4`. Relates #27365	2017-11-14 05:57:51 -05:00
tinder-xli	2e863572f4	Reduce synchronization on field data cache The field data cache can come under heavy contention in cases when lots of search threads are hitting it for doc values. This commit reduces the amount of contention here by using a double-checked locking strategy to only lock when the cache needs to be initialized. Relates #27365	2017-11-13 23:35:46 -05:00
Yannick Welsch	6d30fd5ac0	Properly format IndexGraveyard deletion date as date (#27362 ) The toXContent method for IndexGraveYard (which is a collection of tombstones for explicitly marking indices as deleted in the cluster state) confused timeValue with dateField, resulting in output of the form "delete_date" : "23424.3d" instead of "delete_date":"2017-11-13T15:50:51.614Z".	2017-11-13 18:05:58 +01:00
Yannick Welsch	c83f112b1a	Stop responding to ping requests before master abdication (#27329 ) When the current master node is shutting down, it sends a leave request to the other nodes so that they can eagerly start a fresh master election. Unfortunately, it was still possible for the master node that was shutting down to respond to ping requests, possibly influencing the election decision as it still appeared as an active master in the ping responses. This commit ensures that UnicastZenPing does not respond to ping requests once it's been closed. ZenDiscovery.doStop() continues to ensure that the pinging component is first closed before it triggers a master election. Closes #27328	2017-11-13 15:18:59 +01:00
Simon Willnauer	2299c70371	Allow affix settings to specify dependencies (#27161 ) We use affix settings to group settings / values under a certain namespace. In some cases like login information for instance a setting is only valid if one or more other settings are present. For instance `x.test.user` is only valid if there is an `x.test.passwd` present and vice versa. This change allows to specify such a dependency to prevent settings updates that leave settings in an inconsistent state.	2017-11-13 12:06:36 +01:00
tinder-xli	1e99195743	Remove unnecessary logger creation for doc values field data This commit removes an unnecessary logger instance creation from the constructor for doc values field data. This construction is expensive for this oft-created class because of a synchronized block in the constructor for the logger. Relates #27349	2017-11-10 22:28:58 -05:00
Nicholas Knize	8904fc8210	[Geo] Decouple geojson parse logic from ShapeBuilders This is the first step to supporting WKT (and other future) format(s). The ShapeBuilders are quite messy and can be simplified by decoupling the parse logic from the build logic. This commit refactors the parsing logic into its own package separate from the Shape builders. It also decouples the GeoShapeType into a standalone enumerator that is responsible for validating the parsed data and providing the appropriate builder. This future-proofs the code making it easier to maintain and add new shape types.	2017-11-10 14:37:58 -06:00
Ryan Ernst	8b9e23de93	Plugins: Add versionless alias to all security policy codebase properties (#26756 ) This is a followup to #26521. This commit expands the alias added for the elasticsearch client codebase to all codebases. The original full jar name property is left intact. This only adds an alias without the version, which should help ease the pain in updating any versions (ES itself or dependencies).	2017-11-10 11:00:09 -08:00
Jim Ferenczi	bec5d43228	[Test] #27342 Fix SearchRequests#testValidate	2017-11-10 18:51:58 +01:00
Jim Ferenczi	29331f1127	Fail queries with scroll that explicitely set request_cache (#27342 ) Queries that create a scroll context cannot use the cache. They modify the search context during their execution so using the cache can lead to duplicate result for the next scroll query. This change fails the entire request if the request_cache option is explictely set on a query that creates a scroll context (`scroll=1m`) and make sure internally that we never use the cache for these queries when the option is not explicitely used. For 6.x a deprecation log will be printed instead of failing the entire request and the request_cache hint will be ignored (forced to false).	2017-11-10 16:02:06 +01:00
Christoph Büscher	4fa33e7111	[Tests] Relax allowed delta in extended_stats aggregation (#27171 ) The order in which double values are added in java can give different results for the sum, so we need to allow a certain delta in the test assertions. The current value was still a bit too low, which manifested itself in occasional test failures.	2017-11-10 14:37:26 +01:00
Nicholas Knize	06ff92d237	Add ignore_malformed to geo_shape fields This commit adds ignore_malformed support to geo_shape field types to skip malformed geoJson fields. closes #23747	2017-11-09 17:59:05 -06:00
Dimitris Athanasiou	66bef26495	Aggregations: bucket_sort pipeline aggregation (#27152 ) This commit adds a parent pipeline aggregation that allows sorting the buckets of a parent multi-bucket aggregation. The aggregation also offers [from] and [size] parameters in order to truncate the result as desired. Closes #14928	2017-11-09 17:59:57 +00:00
David Turner	1c6f5ce9cb	Improve error message for parse failures of completion fields (#27297 ) Fix spacing/grammar/punctuation, and include the field name and location in the source document.	2017-11-09 10:45:44 +00:00
Simon Willnauer	a34c2f0b8d	Ensure external refreshes will also refresh internal searcher to minimize segment creation (#27253 ) We cut over to internal and external IndexReader/IndexSearcher in #26972 which uses two independent searcher managers. This has the downside that refreshes of the external reader will never clear the internal version map which in-turn will trigger additional and potentially unnecessary segment flushes since memory must be freed. Under heavy indexing load with low refresh intervals this can cause excessive segment creation which causes high GC activity and significantly increases the required segment merges. This change adds a dedicated external reference manager that delegates refreshes to the internal reference manager that then `steals` the refreshed reader from the internal reference manager for external usage. This ensures that external and internal readers are consistent on an external refresh. As a sideeffect this also releases old segments referenced by the internal reference manager which can potentially hold on to already merged away segments until it is refreshed due to a flush or indexing activity.	2017-11-09 08:40:22 +00:00
David Turner	4abb5fa297	Remove optimisations to reuse objects when applying a new `ClusterState` (#27317 ) In order to avoid churn when applying a new `ClusterState`, there are some checks that compare parts of the old and new states and, if equal, the new object is discarded and the old one reused. Since `ClusterState` updates are now largely diff-based, this code is unnecessary: applying a diff also reuses any old objects if unchanged. Moreover, the code compares the parts of the `ClusterState` using their `version()` values which is not guaranteed to be correct, because of a lack of consensus. This change removes this optimisation, and tests that objects are still reused as expected via the diff mechanism.	2017-11-09 08:09:14 +00:00
Costin Leau	6f04b8c9be	Add unreleased 5.6.5 version number	2017-11-08 18:19:04 +02:00
Boaz Leskes	229bf29ba1	testCreateSplitIndexToN: do not set `routing_partition_size` to >= `number_of_routing_shards` It's an illegal value	2017-11-08 17:09:33 +01:00
Igor Motov	0fe2003ae6	Snapshot/Restore: better handle incorrect chunk_size settings in FS repo (#26844 ) Specifying a negative value or null as a chunk_size in FS repository can lead to corrupt snapshots. Closes #26843	2017-11-08 10:43:28 -05:00
Jason Tedor	6810aa8452	Correct comment in index shard test This commit fixes a comment in an index shard test which was inaccurate after it was copied from another test and not modified to reflect the reasoning in the test that it was copied into.	2017-11-08 09:33:12 -05:00
Jason Tedor	927d7f6b6c	Roll translog generation on primary promotion When a primary is promoted, rolling the translog generation here makes simpler reasoning about the relationship between primary terms and translog generation. Note that this is not strictly necessary for correctness (e.g., to avoid duplicate operations with the same sequence number within a single generation). Relates #27313	2017-11-08 09:14:08 -05:00
olcbean	bd5e7002be	ObjectParser: Replace IllegalStateException with ParsingException (#27302 ) Relates to #27147	2017-11-08 14:10:11 +01:00
Reese Levine	74b1e7db51	scripted_metric _agg parameter disappears if params are provided (#27159 ) * Fixes #19768: scripted_metric _agg parameter disappears if params are provided * Test case for #19768 * Compare boolean to false instead of negating it * Added mocked script in ScriptedMetricIT * Fix test in ScriptedMetricIT for implicit _agg map	2017-11-08 08:45:47 +00:00
Boaz Leskes	ace446f335	Update shrink's bwc version to 6.1.0 and enabled bwc tests	2017-11-07 15:35:46 +01:00
Mayya Sharipova	148376c2c5	Add limits for ngram and shingle settings (#27211 ) * Add limits for ngram and shingle settings (#27211) Create index-level settings: max_ngram_diff - maximum allowed difference between max_gram and min_gram in NGramTokenFilter/NGramTokenizer. Default is 1. max_shingle_diff - maximum allowed difference between max_shingle_size and min_shingle_size in ShingleTokenFilter. Default is 3. Throw an IllegalArgumentException when trying to create NGramTokenFilter, NGramTokenizer, ShingleTokenFilter where difference between max_size and min_size exceeds the settings value. Closes #25887	2017-11-07 08:14:55 -05:00
Boaz Leskes	95cf3df6ac	TemplateUpgradeService should only run on the master (#27294 ) The `TemplateUpgradeService` allows plugins to register a call back that mutates index templates upon recovery. This is handy for upgrade logic that needs to make sure that an existing index template is updated once the cluster is upgraded to a new version of the plugin (and ES). Currently, the service has complicated logic to decide which node should perform the upgrade. It will prefer the master node, if it is of the highest version of the cluster and otherwise it will fall back to one of the non-coordinating nodes which are on the latest version. While this attempts to make sure that new nodes can assume their template version is in place (but old node still need to be able to operate under both old and new template), it has an inherent problem in that the master (on an old version) may not be able to process the put template request with the new template - it may miss certain features. This PR changes the logic to be simpler and always rely on the current master nodes. This comes at the price that new nodes need to operate both with old templates and new. That price is small as they need to operate with old indices regardless of the template. On the flip side we reduce a lot of complexity in what can happen in the cluster.	2017-11-07 08:35:00 +01:00
Jason Tedor	d5451b2037	Die with dignity while merging If an out of memory error is thrown while merging, today we quietly rewrap it into a merge exception and the out of memory error is lost. Instead, we need to rethrow out of memory errors, and in fact any fatal error here, and let those go uncaught so that the node is torn down. This commit causes this to be the case. Relates #27265	2017-11-06 17:55:11 -05:00
Zachary Tong	6e9e07d6f8	Fix profiling naming issues (#27133 ) Some code-paths use anonymous classes (such as NonCollectingAggregator in terms agg), which messes up the display name of the profiler. If we encounter an anonymous class, we need to grab the super's name. Another naming issue was that ProfileAggs were not delegating to the wrapped agg's name for toString(), leading to ugly display. This PR also fixes up the profile documentation. Some of the examples were executing against empty indices, which shows different profile results than a populated index (and made for confusing examples). Finally, I switched the agg display names from the fully qualified name to the simple name, so that it's similar to how the query profiles work. Closes #26405	2017-11-06 16:37:33 -05:00
Jason Tedor	766d29e7cf	Correctly encode warning headers The warnings headers have a fairly limited set of valid characters (cf. quoted-text in RFC 7230). While we have assertions that we adhere to this set of valid characters ensuring that our warning messages do not violate the specificaion, we were neglecting the possibility that arbitrary user input would trickle into these warning headers. Thus, missing here was tests for these situations and encoding of characters that appear outside the set of valid characters. This commit addresses this by encoding any characters in a deprecation message that are not from the set of valid characters. Relates #27269	2017-11-06 13:20:30 -05:00
kel	d7fa09153a	Remove duplicated SnapshotStatus (#27276 )	2017-11-06 16:19:16 +01:00
Simon Willnauer	bd7efa908a	Add ability to split shards (#26931 ) This change adds a new `_split` API that allows to split indices into a new index with a power of two more shards that the source index. This API works alongside the `_shrink` API but doesn't require any shard relocation before indices can be split. The split operation is conceptually an inverse `_shrink` operation since we initialize the index with a _syntetic_ number of routing shards that are used for the consistent hashing at index time. Compared to indices created with earlier versions this might produce slightly different shard distributions but has no impact on the per-index backwards compatibility. For now, the user is required to prepare an index to be splittable by setting the `index.number_of_routing_shards` at index creation time. The setting allows the user to prepare the index to be splittable in factors of `index.number_of_routing_shards` ie. if the index is created with `index.number_of_routing_shards: 16` and `index.number_of_shards: 2` it can be split into `4, 8, 16` shards. This is an intermediate step until we can make this the default. This also allows us to safely backport this change to 6.x. The `_split` operation is implemented internally as a DeleteByQuery on the lucene level that is executed while the primary shards execute their initial recovery. Subsequent merges that are triggered due to this operation will not be executed immediately. All merges will be deferred unti the shards are started and will then be throttled accordingly. This change is intended for the 6.1 feature release but will not support pre-6.1 indices to be split unless these indices have been shrunk before. In that case these indices can be split backwards into their original number of shards.	2017-11-06 11:37:55 +01:00
Tanguy Leroux	43e7a4a349	Upgrade to Jackson 2.8.10 (#27230 ) While it's not possible to upgrade the Jackson dependencies to their latest versions yet (see #27032 (comment) for more) it's still possible to upgrade to the latest 2.8.x version.	2017-11-06 10:20:05 +01:00
kel	76f81e002c	Remove unused parameters in AnalysisRegistry (#27232 ) Removes unused parameters for AnalysisRegistry#processAnalyzerFactory and AnalysisRegistry#processNormalizerFactory.	2017-11-06 09:48:57 +01:00
kel	5d661df174	Add more information on `_failed_to_convert_` exception (#27034 )	2017-11-06 09:40:28 +01:00
Jim Ferenczi	429275a773	Remove ElasticsearchQueryCachingPolicy (#27190 ) We have an hidden setting called `index.queries.cache.term_queries` that disables caching of term queries in the query cache. Though term queries are not cached in the Lucene UsageTrackingQueryCachingPolicy since version 6.5. This makes the es policy useless but also makes it impossible to re-enable caching for term queries. This change appeared in Lucene 6.5 so this setting is no-op since version 5.4 of Elasticsearch The change in this PR removes the setting and the custom policy.	2017-11-06 08:26:24 +01:00
Nhat	fd3fac9565	Backport the size-based index rollver to v6.1.0 Relates #27004	2017-11-04 20:14:59 -04:00
Nhat	c7ce5a07f2	Add size-based condition to the index rollover API (#27160 ) This is to add a max_size condition to the index rollover API. We use a totalSizeInBytes from DocsStats to evaluate this condition. Closes #27004	2017-11-04 19:51:48 -04:00
David Roberts	749c3ec716	Remove the single argument Environment constructor (#27235 ) Only tests should use the single argument Environment constructor. To enforce this the single arg Environment constructor has been replaced with a test framework factory method. Production code (beyond initial Bootstrap) should always use the same Environment object that Node.getEnvironment() returns. This Environment is also available via dependency injection.	2017-11-04 13:25:09 +00:00
Chris Earle	964016e228	Fix RestGetAction name typo This changes the name from docuemnt_get_action to document_get_action. Relates #27266	2017-11-04 08:29:00 -04:00
Igor Motov	117f0f3a44	Fix snapshot getting stuck in INIT state (#27214 ) If the master disconnects from the cluster after initiating snapshot, but just before the snapshot switches from INIT to STARTED state, the snapshot can get indefinitely stuck in the INIT state. This error is specific to v5.x+ and was triggered by keeping the master node that stepped down in the node list, the cleanup logic in snapshot/restore assumed that if master steps down it is always removed from the the node list. This commit changes the logic to trigger cleanup even if no nodes left the cluster. Closes #27180	2017-11-03 19:36:08 -04:00
Colin Goodheart-Smithe	20e8005859	Fixes QueryStringQueryBuilderTests Closes #27246	2017-11-03 13:24:56 +00:00
Jim Ferenczi	262422375e	[Test] Fix QueryStringQueryBuilderTests.testExistsFieldQuery BWC Handle BWC version in this test. Closes #27246	2017-11-03 14:17:11 +01:00
kel	0f21262b36	Do not create directories if repository is readonly (#26909 ) For FsBlobStore and HdfsBlobStore, if the repository is read only, the blob store should be aware of the readonly setting and do not create directories if they don't exist. Closes #21495	2017-11-03 13:10:50 +01:00
Christoph Büscher	9abc26ee92	[Test] Fix InternalStatsTests After recent changes in InternalStats#doXContentBody the corresponding xContent output of the parsed aggregation needed to be changed in a similar way.	2017-11-03 11:26:41 +01:00
Jim Ferenczi	d503782699	[Test] Fix QueryStringQueryBuilderTests.testExistsFieldQuery Adapt the test to check for the new NormsFieldExistsQuery. Closes #27246	2017-11-03 11:24:45 +01:00
Colin Goodheart-Smithe	28b4d95cf5	Uses norms for exists query if enabled (#27237 ) * Uses norms for exists query if enabled This change means that for indexes created from 6.1.0, if normas are enabled we will not write the field name to the `_field_names` field and for an exists query we will instead use the NormsFieldExistsQuery which was added in Lucene 7.1.0. If norms are not enabled or if the index was created before 6.1.0 `_field_names` will be used as before. * Fixes tests	2017-11-03 08:51:40 +00:00
Mathias Fußenegger	827ba7f82d	Avoid uid creation in ParsedDocument (#27241 ) The uid bytes (as the type#id) were needlessly being created even though they are no longer needed after the move to single type per index. This commit avoids creating these when parsed documents are constructed. Relates #27241	2017-11-02 20:10:07 -04:00
kel	55b9dfdd52	Rander sum as zero if count is zero for stats aggregation (#26893 ) (#27193 )	2017-11-02 16:02:47 +00:00
Simon Willnauer	b294250aba	Remove unused searcher parameter in SearchService#createContext (#27227 ) This parameter isn't used anywhere and just adds complexity.	2017-11-02 14:58:34 +01:00
Colin Goodheart-Smithe	c1b8140c83	Upgrade to Lucene 7.1 (#27225 )	2017-11-02 13:25:33 +00:00
Simon Willnauer	f928d613ad	Move IndexShard#getWritingBytes() under InternalEngine (#27209 ) We do some accounting in IndexShard that is not necessarily correct since we maintain two different index readers. This change moves the accounting under the engine which knows what reader we are refreshing. Relates to #26972	2017-11-02 10:43:17 +01:00
olcbean	b9896465cd	Introducing took time for _msearch This commit adds the took time to the response for _msearch. Relates #23767	2017-11-01 21:39:04 -04:00
Jason Tedor	59657ad1cb	Lazy initialize checkpoint tracker bit sets This local checkpoint tracker uses collections of bit sets to track which sequence numbers are complete, eventually removing these bit sets when the local checkpoint advances. However, these bit sets were eagerly allocated so that if a sequence number far ahead of the checkpoint was marked as completed, all bit sets between the "last" bit set and the bit set needed to track the marked sequence number were allocated. If this sequence number was too far ahead, the memory requirements could be excessive. This commit opts for a different strategy for holding on to these bit sets and enables them to be lazily allocated. Relates #27179	2017-11-01 21:26:52 -04:00
Jason Tedor	90d6317437	Remove checkpoint tracker bit sets setting We added an index-level setting for controlling the size of the bit sets used to back the local checkpoint tracker. This setting is really only needed to control the memory footprint of the bit sets but we do not think this setting is going to be needed. This commit removes this setting before it is released to the wild after which we would have to worry about BWC implications. Relates #27191	2017-11-01 21:13:01 -04:00
Colin Goodheart-Smithe	99aca9cdfc	Enhances exists queries to reduce need for `_field_names` (#26930 ) * Enhances exists queries to reduce need for `_field_names` Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance. This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`. Closes #26770 * Addresses review comments * Addresses more review comments * implements existsQuery explicitly on every mapper * Reinstates ability to perform term query on `_field_names` * Added bwc depending on index created version * Review Comments * Skips tests that are not supported in 6.1.0 These values will need to be changed after backporting this PR to 6.x	2017-11-01 10:46:59 +00:00
Martijn van Groningen	d805c41b28	Added new terms_set query This query returns documents that match with at least one ore more of the provided terms. The number of terms that must match varies per document and is either controlled by a minimum should match field or computed per document in a minimum should match script. Closes #26915	2017-11-01 10:55:18 +01:00
Jack Conradson	fd73e5fa41	Add version 6.0.0	2017-10-31 17:49:52 -07:00
Tanguy Leroux	13cd08b1e6	Convert index blocks to cluster block exceptions (#27050 )	2017-10-31 16:11:18 +01:00
Shai Erera	bd0261916c	Fix Laplace scorer to multiply by alpha (and not add) (#27125 )	2017-10-31 13:08:44 +01:00
javanna	34666844b3	[DOCS] Clarify migrate guide and search request validation Relates to #26811	2017-10-31 12:36:00 +01:00
kel	c3e2bdf20c	Raise IllegalArgumentException if query validation failed (#26811 ) Closes #26799	2017-10-31 12:17:27 +01:00
Armin Braun	a4c159e91e	prevent duplicate fields when mixing parent and root nested includes (#27072 ) Closes #26990	2017-10-31 10:01:33 +01:00
Adrien Grand	3812d3cb43	TopHitsAggregator must propagate calls to `setScorer`. (#27138 ) It is required in order to work correctly with bulk scorer implementations that change the scorer during the collection process. Otherwise sub collectors might call `Scorer.score()` on the wrong scorer. Closes #27131	2017-10-31 09:59:06 +01:00
Jason Tedor	a566942219	Refactor internal engine This commit is a minor refactoring of internal engine to move hooks for generating sequence numbers into the engine itself. As such, we refactor tests that relied on this hook to use the new hook, and remove the hook from the sequence number service itself. Relates #27082	2017-10-30 13:10:20 -04:00
Martijn van Groningen	c406a91158	Fix division by zero in phrase suggester that causes assertion to fail	2017-10-30 09:04:56 +01:00
Nhat	d01ad9367e	Enable Docstats with totalSizeInBytes for 6.1.0 Relates https://github.com/elastic/elasticsearch/pull/27117	2017-10-28 14:54:53 -04:00
Nhat	07d270b45f	Adds average document size to DocsStats (#27117 ) This change is required in order to support a size based check for the index rollover. The index size is estimated by sampling the existing segments only. We prefer using segments to StoreStats because StoreStats is not reliable if indexing or merging operations are in progress. Relates #27004	2017-10-28 12:47:08 -04:00
Jim Ferenczi	6625ecfff4	Fix max score tracking with field collapsing (#27122 ) This change makes sure that we track score when sort is set to relevancy only. In this case we always track max score like normal search does. Closes #23840	2017-10-27 09:18:34 +02:00
olcbean	35a2cc1003	fixed typo in ConstructingObjectParse (#27129 )	2017-10-26 13:14:56 -06:00
Jim Ferenczi	d1acf449f5	Apply missing request options to the expand phase (#27118 ) * Apply missing request options to the expand phase This change adds some missing options to the expand query that builds the inner hits for field collapsing. The following options are now applied to the inner_hits query: * post_filters * preferences * routing Closes #27079 Closes #26649	2017-10-26 17:01:57 +02:00
Simon Willnauer	1460a3feac	Only pull SegmentReader once in getSegmentInfo (#27121 )	2017-10-26 14:56:14 +02:00
Jason Tedor	0174d13ca2	Fix BWC for discovery stats The new discovery stats were pushed to the 6.x branch (currently versioned at 6.1.0) but master was not updated to reflect this. This impacts the mixed-cluster BWC tests because a 6.1.0 node will be trying to send a 7.0.0 node the new discovery stats but the 7.0.0 did not yet understand that it should be reading these when talking to a 6.1.0 node. This commit addresses this, and changes the skip version on the discovery stats REST tests.	2017-10-26 07:53:18 -04:00
Catalin Ursachi	8bf33241ed	Add Delete Index API support to high-level REST client (#27019 ) Relates to #25847	2017-10-26 09:52:46 +02:00
Jason Tedor	77f87732ef	Adjust .DS_Store test assertions on Windows Windows handles trying to read a file that does not exist because a component of the path is not a directory differently than other OS handle this situation. This commit adjusts these assertions for Windows.	2017-10-25 22:36:53 -04:00
Jason Tedor	17d6820a4b	Emit settings deprecation logging on empty update When executing a cluster settings update that leaves the cluster state unchanged, we skip validation and this avoids deprecation logging for deprecated settings in the cluster state. This commit addresses this by running validation even if the settings are unchanged. Relates #27017	2017-10-25 22:15:38 -04:00
Jason Tedor	9aae2f593a	Avoid stack overflow on search phases When a search is executing locally over many shards, we can stack overflow during query phase execution. This happens due to callbacks that occur after a phase completes for a shard and we move to the same phase on another shard. If all the shards for the query are local to the local node then we will never go async and these callbacks will end up as recursive calls. With sufficiently many shards, this will end up as a stack overflow. This commit addresses this by truncating the stack by forking to another thread on the executor for the phase. Relates #27069	2017-10-25 22:05:46 -04:00
Nhat	adc195e30c	Fix error message for a put index template request without index_patterns (#27102 ) Just correct the error message from "Validation Failed: 1: pattern is missing;" to "Validation Failed: 1: index_patterns is missing;". Closes #27100	2017-10-25 18:54:40 -04:00
Armin Braun	6533b165d6	#25601 Add pipeline support for REST API bulk upsert (#27075 )	2017-10-25 19:03:25 +02:00
Jason Tedor	6722b9c4a2	Ignore .DS_Store files on macOS Finder creates these files if you browse a directory there. These files are really annoying, but it's an incredible pain for users that these files are created unbeknownst to them, and then they get in the way of Elasticsearch starting. This commit adds leniency on macOS only to skip these files. Relates #27108	2017-10-25 11:25:29 -04:00
Luca Cavanna	5818ff6b56	Make ShardSearchTarget optional when parsing ShardSearchFailure (#27078 ) Turns out that `ShardSearchTarget` is nullable, hence its fields may not be printed out as part of `ShardSearchFailure#toXContent`, in which case `fromXContent` cannot parse it back. We would previously try to create the object with all of its fields set to null, but `Index` complains about it in the constructor. Also made sure that this code path is covered by our unit tests in `ShardSearchFailureTests`. Closes #27055	2017-10-25 13:26:06 +02:00
Luca Cavanna	8caf7d4ff8	Decouple BulkProcessor from ThreadPool (#26727 ) Introduce minimal thread scheduler as a base class for `ThreadPool`. Such a class can be used from the `BulkProcessor` to schedule retries and the flush task. This allows to remove the `ThreadPool` dependency from `BulkProcessor`, which requires to provide settings that contain `node.name` and also needed log4j for logging. Instead, it needs now a `Scheduler` that is much lighter and gets automatically created and shut down on close. Closes #26028	2017-10-25 10:30:23 +02:00
David Turner	cc3364e4f8	Stats to record how often the ClusterState diff mechanism is used successfully (#26973 ) It's believed that using diffs obsoletes the other mechanism for reusing the bits of the ClusterState that didn't change between updates, but in fact we don't know for sure how often the diff mechanism works successfully. The stats collected here will tell us.	2017-10-25 07:35:25 +01:00
Lee Hinman	6bc7024f26	Tie-break shard path decision based on total number of shards on path (#27039 ) Right now if the number of shards for a particular index is equal across the data paths, we tie-break on space. This changes to tie-break first on the total number of shards for each path, and then, if that is the same, on the usable bytes. Relates to #26654 (it's a follow-up)	2017-10-24 16:12:47 -06:00
Jason Tedor	7a792d2c1f	Timed runnable should delegate to abstract runnable If timed runnable wraps an abstract runnable, then it should delegate to the abstract runnable otherwise force execution and handling rejections is dropped on the floor. Thus, timed runnable should itself be an abstract runnable delegating all methods to the wrapped runnable in cases when it is an abstract runnable. This commit causes this to be the case. Relates #27095	2017-10-24 11:36:50 -04:00
Lee Hinman	fcfbdf1f37	Expose adaptive replica selection stats in /_nodes/stats API This exposes the collected metrics we store for ARS in the nodes stats, as well as the computed rank of nodes. Each node exposes its perspective about the cluster. Here's an example output (with `?human`): ```json ... "adaptive_selection" : { "_k6v1-wERxyUd5ke6s-D0g" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "7.8ms", "avg_service_time_ns" : 7896963, "avg_response_time" : "9ms", "avg_response_time_ns" : 9095598, "rank" : "9.1" }, "VJiCUFoiTpySGmO00eWmtQ" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "1.3ms", "avg_service_time_ns" : 1330240, "avg_response_time" : "4.5ms", "avg_response_time_ns" : 4524154, "rank" : "4.5" }, "DHNGTdzyT9iiaCpEUsIAKA" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "2.1ms", "avg_service_time_ns" : 2113164, "avg_response_time" : "6.3ms", "avg_response_time_ns" : 6375810, "rank" : "6.4" } } ... ```	2017-10-24 08:58:42 -06:00
David Turner	cf2d0834f5	Remove duplicated test (#27091 )	2017-10-24 11:52:01 +01:00
Nhat	bf557fd886	test: avoid generating duplicate multiple fields (#27080 ) Multifields parser does not allow duplicate values, however the MultiFieldTests may produce duplicate field values. See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+release-tests/132/console.	2017-10-23 09:59:40 -04:00
Adrien Grand	d0104c22a5	Reduce the default number of cached queries. (#26949 ) Memory usage of queries can't be properly accounted, which can be an issue when large queries are cached since the actual memory usage will be much higher than what the cache thinks. This problem is very hard if not impossible to fix so as a workaround I would like to decrease the maximum number of cached queries so that this problem is less likely to cause trouble in practice. For the record, this problem is more likely to occur in envirenments that have small shards or don't give much memory to the JVM. Closes #26938	2017-10-23 14:11:35 +02:00
Jason Tedor	35984a616e	Keep cumulative elapsed scroll time in microseconds Today we internally accumulate elapsed scroll time in nanoseconds. The problem here is that this can reasonably overflow. For example, on a system with scrolls that are open for ten minutes on average, after sixteen million scrolls the largest value that can be represented by a long will be executed. To address this, we switch to internally representing scrolls using microseconds as this enables with the same number of scrolls scrolls that are open for seven days on average, or with the same average elapsed time sixteen billion scrolls which will never happen (executing one scroll a second until sixteen billion have executed would not occur until more than five-hundred years had elapsed). Relates #27068	2017-10-21 13:18:28 +02:00
Tanguy Leroux	463e7e6fa3	Revert "Upgrade to Jackson 2.9.2 (#27032 )" This reverts commit `0b9acc5ace`.	2017-10-20 08:25:41 +02:00
Tanguy Leroux	0b9acc5ace	Upgrade to Jackson 2.9.2 (#27032 ) Upgrade to Jackson 2.9.2 and also use a boolean `closed` flag to indicate that a FastStringReader instance is closed, so that length is still correctly reported after the reader is closed.	2017-10-19 15:15:02 +02:00
Martijn van Groningen	87c9b79b10	Return the _source of inner hit nested as is without wrapping it into its full path context Due to a change happened via #26102 to make the nested source consistent with or without source filtering, the _source of a nested inner hit was always wrapped in the parent path. This turned out to be not ideal for users relying on the nested source, as it would require additional parsing on the client side. This change fixes this, the _source of nested inner hits is now no longer wrapped by parent json objects, irregardless of whether the _source is included as is or source filtering is used. Internally source filtering and highlighting relies on the fact that the _source of nested inner hits are accessible by its full field path, so in order to now break this, the conversion of the _source into its binary form is performed in FetchSourceSubPhase, after any potential source filtering is performed to make sure the structure of _source of the nested inner hit is consistent irregardless if source filtering is performed. PR for #26944 Closes #26944	2017-10-19 12:04:56 +02:00

1 2 3 4 5 ...

9090 Commits