Commit Graph

8587 Commits

Author SHA1 Message Date
Simon Willnauer a3d5cdcda8 [TEST] Wait for yellow since some shards might not be started
In this test we only index a handful of docs so if we have more shards
than docs we might fail on the `assertSearchResult` since not all shards
are started but results are just fine.
2014-07-04 09:53:48 +02:00
Shay Banon 5249005578 More resource efficient analysis wrapping usage
Today, we take great care to try and share the same analyzer instances across shards and indices (global analyzer). The idea is to share the same analyzer so the thread local resource it has will not be allocated per analyzer instance per thread.
The problem is that AnalyzerWrapper keeps its resources on its own per thread storage, and with per field reuse strategy, it causes for per field per thread token stream components to be used. This is very evident with the StandardTokenizer that uses a buffer...
This came out of test with "many fields", where the majority of 1GB heap was consumed by StandardTokenizer instances...
closes #6714
2014-07-03 21:03:08 +02:00
Brusic 388fddb3d9 Fix github download link when using specific version 2014-07-03 15:40:16 +02:00
David Pilato 162c62dbcc [DOCS] Add information regarding _type parameter requirement for _mget
Change ID to `[[mget-type]]`

Closes #6670.
2014-07-03 15:38:06 +02:00
David Pilato de48d7f94c [DOCS] Add information regarding _type parameter requirement for _mget
Closes #6670.
2014-07-03 15:23:35 +02:00
Jun Ohtani 0c6a859357 Docs: fixed ICU plugin documentation
add ICU Normalization CharFilter to docs

Closes #6711
2014-07-03 15:21:51 +02:00
Martijn van Groningen 7fbfbabfd3 [TEST] Include mapping in failure 2014-07-03 14:54:20 +02:00
Boaz Leskes ae16956e07 [Discovery] immediately start Master|Node fault detection pinging
After a node joins the clusters, it starts pinging the master to verify it's health. Before, the cluster join request was processed async and we had to give some time to complete. With  #6480 we changed this to wait for the join process to complete on the master. We can therefore start pinging immediately for fast detection of failures. Similar change can be made to the Node fault detection from the master side.

Closes #6706
2014-07-03 14:51:11 +02:00
Simon Willnauer f22e51ae81 [TEST] only call TestCluster#afterTest() if cluster was successfully initialized 2014-07-03 14:42:08 +02:00
Simon Willnauer 0475a052b0 [TEST] disable BWC tests for version < 1.1.0 2014-07-03 14:42:08 +02:00
Simon Willnauer 6d2077b0a3 [TEST] Split up random bulks more often and also if the document set is smallish 2014-07-03 13:56:31 +02:00
Boaz Leskes 7beac4ddbf [Discovery] Fault detection should also check cause exceptions for disconnects
The change introduced in #6686 checks for ConnectionTransportException during pinging. However, transport layer wraps it in  SendRequestTransportException
2014-07-03 13:42:53 +02:00
Mikhail Korobov 955473f475 Docs: unescape regexes in Pattern Tokenizer docs
Currently regexes in Pattern Tokenizer docs are escaped (it seems according to Java rules). I think it is better not to escape them because JSON escaping should be automatic in client libraries, and string escaping depends on a client language used. The default pattern is `\W+`, not `\\W+`.

Closes #6615
2014-07-03 13:34:13 +02:00
hanneskaeufler 6e6f4def5d Docs: Fix typo in timestamp-field.asciidoc
Closes #6661
2014-07-03 13:27:37 +02:00
Simon Willnauer 95b6822f46 [TEST] Exclude SORANI analyzer if compatibility version is < 1.3.0 2014-07-03 13:16:19 +02:00
Robert Muir 2935b751e9 Fix doc formatting. Norwegian stemmers and Scandinavian normalizers
were missing commas between entries.
2014-07-03 07:08:33 -04:00
Simon Willnauer 97793358ea [TEST] Wait for cluster consistency before tests starts 2014-07-03 11:56:05 +02:00
Robert Muir b9a09c2b06 Analysis: Add additional Analyzers, Tokenizers, and TokenFilters from Lucene
Add `irish` analyzer
Add `sorani` analyzer (Kurdish)

Add `classic` tokenizer: specific to english text and tries to recognize hostnames, companies, acronyms, etc.
Add `thai` tokenizer: segments thai text into words.

Add `classic` tokenfilter: cleans up acronyms and possessives from classic tokenizer
Add `apostrophe` tokenfilter: removes text after apostrophe and the apostrophe itself
Add `german_normalization` tokenfilter: umlaut/sharp S normalization
Add `hindi_normalization` tokenfilter: accounts for hindi spelling differences
Add `indic_normalization` tokenfilter: accounts for different unicode representations in Indian languages
Add `sorani_normalization` tokenfilter: normalizes kurdish text
Add `scandinavian_normalization` tokenfilter: normalizes Norwegian, Danish, Swedish text
Add `scandinavian_folding` tokenfilter: much more aggressive form of `scandinavian_normalization`
Add additional languages to stemmer tokenfilter: `galician`, `minimal_galician`, `irish`, `sorani`, `light_nynorsk`, `minimal_nynorsk`

Add support access to default Thai stopword set "_thai_"

Fix some bugs and broken links in documentation.

Closes #5935
2014-07-03 05:47:49 -04:00
Simon Willnauer 9ddfaf3aaf [TEST] Expose `tests.filter` for elasticsearch tests.
`-Dtests.filter` allows to pass filter expressions to the elasticsearch
tests. This allows to filter test annotaged with TestGroup annotations
like @Slow, @Nightly, @Backwards, @Integration with a boolean expresssion like:

 * to run only backwards tests run:
     `mvn -Dtests.bwc.version=X.Y.Z -Dtests.filter="@backwards"`
 * to run all integration tests but skip slow tests run:
     `mvn -Dtests.filter="@integration and not @slow"
 * to take defaults into account ie run all test as well as backwards:
     `mvn -Dtests.filter="default and @backwards"

This feature is a more powerful alternative to flags like
`-Dtests.nighly=true|false` etc.

Closes #6703
2014-07-03 11:40:49 +02:00
Matthew L Daniel 53f2301eea Docs: Add clarifying text about regexp and terms
For the casual reader, the reference to "term queries" may be glossed over, yielding an unexpected result when using `regexp` queries.
This attempts to make that distinction more prominent.

Closes #6698
2014-07-03 11:39:57 +02:00
Simon Willnauer 38e9942bd6 [TEST] Stabelize BWC tests for version < 1.1.0 2014-07-03 11:12:43 +02:00
Boaz Leskes 6e9a1f82b6 [Tests] remove bigArrays which were not fully release from watch list (and still fail the test)
This is to prevent future tests from failing due to these arrays
2014-07-03 11:10:44 +02:00
jnguyenx 1883f74cc0 Docs: Fixed missing comma in multi match query example 2014-07-03 08:17:09 +02:00
Shay Banon c1bc269de9 clean shard bulk mapping update to only use type
today we track both the index name and type for mapping updates in the shard bulk action, but we only work against on index in this level, so no need to track the index name itself
closes #6695
2014-07-03 00:38:52 +02:00
Simon Willnauer a960d17d09 [TEST] use pre 1.2.0 MATCH_ALL version if we test BWC for pre 1.2.0 2014-07-02 23:17:56 +02:00
Martijn van Groningen 20a55c05df Percolator: improve logging and cleanup try-catch statement for percolator query loading. 2014-07-02 22:36:02 +02:00
Martijn van Groningen 63eaec6f48 [TEST] Also do waitForConcreteMappingsOnAll() call for the .percolator type. 2014-07-02 22:30:51 +02:00
Simon Willnauer fd19b42cbb [TEST] Don't wait for relocations - the ensureYellow() call does that already 2014-07-02 22:13:44 +02:00
Boaz Leskes 8909a77724 [Discovery] Handle ConnectionTransportException during a Master/Node fault detection ping
Both the Master and Node fault detection register themselves to be notified when a node disconnects to be able to respond to it accordingly. As such, when a ConnectionTransportException was raised on a ping request, it was not handled as it is already handled somewhere else. However, this does introduce a racing condition, if the disconnect  happen during a period where there is no current master (minimum_master_node breach) at which time the fault detection is not active. In this case, we will only discover the disconnect error during the ping request, so we have to respond accordingly.

Closes #6686
2014-07-02 20:49:48 +02:00
Simon Willnauer 3b959706b3 [TEST] Take compatibility version into account for XContentType
randomization

We randomize the XContentType to test deriving the content type on all
APIs. Yet, BWC tests run against versions where CBOR wasn't around
this commit ensures we don't use CBOR when compatibility version is
less than `1.2.0`

Closes #6691
2014-07-02 20:06:03 +02:00
Martijn van Groningen 0ccc4c7c05 [TEST] Also wait for fields to have been applied in the mapping in cluster state during teh waitForConcreteMappingsOnAll call
The concrete DocMapper on the master will be updated before the mapping in the cluster state. The DocMapper is updated during the cluster update task. This can lead to occasional assertion failures on the mapping response, because that is based on the mapping the cluster state, which may not yet have been updated. (time window between the DocMapping is updated, but the mapping in the cluster state isn't)
2014-07-02 17:35:35 +02:00
Shay Banon ccd54dae2d better logic on sending mapping update new type introduction
when an indexing request introduces a new mapping, today we rely on the parsing logic to mark it as modified on the "first" parsing phase. This can cause sending of mapping updates to master even when the mapping has been introduced in the create index/put mapping case, and can cause sending mapping updates without needing to.
 This bubbled up in the disabled field data format test, where we explicitly define mappings to not have the update mapping behavior happening, yet it still happens because of the current logic, and because in our test we delay the introduction of any mapping updates randomly, it can get in and override updated ones.
closes #6669
2014-07-02 17:30:56 +02:00
Alexander Reelsen 4091162d91 Refactoring: Replaced string values with static constants
in TransportShardBulkAction after fixing an issue.
2014-07-02 12:37:40 +02:00
Alexander Reelsen b46d017e5c Bulk API: Fix return of wrong request type on failed updates
In case an update request failed (for example when updating with a
wrongly formatted date), the returned index operation type was index
instead of update.

Closes #6630
2014-07-02 12:37:39 +02:00
Boaz Leskes 7119ffa7bc IndexingMemoryController should only update buffer settings of recovered shards
At the moment the IndexingMemoryController can try to update the index buffer memory of shards at any give moment. This update involves a flush, which may cause a FlushNotAllowedEngineException to be thrown in a concurrently finalizing recovery.

Closes #6642, closes #6667
2014-07-02 12:23:10 +02:00
Adrien Grand b0c21d751d [TEST] Fix SimpleDeleteMappingTests.
The failure was hard to reproduce but it looked to me like dynamic mapping
updates were overriding the delete mappings request.
2014-07-02 12:12:04 +02:00
Adrien Grand 356349599f [TEST] Fix PercolatorTests to wait for mappings on master. 2014-07-02 11:51:58 +02:00
Alexander Reelsen 16fe44c7ec JAVA API: Fix source excludes setting if no includes were provided
Due to a bogus if-check in SearchSourceBuilder.fetchSource(String include, String exclude)
the excludes only got set when the includes were not null. Fixed this and added some
basic tests.

Closes #6632
2014-07-02 11:48:05 +02:00
Simon Willnauer dbd372cd61 [TEST] Added IntegrationTest to reproduce #6614 2014-07-02 11:45:58 +02:00
Simon Willnauer 06918d547a [TEST] Wait for yellow after enable allocation on all nodes in BWC tests 2014-07-02 11:38:52 +02:00
Adrien Grand e76eb228b2 [TEST] Fix IndexLookupTests.testCallWithDifferentFlagsFails. 2014-07-02 10:09:29 +02:00
Adrien Grand 309a284e8d [TEST] Fix failure in SearchFieldsTests.testUidBasedScriptFields.
Sorting fails on unmapped fields so the new propagation delay of the mappings
exposed this issue. I added explicit mappings as part of index creation to fix it.
2014-07-02 09:40:49 +02:00
Adrien Grand a96f9a7c83 Templates: GET templates doesn't honor the `flat_settings` parameter.
Close #6671
2014-07-02 08:42:31 +02:00
Igor Motov 67882d78aa [TEST] Remove RANDOM_NO_DELETE_OPEN_FILE and RANDOM_PREVENT_DOUBLE_WRITE settings from snapshot/restore tests 2014-07-01 15:55:53 -04:00
Boaz Leskes b2b443130f Fix forbidden API syntax error 2014-07-01 19:49:57 +02:00
Shay Banon 2b1823cf02 wait for mapping updates during local recovery
when the primary shard is recovering its translog, make sure to wait for new mapping introductions till the mappings have been updated on the master before finalizing the recovery itself
also, this change performs the mapping updates in a more optimized manner by batching the types to change into a single set and sending after the translog has been replayed

also, remove the wait for mapping on master in the local state tests since this new behavior covers it

closes #6666

remove waiting for mapping on master since we do it in recovery
2014-07-01 19:36:26 +02:00
Boaz Leskes 72d2ac1328 Better support for partial buffer reads/writes in translog infrastructure
Some IO api can return after writing & reading only a part of the requested data. On these rare occasions, we should call the methods again to read/write the rest of the data. This has cause rare translog corruption while writing huge documents on Windows.

Noteful parts of the commit:
- A new Channels class with utility methods for reading and writing to channels
- Writing or reading to channels is added to the forbidden API list
- Added locking to SimpleFsTranslogFile
- Removed FileChannelInputStream which was not used

Closes #6441 , #6576
2014-07-01 19:11:36 +02:00
Martijn van Groningen 5668b1cfc5 Core: cancel entire recovery if shard closes on target node during the recovery operations.
Closes #6645
2014-07-01 18:16:41 +02:00
Simon Willnauer fd1d02fd07 [TEST] Prevent usage of System Properties in the InternalTestCluster
All settings should be passes as settings and the enviroment should not
influence the test cluster settings. The settings we care about ie.
`es.node.mode` and `es.logger.level` should be passed via settings.
This allows tests to override these settings if they for instance need
`network` transport to operate at all.

Closes #6663
2014-07-01 18:05:44 +02:00
Simon Willnauer c9b7bec3cc [INDEX] Ensure `index.version.created` is consistent
Today `index.version.created` depends on the version of the master
node in the cluster. This is potentially causing new features to be
expected on shards that didn't exist when the index was created.
There is no notion of `where was the shard allocated first` such that
`index.version.created` can't be reliably used as a feature flag.

With this change the `index.version.created` can be reliably used to
determin the smallest nodes version at the point in time when the index
was created. This means we can safely use certain features that would
for instance require reindeing and / or would not work if not the
entire index (all shards and segments) have been created with a certain
version or newer.

Closes #6660
2014-07-01 18:00:13 +02:00