* Added percolator field mapper that extracts the query terms and indexes these terms with the percolator query.
* At percolate time these extracted terms are used to query percolator queries that are like to be evaluated. This can significantly cut down the time it takes to percolate. Whereas before all percolator queries were evaluated if they matches with the document being percolated.
* Changes made to percolator queries are no longer immediately visible, a refresh needs to happen before the changes are visible.
* By default the percolate api only returns upto 10 matches instead of returning all matching percolator queries.
* Made percolate more modular, so that it is easier to add unit tests.
* Added unit tests for the percolator.
Closes#12664Closes#13646
We today delete the translog-N.tlog file if any subsequent operation fails
but we might actually be in a good state if for instance the creation of the writer
failes after we sucessfully baked the new translog generation into the checkpoint. In this situation
we used to delete the translog-N.tlog file and failed on the next recovery of the translog with a
NoSuchFileException | FileNotFoundException just like in https://discuss.elastic.co/t/cannot-recover-index-because-of-missing-tanslog-files/38336
This commit changes the behavior and cleans up that limbo state on recovery if we already have a generation+1 file written but not baked into
the checkpoint we remove that file but only if the previous ckp file has already been renamed otherwise we know we can't be in this state.
We has a postIndex|DeleteUnderLock listener callback to load percolator
queries which is entirely private to the index shard in the meanwhile. Yet,
it still calls an external callback while holding an indexing lock which is scary
since we have no control over how long the operation could possibly take.
This commit decouples the percolator registry entirely from the ShardIndexingService
by pessimistically fetching percolator documents from the the engine using realtime get.
Even in situations where the same document is changed concurrently we will eventually end up
in the correct state without loosing an update. This also moves the index throtteling stats directly into
the engine to entirely remove the need for the dependency between InternalEngine and ShardIndexingService.
Relocating a non-primary shard from one node to another is actually done by recovering from the active
primary shard in the cluster, and not the node that we are logically relocating from.
Closes#15775
This commit addresses an issue where a cluster state task listener
throwing an exception could prevent other listeners from being notified,
and could prevent the executor from receiving notifications that a new
cluster state was published. Additionally, this commit also addresses a
similar issue for executors handling cluster state publication
notifications.
Adds task manager class and enables all activities to register with the task manager. Currently, the immutable Transport*Activity class represents activity itself shared across all requests. This PR adds and an additional structure Task that keeps track of currently running requests and can be used to communicate with these requests using TransportTaskAction.
Related to #15117
We used to write into an in-memory buffer and if necessary also allow reading
from the memory buffer if the some translog locations that are not flushed to
the channel need to be read. This commit hides all writing behind a buffered output
stream and if ncecessary flushes all buffered data to the channel for reading. This allows
for several simplifcations like reusing javas build in BufferedOutputStream and removes the
need for read write locks on the translog writer. All thread safety is now achived using
the synchronized primitive.
Several IOExceptions are always wrapped in an NotSerializableWrapper which is
annoying to read. These exceptions are important to get right across the network
and we should support the important ones that indicate problems on the Filesystem.
This commit also adds general support for IOException to preserve the parent type
across the network if no specific type is serializable.
As a default in V2, the GeoPointField.stored option was set to true. Since this consumes disk space with no positive benefit the default stored option is being reverted back to false.
This commit restores logging the ShardRouting#shardId at the front of
the log messages in ShardStateAction. The reason for this is so that
shard-level log messages have the format "[component][node][shard]
message".
There are two bugs:
- the 'global_ordinals_low_cardinality' mode requires a fielddata-based impl so
that it can extract the segment to global ordinal mapping
- the 'global_ordinals_hash' mode abusively casts to the values source to a
fielddata-based impl while it is not needed
Closes#14882
This commit modifies the handling of cluster states in
o.e.c.a.s.ShardStateAction so that all necessary state is obtained
externally to the ShardStateAction#shardFailed and
ShardStateAction#shardStarted methods. This refactoring permits the
removal of the ClusterService field from ShardStateAction.
This commit applies a minor code cleanup to
o/e/c/ClusterStateObserver.java. In particular
- employ the diamond operator instead of explicitly specifying a
generic type parameter
- use 'L' instead of 'l' for specifying a long literal
- remove redundant static modifier on a nested interface
- remove redundant public access modifiers on interface methods
- reformat the declaration of the four-argument ChangePredicate#apply
- simplify the bodies of ValidationPredicate#apply
This commit fixes multiField support for GeoPointFieldMapper by passing an externalValueContext to the multiField parser. Unit testing is added for multi field coverage.
The MapperService doesn't currently check the
index.mapper.dynamic setting during index creation,
so indices can be created with dynamic mappings even
if this setting is false. Add a check that throws an
exception in this case. Fixes#15381
DedicatedClusterSnapshotRestoreIT#testRestoreIndexWithMissingShards took ~1.5 min to finish
due to timeouts that are applied if not all shards are allocated. Now that the index that has
unallocated shareds is not refreshed the test is more reasonable and runs in 15 sec
With this commit we check more precisely on the result of a bulk
request. It could either be ok, fail or be rejected due to resource
constraints. Previously, we have relied that by default we never
get rejected.
However, this is a valid condition even when retrying. With this
commit we check that we either retried often enough that we don't
get rejected *and* if we got rejected that we maxed out the number
of specified retries.
When specifying a string field, you can either do:
```
{
"foo": "bar"
}
```
or
```
{
"foo": {
"value": "bar",
"boost": 42
}
}
```
The latter option is now removed.
Closes#15388
Removal of the pattern node.addShard() -> calculate weight -> node.removeShard() which is expensive as, beside map lookups, it invalidates caching of precomputed values in ModelNode and ModelIndex. Replaced by adding an additional parameter to the weight function which accounts for the added / removed shard.
This removes the backward compatibility layer with pre-2.0 indices, notably
the extraction of _id, _routing or _timestamp from the source document when a
path is defined.
Today when dynamically mapping a field that is already defined in another type,
we use the regular dynamic mapping logic and try to copy some settings to avoid
introducing conflicts. However this is quite fragile as we don't deal with every
existing setting. This proposes a different approach that will just reuse the
shared field type.
Close#15568
FunctionScoreQuery should do two things that it doesn't do today:
- propagate the two-phase iterator from the wrapped scorer so that things are
still executed efficiently eg. if a phrase or geo-distance query is wrapped
- filter out docs that don't have a high enough score using two-phase
iteration: this way the score is only checked when everything else matches
While doing these changes, I noticed that minScore was ignored when scores were
not needed and that explain did not take it into account, so I fixed these
issues as well.
This changes a couple of things:
Mappings are truly immutable. Before, each field mapper stored a
MappedFieldTypeReference that was shared across fields that have the same name
across types. This means that a mapping update could have the side-effect of
changing the field type in other types when updateAllTypes is true. This works
differently now: after a mapping update, a new copy of the mappings is created
in such a way that fields across different types have the same MappedFieldType.
See the new Mapper.updateFieldType API which replaces MappedFieldTypeReference.
DocumentMapper is now immutable and MapperService.merge has been refactored in
such a way that if an exception is thrown while eg. lookup structures are being
updated, then the whole mapping update will be aborted. As a consequence,
FieldTypeLookup's checkCompatibility has been folded into copyAndAddAll.
Synchronization was simplified: given that mappings are truly immutable, we
don't need the read/write lock so that no documents can be parsed while a
mapping update is being processed. Document parsing is not performed under a
lock anymore, and mapping merging uses a simple synchronized block.
This adds the required changes/checks so that the build can run on
FreeBSD.
There are a few things that differ between FreeBSD and Linux:
- CPU probes return -1 for CPU usage
- `hot_threads` cannot be supported on FreeBSD
From OpenJDK's `os_bsd.cpp`:
```c++
bool os::is_thread_cpu_time_supported() {
#ifdef __APPLE__
return true;
#else
return false;
#endif
}
```
So this API now returns (for each FreeBSD node):
```
curl -s localhost:9200/_nodes/hot_threads
::: {Devil Hunter Gabriel}{q8OJnKCcQS6EB9fygU4R4g}{127.0.0.1}{127.0.0.1:9300}
hot_threads is not supported on FreeBSD
```
- multicast fails in native `join` method - known bug:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193246
Which causes:
```
1> Caused by: java.net.SocketException: Invalid argument
1> at java.net.PlainDatagramSocketImpl.join(Native Method)
1> at java.net.AbstractPlainDatagramSocketImpl.join(AbstractPlainDatagramSocketImpl.java:179)
1> at java.net.MulticastSocket.joinGroup(MulticastSocket.java:323)
1> at org.elasticsearch.plugin.discovery.multicast.MulticastChannel$Plain.buildMulticastSocket(MulticastChannel.java:309)
```
So these tests are skipped on FreeBSD.
Resolves#15562
In this commit we increase the queue size of the bulk pool in
BulkProcessorRetryIT to make it less sensitive.
As this test case should stress the pool so bulk processor needs to
back off but not so much that the backoff policy will give up at
some point (which is a valid condition), we still keep it below the
default queue size of 50.
Some tests, but in particular CodecTests, are slow because they test all
versions that ever existed even though they should only test supported
versions.
Today we throttle recoveries only for incoming recoveries. Nodes that have a lot
of primaries can get overloaded due to too many recoveries. To still keep that at bay
we limit the number of threads that are sending files to the target to overcome this problem.
The right solution here is to also throttle the outgoing recoveries that are today unbounded on
the master and don't start the recovery until we have enough resources on both source and target nodes.
The concurrency aspects of the recovery source also added a lot of complexity and additional threadpools
that are hard to configure. This commit removes the concurrent streamns notion completely and sends files
in the thread that drives the recovery simplifying the recovery code considerably.
Outgoing recoveries are not throttled on the master via a allocation decider.
We added this undocumented realtime setting as backup plan long ago
but to date we haven't had a situation where it was a problem. It's reducing
the number of filehandles in the NRT case dramatically and should always be enabled.
Today the logic to async - commit the translog is in every translog instance
itself. While the setting is a per index setting we manageing it per shard. This
polluts the translog code and can more easily be managed in IndexService.
Today we have two variants of translogs for indexing. We only recommend the buffered
one which also has a 20% advantage in indexing speed. This commit removes the option and defaults
to the buffered case. It also hard-wires the translog buffer to 8kb instead of 64kb. We used to
adjust that buffer based on if the shard is active or not, this code has also been removed and
instead we just keep an 8kb buffer arround.
This commit removes `index.translog.flush_threshold_ops` and `index.translog.disable_flush`
in favor of `index.translog.flush_threshold_size`. The number of operations is meaningless by itself and
can easily be turned into a size value with knowledge of the data. Disabling the flush is only useful in
tests and we can set the size value to a really high value. If users really need to do this they can
also apply a very high value like `1PB`.
DocumentMapperParser has both parse and parseCompressed methods. Except that the
parse methods are ONLY used from the unit tests. This commit removes the parse
method and moves all tests to parseCompressed so that they test more
realistically how mappings are managed.
Then I renamed parseCompressed to parse given that this is the only alternative
anyway.
Resolves conflicts between parent routing and alias routing with the following rule:
* The parent routing is ignored if there is an alias routing that matches the request.
Closes#3068
With this commit we implement a cancellation policy in
BulkProcessor which is aligned for the sync and the async case
and also document it.
Closes#14833.
It's important to close not matter what exception caused a tragic event. Today
we only check on IOException and AlreadyClosedExceptions. The test had a bug and
threw an IAE instead causing the translog not to be closed.
This commit addresses a potential race condition in
ClusterServiceIT#testClusterStateBatchedUpdates. The potential race
condition is that the main test thread could be released to execute the
final test assertions before the cluster state publication callbacks had
completed thereby causing a situtation where the test assertions could
be executed before the final test state had been realized.
Provides a new flag which can be enabled on a per-request basis.
When `"profile": true` is set, the search request will execute in a
mode that collects detailed timing information for query components.
```
GET /test/test/_search
{
"profile": true,
"query": {
"match": {
"foo": "bar"
}
}
}
```
Closes#14889
Squashed commit of the following:
commit a92db5723d2c61b8449bd163d2f006d12f9889ad
Merge: 117dd99 3f87b08
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Dec 17 09:44:10 2015 -0500
Merge remote-tracking branch 'upstream/master' into query_profiler
commit 117dd9992e8014b70203c6110925643769d80e62
Merge: 9b29d68 82a64fd
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Dec 15 13:27:18 2015 -0500
Merge remote-tracking branch 'upstream/master' into query_profiler
Conflicts:
core/src/main/java/org/elasticsearch/search/SearchService.java
commit 9b29d6823a71140ecd872df25ff9f7478e7fe766
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Dec 14 16:13:23 2015 -0500
[TEST] Profile flag needs to be set, ensure searches go against primary only for consistency
commit 4d602d8ad1f8cbc7b475450921fa3bc7d395b34f
Merge: 8b48e87 7742c1e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Dec 14 10:56:25 2015 -0500
Merge remote-tracking branch 'upstream/master' into query_profiler
commit 8b48e876348b163ab730eeca7fa35142165b05f9
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Dec 14 10:56:01 2015 -0500
Delegate straight to in.matchCost, no need for profiling
commit fde3b0587911f0b5f15e779c671d0510cbd568a9
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Dec 14 10:28:23 2015 -0500
Documentation tweaks, renaming build_weight -> create_weight
commit 46f5e011ee23fe9bb8a1f11ceb4fa9d19fe48e2e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Dec 14 10:27:52 2015 -0500
Profile TwoPhaseIterator should override matchCost()
commit b59f894ddb11b2a7beebba06c4ec5583ff91a7b2
Merge: 9aa1a3a b4e0c87
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Dec 9 14:23:26 2015 -0500
Merge remote-tracking branch 'upstream/master' into query_profiler
commit 9aa1a3a25c34c9cd9fffaa6114c25a0ec791307d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Dec 9 13:41:48 2015 -0500
Revert "Move some of the collector wrapping logic into ProfileCollectorBuilder"
This reverts commit 02cc31767fb76a7ecd44a302435e93a05fb4220e.
commit 57f7c04cea66b3f98ba2bec4879b98e4fba0b3c0
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Dec 9 13:41:31 2015 -0500
Revert "Rearrange if/else to make intent clearer"
This reverts commit 59b63c533fcaddcdfe4656e86a6f6c4cb1bc4a00.
commit 2874791b9c9cd807113e75e38be465f3785c154e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Dec 9 13:38:13 2015 -0500
Revert "Move state into ProfileCollectorBuilder"
This reverts commit 0bb3ee0dd96170b06f07ec9e2435423d686a5ae6.
commit 0bb3ee0dd96170b06f07ec9e2435423d686a5ae6
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Dec 3 11:21:55 2015 -0500
Move state into ProfileCollectorBuilder
commit 59b63c533fcaddcdfe4656e86a6f6c4cb1bc4a00
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Dec 2 17:21:12 2015 -0500
Rearrange if/else to make intent clearer
commit 511db0af2f3a86328028b88a6b25fa3dfbab963b
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Dec 2 17:12:06 2015 -0500
Rename WEIGHT -> BUILD_WEIGHT
commit 02cc31767fb76a7ecd44a302435e93a05fb4220e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Dec 2 17:11:22 2015 -0500
Move some of the collector wrapping logic into ProfileCollectorBuilder
commit e69356d3cb4c60fa281dad36d84faa64f5c32bc4
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 30 15:12:35 2015 -0500
Cleanup imports
commit c1b4f284f16712be60cd881f7e4a3e8175667d62
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 30 15:11:25 2015 -0500
Review cleanup: Make InternalProfileShardResults writeable
commit 9e61c72f7e1787540f511777050a572b7d297636
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 30 15:01:22 2015 -0500
Review cleanup: Merge ProfileShardResult, InternalProfileShardResult. Convert to writeable
commit 709184e1554f567c645690250131afe8568a5799
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 30 14:38:08 2015 -0500
Review cleanup: Merge ProfileResult, InternalProfileResult. Convert to writeable
commit 7d72690c44f626c34e9c608754bc7843dd7fd8fe
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 30 14:01:34 2015 -0500
Review cleanup: use primitive (and default) for profile flag
commit 97d557388541bbd3388cdcce7d9718914d88de6d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 30 13:09:12 2015 -0500
Review cleanup: Use Collections.emptyMap() instead of building an empty one explicitly
commit 219585b8729a8b0982e653d99eb959efd0bef84e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 30 13:08:12 2015 -0500
Add todo to revisit profiler architecture in the future
commit b712edb2160e032ee4b2f2630fadf131a0936886
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 30 13:05:32 2015 -0500
Split collector serialization from timing, use writeable instead of streamable
Previously, the collector timing was done in the same class that was serialized, which required
leaving the collector null when serializing. Besides being a bit gross, this made it difficult to
change the class to Writeable.
This splits up the timing (InternalProfileCollector + ProfileCollector) and the serialization of
the times (CollectorResult). CollectorResult is writeable, and also acts as the public interface.
commit 6ddd77d066262d4400e3d338b11cebe7dd27ca78
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Nov 25 13:15:12 2015 -0500
Remove dead code
commit 06033f8a056e2121d157654a65895c82bbe93a51
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Nov 25 12:49:51 2015 -0500
Review cleanup: Delegate to in.getProfilers()
Note: Need to investigate how this is used exactly so we can add a test, it isn't touched by a
normal inner_hits query...
commit a77e13da21b4bad1176ca2b5d5b76034fb12802f
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Nov 25 11:59:58 2015 -0500
Review cleanup: collapse to single `if` statement
commit e97bb6215a5ebb508b0293ac3acd60d5ae479be1
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Nov 25 11:39:43 2015 -0500
Review cleanup: Return empty map instead of null for profile results
Note: we still need to check for nullness in SearchPhaseController, since an empty/no-hits result
won't have profiling instantiated (or any other component like aggs or suggest). Therefore
QuerySearchResult.profileResults() is still @Nullable
commit db8e691de2a727389378b459fa76c942572e6015
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Nov 25 10:14:47 2015 -0500
Review cleanup: renaming, formatting fixes, assertions
commit 9011775fe80ba22c2fd948ca64df634b4e32772d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Nov 19 20:09:52 2015 -0500
[DOCS] Add missing annotation
commit 4b58560b06f08d4b99b149af20916ee839baabd7
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Nov 19 20:07:17 2015 -0500
[DOCS] Update documentation for new format
commit f0458c58e5538ed8ec94849d4baf3250aa9ec841
Author: Adrien Grand <jpountz@gmail.com>
Date: Tue Nov 17 10:14:09 2015 +0100
Reduce visibility of internal classes.
commit d0a7d319098e60b028fa772bf8a99b2df9cf6146
Merge: e158070 1bdf29e
Author: Adrien Grand <jpountz@gmail.com>
Date: Tue Nov 17 10:09:18 2015 +0100
Merge branch 'master' into query_profiler
commit e158070a48cb096551f3bb3ecdcf2b53bbc5e3c5
Author: Adrien Grand <jpountz@gmail.com>
Date: Tue Nov 17 10:08:48 2015 +0100
Fix compile error due to bad html in javadocs.
commit a566b5d08d659daccb087a9afbe908ec3d96cd6e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 16 17:48:37 2015 -0500
Remove unused collector
commit 4060cd72d150cc68573dbde62ca7321c47f75703
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 16 17:48:10 2015 -0500
Comment cleanup
commit 43137952bf74728f5f5d5a8d1bfc073e0f9fe4f9
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Nov 16 17:32:06 2015 -0500
Fix negative formatted time
commit 5ef3a980266326aff12d4fe380f73455ff28209f
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Nov 13 17:10:17 2015 +0100
Fix javadocs.
commit 276114d29e4b17a0cc0982cfff51434f712dc59e
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Nov 13 16:25:23 2015 +0100
Fix: include rewrite time as well...
commit 21d9e17d05487bf4903ae3d2ab6f429bece2ffef
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Nov 13 15:10:15 2015 +0100
Remove TODO about profiling explain.
commit 105a31e8e570efb879447159c3852871f5cf7db4
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Nov 13 14:59:30 2015 +0100
Fix nocommit now that the global collector is a root collector.
commit 2e8fc5cf84adb1bfaba296808c329e5f982c9635
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Nov 13 14:53:38 2015 +0100
Make collector wrapping more explicit/robust (and a bit less magical).
commit 5e30b570b0835e1ce79a57933a31b6a2d0d58e2d
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Nov 13 12:44:03 2015 +0100
Simplify recording API a bit.
commit 9b453afced6adc0a59ca1d67d90c28796b105185
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Nov 13 10:54:25 2015 +0100
Fix serialization-related nocommits.
commit ad97b200bb123d4e9255e7c8e02f7e43804057a5
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Nov 13 10:46:30 2015 +0100
Fix DFS.
commit a6de06986cd348a831bd45e4f524d2e14d9e03c3
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Nov 12 19:29:16 2015 +0100
Remove forbidden @Test annotation.
commit 4991a28e19501109af98026e14756cb25a56f4f4
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Nov 12 19:25:59 2015 +0100
Limit the impact of query profiling on the SearchContext API.
Rule is: we can put as much as we want in the search.profile package but should
aim at touching as little as possible other areas of the code base.
commit 353d8d75a5ce04d9c3908a0a63d4ca6e884c519a
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Nov 12 18:05:09 2015 +0100
Remove dead code.
commit a3ffafb5ddbb5a2acf43403c946e5ed128f47528
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Nov 12 15:30:35 2015 +0100
Remove call to forbidden String.toLowerCase() API.
commit 1fa8c7a00324fa4e32bd24135ebba5ecf07606f1
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Nov 12 15:30:27 2015 +0100
Fix compilation.
commit 2067f1797e53bef0e1a8c9268956bc5fb8f8ad97
Merge: 22e631f fac472f
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Nov 12 15:21:12 2015 +0100
Merge branch 'master' into query_profiler
commit 22e631fe6471fed19236578e97c628d5cda401a9
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Nov 3 18:52:05 2015 -0500
Fix and simplify serialization of shard profile results
commit 461da250809451cd2b47daf647343afbb4b327f2
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Nov 3 18:32:22 2015 -0500
Remove helper methods, simpler without them
commit 5687aa1c93d45416d895c2eecc0e6a6b302139f2
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Nov 3 18:29:32 2015 -0500
[TESTS] Fix tests for new rewrite format
commit ba9e82857fc6d4c7b72ef4d962d2102459365299
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Oct 30 15:28:14 2015 -0400
Rewrites begone! Record all rewrites as a single time metric
commit 5f28d7cdff9ee736651d564f71f713bf45fb1d91
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Oct 29 15:36:06 2015 -0400
Merge duplicate rewrites into one entry
By using the Query as the key in a map, we can easily merge rewrites together. This means
the preProcess(), assertion and main query rewrites all get merged together. Downside is that
rewrites of the same Query (hashcode) but in different places get lumped together. I think the
simplicity of the solution is better than the slight loss in output fidelity.
commit 9a601ea46bb21052746157a45dcc6de6bc350e9c
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Oct 29 15:28:27 2015 -0400
Allow multiple "searches" per profile (e.g. query + global agg)
commit ee30217328381cd83f9e653d3a4d870c1d2bdfce
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Oct 29 11:29:18 2015 -0400
Update comment, add nullable annotation
commit 405c6463a64e118f170959827931e8c6a1661f13
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Oct 29 11:04:30 2015 -0400
remove out-dated comment
commit 2819ae8f4cf1bfd5670dbd1c0e06195ae457b58f
Author: Adrien Grand <jpountz@gmail.com>
Date: Tue Oct 27 19:50:47 2015 +0100
Don't render children in the profiles when there are no children.
commit 7677c2ddefef321bbe74660471603d202a4ab66f
Author: Adrien Grand <jpountz@gmail.com>
Date: Tue Oct 27 19:50:35 2015 +0100
Set the profiler on the ContextIndexSearcher.
commit 74a4338c35dfed779adc025ec17cfd4d1c9f66f5
Author: Adrien Grand <jpountz@gmail.com>
Date: Tue Oct 27 19:50:01 2015 +0100
Fix json rendering.
commit 6674d5bebe187b0b0d8b424797606fdf2617dd27
Author: Adrien Grand <jpountz@gmail.com>
Date: Tue Oct 27 19:20:19 2015 +0100
Revert "nocommit - profiling never enabled because setProfile() on ContextIndexSearcher never called"
This reverts commit d3dc10949024342055f0d4fb7e16c7a43423bfab.
commit d3dc10949024342055f0d4fb7e16c7a43423bfab
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Oct 23 17:20:57 2015 -0400
nocommit - profiling never enabled because setProfile() on ContextIndexSearcher never called
Previously, it was enabled by using DefaultSearchContext as a third-party "proxy", but since
the refactor to make it unit testable, setProfile() needs to be called explicitly. Unfortunately,
this is not possible because SearchService only has access to an IndexSearcher. And it is not
cast'able to a DefaultIndexSearcher.
commit b9ba9c5d1f93b9c45e97b0a4e35da6f472c9ea53
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Oct 23 16:27:00 2015 -0400
[TESTS] Fix unit tests
commit cf5d1e016b2b4a583175e07c16c7152f167695ce
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Oct 23 16:22:34 2015 -0400
Increment token after recording a rewrite
commit b7d08f64034e498533c4a81bff8727dd8ac2843e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Oct 23 16:14:09 2015 -0400
Fix NPE if a top-level root doesn't have children
commit e4d3b514bafe2a3a9db08438c89f0ed68628f2d6
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Oct 23 16:05:47 2015 -0400
Fix NPE when profiling is disabled
commit 445384fe48ed62fdd01f7fc9bf3e8361796d9593
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Oct 23 16:05:37 2015 -0400
[TESTS] Fix integration tests
commit b478296bb04fece827a169e7522df0a5ea7840a3
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Oct 23 15:43:24 2015 -0400
Move rewrites to their own section, remove reconciliation
Big commit because the structural change affected a lot of the wrapping code. Major changes:
- Rewrites are now in their own section in the response
- Reconciliation is gone...we don't attempt to roll the rewrites into the query tree structure
- InternalProfileShardResults (plural) simply holds a Map<String, InternalProfileShardResult> and
helps to serialize / ToXContent
- InternalProfileShardResult (singular) is now the holder for all shard-level profiling details. Currently
this includes query, collectors and rewrite. In the future it would hold suggest, aggs, etc
- When the user requests the profiled results, they get back a Map<String, ProfileShardResult>
instead of doing silly helper methods to convert to maps, etc
- Shard details are baked into a string instead of serializing the object
commit 24819ad094b208d0e94f17ce9c3f7c92f7414124
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Oct 23 10:25:38 2015 -0400
Make Profile results immutable by removing relative_time
commit bfaf095f45fed74194ef78160a8e5dcae1850f9e
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Oct 23 10:54:59 2015 +0200
Add nocommits.
commit e9a128d0d26d5b383b52135ca886f2c987850179
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Oct 23 10:39:37 2015 +0200
Move all profile-related classes to the same package.
commit f20b7c7fdf85384ecc37701bb65310fb8c20844f
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Oct 23 10:33:14 2015 +0200
Reorganize code a bit to ease unit testing of ProfileCollector.
commit 3261306edad6a0c70f59eaee8fe58560f61a75fd
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 18:07:28 2015 +0200
Remove irrelevant nocommit.
commit a6ac868dad12a2e17929878681f66dbd0948d322
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 18:06:45 2015 +0200
Make CollectorResult's reason a free-text field to ease bw compat.
commit 5d0bf170781a950d08b81871cd1e403e49f3cc12
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 16:50:52 2015 +0200
Add unit tests for ProfileWeight/ProfileScorer.
commit 2cd88c412c6e62252504ef69a59216adbb574ce4
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 15:55:17 2015 +0200
Rename InternalQueryProfiler to Profiler.
commit 84f5718fa6779f710da129d9e0e6ff914fd85e36
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 15:53:58 2015 +0200
Merge InternalProfileBreakdown into ProfileBreakdown.
commit 135168eaeb8999c8117ea25288104b0961ce9b35
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 13:56:57 2015 +0200
Make it possible to instantiate a ContextIndexSearcher without SearchContext.
commit 5493fb52376b48460c4ce2dedbe00cc5f6620499
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 11:53:29 2015 +0200
Move ProfileWeight/Scorer to their own files.
commit bf2d917b9dae3b32dfc29c35a7cac4ccb7556cce
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 11:38:24 2015 +0200
Fix bug that caused phrase queries to fail.
commit b2bb0c92c343334ec1703a221af24a1b55e36d53
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 11:36:17 2015 +0200
Parsing happens on the coordinating node now.
commit 416cabb8621acb5cd8dfa77374fd23e428f52fe9
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 11:22:17 2015 +0200
Fix compilation (in particular remove guava deps).
commit f996508645f842629d403fc2e71c1890c0e2cac9
Merge: 4616a25 bc3b91e
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Oct 22 10:44:38 2015 +0200
Merge branch 'master' into query_profiler
commit 4616a25afffe9c24c6531028f7fccca4303d2893
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Oct 20 12:11:32 2015 -0400
Make Java Count API compatible with profiling
commit cbfba74e16083d719722500ac226efdb5cb2ff55
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Oct 20 12:11:19 2015 -0400
Fix serialization of profile query param, NPE
commit e33ffac383b03247046913da78c8a27e457fae78
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Oct 20 11:17:48 2015 -0400
TestSearchContext should return null Profiler instead of exception
commit 73a02d69b466dc1a5b8a5f022464d6c99e6c2ac3
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Oct 19 12:07:29 2015 -0400
[DOCS] Update docs to reflect new ID format
commit 36248e388c354f954349ecd498db7b66f84ce813
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Oct 19 12:03:03 2015 -0400
Use the full [node][index][shard] string as profile result ID
commit 5cfcc4a6a6b0bcd6ebaa7c8a2d0acc32529a80e1
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Oct 15 17:51:40 2015 -0400
Add failing test for phrase matching
Stack trace generated:
[2015-10-15 17:50:54,438][ERROR][org.elasticsearch.search.profile] shard [[JNj7RX_oSJikcnX72aGBoA][test][2]], reason [RemoteTransportException[[node_s0][local[1]][indices:data/read/search[phase/query]]]; nested: QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: AssertionError[nextPosition() called more than freq() times!]; ], cause [java.lang.AssertionError: nextPosition() called more than freq() times!
at org.apache.lucene.index.AssertingLeafReader$AssertingPostingsEnum.nextPosition(AssertingLeafReader.java:353)
at org.apache.lucene.search.ExactPhraseScorer.phraseFreq(ExactPhraseScorer.java:132)
at org.apache.lucene.search.ExactPhraseScorer.access$000(ExactPhraseScorer.java:27)
at org.apache.lucene.search.ExactPhraseScorer$1.matches(ExactPhraseScorer.java:69)
at org.elasticsearch.common.lucene.search.ProfileQuery$ProfileScorer$2.matches(ProfileQuery.java:226)
at org.apache.lucene.search.ConjunctionDISI$TwoPhaseConjunctionDISI.matches(ConjunctionDISI.java:175)
at org.apache.lucene.search.ConjunctionDISI$TwoPhase.matches(ConjunctionDISI.java:213)
at org.apache.lucene.search.ConjunctionDISI.doNext(ConjunctionDISI.java:128)
at org.apache.lucene.search.ConjunctionDISI.nextDoc(ConjunctionDISI.java:151)
at org.apache.lucene.search.ConjunctionScorer.nextDoc(ConjunctionScorer.java:62)
at org.elasticsearch.common.lucene.search.ProfileQuery$ProfileScorer$1.nextDoc(ProfileQuery.java:205)
at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:224)
at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:169)
at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:795)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:509)
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:347)
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:111)
at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:366)
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:378)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
at org.elasticsearch.transport.local.LocalTransport$2.doRun(LocalTransport.java:280)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
commit 889fe6383370fe919aaa9f0af398e3040209e40b
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Oct 15 17:30:38 2015 -0400
[DOCS] More docs
commit 89177965d031d84937753538b88ea5ebae2956b0
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Oct 15 09:59:09 2015 -0400
Fix multi-stage rewrites to recursively find most appropriate descendant rewrite
Previously, we chose the first rewrite that matched. But in situations where a query may
rewrite several times, this won't build the tree correctly. Instead we need to recurse
down all the rewrites until we find the most appropriate "leaf" rewrite
The implementation of this is kinda gross: we recursively call getRewrittenParentToken(),
which does a linear scan over the rewriteMap and tries to find rewrites with a larger token
value (since we know child tokens are always larger). Can almost certainly find a better
way to do this...
commit 0b4d782b5348e5d03fd26f7d91bc4a3fbcb7f6a5
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Oct 14 19:30:06 2015 -0400
[Docs] Documentation checkpoint
commit 383636453f6610fcfef9070c21ae7ca11346793e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Sep 16 16:02:22 2015 -0400
Comments
commit a81e8f31e681be16e89ceab9ba3c3e0a018f18ef
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Sep 16 15:48:49 2015 -0400
[TESTS] Ensure all tests use QUERY_THEN_FETCH, DFS does not profile
commit 1255c2d790d85fcb9cbb78bf2a53195138c6bc24
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Sep 15 16:43:46 2015 -0400
Refactor rewrite handling to handle identical rewrites
commit 85b7ec82eb0b26a6fe87266b38f5f86f9ac0c44f
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Sep 15 08:51:14 2015 -0400
Don't update parent when a token is added as root -- Fixes NPE
commit 109d02bdbc49741a3b61e8624521669b0968b839
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Sep 15 08:50:40 2015 -0400
Don't set the rewritten query if not profiling -- Fixes NPE
commit 233cf5e85f6f2c39ed0a2a33d7edd3bbd40856e8
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Sep 14 18:04:51 2015 -0400
Update tests to new response format
commit a930b1fc19de3a329abc8ffddc6711c1246a4b15
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Sep 14 18:03:58 2015 -0400
Fix serialization
commit 69afdd303660510c597df9bada5531b19d134f3d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Sep 14 15:11:31 2015 -0400
Comments and cleanup
commit 64e7ca7f78187875378382ec5d5aa2462ff71df5
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Sep 14 14:40:21 2015 -0400
Move timing into dedicated class, add proper rewrite integration
commit b44ff85ddbba0a080e65f2e7cc8c50d30e95df8e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Sep 14 12:00:38 2015 -0400
Checkpoint - Refactoring to use a token-based dependency tree
commit 52cedd5266d6a87445c6a4cff3be8ff2087cd1b7
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Fri Sep 4 19:18:19 2015 -0400
Need to set context profiling flag before calling queryPhase.preProcess
commit c524670cb1ce29b4b3a531fa2bff0c403b756f46
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Sep 4 18:00:37 2015 +0200
Reduce profiling overhead a bit.
This removes hash-table lookups everytime we start/stop a profiling clock.
commit 111444ff8418737082236492b37321fc96041e09
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Sep 4 16:18:59 2015 +0200
Add profiling of two-phase iterators.
This is useful for eg. phrase queries or script filters, since they are
typically consumed through their two-phase iterator instead of the scorer.
commit f275e690459e73211bc8494c6de595c0320f4c0b
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Sep 4 16:03:21 2015 +0200
Some more improvements.
I changed profiling to disable bulk scoring, since it makes it impossible to
know where time is spent. Also I removed profiling of operations that are
always fast (eg. normalization) and added nextDoc/advance.
commit 3c8dcd872744de8fd76ce13b6f18f36f8de44068
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Sep 4 14:39:50 2015 +0200
Remove println.
commit d68304862fb38a3823aebed35a263bd9e2176c2f
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Sep 4 14:36:03 2015 +0200
Fix some test failures introduced by the rebase...
commit 04d53ca89fb34b7a21515d770c32aaffcc513b90
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Sep 4 13:57:35 2015 +0200
Reconcile conflicting changes after rebase
commit fed03ec8e2989a0678685cd6c50a566cec42ea4f
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Thu Aug 20 22:40:39 2015 -0400
Add Collectors to profile results
Profile response element has now been re-arranged so that everything is listed per-shard to
facilitate grouping elements together. The new `collector` element looks like this:
```
"profile": {
"shards": [
{
"shard_id": "keP4YFywSXWALCl4m4k24Q",
"query": [...],
"collector": [
{
"name": "MultiCollector",
"purpose": "search_multi",
"time": "16.44504400ms",
"relative_time": "100.0000000%",
"children": [
{
"name": "FilteredCollector",
"purpose": "search_post_filter",
"time": "4.556013000ms",
"relative_time": "27.70447437%",
"children": [
{
"name": "SimpleTopScoreDocCollector",
"purpose": "search_sorted",
"time": "1.352166000ms",
"relative_time": "8.222331299%",
"children": []
}
]
},
{
"name": "BucketCollector: [[non_global_term, another_agg]]",
"purpose": "aggregation",
"time": "10.40379400ms",
"relative_time": "63.26400829%",
"children": []
},
...
```
commit 1368b495c934be642c00f6cbf9fc875d7e6c07ff
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Aug 19 12:43:03 2015 -0400
Move InternalProfiler to profile package
commit 53584de910db6d4a6bb374c9ebb954f204882996
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 18 18:34:58 2015 -0400
Only reconcile rewrite timing when rewritten query is different from original
commit 9804c3b29d2107cd97f1c7e34d77171b62cb33d0
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 18 16:40:15 2015 -0400
Comments and cleanup
commit 8e898cc7c59c0c1cc5ed576dfed8e3034ca0967f
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 18 14:19:07 2015 -0400
[TESTS] Fix comparison test to ensure results sort identically
commit f402a29001933eef29d5a62e81c8563f1c8d0969
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 18 14:17:59 2015 -0400
Add note about DFS being unable to profile
commit d446e08d3bc91cd85b24fc908e2d82fc5739d598
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 18 14:17:23 2015 -0400
Implement some missing methods
commit 13ca94fb86fb037a30d181b73d9296153a63d6e4
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 18 13:10:54 2015 -0400
[TESTS] Comments & cleanup
commit c76c8c771fdeee807761c25938a642612a6ed8e7
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 18 13:06:08 2015 -0400
[TESTS] Fix profileMatchesRegular to handle NaN scores and nearlyEqual floats
commit 7e7a10ecd26677b2239149468e24938ce5cc18e1
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 18 12:22:16 2015 -0400
Move nearlyEquals() utility function to shared location
commit 842222900095df4b27ff3593dbb55a42549f2697
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 18 12:04:35 2015 -0400
Fixup rebase conflicts
commit 674f162d7704dd2034b8361358decdefce1f76ce
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Aug 17 15:29:35 2015 -0400
[TESTS] Update match and bool tests
commit 520380a85456d7137734aed0b06a740e18c9cdec
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Aug 17 15:28:09 2015 -0400
Make naming consistent re: plural
commit b9221501d839bb24d6db575d08e9bee34043fc65
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Aug 17 15:27:39 2015 -0400
Children need to be added to list after serialization
commit 05fa51df940c332fbc140517ee56e849f2d40a72
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Aug 17 15:22:41 2015 -0400
Re-enable bypass for non-profiled queries
commit f132204d264af77a75bd26a02d4e251a19eb411d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Aug 17 15:21:14 2015 -0400
Fix serialization of QuerySearchResult, InternalProfileResult
commit 27b98fd475fc2e9508c91436ef30624bdbee54ba
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Aug 10 17:39:17 2015 -0400
Start to add back tests, refactor Java api
commit bcfc9fefd49307045108408dc160774666510e85
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Aug 4 17:08:10 2015 -0400
Checkpoint
commit 26a530e0101ce252450eb23e746e48c2fd1bfcae
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Tue Jul 14 13:30:32 2015 -0400
Add createWeight() checkpoint
commit f0dd61de809c5c13682aa213c0be65972537a0df
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Mon Jul 13 12:36:27 2015 -0400
checkpoint
commit 377ee8ce5729b8d388c4719913b48fae77a16686
Author: Zachary Tong <zacharyjtong@gmail.com>
Date: Wed Mar 18 10:45:01 2015 -0400
checkpoint
Removing the `orientation` field from ShapeBuilder, only leaving it in
PolygonBuilder and MultiPolygonBuilder which are the only places where
it is actually used and parsed at the moment.
This is the second part of making ShapeBuilders implement Writable.
This PR add serialization, equality and hashCode to all remaining
ShapeBuilders that don't implement it yet.
We have the Text API, which is essentially a wrapper around a String and a
BytesReference and then we have 3 implementations depending on whether the
String view should be cached, the BytesReference view should be cached, or both
should be cached.
This commit merges everything into a single Text that is essentially the old
StringAndBytesText impl.
Long term we should look into whether this API has any performance benefit or
if we could just use plain strings. This would greatly simplify all our other
APIs that currently use Text.
This is a bug that I introduced in #13239 while thinking that the differences
were due to changes in Lucene: extractUnknownQuery is also called when span
extraction already succeeded, so we should only fall back to Weight.extractTerms
if no spans have been extracted yet.
Close#15291
With this commit we change the default behavior of
BulkProcessor from not backing off when getting
EsRejectedExecutionException to backing off exponentially.
This commit modifies the behavior after publication of a new cluster
state to only invoke the reroute logic once per batch of shard failures
rather than once per shard failure.
This commit modifies the internal representation of the JVM flag
UseCompressedOops to just be a String. This means we can just store the
value of the flag or "unknown" directly so that we do not have to engage
in shenanigans with three-valued logic around a boxed boolean.
Relates #15489
With this commit we introduce limited retries with a backoff logic to BulkProcessor
when a bulk request has been rejeced with an EsRejectedExecutionException.
Fixes#14620.
Currently MetaDataMappingService parses the mapping updates, reserializes it and
finally calls MapperService.merge with the serialized mapping. Given that mapping
serialization only writes differences from the default, this is a bit unfair to
parsers since they can't know whether some option has been explicitly set or not.
Furthermore this can cause bugs with metadata fields given that these fields use
existing field types as defaults.
This commit changes MetaDataMappingService to call MapperService.merge with the
original mapping update.
This fixes the `lenient` parameter to be `missingClasses`. I will remove this boolean and we can handle them via the normal whitelist.
It also adds a check for sheisty classes (jar hell with the jdk).
This is inspired by the lucene "sheisty" classes check, but it has false positives. This check is more evil, it validates every class file against the extension classloader as a resource, to see if it exists there. If so: jar hell.
This jar hell is a problem for several reasons:
1. causes insanely-hard-to-debug problems (like bugs in forbidden-apis)
2. hides problems (like internal api access)
3. the code you think is executing, is not really executing
4. security permissions are not what you think they are
5. brings in unnecessary dependencies
6. its jar hell
The more difficult problems are stuff like jython, where these classes are simply 'uberjared' directly in, so you cant just fix them by removing a bogus dependency. And there is a legit reason for them to do that, they want to support java 1.4.
We currently randomly add a set of mock plugins to integ tests.
Sometimes it is necessary to omit this mock plugins, but other times you
may just want to suppress a particular mock plugin. For example, if you
have your own transport, you want to omit the asserting local transport
mock, since they would both try to set the transport.type.
This commit adds a callback for a cluster state task executor that will
be invoked if the execution of a batch of cluster state update tasks
led to a new cluster state and that new cluster state was successfully
published.
Closes#15482
This commit adds to JvmInfo the status of whether or not compressed
ordinary object pointers are enabled. Additionally, logging of the max
heap size and the status of the compressed ordinary object pointers flag
are provided on startup.
Relates #13187, relates elastic/elasticsearch-definitive-guide#455
When a client sends a request to fail a shard to the master, the current
behavior is that the master will submit the cluster state update task
and then immediately send a successful response back to the client;
additionally, if there are any failures while processing the cluster
state update task to fail the shard, then the client will never be
notified of these failures.
This commit modifies the master behavior when handling requests to fail
a shard. In particular, the master will now wait until successful
publication of the cluster state update before notifying the request
client that the shard is marked as failed; additionally, the client is
now notified of any failures during the execution of the cluster state
update task.
Relates #14252
When creating a metadata mapper for a new type, we reuse an existing
configuration from an existing type (if any) in order to avoid introducing
conflicts. However this field type that is provided is considered as both an
initial configuration and the default configuration. So at serialization time,
we might only serialize the difference between the current configuration and
this default configuration, which might be different to what is actually
considered the default configuration.
This does not cause bugs today because metadata mappers usually override the
toXContent method and compare the current field type with Defaults.FIELD_TYPE
instead of defaultFieldType() but I would still like to do this change to
avoid future bugs.
Today we only test this when writing sequentially. Yet, in practice we mainly
write concurrently, this commit adds a test that tests that concurrent writes with
sudden fatal failure will not corrupt our translog.
Relates to #15420
Before we only evaluated segments that yielded matches in parent aggs, which caused us to miss to evaluate child docs in segments we didn't have parent matches for.
The fix for this is stop remember in what segments we have matches for and simply evaluate all segments. This makes the code simpler and we can still quickly see if a segment doesn't hold child docs like we did before.
This commit is a trivial reorganization of
o/e/c/a/s/ShardStateAction.java. The primary motive is have all of the
shard failure handling grouped together, and all of the shard started
handling grouped together.
The `path` option allowed to index/store a field `a.b.c` under just `c` when
set to `just_name`. This "feature" has been removed in 2.0 in favor of `copy_to`
so we can remove the back compat in 3.x.
There are two ways that a field can be defined twice:
- by reusing the name of a meta mapper in the root object (`_id`, `_routing`,
etc.)
- by defining a sub-field both explicitly in the mapping and through the code
in a field mapper (like ExternalMapper does)
This commit adds new checks in order to make sure this never happens.
Close#15057
Today mappings are mutable because of two APIs:
- Mapper.merge, which expects changes to be performed in-place
- IncludeInAll, which allows to change whether values should be put in the
`_all` field in place.
This commit changes both APIs to return a modified copy instead of modifying in
place so that mappings can be immutable. For now, only the type-level object is
immutable, but in the future we can imagine making them immutable at the
index-level so that mapping updates could be completely atomic at the index
level.
Close#9365
This change adds back the http.type setting. It also cleans up all the
transport related guice code to be consolidated within the
NetworkModule (as transport and http related stuff is what and how ES
exposes over the network). The setter methods previously used by some
plugins to override eg the TransportService or HttpServerTransport are
removed, and those plugins should now register a custom implementation
of the class with a name and set that using the appropriate config
setting. Note that I think ActionModule should also be moved into here,
to sit along side the rest actions, but I left that for a followup.
closes#14148
This commit addresses two type inference issues that the IntelliJ source
editor struggles with when registering query builder prototypes in
o/e/i/q/IndicesQueriesRegistry.java and
o/e/i/q/f/ScoreFunctionParserMapper.java.
This commit adds explicit logging at the DEBUG level for cluster state
update failures. Currently this responsibility is left to the cluster
state task listener, but we should expliclty log these with a generic
message to address cases where the listener might not.
Relates #14899, relates #15016, relates #15023
This commit changes the behavior of the logging in
TransportBroadcastByNodeAction#onNodeFailure to only trace log
exceptions that are considered shard-not-available exceptions. This
makes the logging consistent with how these exceptions are handled in
the response.
Relates #14927
Migrated from ES-Hadoop. Contains several improvements regarding:
* Security
Takes advantage of the pluggable security in ES 2.2 and uses that in order
to grant the necessary permissions to the Hadoop libs. It relies on a
dedicated DomainCombiner to grant permissions only when needed only to the
libraries installed in the plugin folder
Add security checks for SpecialPermission/scripting and provides out of
the box permissions for the latest Hadoop 1.x (1.2.1) and 2.x (2.7.1)
* Testing
Uses a customized Local FS to perform actual integration testing of the
Hadoop stack (and thus to make sure the proper permissions and ACC blocks
are in place) however without requiring extra permissions for testing.
If needed, a MiniDFS cluster is provided (though it requires extra
permissions to bind ports)
Provides a RestIT test
* Build system
Picks the build system used in ES (still Gradle)
After HighlightBuilder implements Writable now, we can remove
the temporary solution for transporting the highlight section in
SearchSourceBuilder from the coordinating node to the shard as
BytesReference and use HighlightBuilder instead.
The top-level highlighter has many options that can be overwritten per
field. Currently there is very similar code for this in two places.
This PR pulls out the parsing of the common parameters into
AbstractHighlighterBuilder for better reuse and to keep parsing of
common parameters more consistent.
Today we are super lenient (how could I missed that for f**k sake) with failing
/ closing the translog writer when we hit an exception. It's actually worse, we allow
to further write to it and don't care what has been already written to disk and what hasn't.
We keep the buffer in memory and try to write it again on the next operation.
When we hit a disk-full expcetion due to for instance a big merge we are likely adding document to the
translog but fail to write them to disk. Once the merge failed and freed up it's diskspace (note this is
a small window when concurrently indexing and failing the shard due to out of space exceptions) we will
allow in-flight operations to add to the translog and then once we fail the shard fsync it. These operations
are written to disk and fsynced which is fine but the previous buffer flush might have written some bytes
to disk which are not corrupting the translog. That wouldn't be an issue if we prevented the fsync.
Closes#15333
This change removes hardcoded ports from cluster formation. It passes
port 0 for http and transport, and then uses a special property to have
the node log the ports used for http and transport (just for tests).
This does not yet work for multi node tests. This brings us one step
closer to working with --parallel.
This commit improves the handling of ThreadLocal Random instance
allocation in o.e.c.Randomness.
- the seed per instance is no longer fixed
- a non-dangerous race to create the ThreadLocal instance has been
removed
- encapsulated all state into an static nested class for safe and lazy
instantiation
This commit adds the following:
* SpatialStrategy documentation to the geo-shape reference docs.
* Updates relation documentation to geo-shape-query reference docs.
* Updates GeoShapeFiledMapper to set points_only to true if TERM strategy is used (to be consistent with documentation)
This option allows to force the xcontent type to use to store the `_source`
document. The default is to use the same format as the input format.
This commit makes this option ignored for 2.x indices and rejected for 3.0
indices.
This commit removes and now forbids all uses of
Collections#shuffle(List) and Random#<init>() across the codebase. The
rationale for removing and forbidding these methods is to increase test
reproducibility. As these methods use non-reproducible seeds, production
code and tests that rely on these methods contribute to
non-reproducbility of tests.
Instead of Collections#shuffle(List) the method
Collections#shuffle(List, Random) can be used. All that is required then
is a reproducible source of randomness. Consequently, the utility class
Randomness has been added to assist in creating reproducible sources of
randomness.
Instead of Random#<init>(), Random#<init>(long) with a reproducible seed
or the aforementioned Randomess class can be used.
Closes#15287
In commit fafeb3a, we've refactored REST response handling logic
and returned HTTP status names instead of HTTP status codes for
bulk item responses. With this commit we restore the original
behavior.
Checked with @bleskes.