Commit Graph

717 Commits

Author SHA1 Message Date
Zachary Tong 9a70dbb51a Add ability to profile query and collectors
Provides a new flag which can be enabled on a per-request basis.
When `"profile": true` is set, the search request will execute in a
mode that collects detailed timing information for query components.

```
GET /test/test/_search
{
   "profile": true,
   "query": {
      "match": {
         "foo": "bar"
      }
   }
}
```

Closes #14889

Squashed commit of the following:

commit a92db5723d2c61b8449bd163d2f006d12f9889ad
Merge: 117dd99 3f87b08
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Dec 17 09:44:10 2015 -0500

    Merge remote-tracking branch 'upstream/master' into query_profiler

commit 117dd9992e8014b70203c6110925643769d80e62
Merge: 9b29d68 82a64fd
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Dec 15 13:27:18 2015 -0500

    Merge remote-tracking branch 'upstream/master' into query_profiler

    Conflicts:
    	core/src/main/java/org/elasticsearch/search/SearchService.java

commit 9b29d6823a71140ecd872df25ff9f7478e7fe766
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Dec 14 16:13:23 2015 -0500

    [TEST] Profile flag needs to be set, ensure searches go against primary only for consistency

commit 4d602d8ad1f8cbc7b475450921fa3bc7d395b34f
Merge: 8b48e87 7742c1e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Dec 14 10:56:25 2015 -0500

    Merge remote-tracking branch 'upstream/master' into query_profiler

commit 8b48e876348b163ab730eeca7fa35142165b05f9
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Dec 14 10:56:01 2015 -0500

    Delegate straight to in.matchCost, no need for profiling

commit fde3b0587911f0b5f15e779c671d0510cbd568a9
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Dec 14 10:28:23 2015 -0500

    Documentation tweaks, renaming build_weight -> create_weight

commit 46f5e011ee23fe9bb8a1f11ceb4fa9d19fe48e2e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Dec 14 10:27:52 2015 -0500

    Profile TwoPhaseIterator should override matchCost()

commit b59f894ddb11b2a7beebba06c4ec5583ff91a7b2
Merge: 9aa1a3a b4e0c87
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Dec 9 14:23:26 2015 -0500

    Merge remote-tracking branch 'upstream/master' into query_profiler

commit 9aa1a3a25c34c9cd9fffaa6114c25a0ec791307d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Dec 9 13:41:48 2015 -0500

    Revert "Move some of the collector wrapping logic into ProfileCollectorBuilder"

    This reverts commit 02cc31767fb76a7ecd44a302435e93a05fb4220e.

commit 57f7c04cea66b3f98ba2bec4879b98e4fba0b3c0
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Dec 9 13:41:31 2015 -0500

    Revert "Rearrange if/else to make intent clearer"

    This reverts commit 59b63c533fcaddcdfe4656e86a6f6c4cb1bc4a00.

commit 2874791b9c9cd807113e75e38be465f3785c154e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Dec 9 13:38:13 2015 -0500

    Revert "Move state into ProfileCollectorBuilder"

    This reverts commit 0bb3ee0dd96170b06f07ec9e2435423d686a5ae6.

commit 0bb3ee0dd96170b06f07ec9e2435423d686a5ae6
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Dec 3 11:21:55 2015 -0500

    Move state into ProfileCollectorBuilder

commit 59b63c533fcaddcdfe4656e86a6f6c4cb1bc4a00
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Dec 2 17:21:12 2015 -0500

    Rearrange if/else to make intent clearer

commit 511db0af2f3a86328028b88a6b25fa3dfbab963b
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Dec 2 17:12:06 2015 -0500

    Rename WEIGHT -> BUILD_WEIGHT

commit 02cc31767fb76a7ecd44a302435e93a05fb4220e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Dec 2 17:11:22 2015 -0500

    Move some of the collector wrapping logic into ProfileCollectorBuilder

commit e69356d3cb4c60fa281dad36d84faa64f5c32bc4
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 30 15:12:35 2015 -0500

    Cleanup imports

commit c1b4f284f16712be60cd881f7e4a3e8175667d62
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 30 15:11:25 2015 -0500

    Review cleanup: Make InternalProfileShardResults writeable

commit 9e61c72f7e1787540f511777050a572b7d297636
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 30 15:01:22 2015 -0500

    Review cleanup: Merge ProfileShardResult, InternalProfileShardResult.  Convert to writeable

commit 709184e1554f567c645690250131afe8568a5799
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 30 14:38:08 2015 -0500

    Review cleanup: Merge ProfileResult, InternalProfileResult.  Convert to writeable

commit 7d72690c44f626c34e9c608754bc7843dd7fd8fe
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 30 14:01:34 2015 -0500

    Review cleanup: use primitive (and default) for profile flag

commit 97d557388541bbd3388cdcce7d9718914d88de6d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 30 13:09:12 2015 -0500

    Review cleanup: Use Collections.emptyMap() instead of building an empty one explicitly

commit 219585b8729a8b0982e653d99eb959efd0bef84e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 30 13:08:12 2015 -0500

    Add todo to revisit profiler architecture in the future

commit b712edb2160e032ee4b2f2630fadf131a0936886
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 30 13:05:32 2015 -0500

    Split collector serialization from timing, use writeable instead of streamable

    Previously, the collector timing was done in the same class that was serialized, which required
    leaving the collector null when serializing.  Besides being a bit gross, this made it difficult to
    change the class to Writeable.

    This splits up the timing (InternalProfileCollector + ProfileCollector) and the serialization of
    the times (CollectorResult).  CollectorResult is writeable, and also acts as the public interface.

commit 6ddd77d066262d4400e3d338b11cebe7dd27ca78
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Nov 25 13:15:12 2015 -0500

    Remove dead code

commit 06033f8a056e2121d157654a65895c82bbe93a51
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Nov 25 12:49:51 2015 -0500

    Review cleanup:  Delegate to in.getProfilers()

    Note:  Need to investigate how this is used exactly so we can add a test, it isn't touched by a
    normal inner_hits query...

commit a77e13da21b4bad1176ca2b5d5b76034fb12802f
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Nov 25 11:59:58 2015 -0500

    Review cleanup:  collapse to single `if` statement

commit e97bb6215a5ebb508b0293ac3acd60d5ae479be1
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Nov 25 11:39:43 2015 -0500

    Review cleanup: Return empty map instead of null for profile results

    Note: we still need to check for nullness in SearchPhaseController, since an empty/no-hits result
    won't have profiling instantiated (or any other component like aggs or suggest).  Therefore
    QuerySearchResult.profileResults() is still @Nullable

commit db8e691de2a727389378b459fa76c942572e6015
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Nov 25 10:14:47 2015 -0500

    Review cleanup: renaming, formatting fixes, assertions

commit 9011775fe80ba22c2fd948ca64df634b4e32772d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Nov 19 20:09:52 2015 -0500

    [DOCS] Add missing annotation

commit 4b58560b06f08d4b99b149af20916ee839baabd7
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Nov 19 20:07:17 2015 -0500

    [DOCS] Update documentation for new format

commit f0458c58e5538ed8ec94849d4baf3250aa9ec841
Author: Adrien Grand <jpountz@gmail.com>
Date:   Tue Nov 17 10:14:09 2015 +0100

    Reduce visibility of internal classes.

commit d0a7d319098e60b028fa772bf8a99b2df9cf6146
Merge: e158070 1bdf29e
Author: Adrien Grand <jpountz@gmail.com>
Date:   Tue Nov 17 10:09:18 2015 +0100

    Merge branch 'master' into query_profiler

commit e158070a48cb096551f3bb3ecdcf2b53bbc5e3c5
Author: Adrien Grand <jpountz@gmail.com>
Date:   Tue Nov 17 10:08:48 2015 +0100

    Fix compile error due to bad html in javadocs.

commit a566b5d08d659daccb087a9afbe908ec3d96cd6e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 16 17:48:37 2015 -0500

    Remove unused collector

commit 4060cd72d150cc68573dbde62ca7321c47f75703
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 16 17:48:10 2015 -0500

    Comment cleanup

commit 43137952bf74728f5f5d5a8d1bfc073e0f9fe4f9
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Nov 16 17:32:06 2015 -0500

    Fix negative formatted time

commit 5ef3a980266326aff12d4fe380f73455ff28209f
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Nov 13 17:10:17 2015 +0100

    Fix javadocs.

commit 276114d29e4b17a0cc0982cfff51434f712dc59e
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Nov 13 16:25:23 2015 +0100

    Fix: include rewrite time as well...

commit 21d9e17d05487bf4903ae3d2ab6f429bece2ffef
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Nov 13 15:10:15 2015 +0100

    Remove TODO about profiling explain.

commit 105a31e8e570efb879447159c3852871f5cf7db4
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Nov 13 14:59:30 2015 +0100

    Fix nocommit now that the global collector is a root collector.

commit 2e8fc5cf84adb1bfaba296808c329e5f982c9635
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Nov 13 14:53:38 2015 +0100

    Make collector wrapping more explicit/robust (and a bit less magical).

commit 5e30b570b0835e1ce79a57933a31b6a2d0d58e2d
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Nov 13 12:44:03 2015 +0100

    Simplify recording API a bit.

commit 9b453afced6adc0a59ca1d67d90c28796b105185
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Nov 13 10:54:25 2015 +0100

    Fix serialization-related nocommits.

commit ad97b200bb123d4e9255e7c8e02f7e43804057a5
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Nov 13 10:46:30 2015 +0100

    Fix DFS.

commit a6de06986cd348a831bd45e4f524d2e14d9e03c3
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Nov 12 19:29:16 2015 +0100

    Remove forbidden @Test annotation.

commit 4991a28e19501109af98026e14756cb25a56f4f4
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Nov 12 19:25:59 2015 +0100

    Limit the impact of query profiling on the SearchContext API.

    Rule is: we can put as much as we want in the search.profile package but should
    aim at touching as little as possible other areas of the code base.

commit 353d8d75a5ce04d9c3908a0a63d4ca6e884c519a
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Nov 12 18:05:09 2015 +0100

    Remove dead code.

commit a3ffafb5ddbb5a2acf43403c946e5ed128f47528
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Nov 12 15:30:35 2015 +0100

    Remove call to forbidden String.toLowerCase() API.

commit 1fa8c7a00324fa4e32bd24135ebba5ecf07606f1
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Nov 12 15:30:27 2015 +0100

    Fix compilation.

commit 2067f1797e53bef0e1a8c9268956bc5fb8f8ad97
Merge: 22e631f fac472f
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Nov 12 15:21:12 2015 +0100

    Merge branch 'master' into query_profiler

commit 22e631fe6471fed19236578e97c628d5cda401a9
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Nov 3 18:52:05 2015 -0500

    Fix and simplify serialization of shard profile results

commit 461da250809451cd2b47daf647343afbb4b327f2
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Nov 3 18:32:22 2015 -0500

    Remove helper methods, simpler without them

commit 5687aa1c93d45416d895c2eecc0e6a6b302139f2
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Nov 3 18:29:32 2015 -0500

    [TESTS] Fix tests for new rewrite format

commit ba9e82857fc6d4c7b72ef4d962d2102459365299
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Oct 30 15:28:14 2015 -0400

    Rewrites begone! Record all rewrites as a single time metric

commit 5f28d7cdff9ee736651d564f71f713bf45fb1d91
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Oct 29 15:36:06 2015 -0400

    Merge duplicate rewrites into one entry

    By using the Query as the key in a map, we can easily merge rewrites together.  This means
    the preProcess(), assertion and main query rewrites all get merged together.  Downside is that
    rewrites of the same Query (hashcode) but in different places get lumped together.  I think the
    simplicity of the solution is better than the slight loss in output fidelity.

commit 9a601ea46bb21052746157a45dcc6de6bc350e9c
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Oct 29 15:28:27 2015 -0400

    Allow multiple "searches" per profile (e.g. query + global agg)

commit ee30217328381cd83f9e653d3a4d870c1d2bdfce
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Oct 29 11:29:18 2015 -0400

    Update comment, add nullable annotation

commit 405c6463a64e118f170959827931e8c6a1661f13
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Oct 29 11:04:30 2015 -0400

    remove out-dated comment

commit 2819ae8f4cf1bfd5670dbd1c0e06195ae457b58f
Author: Adrien Grand <jpountz@gmail.com>
Date:   Tue Oct 27 19:50:47 2015 +0100

    Don't render children in the profiles when there are no children.

commit 7677c2ddefef321bbe74660471603d202a4ab66f
Author: Adrien Grand <jpountz@gmail.com>
Date:   Tue Oct 27 19:50:35 2015 +0100

    Set the profiler on the ContextIndexSearcher.

commit 74a4338c35dfed779adc025ec17cfd4d1c9f66f5
Author: Adrien Grand <jpountz@gmail.com>
Date:   Tue Oct 27 19:50:01 2015 +0100

    Fix json rendering.

commit 6674d5bebe187b0b0d8b424797606fdf2617dd27
Author: Adrien Grand <jpountz@gmail.com>
Date:   Tue Oct 27 19:20:19 2015 +0100

    Revert "nocommit - profiling never enabled because setProfile() on ContextIndexSearcher never called"

    This reverts commit d3dc10949024342055f0d4fb7e16c7a43423bfab.

commit d3dc10949024342055f0d4fb7e16c7a43423bfab
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Oct 23 17:20:57 2015 -0400

    nocommit - profiling never enabled because setProfile() on ContextIndexSearcher never called

    Previously, it was enabled by using DefaultSearchContext as a third-party "proxy", but since
    the refactor to make it unit testable, setProfile() needs to be called explicitly.  Unfortunately,
    this is not possible because SearchService only has access to an IndexSearcher.  And it is not
    cast'able to a DefaultIndexSearcher.

commit b9ba9c5d1f93b9c45e97b0a4e35da6f472c9ea53
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Oct 23 16:27:00 2015 -0400

    [TESTS] Fix unit tests

commit cf5d1e016b2b4a583175e07c16c7152f167695ce
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Oct 23 16:22:34 2015 -0400

    Increment token after recording a rewrite

commit b7d08f64034e498533c4a81bff8727dd8ac2843e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Oct 23 16:14:09 2015 -0400

    Fix NPE if a top-level root doesn't have children

commit e4d3b514bafe2a3a9db08438c89f0ed68628f2d6
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Oct 23 16:05:47 2015 -0400

    Fix NPE when profiling is disabled

commit 445384fe48ed62fdd01f7fc9bf3e8361796d9593
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Oct 23 16:05:37 2015 -0400

    [TESTS] Fix integration tests

commit b478296bb04fece827a169e7522df0a5ea7840a3
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Oct 23 15:43:24 2015 -0400

    Move rewrites to their own section, remove reconciliation

    Big commit because the structural change affected a lot of the wrapping code.  Major changes:

    - Rewrites are now in their own section in the response
    - Reconciliation is gone...we don't attempt to roll the rewrites into the query tree structure
    - InternalProfileShardResults (plural) simply holds a Map<String, InternalProfileShardResult> and
    helps to serialize / ToXContent
    - InternalProfileShardResult (singular) is now the holder for all shard-level profiling details. Currently
    this includes query, collectors and rewrite.  In the future it would hold suggest, aggs, etc
    - When the user requests the profiled results, they get back a Map<String, ProfileShardResult>
    instead of doing silly helper methods to convert to maps, etc
    - Shard details are baked into a string instead of serializing the object

commit 24819ad094b208d0e94f17ce9c3f7c92f7414124
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Oct 23 10:25:38 2015 -0400

    Make Profile results immutable by removing relative_time

commit bfaf095f45fed74194ef78160a8e5dcae1850f9e
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Oct 23 10:54:59 2015 +0200

    Add nocommits.

commit e9a128d0d26d5b383b52135ca886f2c987850179
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Oct 23 10:39:37 2015 +0200

    Move all profile-related classes to the same package.

commit f20b7c7fdf85384ecc37701bb65310fb8c20844f
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Oct 23 10:33:14 2015 +0200

    Reorganize code a bit to ease unit testing of ProfileCollector.

commit 3261306edad6a0c70f59eaee8fe58560f61a75fd
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 18:07:28 2015 +0200

    Remove irrelevant nocommit.

commit a6ac868dad12a2e17929878681f66dbd0948d322
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 18:06:45 2015 +0200

    Make CollectorResult's reason a free-text field to ease bw compat.

commit 5d0bf170781a950d08b81871cd1e403e49f3cc12
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 16:50:52 2015 +0200

    Add unit tests for ProfileWeight/ProfileScorer.

commit 2cd88c412c6e62252504ef69a59216adbb574ce4
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 15:55:17 2015 +0200

    Rename InternalQueryProfiler to Profiler.

commit 84f5718fa6779f710da129d9e0e6ff914fd85e36
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 15:53:58 2015 +0200

    Merge InternalProfileBreakdown into ProfileBreakdown.

commit 135168eaeb8999c8117ea25288104b0961ce9b35
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 13:56:57 2015 +0200

    Make it possible to instantiate a ContextIndexSearcher without SearchContext.

commit 5493fb52376b48460c4ce2dedbe00cc5f6620499
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 11:53:29 2015 +0200

    Move ProfileWeight/Scorer to their own files.

commit bf2d917b9dae3b32dfc29c35a7cac4ccb7556cce
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 11:38:24 2015 +0200

    Fix bug that caused phrase queries to fail.

commit b2bb0c92c343334ec1703a221af24a1b55e36d53
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 11:36:17 2015 +0200

    Parsing happens on the coordinating node now.

commit 416cabb8621acb5cd8dfa77374fd23e428f52fe9
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 11:22:17 2015 +0200

    Fix compilation (in particular remove guava deps).

commit f996508645f842629d403fc2e71c1890c0e2cac9
Merge: 4616a25 bc3b91e
Author: Adrien Grand <jpountz@gmail.com>
Date:   Thu Oct 22 10:44:38 2015 +0200

    Merge branch 'master' into query_profiler

commit 4616a25afffe9c24c6531028f7fccca4303d2893
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Oct 20 12:11:32 2015 -0400

    Make Java Count API compatible with profiling

commit cbfba74e16083d719722500ac226efdb5cb2ff55
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Oct 20 12:11:19 2015 -0400

    Fix serialization of profile query param, NPE

commit e33ffac383b03247046913da78c8a27e457fae78
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Oct 20 11:17:48 2015 -0400

    TestSearchContext should return null Profiler instead of exception

commit 73a02d69b466dc1a5b8a5f022464d6c99e6c2ac3
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Oct 19 12:07:29 2015 -0400

    [DOCS] Update docs to reflect new ID format

commit 36248e388c354f954349ecd498db7b66f84ce813
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Oct 19 12:03:03 2015 -0400

    Use the full [node][index][shard] string as profile result ID

commit 5cfcc4a6a6b0bcd6ebaa7c8a2d0acc32529a80e1
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Oct 15 17:51:40 2015 -0400

    Add failing test for phrase matching

    Stack trace generated:

    [2015-10-15 17:50:54,438][ERROR][org.elasticsearch.search.profile] shard [[JNj7RX_oSJikcnX72aGBoA][test][2]], reason [RemoteTransportException[[node_s0][local[1]][indices:data/read/search[phase/query]]]; nested: QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: AssertionError[nextPosition() called more than freq() times!]; ], cause [java.lang.AssertionError: nextPosition() called more than freq() times!
    	at org.apache.lucene.index.AssertingLeafReader$AssertingPostingsEnum.nextPosition(AssertingLeafReader.java:353)
    	at org.apache.lucene.search.ExactPhraseScorer.phraseFreq(ExactPhraseScorer.java:132)
    	at org.apache.lucene.search.ExactPhraseScorer.access$000(ExactPhraseScorer.java:27)
    	at org.apache.lucene.search.ExactPhraseScorer$1.matches(ExactPhraseScorer.java:69)
    	at org.elasticsearch.common.lucene.search.ProfileQuery$ProfileScorer$2.matches(ProfileQuery.java:226)
    	at org.apache.lucene.search.ConjunctionDISI$TwoPhaseConjunctionDISI.matches(ConjunctionDISI.java:175)
    	at org.apache.lucene.search.ConjunctionDISI$TwoPhase.matches(ConjunctionDISI.java:213)
    	at org.apache.lucene.search.ConjunctionDISI.doNext(ConjunctionDISI.java:128)
    	at org.apache.lucene.search.ConjunctionDISI.nextDoc(ConjunctionDISI.java:151)
    	at org.apache.lucene.search.ConjunctionScorer.nextDoc(ConjunctionScorer.java:62)
    	at org.elasticsearch.common.lucene.search.ProfileQuery$ProfileScorer$1.nextDoc(ProfileQuery.java:205)
    	at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:224)
    	at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:169)
    	at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
    	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:795)
    	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:509)
    	at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:347)
    	at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:111)
    	at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:366)
    	at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:378)
    	at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
    	at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
    	at org.elasticsearch.transport.local.LocalTransport$2.doRun(LocalTransport.java:280)
    	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    	at java.lang.Thread.run(Thread.java:745)

commit 889fe6383370fe919aaa9f0af398e3040209e40b
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Oct 15 17:30:38 2015 -0400

    [DOCS] More docs

commit 89177965d031d84937753538b88ea5ebae2956b0
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Oct 15 09:59:09 2015 -0400

    Fix multi-stage rewrites to recursively find most appropriate descendant rewrite

    Previously, we chose the first rewrite that matched.  But in situations where a query may
    rewrite several times, this won't build the tree correctly.  Instead we need to recurse
    down all the rewrites until we find the most appropriate "leaf" rewrite

    The implementation of this is kinda gross: we recursively call getRewrittenParentToken(),
    which does a linear scan over the rewriteMap and tries to find rewrites with a larger token
    value (since we know child tokens are always larger).  Can almost certainly find a better
    way to do this...

commit 0b4d782b5348e5d03fd26f7d91bc4a3fbcb7f6a5
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Oct 14 19:30:06 2015 -0400

    [Docs] Documentation checkpoint

commit 383636453f6610fcfef9070c21ae7ca11346793e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Sep 16 16:02:22 2015 -0400

    Comments

commit a81e8f31e681be16e89ceab9ba3c3e0a018f18ef
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Sep 16 15:48:49 2015 -0400

    [TESTS] Ensure all tests use QUERY_THEN_FETCH, DFS does not profile

commit 1255c2d790d85fcb9cbb78bf2a53195138c6bc24
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Sep 15 16:43:46 2015 -0400

    Refactor rewrite handling to handle identical rewrites

commit 85b7ec82eb0b26a6fe87266b38f5f86f9ac0c44f
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Sep 15 08:51:14 2015 -0400

    Don't update parent when a token is added as root -- Fixes NPE

commit 109d02bdbc49741a3b61e8624521669b0968b839
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Sep 15 08:50:40 2015 -0400

    Don't set the rewritten query if not profiling -- Fixes NPE

commit 233cf5e85f6f2c39ed0a2a33d7edd3bbd40856e8
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Sep 14 18:04:51 2015 -0400

    Update tests to new response format

commit a930b1fc19de3a329abc8ffddc6711c1246a4b15
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Sep 14 18:03:58 2015 -0400

    Fix serialization

commit 69afdd303660510c597df9bada5531b19d134f3d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Sep 14 15:11:31 2015 -0400

    Comments and cleanup

commit 64e7ca7f78187875378382ec5d5aa2462ff71df5
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Sep 14 14:40:21 2015 -0400

    Move timing into dedicated class, add proper rewrite integration

commit b44ff85ddbba0a080e65f2e7cc8c50d30e95df8e
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Sep 14 12:00:38 2015 -0400

    Checkpoint - Refactoring to use a token-based dependency tree

commit 52cedd5266d6a87445c6a4cff3be8ff2087cd1b7
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Fri Sep 4 19:18:19 2015 -0400

    Need to set context profiling flag before calling queryPhase.preProcess

commit c524670cb1ce29b4b3a531fa2bff0c403b756f46
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Sep 4 18:00:37 2015 +0200

    Reduce profiling overhead a bit.

    This removes hash-table lookups everytime we start/stop a profiling clock.

commit 111444ff8418737082236492b37321fc96041e09
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Sep 4 16:18:59 2015 +0200

    Add profiling of two-phase iterators.

    This is useful for eg. phrase queries or script filters, since they are
    typically consumed through their two-phase iterator instead of the scorer.

commit f275e690459e73211bc8494c6de595c0320f4c0b
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Sep 4 16:03:21 2015 +0200

    Some more improvements.

    I changed profiling to disable bulk scoring, since it makes it impossible to
    know where time is spent. Also I removed profiling of operations that are
    always fast (eg. normalization) and added nextDoc/advance.

commit 3c8dcd872744de8fd76ce13b6f18f36f8de44068
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Sep 4 14:39:50 2015 +0200

    Remove println.

commit d68304862fb38a3823aebed35a263bd9e2176c2f
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Sep 4 14:36:03 2015 +0200

    Fix some test failures introduced by the rebase...

commit 04d53ca89fb34b7a21515d770c32aaffcc513b90
Author: Adrien Grand <jpountz@gmail.com>
Date:   Fri Sep 4 13:57:35 2015 +0200

    Reconcile conflicting changes after rebase

commit fed03ec8e2989a0678685cd6c50a566cec42ea4f
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Thu Aug 20 22:40:39 2015 -0400

    Add Collectors to profile results

    Profile response element has now been re-arranged so that everything is listed per-shard to
    facilitate grouping elements together.  The new `collector` element looks like this:

    ```
    "profile": {
      "shards": [
         {
            "shard_id": "keP4YFywSXWALCl4m4k24Q",
            "query": [...],
            "collector": [
               {
                  "name": "MultiCollector",
                  "purpose": "search_multi",
                  "time": "16.44504400ms",
                  "relative_time": "100.0000000%",
                  "children": [
                     {
                        "name": "FilteredCollector",
                        "purpose": "search_post_filter",
                        "time": "4.556013000ms",
                        "relative_time": "27.70447437%",
                        "children": [
                           {
                              "name": "SimpleTopScoreDocCollector",
                              "purpose": "search_sorted",
                              "time": "1.352166000ms",
                              "relative_time": "8.222331299%",
                              "children": []
                           }
                        ]
                     },
                     {
                        "name": "BucketCollector: [[non_global_term, another_agg]]",
                        "purpose": "aggregation",
                        "time": "10.40379400ms",
                        "relative_time": "63.26400829%",
                        "children": []
                     },
           ...
    ```

commit 1368b495c934be642c00f6cbf9fc875d7e6c07ff
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Aug 19 12:43:03 2015 -0400

    Move InternalProfiler to profile package

commit 53584de910db6d4a6bb374c9ebb954f204882996
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 18 18:34:58 2015 -0400

    Only reconcile rewrite timing when rewritten query is different from original

commit 9804c3b29d2107cd97f1c7e34d77171b62cb33d0
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 18 16:40:15 2015 -0400

    Comments and cleanup

commit 8e898cc7c59c0c1cc5ed576dfed8e3034ca0967f
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 18 14:19:07 2015 -0400

    [TESTS] Fix comparison test to ensure results sort identically

commit f402a29001933eef29d5a62e81c8563f1c8d0969
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 18 14:17:59 2015 -0400

    Add note about DFS being unable to profile

commit d446e08d3bc91cd85b24fc908e2d82fc5739d598
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 18 14:17:23 2015 -0400

    Implement some missing methods

commit 13ca94fb86fb037a30d181b73d9296153a63d6e4
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 18 13:10:54 2015 -0400

    [TESTS] Comments & cleanup

commit c76c8c771fdeee807761c25938a642612a6ed8e7
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 18 13:06:08 2015 -0400

    [TESTS] Fix profileMatchesRegular to handle NaN scores and nearlyEqual floats

commit 7e7a10ecd26677b2239149468e24938ce5cc18e1
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 18 12:22:16 2015 -0400

    Move nearlyEquals() utility function to shared location

commit 842222900095df4b27ff3593dbb55a42549f2697
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 18 12:04:35 2015 -0400

    Fixup rebase conflicts

commit 674f162d7704dd2034b8361358decdefce1f76ce
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Aug 17 15:29:35 2015 -0400

    [TESTS] Update match and bool tests

commit 520380a85456d7137734aed0b06a740e18c9cdec
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Aug 17 15:28:09 2015 -0400

    Make naming consistent re: plural

commit b9221501d839bb24d6db575d08e9bee34043fc65
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Aug 17 15:27:39 2015 -0400

    Children need to be added to list after serialization

commit 05fa51df940c332fbc140517ee56e849f2d40a72
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Aug 17 15:22:41 2015 -0400

    Re-enable bypass for non-profiled queries

commit f132204d264af77a75bd26a02d4e251a19eb411d
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Aug 17 15:21:14 2015 -0400

    Fix serialization of QuerySearchResult, InternalProfileResult

commit 27b98fd475fc2e9508c91436ef30624bdbee54ba
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Aug 10 17:39:17 2015 -0400

    Start to add back tests, refactor Java api

commit bcfc9fefd49307045108408dc160774666510e85
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Aug 4 17:08:10 2015 -0400

    Checkpoint

commit 26a530e0101ce252450eb23e746e48c2fd1bfcae
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Tue Jul 14 13:30:32 2015 -0400

    Add createWeight() checkpoint

commit f0dd61de809c5c13682aa213c0be65972537a0df
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Mon Jul 13 12:36:27 2015 -0400

    checkpoint

commit 377ee8ce5729b8d388c4719913b48fae77a16686
Author: Zachary Tong <zacharyjtong@gmail.com>
Date:   Wed Mar 18 10:45:01 2015 -0400

    checkpoint
2015-12-17 15:29:00 -05:00
Jason Bryan 9a1133ca50 Fix typo in scroll.asciidoc
Fix scroll request with sort.

Closes #15493
2015-12-16 20:31:46 -05:00
Clinton Gormley 4597a22ace Merge pull request #15473 from jmluy/patch-1
Update sample in sort for consistency
2015-12-16 12:53:09 +01:00
Clinton Gormley 259c6eeb59 Merge pull request #15274 from murnieza/patch-1
[Doc] Redundant indefinite article removed
2015-12-11 14:38:44 +01:00
Clinton Gormley 1685126bb6 Merge pull request #15085 from kaneshin/docs/modify/post_filter
Remove a trailing comma from an example data of JSON
2015-11-30 08:05:10 +01:00
Shintaro Kaneko d7baeb1e7b Remove a trailing comma from an example data of JSON 2015-11-28 16:50:28 +00:00
Clinton Gormley 174b4bacbe Merge pull request #14871 from jamiemccarthy/doc-fix
Fix doc of nested_path sort option
2015-11-28 17:46:47 +01:00
Martijn van Groningen 48771f1a76 field stats: Added `min_value_as_string` and `max_value_as_string` response elements for all number based fields. The existing `min_value` and `max_value` will return the values as numbers instead.
Closes #14404
2015-11-23 08:48:28 +01:00
Jamie McCarthy ce20337d03 Fix doc of nested_path sort option 2015-11-19 12:22:00 -05:00
Martijn van Groningen d2ae3ffa36 docs: can't use same call out twice 2015-11-18 15:16:13 +01:00
Martijn van Groningen e443bfc492 docs: fix callouts 2015-11-18 15:13:38 +01:00
Martijn van Groningen 8a454dae33 field stats: Added a `format` option to index constraint that allows to specify date index constraint values in a different format then the for specified in the mapping.
Closes #14804
2015-11-18 14:19:07 +01:00
javanna ca980b7a83 [DOCS] document replacement for search exists
Relates to #13910
Closes #14393
2015-11-09 15:05:07 +01:00
Adrien Grand 81767bc639 Merge pull request #14474 from jpountz/doc/search_body_fields_warning
Add a warning about fields vs. source filtering.
2015-11-09 11:52:58 +01:00
Clinton Gormley 1220b89a60 Docs: Fixed bad link in completion suggester 2015-11-08 09:51:14 +01:00
Areek Zillur dd1c687ace Completion Suggester V2
The completion suggester provides auto-complete/search-as-you-type functionality.
This is a navigational feature to guide users to relevant results as they are typing, improving search precision.
It is not meant for spell correction or did-you-mean functionality like the term or phrase suggesters.

The completions are indexed as a weighted FST (finite state transducer) to provide fast Top N prefix-based
searches suitable for serving relevant results as a user types.

closes #10746
2015-11-07 17:46:27 -05:00
Adrien Grand c9120c5c2a Docs: Add a warning about fields vs. source filtering.
Close #14470
2015-11-03 11:18:00 +01:00
Lee Hinman 3b5058017e Merge branch 'remove-optimize-rest' 2015-10-29 15:18:03 -06:00
javanna 49f5757ae2 Remove support for multiple highlighter names
The only way to refer to the plain highlighter is now `plain`, the only way to refer to the fast vector highlighter is `fvh` and the only way to refer to the postings highlighter is `postings`. The name variants like `highlighter`, `postings-highlighter` and `fast-vector-highlighter` have been removed.
2015-10-28 10:50:29 +01:00
Lee Hinman 3a458af0b7 Remove /_optimize REST API endpoint
The `/_optimize` endpoint was deprecated in 2.1.0 and can now be removed
entirely.
2015-10-27 10:17:16 -06:00
javanna 75cedca0da Remove search exists api
Closes #13682
Closes #13911
2015-10-21 17:39:32 +02:00
javanna c5152c7ecb [DOCS] terminate_after is not experimental anymore
we are relying on terminate_after more and more, replaced the limit filter with it and soon it will also replace the search_exists api. At that point we should make it a stable api rather than experimental.

Closes #14183
2015-10-19 13:56:42 +02:00
Clinton Gormley dc018cf622 Updated docs for 3.0.0-beta 2015-10-07 13:27:46 +02:00
Thomas Cucchietti ecc2985b84 Update inner-hits.asciidoc
Fix a glitch in inner_hits feature documentation (though I'm not absolutely sure of the final version)
2015-09-30 11:07:51 +02:00
ulkas e133fdd49f Update phrase-suggest.asciidoc
small sentence fix
2015-09-23 09:24:23 -04:00
Clinton Gormley fa77cf6f6f Docs: Always quote "@file" argument to --data-binary
Closes #13500
2015-09-19 17:28:15 +02:00
Alexander Pepper df9d4eca66 [docs] Document meaning of "FST" and "FSTs".
Conflicts:
	docs/reference/index-modules/fielddata.asciidoc
2015-09-11 05:34:41 -04:00
Adrien Grand 86f1b07df0 Docs: Remove docs for the `filtered`, `and`, `or` and `(f)query` queries. 2015-09-11 11:00:54 +02:00
Nik Everett e4981968ad [search] Limit the size of the result window
Requesting a million hits, or page 100,000 is always a bad idea, but users
may not be aware of this. This adds a per-index limit on the maximum size +
from that can be requested which defaults to 10,000.

This should not interfere with deep-scrolling.

Closes #9311
2015-09-10 15:38:29 -04:00
Martijn van Groningen 2eadc6d595 nested sorting: If sorting by nested field then the `nested_path` should always be specified.
Closes #13420
2015-09-10 12:21:12 +02:00
Martijn van Groningen 11c87106ce docs: inner hits is no longer experimental 2015-09-07 16:58:46 +02:00
Adrien Grand 0c26e7cd83 Remove the scan and count search types.
These search types have been deprecated in 2.1 and 2.0 respectively, and will
be removed in 3.0.
2015-09-07 15:18:45 +02:00
Britta Weber 2b27bc11b6 [doc] remove comment about function_score faster than script sort. It is not so. 2015-09-03 12:33:00 +02:00
Adrien Grand bd44dbe5cd Docs: Insist that setting size=0 will help performance. 2015-09-03 09:36:34 +02:00
Michael McCandless 1c85b68674 Don't document expert segment merge settings 2015-08-29 17:21:46 -04:00
Adrien Grand 7b878b5b5c Docs: Document the `_doc` sort order. 2015-08-24 15:39:50 +02:00
Clinton Gormley fb632d5dbe Update completion-suggest.asciidoc
Corrected "length" in result output

Closes #13011
2015-08-24 13:32:49 +02:00
Adrien Grand 6fa258b8fa Deprecate the `scan` search type.
This commit deprecates the `scan` search type in favour of regular scroll
requests sorted by `_doc`.

Related to #12983
2015-08-20 12:47:23 +02:00
Adrien Grand 551e92ec71 Fix documentation: scrolls are not closed automatically.
The documentation states that scrolls are automatically closed when all
documents are consumed, but this is not the case. I first tried to fix
the code to close scrolls automatically but this made REST tests fail
because clearing a scroll that is already closed returned a 4xx error
instead of a 2xx code, so this has probably been this way for a very long
time.
2015-08-20 09:20:40 +02:00
Clinton Gormley c6c3a40cb6 Docs: Updated annotations for 2.0.0-beta1 2015-08-14 10:51:09 +02:00
Clinton Gormley ac2b8951c6 Docs: Mapping docs completely rewritten for 2.0 2015-08-06 17:24:51 +02:00
Michael McCandless ac2e0fd6a0 Remove delete-by-query core docs
We moved delete-by-query from core to a plugin, but forgot to remove the core docs.

Closes #12585
2015-08-01 05:14:46 -04:00
Martijn van Groningen a14913f7b6 Left over from the `query_cache` to `request_cache` rename. 2015-07-27 13:28:15 +02:00
Lee Hinman a8391fcae9 Add _replica and _replica_first as search preference.
Just like specifying `?preference=_primary`, this adds the ability to
specify `?preference=_replica` or `?preference=_replica_first` on
requests that support it.

Resolves #12222
2015-07-16 09:25:23 -06:00
Areek Zillur c62d0b9ee3 Merge pull request #12249 from areek/fix/12228
Clarify docs for transpositions setting in completion suggester
closes #12228
2015-07-15 15:45:52 -04:00
Areek Zillur 8bbd57bcb0 Clarify docs for transpositions setting in completion suggester
closes #12228
2015-07-15 15:43:51 -04:00
markharwood 52fb3c3a09 Docs fix- added performance note about plain highlighter
Closes #11442
2015-07-15 14:28:28 +01:00
Clinton Gormley 2b512f1f29 Docs: Use "js" instead of "json" and "sh" instead of "shell" for source highlighting 2015-07-14 18:14:09 +02:00
Clinton Gormley d9dfa9a24c Merge pull request #12183 from erichard/patch-1
Fix documentation typo
2015-07-10 19:15:56 +02:00
Adrien Grand d7af88631f Merge pull request #11538 from Collaborne/docs-sort-sr-typo
Fix a typo in the documentation: six_hun -> "narrower"
2015-07-08 19:22:03 +02:00
Martijn van Groningen 74cf05595e docs: Fix field stats docs. 2015-07-01 12:05:26 +02:00
Ruslan Boyarskiy e5e422b880 Docs: Update post-filter.asciidoc
Removing useless comma

Closes #11912
2015-07-01 09:32:39 +02:00
Martijn van Groningen ef9d70b9b3 field stats: added index constraints
Field stats index constraints allows to omit all field stats for indices that don't match with the constraint. An index
constraint can exclude indices' field stats based on the `min_value` and `max_value` statistic. This option is only
useful if the `level` option is set to `indices`.

For example index constraints can be useful to find out the min and max value of a particular property of your data in
a time based scenario. The following request only returns field stats for the `answer_count` property for indices
holding questions created in the year 2014:

curl -XPOST 'http://localhost:9200/_field_stats?level=indices' -d '{
   "fields" : ["answer_count"] <1>
   "index_constraints" : { <2>
      "creation_date" : { <3>
         "min_value" : { <4>
            "gte" : "2014-01-01T00:00:00.000Z",
         },
         "max_value" : {
            "lt" : "2015-01-01T00:00:00.000Z"
         }
      }
   }
}'

Closes #11187
2015-07-01 08:47:03 +02:00
Colin Goodheart-Smithe d9ab3cba77 Search Templates: Adds API endpoint to render search templates as a response
Closes #6821
2015-06-30 16:57:23 +01:00
Adrien Grand d2f86933cc Merge pull request #11893 from jpountz/fix/rename_cache
Rename caches.
2015-06-29 10:21:18 +02:00
Adrien Grand 38f5cc236a Rename caches.
In order to be more consistent with what they do, the query cache has been
renamed to request cache and the filter cache has been renamed to query
cache.

A known issue is that package/logger names do no longer match settings names,
please speak up if you think this is an issue.

Here are the settings for which I kept backward compatibility. Note that they
are a bit different from what was discussed on #11569 but putting `cache` before
the name of what is cached has the benefit of making these settings consistent
with the fielddata cache whose size is configured by
`indices.fielddata.cache.size`:
 * index.cache.query.enable -> index.requests.cache.enable
 * indices.cache.query.size -> indices.requests.cache.size
 * indices.cache.filter.size -> indices.queries.cache.size

Close #11569
2015-06-29 10:15:27 +02:00
Clinton Gormley f19a748d3c Docs: Move field highlight order to the highlight page 2015-06-26 17:36:48 +02:00
Martijn van Groningen fe330b868a percolator: Fail nicely if `nested` query with `inner_hits` is used in a percolator query.
Closes #11672
2015-06-23 15:03:31 +02:00
Clinton Gormley f123a53d72 Docs: Refactored modules and index modules sections 2015-06-22 23:49:45 +02:00
Clinton Gormley d6ba3226d6 Docs: Add missing quotes in phrase suggest 2015-06-19 16:56:25 +02:00
Nirmal Chidambaram 72a9d34eb8 5925 - Allow node specification in preference
-Allow node selector api's with new preference
ONLY_NODES ( selector apis like https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster.html)

-Update documentation
2015-06-16 11:49:12 -05:00
Andreas Kohn 1c0ad8c724 Fix a typo in the documentation: six_hun -> "narrower"
This was introduced in https://github.com/elastic/elasticsearch.github.com/commit/defaf4f0, probably
as a search-and-replace mistake.
2015-06-08 18:07:52 +02:00
Areek Zillur fb8cd53582 This commit removes the ability to use `filter` for PhraseSuggester collate.
Only `query` can be used for collation.

Internally, a collate query is executed as an exists query. So specifying a
filter does not have any benefits.
2015-05-29 12:26:08 -04:00
Colin Goodheart-Smithe 35a58d874e Scripting: Unify script and template requests across codebase
This change unifies the way scripts and templates are specified for all instances in the codebase. It builds on the Script class added previously and adds request building and parsing support as well as the ability to transfer script objects between nodes. It also adds a Template class which aims to provide the same functionality for template APIs

Closes #11091
2015-05-29 16:52:04 +01:00
Eduardo Gurgel 0f3b3c0787 Docs: Fix typo on percolate_format description
Closes #11215
2015-05-25 13:17:59 +02:00
Clinton Gormley 4d27d751fb Docs: Move the page on facets into redirects.asciidoc 2015-05-24 23:34:23 +02:00
Clinton Gormley 4b854d10bd Docs: Tidied up the field statistics docs 2015-05-24 15:12:44 +02:00
javanna a843008b17 Highlighting: require_field_match set to true by default
The default `false` for `require_field_match` is a bit odd and confusing for users, given that field names get ignored by default and every field gets highlighted if it contains terms extracted out of the query, regardless of which fields were queries. Changed the default to `true`, it can always be changed per request.

Closes #10627
Closes #11067
2015-05-15 21:38:45 +02:00
javanna 46c521f7ec Highlighting: nuke XPostingsHighlighter
Our own fork of the lucene PostingsHighlighter is not easy to maintain and doesn't give us any added value at this point. In particular, it was introduced to support the require_field_match option and discrete per value highlighting, used in case one wants to highlight the whole content of a field, but get back one snippet per value. These two features won't
 make it into lucene as they slow things down and shouldn't have been supported from day one on our end probably.

One other customization we had was support for a wider range of queries via custom rewrite etc. (yet another way to slow
 things down), which got added to lucene and works much much better than what we used to do (instead of or rewrite, term
s are pulled out of the automata for multi term queries).

Removing our fork means the following in terms of features:
- dropped support for require_field_match: the postings highlighter will only highlight fields that were queried
- some custom es queries won't be supported anymore, meaning they won't be highlighted. The only one I found up until now is the phrase_prefix. Postings highlighter rewrites against an empty reader to avoid slow operations (like the ones that we were performing with the fork that we are removing here), thus the prefix will not be expanded to any term. What the postings highlighter does instead is pulling the automata out of multi term queries, but this is not supported at the moment with our MultiPhrasePrefixQuery.

Closes #10625
Closes #11077
2015-05-15 20:41:33 +02:00
Areek Zillur 7efc43db25 Re-structure collate option in PhraseSuggester to only collate on local shard.
Previously, collate feature would be executed on all shards of an index using the client,
this leads to a deadlock when concurrent collate requests are run from the _search API,
due to the fact that both the external request and internal collate requests use the
same search threadpool.

As phrase suggestions are generated from the terms of the local shard, in most cases the
generated suggestion, which does not yield a hit for the collate query on the local shard
would not yield a hit for collate query on non-local shards.

Instead of using the client for collating suggestions, collate query is executed against
the ContextIndexSearcher. This PR removes the ability to specify a preference for a collate
query, as the collate query is only run on the local shard.

closes #9377
2015-05-14 17:21:53 -04:00
Jack Conradson a5c0ac0d67 Scripting: Add Multi-Valued Field Methods to Expressions
Add methods to operate on multi-valued fields in the expressions language.
Note that users will still not be able to access individual values
within a multi-valued field.

The following methods will be included:

* min
* max
* avg
* median
* count
* sum

Additionally, changes have been made to MultiValueMode to support the
new median method.

closes #11105
2015-05-14 08:27:24 -07:00
javanna 36c373e615 [DOCS] documented missing query_string parameters for count, exists, search & validate_query
relates to #11057
2015-05-11 12:58:30 +02:00
Martijn van Groningen acdd9a5dd9 parent/child: Removed the `top_children` query. 2015-05-10 16:30:19 +02:00
Andrew Selden c953e99324 Merge pull request #10864 from aleph-zero/issues/9606
Remove (dfs_)query_and_fetch from the REST API
2015-05-07 12:51:28 -07:00
josephwolnskipn 7f064c592f Docs: Fix grammar and typos in percolate
Added commas, capitalized "JSON" and "API", capitalized titles, etc.

Closes #11023
2015-05-07 21:50:48 +02:00
Alex Ksikes ec4f12f9ef More Like This: removal of the MLT API
Removes the More Like This API, users should now use the More Like This query.
The MLT API tests were converted to their query equivalent. Also some clean
ups in MLT tests.

Closes #10736
Closes #11003
2015-05-06 18:11:11 +02:00
Pascal Borreli af6d890ad5 Docs: Fixed typos
Closes #10973
2015-05-05 10:38:05 +02:00
aleph-zero 2b483cc806 Removed reference to search type 'count'
Removed reference to search type 'count' as this is now a deprecated
search type.
2015-05-04 14:48:40 -07:00
Zachary Tong e3ae1df6f0 [DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00
Adrien Grand e5be85d586 Aggs: Change the default `min_doc_count` to 0 on histograms.
The assumption is that gaps in histogram are generally undesirable, for instance
if you want to build a visualization from it. Additionally, we are building new
aggregations that require that there are no gaps to work correctly (eg.
derivatives).
2015-04-30 15:48:23 +02:00
Colin Goodheart-Smithe 969f53e399 fix typo in Min bucket aggregation docs 2015-04-30 14:41:01 +01:00
Colin Goodheart-Smithe d16bf992a9 Aggregations: min_bucket aggregation
An aggregation to calculate the minimum value in a set of buckets.

Closes #9999
2015-04-30 13:34:21 +01:00
Zachary Tong 351a4d3315 [DOCS] Fix movavg images and naming 2015-04-29 13:33:54 -04:00
Colin Goodheart-Smithe 57a8885964 Merge branch 'master' into feature/aggs_2_0
# Conflicts:
#	src/main/java/org/elasticsearch/index/query/CommonTermsQueryBuilder.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregationModule.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregatorFactories.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregatorParsers.java
#	src/main/java/org/elasticsearch/search/aggregations/InternalMultiBucketAggregation.java
#	src/main/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregator.java
#	src/main/java/org/elasticsearch/search/aggregations/metrics/InternalNumericMetricsAggregation.java
#	src/test/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregatorTest.java
2015-04-29 15:49:41 +01:00
Antonio Bonuccelli ab83eb036b Docs: adding missing single quote on PUT index request
Closes #10876
2015-04-29 14:45:25 +02:00
aleph-zero 1d60f34944 Remove all doc references to (dfs_)query_and_fetch
Removes references to (dfs_)query_and_fetch as possible ‘search_type’
parameters for the REST API.
2015-04-28 15:57:46 -07:00
aleph-zero 89542facb3 Remove (dfs_)query_and_fetch from the REST API
Remove the ability to specify search type ‘query_and_fetch’ and
‘df_query_and_fetch’ from the REST API.

- Adds REST tests
- Updates REST API spec to remove ‘query_and_fetch’ and
‘df_query_and_fetch’ as options
- Removes documentation for these options

Closes #9606
2015-04-28 15:27:59 -07:00
Zachary Tong bf9739d0f0 [DOCS] review comment fixes 2015-04-27 14:40:04 -04:00
Clinton Gormley 37ed61807f Docs: Updated the experimental annotations in the docs as follows:
* Removed the docs for `index.compound_format` and `index.compound_on_flush` - these are expert settings which should probably be removed (see https://github.com/elastic/elasticsearch/issues/10778)
* Removed the docs for `index.index_concurrency` - another expert setting
* Labelled the segments verbose output as experimental
* Marked the `compression`, `precision_threshold` and `rehash` options as experimental in the cardinality and percentile aggs
* Improved the experimental text on `significant_terms`, `execution_hint` in the terms agg, and `terminate_after` param on count and search
* Removed the experimental flag on the `geobounds` agg
* Marked the settings in the `merge` and `store` modules as experimental, rather than the modules themselves

Closes #10782
2015-04-26 18:49:15 +02:00
Clinton Gormley f1a0e2216a Docs: Mentioned script_id and script_file parameters across all aggs
Closes #10760
2015-04-26 17:30:38 +02:00
Clinton Gormley 7de8b7008e Docs: Tidied docs for field-stats 2015-04-26 15:52:02 +02:00
Mehdi Mollaverdi dce920b75f Docs: The name of scroll ID attribute in the response is "_scroll_id" rather than "scroll_id"
Closes #10691
2015-04-25 19:32:32 +02:00
Mal Curtis 9eabcd7c0f Docs: Fix missing comma in context suggester docs
Closes #10623
2015-04-23 14:04:46 +02:00
Martijn van Groningen dbeb4aaacf docs: make sure that the options are rendered correctly 2015-04-23 10:50:01 +02:00
Martijn van Groningen 6a2f9c2682 docs: fixed title out of sequence 2015-04-23 09:57:31 +02:00
Martijn van Groningen 5705537ecf Added field stats api
The field stats api returns field level statistics such as lowest, highest values and number of documents that have at least one value for a field.

An api like this can be useful to explore a data set you don't know much about. For example you can figure at with the lowest and highest response times are, so that you can create a histogram or range aggregation with sane settings.

This api doesn't run a search to figure this statistics out, but rather use the Lucene index look these statics up (using Terms class in Lucene). So finding out these stats for fields is cheap and quick.

The min/max values are based on the type of the field. So for a numeric field min/max are numbers and date field the min/max date and other fields the min/max are term based.

Closes #10523
2015-04-23 08:52:34 +02:00
Zachary Tong e08e45cee8 [DOCS] Add link to movavg page 2015-04-22 18:59:39 -04:00
Zachary Tong a03cefcece [DOCS] Add documentation for moving average 2015-04-22 18:59:39 -04:00
Clinton Gormley a60571c597 Docs: Removed some unused callout from the scroll docs 2015-04-22 12:49:06 +02:00
Jun Ohtani 0955c127c0 Rest: Add json in request body to scroll, clear scroll, and analyze API
Change analyze.asciidoc and scroll.asciidoc
Add json support to Analyze and Scroll, and clear scrollAPI
Add rest-api-spec/test

Closes #5866
2015-04-22 17:53:20 +09:00
Colin Goodheart-Smithe bd28c9c44e Documentation for the max_bucket reducer 2015-04-21 15:06:20 +01:00
Colin Goodheart-Smithe be647a89d3 Documentation for the derivative reducer 2015-04-21 15:06:20 +01:00
Colin Goodheart-Smithe 0f4b7f3b5c Added section for reducer aggregations in the main aggregation docs page 2015-04-21 15:06:19 +01:00
markharwood 63db34f649 New feature - Sampler aggregation used to limit any nested aggregations' processing to a sample of the top-scoring documents.
Optionally, a “diversify” setting can limit the number of collected matches that share a common value such as an "author".

Closes #8108
2015-04-21 10:22:05 +01:00
Adrien Grand f4d5914511 Docs: Warn about the fact that min_doc_count=0 might return terms that only belong to different types. 2015-04-21 00:57:57 +02:00
Honza Král e929c1560d [DOCS] Be explicit about scan doing no scoring 2015-04-20 18:05:45 +02:00
Alex Ksikes c347dfe91c Validate API: support for verbose explanation of succesfully validated queries
This commit adds a `rewrite` parameter to the validate API in order to shown
how the given query is re-written into primitive queries. For example, an MLT
query is re-written into a disjunction of the selected terms. Other use cases
include `fuzzy`, `common_terms`, or `match` query especially with a
`cutoff_frequency` parameter. Note that the explanation is only given for a
single randomly chosen shard only, so the output may vary from one shard to
another.

Relates #1412
Closes #10147
2015-04-13 19:17:58 +02:00
Clinton Gormley abc7de96ae Docs: Updated version annotations in master 2015-04-09 14:50:11 +02:00
Adrien Grand aecd9ac515 Aggregations: Speed up include/exclude in terms aggregations with regexps.
Today we check every regular expression eagerly against every possible term.
This can be very slow if you have lots of unique terms, and even the bottleneck
if your query is selective.

This commit switches to Lucene regular expressions instead of Java (not exactly
the same syntax yet most existing regular expressions should keep working) and
uses the same logic as RegExpQuery to intersect the regular expression with the
terms dictionary. I wrote a quick benchmark (in the PR) to make sure it made
things faster and the same request that took 750ms on master now takes 74ms with
this change.

Close #7526
2015-04-09 12:12:56 +02:00
marko asplund 5585175173 Docs: fix typos in example JSON data
Closes #10479
2015-04-08 13:40:35 +02:00
Adrien Grand a608db122d Search: Remove the `count` search type.
This commit brings the benefits of the `count` search type to search requests
that have a `size` of 0:
 - a single round-trip to shards (no fetch phase)
 - ability to use the query cache

Since `count` now provides no benefits over `query_then_fetch`, it has been
deprecated.

Close #7630
2015-03-31 11:31:49 +02:00
olivier bourgain 00a9db73ae [DOCS] Fix multi percolate response sample in percolate.asciidoc 2015-03-30 11:32:41 +02:00
javanna d9d1e6a67a Scripting: add support for fine-grained settings
Allow to on/off scripting based on their source (where they get loaded from), the  operation that executes them and their language.

The settings cover the following combinations:

- mode: on, off, sandbox
- source: indexed, dynamic, file
- engine: groovy, expressions, mustache, etc
- operation: update, search, aggs, mapping

The following settings are supported for every engine:

script.engine.groovy.indexed.update:    sandbox/on/off
script.engine.groovy.indexed.search:    sandbox/on/off
script.engine.groovy.indexed.aggs:      sandbox/on/off
script.engine.groovy.indexed.mapping:   sandbox/on/off
script.engine.groovy.dynamic.update:    sandbox/on/off
script.engine.groovy.dynamic.search:    sandbox/on/off
script.engine.groovy.dynamic.aggs:      sandbox/on/off
script.engine.groovy.dynamic.mapping:   sandbox/on/off
script.engine.groovy.file.update:       sandbox/on/off
script.engine.groovy.file.search:       sandbox/on/off
script.engine.groovy.file.aggs:         sandbox/on/off
script.engine.groovy.file.mapping:      sandbox/on/off

For ease of use, the following more generic settings are supported too:

script.indexed: sandbox/on/off
script.dynamic: sandbox/on/off
script.file:    sandbox/on/off

script.update:  sandbox/on/off
script.search:  sandbox/on/off
script.aggs:    sandbox/on/off
script.mapping: sandbox/on/off

These will be used to calculate the more specific settings, using the stricter setting of each combination. Operation based settings have precedence over conflicting source based ones.

Note that the `mustache` engine is affected by generic settings applied to any language, while native scripts aren't as they are static by definition.

Also, the previous `script.disable_dynamic` setting can now be deprecated.

Closes #6418
Closes #10116
Closes #10274
2015-03-26 19:56:55 +01:00
Boaz Leskes 4970e3e225 Revert "Rest: Add json in request body to scroll, clear scroll, and analyze API"
This reverts commit 16083d454c.
2015-03-23 12:57:19 +01:00
Jun Ohtani 16083d454c Rest: Add json in request body to scroll, clear scroll, and analyze API
Add json support to scroll, clear scroll, and analyze

Closes #5866
2015-03-23 15:35:38 +09:00
Simon Willnauer 7257345db9 Revert Benchmark API
The benchmark api is being worked on feature/bench branch and will be merged from there when ready.
2015-03-21 10:36:04 +01:00
Asimov4 649e3aa4c5 [DOCS] Fix typos in percolate.asciidoc 2015-03-21 10:23:15 +01:00
Martijn van Groningen 4393939f5e inner_hits: Nested parent field should be resolved based on the parent inner hit definition, instead of the nested parent field in the mapping.
The behaviour is better in the case someone has multiple levels of nested object fields defined in the mapping and like to define a single inner_hits definition that is two or more levels deep.

If someone wants inner hits on a nested field that is 2 levels deep the following would need to be defined:

```
{
  ...
  "inner_hits" : {
     "path" : {
        "level1" : {
            "inner_hits" : {
               "path" : {
                  "level2" : {
                     "query" : { .... }
                  }
               }
            }
        }
     }
  }
}
```

With this change the above can be defined as:

```
{
  ...
  "inner_hits" : {
     "path" : {
        "level1.level2" : {
            "query" : { .... }
        }
     }
  }
}
```

Closes #9251
2015-03-16 16:31:03 -07:00
Lee Hinman 6aec68cd29 Revert "[QUERY] Remove lowercase_expanded_terms and locale options"
This reverts commit d1f7bd97cb.

Ryan pointed out that this needs to work with the multi term query, so
additional analysis and tests should be added.
2015-03-13 13:51:44 -06:00
Lee Hinman d1f7bd97cb [QUERY] Remove lowercase_expanded_terms and locale options
The analysis chain should be used instead of relying on this, as it is
confusing when dealing with different per-field analysers.

The `locale` option was only used for `lowercase_expanded_terms`, which,
once removed, is no longer needed, so it was removed as well.

Fixes #9978
Relates to #9973
2015-03-13 13:17:27 -06:00
olivier bourgain bcb4decca9 [DOCS] add missing comma in percentile_rank aggregation example 2015-03-10 08:21:06 -07:00
olivier bourgain fb7cd2ea9a [DOCS] Adjusted geo_distance aggregation example
unit is not returned in the response, but we have key and an implicit from starting at 0 for the first bucket
2015-03-10 08:20:20 -07:00
olivier bourgain eaeddc6bd4 [DOCS] missing curly brace in ip_range aggregation example 2015-03-10 08:19:57 -07:00
Britta Weber 580728dfd6 significant terms: add scriptable significance heuristic
This commit adds scripting capability to significant_terms.
Custom heuristics can be implemented with a script that provides
parameters subset_freq, superset_freq,subset_size, superset_size.

closes #7850
2015-03-06 17:06:04 +01:00
Clinton Gormley c223ed0db4 Update search-type.asciidoc
Changed search_type docs to reflect that the `(dfs_)query_and_fetch` modes are an internal optimization and should not be specified explicitly by the user.

Relates to #9606
2015-03-02 10:55:22 +01:00
Geoff Bourne 0e09c02c56 Spelling out the sort order options
Closes #9768
2015-03-01 21:05:52 +01:00
Clinton Gormley e194fb3a07 Docs: Default distance unit in geo distance agg is metres, not km
Closes #9812
2015-02-28 01:45:29 +01:00
Colin Goodheart-Smithe 2520dc78ec [DOCS] added a note for the default shard_size value 2015-02-25 11:00:55 +00:00
markharwood 29b1902cfb New aggregations feature - “PercentageScore” heuristic for significant_terms aggregation provides simple “per-capita” type measures.
Closes #9720
2015-02-20 13:22:08 +00:00
Christoph Büscher 30fd70f07b Aggregations: Simplify time zone option in `date_histogram`
Removed the existing `pre_zone` and `post_zone` option in `date_histogram` in favor of
the simpler `time_zone` option. Previously, specifying different values for these could
lead to confusing scenarios where ES would return bucket keys that are not UTC.
Now `time_zone` is the only option setting, the calculation of date buckets to take place in the
preferred time zone, but after rounding converting the bucket key values back to UTC.

Closes #9062
Closes #9637
2015-02-16 16:54:06 +01:00
Clinton Gormley 6fadeeca56 Updated doc annotations for 1.4.3 2015-02-11 17:54:53 +01:00
Christoph Büscher d2f852a274 Aggregations: Add 'offset' option to date_histogram, replacing 'pre_offset' and 'post_offset'
Add offset option to 'date_histogram' replacing and simplifying the previous 'pre_offset' and 'post_offset' options.
This change is part of a larger clean up task for `date_histogram` from issue #9062.
2015-02-09 14:03:28 +01:00
Adrien Grand 95f46f1212 Docs: Use the new experimental annotation.
We now have a very useful annotation to mark features or parameters as
experimental. Let's use it! This commit replaces some custom text warnings with
this annotation and adds this annotation to some existing features/parameters:
 - inner_hits (unreleased yet)
 - terminate_after (released in 1.4)
 - per-bucket doc count errors in the terms agg (released in 1.4)

I also tagged with this annotation settings which should either be not needed
(like the ability to evict entries from the filter cache based on time) or that
are too deep into the way that Elasticsearch works like the Directory
implementation or merge settings.

Close #9563
2015-02-05 15:29:45 +01:00
Adrien Grand 3a486066fd Docs: Remove the experimental status of the cardinality and percentiles(-ranks) aggregations
These aggregations are not experimental anymore but some of their parameters
still are:
 - `precision_threshold` and `rehash` on `cardinality`
 - `compression` on percentiles(-ranks)

Close #9560
2015-02-05 15:18:40 +01:00
Christoph Büscher 44193e7ba5 Aggregations: Add 'offset' option to histogram aggregation
Histogram aggregation supports an 'offset' option to move bucket boundaries.
In a histogram with buckets of size X these can be moved from 0, X, 2X, 3X,...
by an offset value of Y to Y, X+Y, 2X+Y, 3X+Y... by using the 'offset' option.
The previous 'pre_offset' and 'post_offset' options are removed in favour of
the simplified 'offset' option.

Closes #9417
Closes #9505
2015-02-02 18:23:01 +01:00
Oliver e412dab63a Docs: Fix sample query
Closes #9472
2015-01-29 15:56:24 +01:00
Ryan Ernst afcedb94ed Mappings: Remove `index_analyzer` setting to simplify analyzer logic
The `analyzer` setting is now the base setting, and `search_analyzer`
is simply an override of the search time analyzer.  When setting
`search_analyzer`, `analyzer` must be set.

closes #9371
2015-01-28 13:43:15 -08:00
Zachary Tong a4eb1d5505 Aggregations: Add standard deviation bounds to extended_stats
Extended_stats now displays the upper and lower bounds on standard deviations (e.g. avg +/- std).
Default is to show 2 std above/below, but can be changed using the `sigma` parameter.
Accepts non-negative doubles

Closes #9356
2015-01-28 11:47:20 -05:00
eBuildy 85ef44fd73 Docs: Fix missing comma and boolean true
Closes #9350
2015-01-19 21:31:29 +01:00
Martijn van Groningen 8e0292b1aa docs: fix inner hits snippet 2015-01-19 18:56:45 +01:00
sweetest eaa1674d6d Introduce index option named 'index.percolator.map_unmapped_fields_as_string', that handles unmapped fields in percolator queries as type string.
Closes #9053
Closes #9054
2015-01-19 09:51:10 +01:00
David Pilato fc7a0d3a4a [Docs] fix three to four 2015-01-12 12:13:23 +01:00
Martijn van Groningen d8054ec299 inner_hits: Added another more compact syntax for inner hits.
Closes #8770
2014-12-24 17:41:35 +01:00
Ryan Ernst 39b3613420 Fix date histogram docs grammar. 2014-12-23 10:19:55 -08:00
Yasir Bamarni 5059d6fe1c Update percolate.asciidoc
wrong type used in the -GET request

Closes #8942
2014-12-17 14:05:27 +01:00
Ayush 23dbecf3e7 Update percolate.asciidoc
Updating the `associated` spelling

Closes #8907
2014-12-15 14:12:03 +01:00
Adam Menges 3a3030e217 Docs: Fix the wording for inner hits a bit
Closes #8747
2014-12-09 13:36:26 +01:00
Martijn van Groningen d7e224da04 Added `inner_hits` feature that allows to include nested hits.
Inner hits allows to embed nested inner objects, children documents or the parent document that contributed to the matching of the returned search hit as inner hits, which would otherwise be hidden.

Closes #8153
Closes #3022
Closes #3152
2014-12-02 12:01:01 +01:00
Clinton Gormley 88e06cba80 Update daterange-aggregation.asciidoc
Clarified the date-math expressions on date range aggregations

Closes #8703
2014-11-28 16:53:33 +01:00
David Pilato 43a1435d3b [Docs] fix consistency between examples 2014-11-27 20:29:34 +01:00
David Pilato 40f0e07db3 [Docs] Fix missing new line 2014-11-27 19:39:12 +01:00
David Pilato da27c2104a [Docs] Fix missing comma in mapping 2014-11-27 11:03:19 +01:00
David Haney 2c429452e9 Typo: changed "5% or the real words" to "5% of the real words"
Closes #8582
2014-11-25 13:15:33 +01:00
barbasa fd6c41bfbf Missing quote in the example 2014-11-23 14:03:58 +01:00
Boaz Leskes 1e16375d04 Docs: Update execution hint docs for Significant terms agg
copied over the relevant pieces from the terms agg

Closes #8532
2014-11-18 20:54:26 +01:00
Joel Taddei 7e72800c83 [DOCS] Corrected syntax error in search curl cmd
Closes #8447
2014-11-12 17:21:19 +01:00
Clinton Gormley cff544dcc2 Docs: Removed old coming/added tags 2014-11-10 14:41:24 +01:00
Veres Lajos 4059e4ac86 typo fixes - https://github.com/vlajos/misspell_fixer
Closes #8323
2014-11-08 18:55:57 +01:00
Clinton Gormley 08aa715d2e Update datehistogram-aggregation.asciidoc
Clarified use of fractional time units in the date histo agg.

Closes #7957
2014-11-08 17:49:34 +01:00
Martijn Laarman 82278bb7bc [Aggregations] Meta data support
This commit adds the ability to associate a bit of state with each
individual aggregation.

The aggregation response can be hard to stitch back together without
having a reference to the aggregation request. In many cases this is not
available, many json serializer frameworks cache types globally or have a
static deserialisation override mechanism. In these cases making the
original request available, if at all possible, would be a hack.

The old facets returned `_type` which was just enough metadata to know
what the originating facet type in the request was.

This PR takes `_type` one step further by introducing ANY arbitrary meta
data. This could be further <strike>ab</strike>used for instance by
generic/automated aggregations that include UI state (color information,
thresholds, user input states, etc) per aggregation.
2014-11-03 22:32:23 +01:00
Clinton Gormley e56d85439c Update search-template.asciidoc
Clarified using the conditional clause template example as a string
2014-10-31 15:32:14 +01:00
Clinton Gormley 2569188d25 Update search-template.asciidoc
Fixed asciidoc typo

Closes #8308
2014-10-31 14:40:32 +01:00
Areek Zillur 96f1606cdc Completion Suggester: Fix CompletionFieldMapper to correctly parse weight
- Allows weight to be defined as a string representation of a positive integer

closes #8090
2014-10-28 18:39:02 -04:00
Adrien Grand 7ea490dfd1 Aggregations: Return the sum of the doc counts of other buckets.
This commit adds a new field to the response of the terms aggregation called
`sum_other_doc_count` which is equal to the sum of the doc counts of the buckets
that did not make it to the list of top buckets. It is typically useful to have
a sector called eg. `other` when using terms aggregations to build pie charts.

Example query and response:

```json
GET test/_search?search_type=count
{
  "aggs": {
    "colors": {
      "terms": {
        "field": "color",
        "size": 3
      }
    }
  }
}
```

```json
{
   [...],
   "aggregations": {
      "colors": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 4,
         "buckets": [
            {
               "key": "blue",
               "doc_count": 65
            },
            {
               "key": "red",
               "doc_count": 14
            },
            {
               "key": "brown",
               "doc_count": 3
            }
         ]
      }
   }
}
```

Close #8213
2014-10-27 12:11:26 +01:00
Brian Kim 58086dd08b Docs: missing quote
fix missing quote

Closes #8176
2014-10-21 12:52:12 +02:00
Michael McCandless 85065f9c8e Core: cutover to Lucene's query rescorer
This is functionally equivalent to before, so there should be no
user-visible impact, except I added a NOTE in the docs warning about
the interaction of pagination and rescoring.

Closes #6232

Closes #7707
2014-10-18 05:25:50 -04:00
Sergii Golubev 028a2b732a Docs: Percolate reference - a typo and a misused word
Closes #8116
2014-10-17 15:26:29 +02:00
Sergii Golubev ae923a81b9 Docs: Percolate `_score` reference
Added missing `_score` word, made the sentence less ambiguous.

Closes #8115
2014-10-17 15:25:02 +02:00
Andrew O'Brien 33097d901b Docs: Typo: s/by/be/
Closes #8114
2014-10-16 20:51:58 +02:00
Son 6f3227db01 Docs: Fix order for PUT _mapping docs
Closes #8083
2014-10-16 18:49:36 +02:00
Clinton Gormley 6a180d1803 Docs: Update highlighting.asciidoc
Added note about how to highlight on the `_all` field

Closes #7991
2014-10-15 13:45:56 +02:00
Clinton Gormley 7e916d0b8b Update completion-suggest.asciidoc
Documented the `size` parameter in the completion suggester query
2014-10-14 18:47:32 +02:00
Martijn van Groningen 5763b24686 Core: Make fetch phase nested doc aware
By letting the fetch phase understand the nested docs structure we can serve nested docs as hits.
The `top_hits` aggregation can because of this commit be placed in a `nested` or `reverse_nested` aggregation.

Closes #7164
2014-10-08 22:21:30 +02:00
Colin Goodheart-Smithe 6cf371395a Aggregations: makes script params consistent with other APIs in scripted_metric
This change removes the script_type parameter form the Scripted Metric Aggregation and adds support for _file and _id suffixes to the init_script, map_script, combine_script and reduce_script parameters to make defining the source of the script consistent with the other APIs which use the ScriptService
2014-10-06 09:07:25 +01:00
mdzor 4b3f66e585 Update suggesters.asciidoc
A request was malformed

Closes #7867
2014-09-28 11:04:28 +02:00
Clinton Gormley cb00d4a542 Docs: Removed all the added/deprecated tags from 1.x 2014-09-26 21:04:42 +02:00
Colin Goodheart-Smithe 8a70b115f2 Aggregations: More consistent response format for scripted metrics aggregation
Changes the name of the field in the scripted metrics aggregation from 'aggregation' to 'value' to be more in line with the other metrics aggregations like 'avg'
2014-09-17 11:46:26 +01:00
Jordan Snodgrass 6246aac9ab Docs: Indicate that the Children Aggregation is coming in 1.4.0 2014-09-17 09:22:02 +02:00
Colin Goodheart-Smithe d4e83df3b8 Aggregations: Adds ability to sort on multiple criteria
The terms aggregation can now support sorting on multiple criteria by replacing the sort object with an array or sort object whose order signifies the priority of the sort. The existing syntax for sorting on a single criteria also still works.

Contributes to #6917
Replaces #7588
2014-09-15 11:08:29 +01:00
markharwood 3c8f8cc090 Aggs enhancement - allow Include/Exclude clauses to use array of terms as alternative to a regex
Closes #6782
2014-09-12 15:28:03 +01:00
Lee Hinman 1dd26888f6 [DOCS] Additional documentation for _score accessing
Closes #7043
2014-09-11 12:53:25 +02:00
smayzak 65a0ca021d The description was incorrect
Looked like a copy and paste from another aggregation
2014-09-10 16:05:03 +02:00
smayzak 6416f5d3d0 Fixing some grammar 2014-09-10 16:05:03 +02:00
David Pilato 7fdd3651fa [docs] Fix typo: resonable - reasonable 2014-09-10 15:57:57 +02:00
Martijn van Groningen 52f1ab6e16 Core: Added the `index.query.parse.allow_unmapped_fields` setting to fail queries if they refer to unmapped fields.
The percolator and filters in aliases by default enforce strict query parsing.

Closes #7335
2014-09-09 15:00:47 +02:00
Colin Goodheart-Smithe b127b52fd3 Revert "Aggregations: Adds ability to sort on multiple criteria"
This reverts commit bfedd11ffa.
2014-09-08 20:27:19 +01:00
Colin Goodheart-Smithe bfedd11ffa Aggregations: Adds ability to sort on multiple criteria
The terms aggregation can now support sorting on multiple criteria by replacing the sort object with an array or sort object whose order signifies the priority of the sort. The existing syntax for sorting on a single criteria also still works.

Contributes to #6917
2014-09-08 15:20:33 +01:00
Clinton Gormley 1bdf79e527 Docs: Added explanation of how to do multi-field terms agg
Closes #5100
2014-09-07 11:09:52 +02:00
shrinidhichaudhari 13e3a5e99c Docs: Update cardinality-aggregation.asciidoc
Closes #7516
2014-09-06 20:45:45 +02:00
Adrien Grand 4bfad644b3 Aggregations: Forbid usage of aggregations in conjunction with search_type=SCAN.
Aggregations are collection-wide statistics, which is incompatible with the
collection mode of search_type=SCAN since it doesn't collect all matches on
calls to the search API.

Close #7429
2014-09-03 09:03:01 +02:00
Adrien Grand 203e80e650 Aggregations: Only return aggregations on the first page when scrolling.
Aggregations are collection-wide statistics so they would always be the same.
In order to save CPU/bandwidth, we can just return them on the first page.

Same as #1642 but for aggregations.
2014-09-03 09:03:01 +02:00
Clinton Gormley a059a6574a Update reverse-nested-aggregation.asciidoc
Fixed reverse nested example

Closes #7463
2014-09-02 11:40:41 +01:00
Adrien Grand 8e1d3d56b3 Docs: Replace added[1.4.0] with coming[1.4.0] since 1.4 is not released yet. 2014-08-29 11:57:22 +02:00
londocr 1213eec834 Spelling error of aggregation 2014-08-28 08:57:12 +02:00
Adrien Grand ea96359d82 Facets: Removal from master.
Close #7337
2014-08-21 10:34:39 +02:00
Colin Goodheart-Smithe 7f943f0296 Aggregations: Scriptable Metrics Aggregation
A metrics aggregation which runs specified scripts at the init, collect, combine, and reduce phases

Closes #5923
2014-08-20 18:17:27 +01:00
Martijn van Groningen 383e64bd5c Aggregations: Add `children` bucket aggregator that is able to map buckets between parent types and child types using the already builtin parent/child support.
Closes #6936
2014-08-19 12:40:51 +02:00
Britta Weber 639692943f Docs: Document distance type and sort mode for many to many geo_points
closes #7280
2014-08-18 16:15:55 +02:00
Konrad Feldmeier 3b3e2ed5e9 Docs: Remove the 'Factor' paragraph to reflect #6490
The current implementation of 'date_histogram' does not understand
the `factor` parameter. Since the docs shouldn't raise false hopes,
I removed the section.

Closes #7277
2014-08-18 13:02:15 +02:00
Mpampis Kostas 55b642abc5 Docs: Fix typo in phrase-suggest.asciidoc
Closes #7262
2014-08-18 13:00:30 +02:00
Clinton Gormley 9dfede8cbb Update search-template.asciidoc
Remove extra commas in template query ;-)

Closes #7033
2014-08-18 12:35:18 +02:00
Clinton Gormley 6477e13c77 Typo 2014-08-18 12:30:49 +02:00
smayzak 8449128032 error in code
The top-tags and terms were reversed.
2014-08-18 12:28:53 +02:00
Areek Zillur 0b6734aa40 [DOCS] Clarify Completion Suggester output deduplication 2014-08-13 11:09:18 -04:00
Colin Goodheart-Smithe 36083cb27f [DOCS] Added section describing how to return only agg results
Closes #5875
2014-08-11 11:31:01 +01:00
Britta Weber d49ed93488 Docs: md -> asciidoc 2014-08-08 11:25:14 +02:00
Colin Goodheart-Smithe e6632ec63e [DOCS] fixed title for filters aggregation documentation 2014-08-07 08:37:43 +01:00
Clinton Gormley 7b0b315b71 Tidied up the filters agg docs and added a coming[] tag 2014-08-07 09:03:23 +02:00
Clinton Gormley e7f1aa4f4f Documented the query cache module
Related to #7161 and #7167
2014-08-06 11:55:11 +02:00
Britta Weber a3cefd919e significant terms: add google normalized distance, add chi square
closes #6858
2014-08-04 08:15:26 +02:00
uboness 3c9c9f33e2 Aggregations Added Filters aggregation
A multi-bucket aggregation where multiple filters can be defined (each filter defines a bucket). The buckets will collect all the documents that match their associated filter.

This aggregation can be very useful when one wants to compare analytics between different criterias. It can also be accomplished using multiple definitions of the single filter aggregation, but here, the user will only need to define the sub-aggregations only once.

Closes #6118
2014-08-01 16:01:08 +01:00
Adrien Grand d9d5b35be9 Sort: Make `ignore_unmapped` work for cross-index queries.
Close #2255
2014-08-01 15:30:17 +02:00
Stefan Antoni 8e862f15c1 [DOCS] fixed small typo in percolate.asciidoc 2014-08-01 12:38:35 +02:00
Britta Weber d6a18ab2ba Docs: add 1.4.0 label to many to many geo distance sort 2014-08-01 12:30:08 +02:00
Kurt Hurtado 66560acebb Update fielddata-fields.asciidoc 2014-08-01 09:20:19 +02:00
Areek Zillur 1d581e6286 Search Exists API: Checks if any matching documents exist for a given query
Implements a new Exists API allowing users to do fast exists check on any matched documents for a given query.
This API should be faster then using the Count API as it will:
 - early terminate the search execution once any document is found to exist
 - return the response as soon as the first shard reports matched documents

closes #6995
2014-07-31 15:42:30 -04:00
Britta Weber fe86c8bc88 _geo_distance sort: allow many to many geo point distance
Add computation of disyance to many geo points. Example request:

```
{
  "sort": [
    {
      "_geo_distance": {
        "location": [
          {
            "lat":1.2,
            "lon":3
          },
          {
             "lat":1.2,
            "lon":3
          }
        ],
        "order": "desc",
        "unit": "km",
        "sort_mode": "max"
      }
    }
  ]
}
```

closes #3926
2014-07-31 17:33:45 +02:00
Clinton Gormley 36e1c7928c Rewrote post-filter.asciidoc
Closes #5166
2014-07-31 12:56:11 +02:00
Adrien Grand 1fe76b891b Docs: Add links to the equivalent aggs in facets documentation. 2014-07-28 15:22:49 +02:00
Clinton Gormley be86556946 Update request-body.asciidoc
Added link from `timeout` to time-units

Closes #6361
2014-07-28 11:08:59 +02:00
Clinton Gormley 10b4177def Docs: Fixed path to search-shards 2014-07-26 15:05:53 +02:00
Clinton Gormley 88c8754a3c Docs: Removed search-shards from request-body 2014-07-26 14:52:50 +02:00
Colin Goodheart-Smithe 655157c83a Aggregations: Added an option to show the upper bound of the error for the terms aggregation.
This is only applicable when the order is set to _count.  The upper bound of the error in the doc count is calculated by summing the doc count of the last term on each shard which did not return the term.  The implementation calculates the error by summing the doc count for the last term on each shard for which the term IS returned and then subtracts this value from the sum of the doc counts for the last term from ALL shards.

Closes #6696
2014-07-25 14:24:24 +01:00
Areek Zillur 5487c56c70 Search & Count: Add option to early terminate doc collection
Allow users to control document collection termination, if a specified terminate_after number is
set. Upon setting the newly added parameter, the response will include a boolean terminated_early
flag, indicating if the document collection for any shard terminated early.

closes #6876
2014-07-23 15:10:15 -04:00
Clinton Gormley 0f943850a0 Update named-queries-and-filters.asciidoc 2014-07-23 17:28:49 +02:00
Simon Willnauer 5bfea56457 [DOCS] move all coming tags to added in master 2014-07-23 16:37:19 +02:00
Areek Zillur f39d4e1f89 PhraseSuggester: Collate option should allow returning phrases with no matching docs
A new option `prune` has been added to allow users to control phrase suggestion pruning when `collate`
is set. If the new option is set, the phrase suggestion option will contain a boolean `collate_match`
indicating whether the respective result had hits in collation.

CLoses #6927
2014-07-22 17:17:15 -04:00
Adrien Grand abeefbddea Docs: Update documentation about execution hints for the terms aggregation. 2014-07-21 11:55:57 +02:00
Clinton Gormley 6a7a77eada Docs: Add links to client helper classes for bulk/scroll/reindexing 2014-07-18 13:55:47 +02:00
Simon Willnauer f9a9348508 [DOCS] Move benchmark API to 1.4 2014-07-16 15:02:20 +02:00
Brian Murphy d6cd2c2b73 [DOCS][FIX] Fix reference check in indexed scripts/templates doc. 2014-07-16 11:24:18 +01:00
Brian Murphy bc570919ee [DOCS][FIX] Fix doc parsing, broken closing block 2014-07-16 11:18:21 +01:00
Brian Murphy cbd2a97abd [DOCS] : Indexed scripts/templates
These are the docs for the indexed scripts/templates feature.
Also moved the namespace for the REST endpoints.

Closes #6851
2014-07-16 10:49:02 +01:00
Areek Zillur 76343899ea Phrase Suggester: Add collate option to PhraseSuggester
The newly added collate option will let the user provide a template query/filter which will be executed for every phrase suggestions generated to ensure that the suggestion matches at least one document for the filter/query.
The user can also add routing preference `preference` to route the collate query/filter and additional `params` to inject into the collate template.

Closes #3482
2014-07-14 16:07:52 -04:00
Britta Weber 74927adced significant terms: infrastructure for changing easily the significance heuristic
This commit adds the infrastructure to allow pluging in different
measures for computing the significance of a term.
Significance measures can be provided externally by overriding

- SignificanceHeuristic
- SignificanceHeuristicBuilder
- SignificanceHeuristicParser

closes #6561
2014-07-14 11:00:50 +02:00
Florian Hopf 3689f67a76 Docs: Fixed invalid word count in geodistance agg doc
Closes #6838
2014-07-11 18:35:36 +02:00
Clinton Gormley b6baa4be4a Update preference.asciidoc
Clarify that `preference` is a query string parameter only
and provide an example.
2014-07-09 11:13:17 +02:00
Clinton Gormley feb81e228b Docs: Rewrote the scroll/scan docs
Closes #6774
2014-07-08 11:54:53 +02:00
Andrii Gakhov 80321d89d9 Docs: Update histogram-aggregation.asciidoc
filter in a filtered query should be under "filter" key

Closes #6738
2014-07-07 10:44:11 +02:00
Carsten Brandt bd4699da7e Docs: fixed a typo in the docs
Closes: #6718
2014-07-07 10:41:36 +02:00
Duncan Angus Wilkie 60a8515fb7 Update histogram-facet.asciidoc
Spotted a typo, which I've fixed.
2014-07-01 10:49:43 +02:00
Clinton Gormley 64a4acc49b Docs: Added IDs to the highlighters for linking 2014-06-22 16:46:42 +02:00
Chris 011e20678d [DOCS] Fixed json example in nested-aggregation.asciidoc 2014-06-18 19:38:02 +02:00
Colin Goodheart-Smithe 7423ce0560 Aggregations: Added percentile rank aggregation
Percentile Rank Aggregation is the reverse of the Percetiles aggregation.  It determines the percentile rank (the proportion of values less than a given value) of the provided array of values.

Closes #6386
2014-06-18 12:02:08 +01:00
stephlag 13d910f016 Added missing comma in suggester example 2014-06-13 16:01:04 +02:00
Adrien Grand 01327d7136 Facets: deprecation.
Users are encouraged to move to the new aggregation framework that was
introduced in Elasticsearch 1.0.

Close #6485
2014-06-13 13:13:44 +02:00
Luke Fender f9da5259bc [DOCS] Fixed typo in post-filter.asciidoc
Remove 'be' where it is not needed
2014-06-12 12:09:19 +02:00
Martijn van Groningen 5e408f3d40 Change the top_hits to be a metric aggregation instead of a bucket aggregation (which can't have an sub aggs)
Closes #6395
Closes #6434
2014-06-10 09:09:50 +02:00
markharwood 724129e6ce Aggregations optimisation for memory usage. Added changes to core Aggregator class to support a new mode of deferred collection.
A new "breadth_first" results collection mode allows upper branches of aggregation tree to be calculated and then pruned
to a smaller selection before advancing into executing collection on child branches.

Closes #6128
2014-06-06 15:59:51 +01:00
fransflippo cdbde4a578 [DOCS] Reworded note about shorthand suggest syntax
The existing Note about the shorthand suggest syntax was poorly worded and confusing. Please check whether the way I've phrased it now is still correct as to what the shorthand form actually does and doesn't do: the original wording did not provide me enough information to be sure.
Thanks!
2014-06-06 10:21:01 +02:00
Jad Naous 5aa84c9aab [DOCS] Fixed typos in aggregations.asciidoc
Fix plural/singular forms.
2014-06-05 19:47:01 +02:00
Colin Goodheart-Smithe b9f4d44b14 Aggregations: Adds GeoBounds Aggregation
The GeoBounds Aggregation is a new single bucket aggregation which outputs the coordinates of a bounding box containing all the points from all the documents passed to the aggregation as well as the doc count. Geobound Aggregation also use a wrap_logitude parameter which specifies whether the resulting bounding box is permitted to overlap the international date line.  This option defaults to true.

This aggregation introduces the idea of MetricsAggregation which do not return double values and cannot be used for sorting.  The existing MetricsAggregation has been renamed to NumericMetricsAggregation and is a subclass of MetricsAggregation.  MetricsAggregations do not store doc counts and do not support child aggregations.

Closes #5634
2014-06-03 15:59:56 +01:00
javanna 5a1ad7b42e [DOCS] fixed curl requests in benchmark docs 2014-06-03 11:47:13 +02:00
leonardo menezes f3eca05c3b [DOCS] removed slowest on single query benchmark requests
Relates to #5904
2014-06-03 11:47:13 +02:00
Clinton Gormley 7fff6f1f43 Docs: Tidied percolate.asciidoc 2014-05-30 11:56:06 +02:00
Martijn van Groningen aab38fb2e6 Aggregations: added pagination support to `top_hits` aggregation by adding `from` option.
Closes #6299
2014-05-30 11:45:31 +02:00
Martijn van Groningen 5fafd2451a Added `top_hits` aggregation that keeps track of the most relevant document being aggregated per bucket.
Closes #6124
2014-05-23 16:01:18 +02:00
Nik Everett 3573822b7e Highlight fields in request order
Because json objects are unordered this also adds an explicit order syntax
that looks like
    "highlight": {
        "fields": [
            {"title":{ /*params*/ }},
            {"text":{ /*params*/ }}
        ]
    }

This is not useful for any of the builtin highlighters but will be useful
in plugins.

Closes #4649
2014-05-22 16:44:14 +02:00
Simon Willnauer 9d5507047f Update Documentation Feature Flags [1.2.0] 2014-05-22 15:06:42 +02:00
Clinton Gormley f950344546 [DOCS] Fixed title levels in context suggester 2014-05-21 20:47:25 +02:00
Simon Willnauer ec3b1c57ac Move Benchmark release to 1.3 2014-05-21 10:17:59 +02:00
Britta Weber 08e57890f8 use shard_min_doc_count also in TermsAggregation
This was discussed in issue #6041 and #5998 .

closes #6143
2014-05-14 14:10:04 +02:00
Clinton Gormley ff12585fea Improved wording in search-type.asciidoc
Closes #5951
2014-05-14 12:15:48 +02:00
David Pilato 1cb2c3bdd3 [DOCS] reverse-nested aggs are added in 1.2.0 2014-05-13 20:00:42 +02:00
Tiago Alves Macambira a8242e6c8c Clarify `missing` behavior. 2014-05-13 15:49:46 +02:00
Adrien Grand cc530b9037 Use t-digest as a dependency.
Our improvements to t-digest have been pushed upstream and t-digest also got
some additional nice improvements around memory usage and speedups of quantile
estimation. So it makes sense to use it as a dependency now.

This also allows to remove the test dependency on Apache Mahout.

Close #6142
2014-05-13 10:38:08 +02:00
Clinton Gormley 3aac594503 [DOCS] Fix typos in context suggest 2014-05-13 10:34:16 +02:00
markharwood 1e560b0d92 Significant_terms agg: added option for a background_filter to define background context for analysis of term frequencies
Closes #5944
2014-05-13 09:10:30 +01:00
Clinton Gormley 5b93255ec8 [DOCS] Added "Aggregation" to all aggs titles 2014-05-13 01:35:58 +02:00
Rashid Khan 233aaa63c9 Change key to keyed 2014-05-12 13:15:07 -07:00
Alex Ksikes dae48d9fe8 Added the ability to include the queried document for More Like This API.
By default More Like This API excludes the queried document from the response.
However, when debugging or when comparing scores across different queries, it
could be useful to have the best possible matched hit. So this option lets users
explicitly specify the desired behavior.

Closes #6067
2014-05-09 12:59:39 +02:00
Alex Ksikes 48b7172ee7 Provided some insights as to how More Like This works internally.
In the Google Groups forum there appears to be some confusion as to what mlt
does. This documentation update should hopefully help demystifying this
feature, and provide some understanding as to how to use its parameters.

Closes #6092
2014-05-09 12:13:29 +02:00
Andrew Selden f23274523a Integration tests for benchmark API.
- Randomized integration tests for the benchmark API.
- Negative tests for cases where the cluster cannot run benchmarks.
- Return 404 on missing benchmark name.
- Allow to specify 'types' as an array in the JSON syntax when describing a benchmark competition.
- Don't record slowest for single-request competitions.

Closes #6003, #5906, #5903, #5904
2014-05-07 14:14:54 -07:00
uboness fc52db1209 Changed the respnose structure of the percentiles aggregation where now all the percentiles are placed under a `values` object (or `values` array in case the `keyed` flag is set to `false`
Closes #5870
2014-05-07 18:35:24 +02:00
Britta Weber 7944369fd1 Add `shard_min_doc_count` parameter for significant terms similar to `shard_size`
Significant terms internally maintain a priority queue per shard with a size potentially
lower than the number of terms. This queue uses the score as criterion to determine if
a bucket is kept or not. If many terms with low subsetDF score very high
but the `min_doc_count` is set high, this might result in no terms being
returned because the pq is filled with low frequent terms which are all sorted
out in the end.

This can be avoided by increasing the `shard_size` parameter to a higher value.
However, it is not immediately clear to which value this parameter must be set
because we can not know how many terms with low frequency are scored higher that
the high frequent terms that we are actually interested in.

On the other hand, if there is no routing of docs to shards involved, we can maybe
assume that the documents of classes and also the terms therein are distributed evenly
across shards. In that case it might be easier to not add documents to the pq that have
subsetDF <= `shard_min_doc_count` which can be set to something like
`min_doc_count`/number of shards  because we would assume that even when summing up
the subsetDF across shards `min_doc_count` will not be reached.

closes #5998
closes #6041
2014-05-07 18:02:56 +02:00
gabriel-tessier 7b0efcbd96 fix typo 2014-05-06 15:54:36 +02:00
Audrey 52d2f2d229 [DOCS] Update phrase-suggest.asciidoc
Grammatical error

Close #5993
2014-05-06 10:28:13 +02:00
Martijn van Groningen 013b319415 Added `reverse_nested` aggregation.
The `reverse_nested` aggregation allows to aggregate on properties outside of the nested scope of a `nested` aggregation.

Closes #5507
2014-05-01 00:23:05 +07:00
Lee Hinman 57bee03193 [DOCS] Add /_search_shards documentation 2014-04-22 08:54:32 -06:00
Clinton Gormley 3ba8fbbef8 Update benchmark.asciidoc
Fixed incorrect parameter spec for benchmark nodes
2014-04-22 14:16:10 +02:00
Clinton Gormley 0e782331be Update benchmark.asciidoc 2014-04-21 20:39:33 +02:00
David Pilato f3fe50aac4 [DOCS] fix typo 2014-04-19 22:44:44 +02:00
Scott Wilkerson 9ea0e3a95b Update percolate.asciidoc
fix typo
2014-04-15 16:01:44 +02:00
Andrew Selden 2cf66c4115 Benchmark documentation
Moving benchmark documentation under the search section.

Closes #5786
2014-04-14 14:08:41 -07:00
Malte Schirnacher 8ce3bba010 Fix typos in percolate.asciidoc
Close #5762 #5763 #5764
2014-04-11 18:09:16 +02:00
Andrew O'Brien 48031b6236 Fixes typo in "Scan" search type documention 2014-04-07 16:01:37 -06:00
gabriel-tessier 000c33aac3 fix typo 2014-04-07 09:23:46 +02:00
Martijn van Groningen ade1d0ef57 Added global ordinals (unique incremental numbering for terms) to fielddata.
Added a terms aggregation implementations that work on global ordinals, which is also the default.

Closes #5672
2014-04-07 11:06:41 +07:00
Karl Meisterheim 6d993bc810 [DOCS] A few grammar and word use corrections 2014-04-04 19:26:38 +02:00
Alexander Reelsen e547e113e1 Geo context suggester: Require precision in mapping
The default precision was way too exact and could lead people to
think that geo context suggestions are not working. This patch now
requires you to set the precision in the mapping, as elasticsearch itself
can never tell exactly, what the required precision for the users
suggestions are.

Closes #5621
2014-04-02 23:51:14 +02:00
Hannes Korte c11293ad78 Fix some typos in documentation. 2014-03-31 13:48:17 +02:00
bleskes 5d832374dd Update Documentation Feature Flags [1.1.0] 2014-03-25 17:51:30 +01:00
Boaz Leskes fc8dc3f733 [Docs] updated the search template and query template docs 2014-03-25 15:25:02 +01:00
Alexander Reelsen 4fc461a97c [DOCS] Moved the template query documentation into search section 2014-03-25 10:01:41 +01:00
Simon Willnauer b4e504df99 [Docs] Add coming tag for context suggester docs 2014-03-25 09:46:49 +01:00
uboness 7d6ad8d91c Added extended_bounds support for date_/histogram aggs
By default the date_/histogram returns all the buckets within the range of the data itself, that is, the documents with the smallest values (on which with histogram) will determine the min bucket (the bucket with the smallest key) and the documents with the highest values will determine the max bucket (the bucket with the highest key). Often, when when requesting empty buckets (min_doc_count : 0), this causes a confusion, specifically, when the data is also filtered.

To understand why, let's look at an example:

Lets say the you're filtering your request to get all docs from the last month, and in the date_histogram aggs you'd like to slice the data per day. You also specify min_doc_count:0 so that you'd still get empty buckets for those days to which no document belongs. By default, if the first document that fall in this last month also happen to fall on the first day of the **second week** of the month, the date_histogram will **not** return empty buckets for all those days prior to that second week. The reason for that is that by default the histogram aggregations only start building buckets when they encounter documents (hence, missing on all the days of the first week in our example).

With extended_bounds, you now can "force" the histogram aggregations to start building buckets on a specific min values and also keep on building buckets up to a max value (even if there are no documents anymore). Using extended_bounds only makes sense when min_doc_count is 0 (the empty buckets will never be returned if the min_doc_count is greater than 0).

Note that (as the name suggest) extended_bounds is **not** filtering buckets. Meaning, if the min bounds is higher than the values extracted from the documents, the documents will still dictate what the min bucket will be (and the same goes to the extended_bounds.max and the max bucket). For filtering buckets, one should nest the histogram agg under a range filter agg with the appropriate min/max.

Closes #5224
2014-03-20 14:48:27 +01:00
markharwood 5f1d9af9fe Documentation fix for significant_terms heading levels 2014-03-17 12:17:54 +00:00
Randy Stauner 1486188a3b [DOCS] Reword clear-scroll sentence 2014-03-17 12:08:49 +01:00
Boaz Leskes ee8743f3f2 [Docs] added a missing reference to significantterms-aggergations
Also fix header level mismatch issue reported by the build
2014-03-17 11:45:55 +01:00
rphadake 36a0cb99d7 [Doc] doc updates for date histogram interval
Close #5308
2014-03-14 18:55:32 +01:00
Adrien Grand eef71da650 [Doc] Add a chart about the relative error of the percentiles aggregation. 2014-03-14 12:23:23 +01:00
markharwood 767bef0596 Significant_terms aggregation identifies terms that are significant rather than merely popular in a set.
Significance is related to the changes in document frequency observed between everyday use in the corpus and
frequency observed in the result set. The asciidocs include extensive details on the applications of this feature.

Closes #5146
2014-03-14 10:34:24 +00:00
Adrien Grand 5821fa042c Cardinality aggregation.
This aggregation computes unique term counts using the hyperloglog++ algorithm
which uses linear counting to estimate low cardinalities and hyperloglog on
higher cardinalities.

Since this algorithm works on hashes, it is useful for high-cardinality fields
to store the hash of values directly in the index, which is the purpose of
the new `murmur3` field type. This is less necessary on low-cardinality
string fields because the aggregator is smart enough to only compute the hash
once per unique value per segment thanks to ordinals, or on numeric fields
since hashing them is very fast.

Close #5426
2014-03-13 19:19:56 +01:00
Florian Schilling 81e537bd5e ContextSuggester
================

This commit extends the `CompletionSuggester` by context
informations. In example such a context informations can
be a simple string representing a category reducing the
suggestions in order to this category.

Three base implementations of these context informations
have been setup in this commit.

- a Category Context
- a Geo Context

All the mapping for these context informations are
specified within a context field in the completion
field that should use this kind of information.
2014-03-13 11:24:46 +01:00
Kurt Hurtado ca6a2bb790 [DOCS] Various aggregation doc fixes 2014-03-13 09:05:25 +01:00
Boaz Leskes b7a95d11a7 Introduced VersionType.FORCE & VersionType.EXTERNAL_GTE
Also added "external_gt" as an alias name for VersionType.EXTERNAL , accessible for the rest layer.

Closes #4213 , Closes #2946
2014-03-10 21:07:17 +01:00
Simon Willnauer fbb8c0fafa [DOCS] Add `coming` tag to multiple rescores
Closes #5365
2014-03-10 09:27:44 +01:00
Benjamin Devèze 2affa5004f Fix small typo in percentiles doc 2014-03-07 10:10:19 +01:00
Adrien Grand f359b7f38b [DOC] The percentiles aggregation is coming in 1.1.0. 2014-03-07 10:03:15 +01:00
uboness 9d0fc76f54 Added support for sorting buckets based on sub aggregations
Supports sorting on sub-aggs down the current hierarchy. This is supported as long as the aggregation in the specified order path are of a single-bucket type, where the last aggregation in the path points to either a single-bucket aggregation or a metrics one. If it's a single-bucket aggregation, the sort will be applied on the document count in the bucket (i.e. doc_count), and if it is a metrics type, the sort will be applied on the pointed out metric (in case of a single-metric aggregations, such as avg, the sort will be applied on the single metric value)

 NOTE: this commit adds a constraint on what should be considered a valid aggregation name. Aggregations names must be alpha-numeric and may contain '-' and '_'.

 Closes #5253
2014-03-06 00:05:27 +01:00
Zachary Tong 7b16c5857d Percentiles aggregation.
A new metric aggregation that can compute approximate values of arbitrary
percentiles.

Close #5323
2014-03-03 18:06:14 +01:00
Binh Ly 7e49848697 Clarify range aggregations 2014-02-28 14:38:57 -05:00
Clinton Gormley 53ce0e8e27 [DOCS] Fixed added[] tag version number 2014-02-28 15:29:43 +01:00
Luca Cavanna 4e6610a798 Fixed multi term queries support in postings highlighter for non top-level queries
In #4052 we added support for highlighting multi term queries using the postings highlighter. That worked only for top-level queries though, and not for multi term queries that are nested for instance within a bool query, or filtered query, or a constant score query.

The way we make this work is by walking the query structure and temporarily overriding the query rewrite method with a method that allows for multi terms extraction.

Closes #5102
2014-02-21 21:43:40 +01:00
Britta Weber db3c6c2a8e Enable percolation for nested documents
closes #5082
2014-02-14 22:42:33 +01:00
uboness d335630e57 [docs] fixed errors in aggs docs
- error in nested aggs example
- error in terms aggs example
2014-02-13 20:36:02 +01:00
Luca Cavanna 179750f0f5 [DOCS] fixed count docs, it now requires a top-level query object, same as other apis
Relates to #4074
2014-02-13 13:36:20 +01:00
Luca Cavanna 01abea5945 [DOCS] fixed count and validate query docs, they now require a top-level query object, same as other apis
Relates to #4074
Closes #5111
2014-02-13 11:42:04 +01:00
Simon Willnauer 990ce658a4 [Docs] Remove `custom_score` from documentation and add a migration
section.
2014-02-11 14:59:15 +01:00
Clinton Gormley 93930d6dc7 Removed 0.90.* deprecation and addition notifications
Closes #5052
2014-02-07 20:52:49 +01:00
Adrien Grand 9cb17408cb Make size=0 return all buckets for the geohash_grid aggregation.
Close #4875
2014-02-07 09:55:10 +01:00
Boaz Leskes 9bf263c741 [DOCS] Fix terms agg value script example 2014-02-06 16:35:49 +01:00
Boaz Leskes ae4ed29f9b [Docs] value_count supports script per 1.1 2014-02-06 15:04:50 +01:00
Clinton Gormley 6238d406b5 [DOCS] Removed the experimental label from Tribe, Hot Threads
and Completion Suggester
2014-02-06 14:19:17 +01:00
Adrien Grand 6777be60ce Add script support to value_count aggregations.
Close #5001
2014-02-04 14:29:32 +01:00
Clinton Gormley 238b26a466 [DOC] Tidied up geohashgrid aggregations 2014-02-04 11:54:32 +01:00
Jun Ohtani ba415b8ad2 Does not support "script" in value_clunt aggregation. 2014-02-04 10:26:07 +01:00
Adrien Grand cc1ff560df Rename `geohashgrid` to `geohash_grid` in documentation.
It was renamed in fc6bc4c477.

Close #4997
2014-02-04 09:39:55 +01:00
Lars Francke 1bd9dc129b Fix confusing sentence
The original sentence didn't make much sense. I hope this is a bit better. Taken heavy inspiration from c63d8c4fb5
2014-02-03 17:20:40 +01:00
Lars Francke 7cbd0962b5 Improve Aggregations documentation
* Mostly minor things like typos and grammar stuff
* Some clarifications
* The note on the deprecation was ambiguous. I've removed the problematic part so that it now definitely says it's deprecated
2014-02-03 17:16:52 +01:00
uboness d3f2173ef9 fixed date_/histogram aggregation documentation - added documentation for the `min_doc_count` setting
Closes #4944
2014-01-29 20:55:26 +01:00
uboness 9f04e5fe38 fixed nested example response in docs
Closes #4935
2014-01-29 13:09:12 +01:00
uboness dd389d1cc5 Made all multi-bucket aggs return consistent response format
Closes #4926
2014-01-28 17:46:57 +01:00
Nik Everett 93a8e80aff Support multiple rescores
Detects if rescores arrive as an array instead of a plain object.  If so
then parse each element of the array as a separate rescore to be executed
one after another.  It looks like this:
   "rescore" : [ {
      "window_size" : 100,
      "query" : {
         "rescore_query" : {
            "match" : {
               "field1" : {
                  "query" : "the quick brown",
                  "type" : "phrase",
                  "slop" : 2
               }
            }
         },
         "query_weight" : 0.7,
         "rescore_query_weight" : 1.2
      }
   }, {
      "window_size" : 10,
      "query" : {
         "score_mode": "multiply",
         "rescore_query" : {
            "function_score" : {
               "script_score": {
                  "script": "log10(doc['numeric'].value + 2)"
               }
            }
         }
      }
   } ]

Rescores as a single object are still supported.

Closes #4748
2014-01-23 16:29:07 +01:00
Nik Everett 37f80c8d80 Documentation for score_mode
Closes #4742
2014-01-23 16:24:48 +01:00
Clinton Gormley 8685818ad3 [DOCS] Moved termvector and mtermvectors from search to docs 2014-01-22 14:10:26 +01:00
Simon Willnauer cb3bcb05be [DOCS]: Fix added version termvectors.asciidoc 2014-01-22 12:08:13 +01:00
Adrien Grand 9282ae4ffd Terms aggregations: make size=0 return all terms.
Terms aggregations return up to `size` terms, so up to now, the way to get all
matching terms back was to set `size` to an arbitrary high number that would be
larger than the number of unique terms.

Terms aggregators already made sure to not allocate memory based on the `size`
parameter so this commit mostly consists in making `0` an alias for the
maximum integer value in the TermsParser.

Close #4837
2014-01-22 11:05:10 +01:00
Lee Hinman 2c289fb538 Add the ability to retrieve fields from field data
Adds a new FetchSubPhase, FieldDataFieldsFetchSubPhase, which loads the
field data cache for a field and returns an array of values for the
field.

Also removes `doc['<field>']` and `_source.<field>` workaround no longer
needed in field name resolving.

Closes #4492
2014-01-21 09:13:32 -07:00
Martijn van Groningen 9bc3d996ff [SPECS] Updated percolator specs. 2014-01-20 18:18:27 +01:00
Florian Gilcher eed079aaac Reference docs fixes
* Make it clearer that `aggs` is an allowed synomym
  for the `aggregations` key
* Fix broken example in for datehistogram, `1.5M` is
  not an allowed interval
* Make use of colon before examples consistent
* Fix typos
2014-01-20 12:14:17 +01:00
Dawid Weiss ae71b25145 Documentation typo. 2014-01-20 11:51:08 +01:00
Luca Cavanna 4126ae2631 [DOCS] updated json responses after #4310 and #4480
- Removed "ok": true from response examples
 - Added "created" flag to index response examples
 - Replaced exists flag with found in delete response examples
2014-01-16 12:01:39 +01:00
markharwood 2795f4e55d Standardized use of “*_length” for parameter names rather than “*_len”.
Java Builder apis drop old “len” methods in favour of new “length”
Rest APIs support both old “len: and new “length” forms using new ParseField class to a) provide compiler-checked consistency between Builder and Parser classes and
b) a common means of handling deprecated syntax in the DSL.
Documentation and rest specs only document the new “*length” forms
Closes #4083
2014-01-13 15:59:15 +00:00
Adrien Grand 5c237fe834 Add new option `min_doc_count` to terms and histogram aggregations.
`min_doc_count` is the minimum number of hits that a term or histogram key
should match in order to appear in the response.

`min_doc_count=0` replaces `compute_empty_buckets` for histograms and will
behave exactly like facets' `all_terms=true` for terms aggregations.

Close #4662
2014-01-13 10:09:38 +01:00
Martijn van Groningen 943b62634c Replaced the multi-field type in favour for the multi fields option that can be set on any core field.
When upgrading to ES 1.0 the existing mappings with a multi-field type automatically get replaced to a core field with the new `fields` option.

If a `multi_field` type-ed field doesn't have a main / default field, a default field will be chosen for the multi fields syntax. The new main field type
will be equal to the first `multi_field` fields' field or type string if no fields have been configured for the `multi_field` field and in both cases
the default index will not be indexed (`index=no` is set on the default field).

If a `multi_field` typed field has a default field, that field will replace the `multi_field` typed field.

Closes to #4521
2014-01-13 09:21:53 +01:00
Florian Schilling 464037e0c1 Geo clean Up
============
The default unit for measuring distances is *MILES* in most cases. This commit moves ES
over to the *International System of Units* and make it work on a default which relates
to *METERS* . Also the current structures of the `GeoBoundingBox Filter` changed in
order to define the *Bounding* by setting abitrary corners.

Distances
---------
Since the default unit for measuring distances has changed to a default unit
`DistanceUnit.DEFAULT` relating to *meters*, the **REST API** has changed at the
following places:

  * `ScriptDocValues.factorDistance()` returns *meters* instead of *miles*
  * `ScriptDocValues.factorDistanceWithDefault()` returns *meters* instead of *miles*
  * `ScriptDocValues.arcDistance()` returns *meters* instead of *miles*
        one might use `ScriptDocValues.arcDistanceInMiles()`
  * `ScriptDocValues.arcDistanceWithDefault()` returns *meters* instead of *miles*
  * `ScriptDocValues.distance()` returns *meters* instead of *miles*
        one might use `ScriptDocValues.distanceInMiles()`
  * `ScriptDocValues.distanceWithDefault()` returns *meters* instead of *miles*
        one might use `ScriptDocValues.distanceInMilesWithDefault()`
  * `GeoDistanceFilter` default unit changes from *kilometers* to *meters*
  * `GeoDistanceRangeFilter` default unit changes from *miles* to *meters*
  * `GeoDistanceFacet` default unit changes from *miles* to *meters*

Geo Bounding Box Filter
-----------------------
The naming of the GeoBoundingBoxFilter properties allows to set arbitrary corners
(see #4084) namely `top_right`, `top_left`, `bottom_right` and `bottom_left`. This
change also includes the fields `topRight` and `bottomLeft` Also it is be possible to
set the single values by using just `top`, `bottom`, `left` and `right` parameters.

Closes #4515, #4084
2014-01-11 21:30:29 +09:00
Simon Willnauer bc5a9ca342 Rename edit_distance/min_similarity to fuzziness
A lot of different API's currently use different names for the
same logical parameter. Since lucene moved away from the notion
of a `similarity` and now uses an `fuzziness` we should generalize
this and encapsulate the generation, parsing and creation of these
settings across all queries.

This commit adds a new `Fuzziness` class that handles the renaming
and generalization in a backwards compatible manner.

This commit also added a ParseField class to better support deprecated
Query DSL parameters

The ParseField class allows specifying parameger that have been deprecated.
Those parameters can be more easily tracked and removed in future version.
This also allows to run queries in `strict` mode per index to throw
exceptions if a query is executed with deprected keys.

Closes #4082
2014-01-09 15:14:51 +01:00
Martijn van Groningen 7e341cefd0 Change the `sort` boolean option in percolate api to the sort dsl available in search api.
Closes #4625
2014-01-09 09:58:34 +01:00
Clinton Gormley 2e4b70d40f [DOCS] Fixed duplicate ID in highlighting 2014-01-09 00:37:18 +01:00