Commit Graph

1782 Commits

Author SHA1 Message Date
Shay Banon c449fbdd68 missing/exists filters should also work for objects
closes #3141
2013-06-12 04:42:23 +02:00
Simon Willnauer 66cd74d2df Always ceate index with mapping in test to ensure shards are available 2013-06-11 19:08:33 +02:00
Shay Banon dac2c559d4 remove the index level class support
fix the test that relies on it, just index the data for each test case
2013-06-11 16:35:13 +02:00
Shay Banon 78fb12bcaa fix the type of the mapping 2013-06-11 14:49:34 +02:00
Shay Banon 3a0f9c6ea3 fix shared cluster to delete templates as well per test run 2013-06-11 14:43:18 +02:00
Shay Banon 1d63ff64c7 simplify parsing code 2013-06-11 13:19:54 +02:00
Shay Banon 41e4ee22e6 Thread pool: rename `capacity` to `queue_size`
fixes #3161
2013-06-11 13:07:07 +02:00
Simon Willnauer 7afffbe13b Cleanup String to UTF-8 conversion
Currently we have many different places that convert String to UTF-8
bytes and back. We shouldn't maintain more code than necessary to
do this conversion and rather use Lucene's support for it.
2013-06-10 21:56:24 +02:00
Alexander Reelsen 9323e677bd Cleaning up some tests by using assertHitCount assertion 2013-06-10 16:57:09 +02:00
Simon Willnauer 21945e5060 Ensure all shards return compareable scores for rescore tests 2013-06-10 16:50:10 +02:00
Simon Willnauer 314a3343f9 Add more verbose matchers / asserts to tests 2013-06-10 16:06:04 +02:00
Florian Schilling f64f7c0c08 Fixed the `GeoPointFieldMapper` to parse `geohashes` correctly.
Closes #3073
2013-06-10 12:13:43 +02:00
Simon Willnauer b9feaa9999 Simplify TestCluster
TestCluster now doesn't use any reference counting anymore and
testcluster names are based on creation time to prevent confilcts if
builds hang.
2013-06-10 12:07:11 +02:00
Britta Weber 11d08ac436 term vector request
================================

Returns information and statistics on terms in the fields of a particular document as stored in the index.

        curl -XGET 'http://localhost:9200/twitter/tweet/1/_termvector?pretty=true'

Tree types of values can be requested: term information, term statistics and field statistics.
By default, all term information and field statistics are returned for all fields but no term statistics.

Optionally, you can specify the fields for which the information is retrieved either with a parameter in the url

	curl -XGET 'http://localhost:9200/twitter/tweet/1/_termvector?fields=text,...'

or adding by adding the requested fields in the request body (see example below).

Term information
-------------------------

- term frequency in the field (always returned)
- term positions ("positions" : true)
- start and end offsets ("offsets" : true)
- term payloads ("payloads" : true), as base64 encoded bytes

If the requested information wasn't stored in the index, it will be omitted without further warning.
See [mapping](http://www.elasticsearch.org/guide/reference/mapping/core-types/) on how to configure your index to store term vectors.

Term statistics
-------------------------

Setting "term_statistics" to "true" (default is "false") will return

- total term frequency (how often a term occurs in all documents)
- document frequency (the number of documents containing the current term)

By default these values are not returned since term statistics can have a serious performance impact.

Field statistics
-------------------------

Setting "field_statistics" to "false" (default is "true") will omit

- document count (how many documents contain this field)
- sum of document frequencies (the sum of document frequencies for all terms in this field)
- sum of total term frequencies (the sum of total term frequencies of each term in this field)

Behavior
-------------------------

The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in. The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context.

Example
-------------------------

First, we create an index that stores term vectors, payloads etc. :

    curl -s -XPUT 'http://localhost:9200/twitter/' -d '{
        "mappings": {
            "tweet": {
                "properties": {
                    "text": {
                                "type": "string",
                                "term_vector": "with_positions_offsets_payloads",
                                "store" : "yes",
                                "index_analyzer" : "fulltext_analyzer"
                         },
                     "fullname": {
                                "type": "string",
                                "term_vector": "with_positions_offsets_payloads",
                                "index_analyzer" : "fulltext_analyzer"
                         }
                 }
            }
        },
        "settings" : {
            "index" : {
                "number_of_shards" : 1,
                "number_of_replicas" : 0
            },
            "analysis": {
                    "analyzer": {
                        "fulltext_analyzer": {
                            "type": "custom",
                            "tokenizer": "whitespace",
                            "filter": [
                                "lowercase",
                                "type_as_payload"
                            ]
                        }
                    }
            }
         }
    }'

Second, we add some documents:

    curl -XPUT 'http://localhost:9200/twitter/tweet/1?pretty=true' -d '{
      "fullname" : "John Doe",
      "text" : "twitter test test test "

    }'

    curl -XPUT 'http://localhost:9200/twitter/tweet/2?pretty=true' -d '{
      "fullname" : "Jane Doe",
      "text" : "Another twitter test ..."

    }'

The following request returns all information and statistics for field "text" in document "1" (John Doe):

     curl -XGET 'http://localhost:9200/twitter/tweet/1/_termvector?pretty=true' -d '{
                    "fields" : ["text"],
                    "offsets" : true,
                    "payloads" : true,
                    "positions" : true,
                    "term_statistics" : true,
                    "field_statistics" : true
            }'
Equivalently, all parameters can be passed as URI parameters:
     curl -GET 'http://localhost:9200/twitter/tweet/1/_termvector?pretty=true&fields=text&offsets=true&payloads=true&positions=true&term_statistics=true&field_statistics=true'

Response:

  {
    "_index" : "twitter",
    "_type" : "tweet",
    "_id" : "1",
    "_version" : 1,
    "exists" : true,
    "term_vectors" : {
      "text" : {
        "field_statistics" : {
          "sum_doc_freq" : 6,
          "doc_count" : 2,
          "sum_ttf" : 8
        },
        "terms" : {
          "test" : {
            "doc_freq" : 2,
            "ttf" : 4,
            "term_freq" : 3,
            "pos" : [ 1, 2, 3 ],
            "start" : [ 8, 13, 18 ],
            "end" : [ 12, 17, 22 ],
            "payload" : [ "d29yZA==", "d29yZA==", "d29yZA==" ]
          },
          "twitter" : {
            "doc_freq" : 2,
            "ttf" : 2,
            "term_freq" : 1,
            "pos" : [ 0 ],
            "start" : [ 0 ],
            "end" : [ 7 ],
            "payload" : [ "d29yZA==" ]
          }
        }
      }
    }
  }

Further changes:
-------------------------

XContentBuilder
new method
public XContentBuilder field(XContentBuilderString name, int offset, int length, int... value)
to put an integer array.

IndicesAnalysisService
make token filter for saving payloads available in elasticsearch

AbstractFieldMapper/TypeParser
make term vector options string available and also fix the parsing of this string:
with_positions_payloads is actually allowed as can be seen in TermVectorsConsumerPerFields.

Closes #3114
2013-06-10 11:09:11 +02:00
Simon Willnauer 945b89fd80 Don't test the test - who tests the test for the test? ;) 2013-06-07 20:40:50 +02:00
Simon Willnauer b222e83d2b Stabelize more tests 2013-06-07 20:33:17 +02:00
Britta Weber ac75b1bcae Fix addMapping() in AbstractSharedClusterTest for more than one field 2013-06-07 19:05:13 +02:00
Alexander Reelsen a5f9173e14 Making deb installable by being lintian compatible
According to #2515 the ubuntu software center does not allow to install
debian packages which are not lintian compatible

I worked on the package and made it lintian compatible by doing

* Ignoring errors about arch dependent binaries as we will not split
  this package. The arch dependent libraries are used correctly.
* Added a copyright file pointing to the apache license in debian

Closes #2515
Closes #2320
2013-06-07 13:53:14 +02:00
Simon Willnauer 962e3d58f7 Added shortcuts for several common commands
added simple way to add more complex mappings as well as shortcuts for flush
and status etc. all checking if requests return failed shards
2013-06-07 12:30:30 +02:00
Martijn van Groningen 8016d32a0e Fixed minor issue in ASCT#indexExists(...) 2013-06-06 21:42:42 +02:00
Martijn van Groningen e218ead19e ChildrenQuery and ParentQuery now take into account documents that have been marked.
Closes #3144
2013-06-06 17:13:49 +02:00
Simon Willnauer 3b01f812d6 Stabelize more tests
Wait for relocation before checking statistics or run refresh / optimze.
2013-06-06 17:03:36 +02:00
Simon Willnauer 1c513bc262 Fallback to extract terms if MultiPhraseQuery is large
Currently if MPQ is very large highlighing can take down a node
or cause high CPU / RAM consumption. If the query grows > 16 terms
we just extract the terms and do term by term highlighting.

Closes  #3142 #3128
2013-06-06 11:22:49 +02:00
Simon Willnauer f995c9c130 Correct offsets in FVH also if stored field is used for highlighting
The SimpleFragemntsBuilder did not correct offsets if the used
analysis chais could produce broken offsets that could lead to
StringArrayIndexOutOfBounds Exceptions

Closes #3140
2013-06-06 10:23:09 +02:00
Simon Willnauer 00c13532a9 report details if shard response has failed shards 2013-06-06 00:54:34 +02:00
Martijn van Groningen 7936417270 Added a benchmark for parent/child queries while indexing at the same time. 2013-06-05 22:27:18 +02:00
Martijn van Groningen 82ff1c6802 Fixed `has_parent` query and filter returning no results with multi level child docs. 2013-06-05 22:12:26 +02:00
Simon Willnauer 56dfa96851 More test cleanups 2013-06-05 15:45:03 +02:00
Simon Willnauer 23ad8401d0 Fix SearchStatsTest
Use actual node in test instead of the first node in the array
2013-06-05 09:47:37 +02:00
Simon Willnauer 4ff471ff82 Stabelize more failing tests.
- SimpleSortTests#testSortScript which was not using the mapping correctly
- SearchStatsTests#testSimpleStats which didn't clear the stats before
  running the test and a previous run could have added queries
2013-06-04 08:32:48 +02:00
Adrien Grand 85f54edf66 Fix AbstractSimpleEngineTests versioning tests.
Version is now stored on a distinct field, that AbstractSimpleEngineTests
didn't correctly add before running tests. This generated a test failure
when the version needed to be loaded from the index.
2013-06-04 00:58:54 +02:00
Simon Willnauer 07546d4d8d Stabeilze SearchStatsTests
Query stats are only present (not 0) on nodes that hold a shard of the index.
2013-06-03 15:05:57 +02:00
Christoph Kempen 9f43814a86 Changed Java dependency from Depends to Suggest.
Since people are using the Oracle JAVA distribution and not the OpenJDK.
You can suggest it of course. Now the installation will at least continue.

If the init script is called, it will exit with a useful error message, that
no JDK is available via the JAVA_HOME variable.
2013-06-02 15:09:29 +02:00
Alexander Reelsen 609ad0e572 Changing version semantics to be more readable
The Version class had hard to understand semantics when two versions were
compared against each other.

Sample of the new logic:
* V_0_20_0.before(V_0_90_0) => true
* V_0_90_0.after(V_0_20_0)  => true

Closes #3124
2013-06-02 14:58:36 +02:00
Simon Willnauer 3417b945dd stabelize SimpleQueryTest 2013-06-02 13:02:36 +02:00
Simon Willnauer a3f4d33aaa Stabelize MoreLikeThisActionTest
Ensure test sends mapping with createIndex
2013-06-02 10:45:30 +02:00
Simon Willnauer a5837b0f8d Stableize SearchStatsTest after search refactoring
SearchStatsTest depends on a given set of nodes and shards. The test
needed to be adjusted to reflect a possibly random number of nodes.
2013-06-02 10:04:47 +02:00
Simon Willnauer 2682c24975 Add Test for simple allocation scenario
This test checks for the "perfect" or a "sane" allocation
when the total number of shards is separable by the total number of nodes
the index can be allocated on.
2013-06-02 08:06:40 +02:00
Alexander Reelsen 183bb76371 Ensure config files are not overwritten in RPM upgrade
In order to ensure that configuration files do not get overwritten when
upgrading an RPM, it is not sufficient to mark them as configuration. You
have to use the 'noreplace' parameter to make sure, they are never
overwritten. Added this parameter for the /etc/elasticsearch directory
as well as the /etc/sysconfig/elasticsearch file.

In addition, the post remove script now only deletes the user in case of
a package removal (and does nothing on package upgrade).

Closes #3123
2013-05-31 16:26:10 +02:00
Simon Willnauer e6a3c9c153 Improve integration testing by reusing an abstracted cluster across tests
The new AbstractSharedClusterTest abstracts integration testing further to
reduce the overhead of writing tests that don't rely on explict control over
the cluster. For instance tests that run query, facets or that test highlighting
don't need to explictly start and stop nodes. Testing features like the ones
just mentioned are based on the assumption that the underlying cluster can
be arbitray. Based on this assumption this base class allows to:

 * randomize cluster and index settings if not explictly specified
 * transparently test transport & node clients
 * test features like search or highlighting on different cluster sizes
 * allow reuse of node insteance across tests
 * provide utility methods that act as upper or lower bounds that a test must pass with
   ie. if a test requries at least 3 nodes then it should also pass with 4 nodes
 * given a cluster has unmodified cluster settings (persistent and transient) the cluster
   should not differ to a fresh started cluster when reused across nodes.
 * within a test the client implementation and the clients associated node can be changed
   at any time and should return a valid result.

This patch also prepares some redundant tests like 'RelocationTests.java' for randomized
testing. Test like this are very long-running on some machines and run the same test with
different parameters like 'number of writers' or 'number of relocations' which can easily
be chosen with a random number and run only ones during development but multiple times
during CI builds.

All the improvements in this change reduce the test time by ~30%
2013-05-31 10:23:45 +02:00
iksnalybok 47154a79c5 Allow negative slops in SpanNearQueryParser
This is mainly due to the fact that SpanNearQuery allows some neat
tricks with negative slops to run zero-sloped near queries across
2 or more SpanTermQueries.

Closes #3079
2013-05-31 09:35:46 +02:00
David Pilato 663f653ced Add more information and options in PluginManager
New option -l, --list displays list of existing plugins
New option -h, --help displays help
Deprecate options:
   -install is now -i, --install
   -remove is now -r, --remove
   -url is now -u, --url
Catch ArraysOutOfBoundException when no arg given to install, remove or url option
Add description on plugin name structure:
- elasticsearch/plugin/version for official elasticsearch plugins (download from download.elasticsearch.org)
- groupId/artifactId/version   for community plugins (download from maven central or oss sonatype)
- username/repository          for site plugins (download from github master)
Closes #3112.
2013-05-30 22:26:05 +02:00
Adrien Grand c16a46e15c Make it easier to get started with Eclipse.
This patch makes mvn eclipse:eclipse generate additional eclipse configuration
files so that Eclipse:
 - uses Java 1.6 compliance level,
 - truncates lines after 140 chars,
 - uses 4 spaces for indentation,
 - automatically adds a license header when creating a new class file,
 - organizes imports the same way as Intellij Idea (which makes sense I guess
   since most of the code bas has been written with Intellij, this will prevent
   from having large diffs due to the fact that the order of imports has
   changed).
2013-05-30 16:48:58 +02:00
Shay Banon 7931add154 add 0.90.2 version 2013-05-30 14:27:33 +02:00
Adrien Grand 490c7103ae Store _version as a numeric doc values field.
Doc values can be expected to be more compact than payloads and should provide
better flexibility since doc values formats can be picked on a per-field basis.
This patch:
 - makes _version stored as a numeric doc values field,
 - manages backwards compatibility: if a version is not found in doc values,
   then it will look into payloads,
 - uses background merges to upgrade old segments and move _version from
   payloads to doc values.

Closes #3103
2013-05-30 11:28:54 +02:00
Adrien Grand 5ea6c77dad Highlighting shouldn't fail when the field to highlight is absent.
PlainHighlighter fails with a NPE when the field to highlight is marked as
stored in the mapping but doesn't exist in a hit. This patch makes
FieldsVisitor.fields less error-prone by returning an empty list instead
of null when no matching stored field was found.

Closes #3109
2013-05-30 10:56:36 +02:00
Alexander Reelsen 03a86604a4 Reuse suggester implementations in suggest parsers 2013-05-29 15:07:23 +02:00
Alexander Reelsen 8a5b7b21df Make suggester implementation pluggable
This patch tries to make the suggester implementation as pluggable as
facets or highlight implementations. The goal is to be able to create
own suggest implementations in a suggest query.

Closes #3089
2013-05-28 08:59:50 +02:00
Martijn van Groningen 8b95c5fab8 Added indices aliases exists api.
Added indices aliases exists api that allows to check to existence of an index alias. This api redirects to the master to check for the existence of one or multiple index aliases.

Possible options:
* `index` - The index name to check index aliases for. Partially names are supported via wildcards, also multiple index names can be specified separated with a comma. Also the alias name for an index can be used.
* `alias` - The name of alias to check the existence for. Like the index option, this option supports wildcards and the option the specify multiple alias names separated by a comma. This is a required option.
* `ignore_indices` - What to do is an specified index name doesn't exist. If set to `missing` then those indices are ignored.

The rest head endpoint is: `/{index}/_alias/{alias}`

Examples:
Check existence for any aliases with the name 2013 in any index:
```
curl -XHEAD 'localhost:9200/_alias/2013
```
Check existence for any aliases that start with 2013_01 in any index
```
curl -XHEAD 'localhost:9200/_alias/2013_01*
```
Check existence for any aliases in the users index.
```
curl -XHEAD 'localhost:9200/users/_alias/*
```

Closes #3100
2013-05-27 20:40:11 +02:00
Shay Banon b4d75a50bf Dates accessed from scripts should use UTC timezone
this was broken in the field data refactoring we did in 0.90, fixes #3091
2013-05-25 22:43:48 +02:00
Alexander Reelsen 24fccc91d8 Stabilized SimplePercolaterTests 2013-05-24 22:22:43 +02:00
Alexander Reelsen 2e4d18b519 Fixing percolation of documents with TTL set
When a type is configured with a TTL, percolation of documents of this type
was not possible. This fix ignores the TTL for percolation instead of
throwing an exception that the document is already expired.

Closes #2975
2013-05-24 17:58:02 +02:00
Simon Willnauer 6e366bae34 Never throw an IAE if the IndexMapper isn't present in PostingsFormat
If we throw an exception in the PostingsFormat during a merge we essentially
fail the entire merge which can lead to a corrupt index. We should rather
return the default postings format for the new segment and log a warning.

Closes #3088
2013-05-24 17:40:36 +02:00
Martijn van Groningen 9ed274822d wait for green/yellow status 2013-05-24 17:32:45 +02:00
Simon Willnauer d5ca1be34e Added testcase to ensure #3078 doesn't fail 2013-05-23 23:18:45 +02:00
Simon Willnauer 13c1145548 Fix String.format to use Locale.ROOT 2013-05-23 18:26:12 +02:00
Martijn van Groningen ffdebe9bc3 Added three new index alias related apis.
Added apis to get specific index aliases based on filtering by alias name and index name:
```
curl -XGET 'localhost:9200/{index_or_alias}/_alias/{alias_name}'
```

Added delete index alias api for deleting a single index alias:
```
curl -XDELETE 'localhost:9200/{index}/_alias/{alias_name}'
```

Added create index alias api for adding a single index alias:
```
curl -XPUT 'localhost:9200/{index}/_alias/{alias_name}'

curl -XPUT 'localhost:9200/{index}/_alias/{alias_name}' -d '{
	"routing" : {routing},
	"filter" : {filter}
}'

```

Closes #3075 #3076 #3077
2013-05-23 09:18:17 +02:00
Simon Willnauer 841c2d1e14 Fix bug in DateFieldMapper where format is serialized instead of locale
This fix adds a default serialization step in the SimpleDateMappingTests
that parses the mapping, builds the mapper, serializes the mapper and
rebuilds the actual mapper from the serialization result. The contained
information must be equivalent to the original mapping.

The fixed bug has no issue assigned to is since the code is unreleased yet.
2013-05-22 21:14:19 +02:00
Clinton Gormley bb9871bcb5 Changed common terms query to also support camelCased parameters
and renamed disable_coords to disable_coord, to be consistent with
the bool query

Closes #3074
2013-05-22 16:52:32 +02:00
Matt Weber 927fda8a61 Apply QueryParser boost to top leve query if applicable.
Set the query boost of a parsed query string query to the product of
the parsed query boost and the boost value specified in the "boost"
query string parameter. This only applies if the top level query returned
from the query parser has a boost assigned to it. In such a case we must
multiply the boost with the top level query boost otherwise the boost
will be overwritten ie. 'foo^2' has a top-level boost of 2 while
'foo^2 OR bar^3' has a top level boost of 1.0 (default) since the
boolean query is the top level query.

Closes #3024
2013-05-21 10:29:28 +02:00
Simon Willnauer af4205fd30 Fix method name typo & beef up tests
s/DateFieldMapper#parseLocal/DateFieldMapper#parseLocal/
SimpleDateMappingTests Tests now also check local dependent
patters with root locale.
2013-05-21 09:37:35 +02:00
Shay Banon e0825686f3 rollback multi get fields change
seems like it still fails while serializing with sporadic failures in the tests (due to routing on serialization), need to test it in a consistent manner
2013-05-20 06:56:37 -07:00
Shay Banon 0066566357 External terms doesn't work with _id field
fixes #3063
2013-05-20 06:17:18 -07:00
Shay Banon d983389090 multi get semantics of empty/null fields missing
the semantics between null fields (asking for source), and empty fields (not asking for anything) is missing
also exposes the items in the request, relates to #3061
2013-05-20 05:09:48 -07:00
Simon Willnauer 31f0aca65d Integrate forbiddenAPI checks into Maven build.
This commit integrates the forbiddenAPI checks that checks
Java byte code against a list of "forbidden" API signatures.
The commit also contains the fixes of the current source code
that didn't pass the default API checks.

See https://code.google.com/p/forbidden-apis/ for details.

Closes #3059
2013-05-19 23:25:44 +02:00
Simon Willnauer c4db582f26 Allow Date Fields to have a locale for date parsing
Currently if somebody uses a date format that is locale dependend
date fields can only parse a single format depending on the nodes
host locale. This can cause lots of problems since nodes might have
different locales. ie. "E, d MMM yyyy HH:mm:ss Z" where you have
"Wed, 06 Dec 2000 02:55:00 -0800" for en_EN while
"Mi, 06 Dez 2000 02:55:00 -0800" for de_DE.

Closes #3047
2013-05-19 23:25:44 +02:00
Shay Banon e580507fbe wait for yellow after the index is created
also, remove starting one node, it not useful for the test, and slows down the execution
2013-05-17 18:21:51 +02:00
Simon Willnauer 17681d7104 Grow array buffer in ScriptDocValues if needed
The buffer in ScriptDocValues for Strings was never called causeing
NPE in scripts if a document has > 10 distinct values in a field.

Closes #3051
2013-05-17 15:16:59 +02:00
Alexander Reelsen bd857d6d2e Ensuring test isolation when jvm plugins are loaded
Instead of specifying 'path.plugins' configuration option, 'plugin.types'
is used to load plugins in integration tests. This makes sure the JVM
plugins are not loaded in all following tests from then.

Also removed the now unneeded es-plugin.properties files from JVM test
plugins.
2013-05-17 13:21:26 +02:00
Alexander Reelsen 2485c4890c Packaging improvements & bugfixes
* RPM: Use the ES_USER variable to set the user (same name as in the debian package
  now), while retaining backwards compatibility to existing /etc/sysconfig/elasticsearch
* RPM: Bugfix: Remove the user when uninstalling the package
* RPM: Set an existing homedir when adding the user (allows one to run cronjobs as this user)
* DEB & RPM: Unify Required-Start/Required-Stop fields in initscripts
2013-05-17 11:14:44 +02:00
Alexander Reelsen 2e07af63ba Allowing pluggable highlighter implementations.
Currently elasticsearch ships with the plain and the fast-vector highlighter.
In order to support arbitrary highlighters via plugins, you only need to
implement a Highlighter interface and register your implementation in your
plugin at the HighlightModule.

In addition you can also add arbitrary options via the 'options' field in
the highlight request, which can be parsed in the highlighter implementation.

In order to find out how to write add your own analyzer, check out the tests
classes (CustomHighlighterSearchTests and CustomHighlighter).

Closes #2828
2013-05-17 09:07:13 +02:00
Martijn van Groningen db421742f7 Added support for nested sorting for script sorting and geo sorting.
Closes #3044
2013-05-16 18:45:00 +02:00
Martijn van Groningen 42d5bdd337 If matching root doc's inner objects don't match the `nested_filter` then the `missing` value should be used to sort the root doc.
Closes #3020
2013-05-16 10:12:02 +02:00
Shay Banon 2779967279 fix package name... 2013-05-15 17:06:42 +02:00
Martijn van Groningen bc0c7f8f28 Added simple id loading test.
Relates to #3028
2013-05-15 16:10:22 +02:00
Simon Willnauer 8235b89e9c Don't apply min frequency smoothing if suggest type is 'always'
Using an automatically detected 'min_doc_freq' if suggest type is set to
'always' is counter intuitive. If we suggest always ignore the frequency and
set threshold frequency to 0 to allow all possible candidates to be drawn if
they are within the given bounds.

Closes #3037
2013-05-15 15:17:49 +02:00
Martijn van Groningen 48cb06c9cf Keep backwards compatible with 0.90.0 on the transport layer.
Relates to #3039
2013-05-15 13:28:55 +02:00
Martijn van Groningen 585cbf6886 Routing value not serialized on transport layer.
Closes #3039
2013-05-15 13:09:13 +02:00
Clinton Gormley db805cf5a9 Corrected English in a shard error message 2013-05-15 12:41:49 +02:00
Clinton Gormley 4d09e7562a Corrected a typo and improved the English in a master-discovery error 2013-05-15 12:39:31 +02:00
Shay Banon f92eed8591 clean thread locals without needing a wrapper
clean thread locals smartly by identifying "our" classes, and removing them, so there is no need to wrap it in our our clenable values
2013-05-15 12:13:13 +02:00
Shay Banon 4d357660ca reuse version key in an actual operation
no need to compute the hash several times
2013-05-15 00:27:48 +02:00
Shay Banon 1fb78c53b8 remove unused class 2013-05-14 20:21:37 +02:00
Shay Banon 1c7d2442c8 use bytes instead of String as key in versionMap
no need to create a String every time we put or get a value from the version map
2013-05-14 20:18:54 +02:00
Martijn van Groningen 15fcb17a81 During parent uid loading seek to next parent type when child type is encountered.
Relates to #3028
2013-05-14 16:22:05 +02:00
Simon Willnauer 6d5805c901 Use Recovery Throtteling by default.
To prevent to extensive resource use during recovery we use
recovery throtteling by default to prevent unexpected peak load
on clusters. The default is set to 20 MB/sec.

Closes #3035
2013-05-14 15:10:03 +02:00
Simon Willnauer 6624949501 Use Merge Throtteling by default on node level.
Merge Throtteling is one of the most recommended settings and crucial in the
RealTime indexing case. We should set the default to a reasonable setting
that allows folks to index in a production index and don't see large merge
peaks by default. The default is set to 20 MB/sec on the node level.

Closes #3033
2013-05-14 15:10:03 +02:00
Simon Willnauer 09fb2264d0 Raise search threadpool default size.
The default size used to be 2x availableProcessors which seemed to
be a to lowish value in practice. 3x appeared to be a sweetspot for
most application. The default is now 3 x availableProcessors

Closes #3023
2013-05-14 15:10:03 +02:00
uboness d06a15ec3e Support for term facets on unmapped fields
Added support for unmapped & partially mapped fields (partially mapped fields may occur when searching across multiple indices where the faceted field is mapped on some and unmapped on others). If a shard doesn't have mappings for a field, the matching documents count on that shard will be added to the missing count for that facet.
2013-05-14 13:53:41 +02:00
Martijn van Groningen 906f278896 Make sure only relevant documents are evaluated in the second round lookup phase.
Both has_parent and has_child filters are internally executed in two rounds. In the second round all documents are evaluated whilst only specific documents need to be checked. In the has_child case only documents belonging to a specific parent type need to be checked and in the has_parent case only child documents need to be checked.

Closes #3034
2013-05-14 11:02:03 +02:00
Shay Banon ae6c1b345f Allow to disable allocation on the index level
Similar to the global cluster wide disable allocation flags, allow to set those on a specific index by updating its settings. The keys are the same as the cluster one, except they start with an index, for example: index.routing.allocation.disable_allocation set to true.
closes #3031
2013-05-14 10:25:23 +02:00
Simon Willnauer 7b437e801a Added test for LimitTokenCountFilterFactory 2013-05-14 09:58:43 +02:00
Brusic 183ac1e04c Expose LimitTokenCountFilter as a TokenFilter
Closes #3013
2013-05-14 09:58:42 +02:00
Martijn van Groningen 669cf90d0c Not load the ids of child documents into memory.
Closes #3028
2013-05-14 09:46:43 +02:00
Alexander Reelsen 31b4b7ea58 Renaming span_multi_term query to span_multi
... due to discussing this on #2610 in order to have a more concise name
2013-05-13 12:32:57 +02:00
Simon Willnauer cffe333fe3 Ensure tests pass if store dir is a soft-link 2013-05-13 12:08:41 +02:00
Simon Willnauer a3a2ca0ad3 Reduce branches in TopChildrenQuery
The branches used in the score method can be moved into the
scorer call and be essentially a constant operation rather than
a linear operation depending on the number of parent docs.
2013-05-13 12:08:41 +02:00
Alexander Reelsen 52654179e7 Fix for RPM postinstall on old OpenSUSE distributions
Older OpenSUSE distributions do not ship with systemd and therefore are
using chkconfig, but do not have their scripts placed at /etc/init.d/
This patch is more defensive and adds additional checks in the postinstall
script to prevent aborted post install scripts, which makes the RPM
uninstallable.
2013-05-13 11:48:04 +02:00
Martijn van Groningen 3c58176d29 Also support `sum` as `score_mode` option for the nested query.
Relates to #3026
2013-05-13 10:38:20 +02:00
Martijn van Groningen 6eaad25621 Made all the queries support `score_mode` parameter name in addition to the existing parameter name for score mode.
Closes #3026
2013-05-13 10:30:01 +02:00