Commit Graph

3294 Commits

Author SHA1 Message Date
Martijn van Groningen 0973b2863c Added extra rest endpoint for get settings api.
Added rest test to also test the get settings' prefix option.
2014-01-09 09:44:40 +01:00
David Pilato 36e58c092b Add more traces in case of failure when testing with actual plugins
(cherry picked from commit 0b2ff1e)
2014-01-09 09:27:44 +01:00
Shay Banon bc0909b232 move RestRequest to be an abstract class, and expose local/remote address 2014-01-09 00:05:03 +01:00
Leonardo Menezes 8686ffe761 Expose headers list in RestRequest
Closes #4609
2014-01-08 23:13:58 +01:00
Shay Banon 8f2b8ec8a7 use proper logging 2014-01-08 21:17:49 +01:00
Shay Banon efa59f37a8 Create standard gc and memory_pool names for Jvm stats
fixes #4661
2014-01-08 21:16:10 +01:00
ohnorobo b7a5537d83 Cleaning up nodenames
Common changes:
Lastname, Firstname -> Firstname Lastname
Name I -> Name
Title-style capitalization
Removed L-to-R characters ( <e200> )
Removed duplicates
Alphabetical order
2014-01-08 19:23:24 +01:00
Shay Banon e9f5e5a8b3 when specifying size 0, just use the total hits collector in query phase
no need the (old) hack of setting the size to 1 anymore really...
2014-01-08 18:28:58 +01:00
Costin Leau 7ae4d101ab add support for tera and peta units 2014-01-08 18:13:49 +02:00
Igor Motov bec6527312 Add support for flat_settings flag to all REST APIs that output settings
Closes #4140
2014-01-08 10:36:36 -05:00
Luca Cavanna 6c23ace68f Fixed open/close index api when using wildcard only
Named wildcards were not always properly replaced with proper values by PathTrie.
Delete index (curl -XDELETE localhost:9200/*) worked anyway as the named wildcard is the last path element (and even if {index} didn't get replaced with '*', the empty string would have mapped to all indices anyway). When the named wildcard wasn't the last path element (e.g. curl -XPOST localhost:29200/*/_close), the variable didn't get replaced with the current '*' value, but with the empty string, which leads to an error as empty index is not allowed by open/close index.

Closes #4564
2014-01-08 15:01:49 +01:00
Shay Banon c6fefacb2f move ack tests for cluster update settings to its own test with scope test, and all other ack tests to scope suite for faster execution 2014-01-08 14:39:43 +01:00
Martijn van Groningen 6dc434822c Changed get index settings api to use new internal get index settings api instead of relying on the cluster state api.
The new internal get index settings api is more efficient when it comes to sending the index settings from the master to the client via the
Also the get index settings support now all the indices options.

Closes #4620
2014-01-08 13:18:57 +01:00
Adrien Grand 0a36d6da26 Fix compilation under javac 1.6. 2014-01-08 12:17:55 +01:00
Nik Everett 8bd9e34e39 Stop FVH from throwing away some query boosts
The FVH was throwing away some boosts on queries stopping a number of
ways to boost phrase matches to the top of the list of fragments from
working.

The plain highlighter also doesn't work for this but that is because it
doesn't support the concept of the same term having a different score at
different positions.

Also update documentation claiming that FHV is nicer for weighing terms
found by query combinations.

Closes #4351
2014-01-08 11:51:48 +01:00
Nik Everett 522d620eb6 Use FHV's phraseLimit
This prevents poisoning the FVH with documents that contain TONS of matches
which take tons of memory and time to highlight.

Closes #4645
2014-01-08 11:27:58 +01:00
Martijn van Groningen 2a0842f1b2 Multi-percolate respects the `rest.action.multi.allow_explicit_index` setting
Closes #4284
2014-01-08 11:18:50 +01:00
Alexander Reelsen ad50afbec8 Simplify usage of nodes info API
Important: This breaks backwards compatibility with 0.90

* Removed endpoints: /_cluster/nodes, /_cluster/nodes/nodeId1,nodeId2
* Disallow usage of parameters, but make required metrics part of URI
* Changed NodesInfoRequest to return everything by default
* Fixed NPE in NodesInfoResponse

Closes #4055
2014-01-08 09:46:04 +01:00
Alexander Reelsen 6ef6bb993c Cluster state API: Improved consistency
Instead of specifying what kind of data should be filtered, this commit
streamlines the API to actually specify, what kind of data should be displayed.
This makes its behaviour similar to the other requests, like NodeIndicesStats.

A small feature has been added as well: If you specify an index to select on, not
only the metadata, but also the routing tables are filtered by index in order
to prevent too big cluster states to be returned.

Also the CAT apis have been changed to only return the wanted data in order to keep
network traffic as small as needed.

Tests for the cluster state API filtering have been added as well.

Note: This change breaks backwards compatibility with 0.90!

Closes #4065
2014-01-08 09:25:20 +01:00
Igor Motov b9aaa79afd Add missing license headers, move package names after license headers 2014-01-07 14:28:09 -05:00
Igor Motov 2b49ec138c Fix mixed up duration and duration_in_millis fields in snapshot information output 2014-01-07 14:24:02 -05:00
Shay Banon 4aa5ef139e randomize flush interval so multiple shards won't flush at the sam time
- also, allow to update interval using update settings on an index
2014-01-07 19:58:28 +01:00
markharwood 602de04692 A GeoHashGrid aggregation that buckets GeoPoints into cells whose dimensions are determined by a choice of GeoHash resolution.
Added a long-based representation of GeoHashes to GeoHashUtils for fast evaluation in aggregations.
The new BucketUtils provides a common heuristic for determining the number of results to obtain from each shard in "top N" type requests.
2014-01-07 18:03:33 +00:00
Shay Banon abf68c472e don't schedule a flush if there are no operations in the translog
this can happen with the 30m time base setting expires
2014-01-07 18:53:58 +01:00
Lee Hinman 2cb40fcb17 Rename "exists" to "found" in TermVector and Get responses
- Adds the "created" field to the index action response
- Reverses Delete class' notFound to Found to avoid double negative
2014-01-07 09:47:07 -07:00
Lee Hinman d23f640cd1 Remove hard-coded "ok": true from REST responses 2014-01-07 09:27:07 -07:00
Luca Cavanna 7166ab6b6f Fixed match assertion that didn't run any assert with object of different types that don't extend Number 2014-01-07 16:21:33 +01:00
Adrien Grand 107ae66a60 Replace RecyclerUtils with Releasables. 2014-01-07 14:44:24 +01:00
Adrien Grand 3413eedd8d Fix license header og BytesRefComparisonsBenchmark. 2014-01-07 12:09:40 +01:00
Adrien Grand 1ddc0493df Add a benchmark for BytesRef comparisons. 2014-01-07 12:03:24 +01:00
Simon Willnauer a4b2366e1e Add missing license headers 2014-01-07 11:41:01 +01:00
Simon Willnauer 10ec2e948a Fix ASL Header in source files to reflect s/ElasticSearch/Elasticsearch
This commit also removes the license to Shay Banon in favor of soley
Elasticsearch. Thanks Shay for this awesome product you took it far!

Closes #4636
2014-01-07 11:22:01 +01:00
Simon Willnauer fa16969360 Cleanup comments and class names s/ElasticSearch/Elasticsearch
* Clean up s/ElasticSearch/Elasticsearch on docs/*
 * Clean up s/ElasticSearch/Elasticsearch on src/* bin/* & pom.xml
 * Clean up s/ElasticSearch/Elasticsearch on NOTICE.txt and README.textile

Closes #4634
2014-01-07 11:21:51 +01:00
Igor Motov 56b3941706 Make partial dates without year to be 1970 based instead of 2000
Fixes #4451

Date fields without date (HH:mm:ss, for example) are parsed as time on Jan 1, 1970 UTC. However, before this change partial dates without year (MMM dd HH:mm:ss, for example) were parsed as as days of they year 2000. This change makes all partial dates to be treated based on year 1970. This is breaking change - before this change "Dec 15, 10:00:00" in most cases was parsed (and indexed) as "2000-12-15T10:00:00Z". After this change, it will be consistently parsed and indexed as  "1970-12-15T10:00:00Z"
2014-01-06 19:51:08 -05:00
Andrew Raines 5a02ec86a8 cat stats: remove "total" prefix and shorten "primaries" to "pri"
total is the normal case and primaries just shows up in _cat/indices.
2014-01-06 18:50:03 -06:00
Andrew Raines 27a84afe24 Update primaryOrReplica column. 2014-01-06 17:24:33 -06:00
Andrew Raines 19e26122a4 Add rest of index stats to cat/nodes and cat/shards.
Closes #4607.
2014-01-06 17:24:31 -06:00
Andrew Raines 5ca0d47fa4 Add total stats to cat/indices. 2014-01-06 16:08:53 -06:00
Andrew Raines 97b51723f4 Add remaining aliases to cat/indices. 2014-01-06 16:08:08 -06:00
Andrew Raines 38894632c3 Swap help and headers params. 2014-01-06 16:08:08 -06:00
Andrew Raines cf81e19cce Add alias sample. 2014-01-06 16:08:08 -06:00
Andrew Raines 03c068fc8a Add rest of indices primaries stats. 2014-01-06 16:08:08 -06:00
Andrew Raines 125ac4f92f Add ByteSizeValue to SegmentStats. 2014-01-06 16:08:08 -06:00
Andrew Raines d657dc49b2 Add more info to table row size errors. 2014-01-06 16:08:08 -06:00
Shay Banon 106b747a08 *a*wait... 2014-01-06 21:45:35 +01:00
Shay Banon 6d1fe75799 simplify by directly getting the search TP info 2014-01-06 21:45:21 +01:00
Shay Banon 01c5be1da3 generic thread pool should always be cached
when using generic thread pool, some elements in the code base rely on the fact that it will always be able to fork it
2014-01-06 21:39:20 +01:00
Adrien Grand 4271d573d6 Page-based cache recycling.
Refactor cache recycling so that it only caches large arrays (pages) that can
later be used to build more complex data-structures such as hash tables.

 - QueueRecycler now takes a limit like other non-trivial recyclers.
 - New PageCacheRecycler (inspired of CacheRecycler) has the ability to cache
   byte[], int[], long[], double[] or Object[] arrays using a fixed amount of
   memory (either globally or per-thread depending on the Recycler impl, eg.
   queue is global while thread_local is per-thread).
 - Paged arrays in o.e.common.util can now optionally take a PageCacheRecycler
   to reuse existing pages.
 - All aggregators' data-structures now use PageCacheRecycler:
   - for all arrays (counts, mins, maxes, ...)
   - LongHash can now take a PageCacheRecycler
   - there is a new BytesRefHash (inspired from Lucene but quite different,
     still; for instance it cheats on BytesRef comparisons by using Unsafe)
     that also takes a PageCacheRecycler

Close #4557
2014-01-06 19:02:00 +01:00
Simon Willnauer 80de40f195 Use a tolerance to decide if a value is less than the threshold
Adding a small value to the threshold prevents weight deltas that are
very very close to the threshold to not trigger relocations. These
deltas can be rounding errors that lead to unnecessary relocations. In
practice this might only happen under very rare circumstances.
In general it's a good idea for the shard allocator to be a bit
more conversavtive in terms of rebalancing since in general relocation
costs are pretty high.

Closes #4630
2014-01-06 17:47:51 +01:00
Simon Willnauer f6cf5e2e6f s/StringBuffer/StringBuilder 2014-01-06 17:44:56 +01:00
Martijn van Groningen 32c5471d33 Rename `score` to `track_scores` in percolate api.
Closes #4624
2014-01-06 14:57:39 +01:00
Martijn van Groningen e3327f5271 Brought back the deprecated _aliases api. 2014-01-06 11:01:52 +01:00
Adrien Grand 9763d079b8 Eager norms loading options.
Norms can be eagerly loaded on a per-field basis by setting norms.loading to
`eager` instead of the default `lazy`:

```
"my_string_field" : {
  "type": "string",
  "norms": {
    "loading": "eager"
  }
}
```

In case this behavior should be applied to all fields, it is possible to change
the default value by setting `index.norms.loading` to `eager`.

Close #4079
2014-01-06 09:53:42 +01:00
Luca Cavanna 6e4d33bb4d Refactored create index api to make use of the new recently introduced generic ack mechanism
Closes #4421
2014-01-06 09:04:57 +01:00
Alexander Reelsen bb275166f1 Simplify nodes stats API
First, this breaks backwards compatibility!

* Removed /_cluster/nodes/stats endpoint
* Excpect the stats types not as parameters, but as part of the URL
* Returning all indices stats by default, returning all nodes stats by default
* Supporting groups & types in nodes stats now as well
* Updated documentation & tests accordingly
* Allow level parameter for "shards" and "indices" (cluster does not make sense here)

Closes #4057
2014-01-06 08:33:32 +01:00
Alexander Reelsen 33878be1e8 Simplify indices stats API
Note: This breaks backward compatibility

* Removed clear/all parameters, now all stats are returned by default
* Made the metrics part of the URL
* Removed a lot of handlers
* Added shards/indices/cluster level paremeter to change response serialization
* Returning translog statistics in IndicesStats
* Added TranslogStats class
* Added IndexShard.translogStats() method to get the stats from concrete implementation
* Updated documentation

Closes #4054
2014-01-06 07:27:03 +01:00
Martijn van Groningen 3024cc24a6 Added missing indices options to rest spec and rest actions. 2014-01-05 23:32:44 +01:00
Martijn van Groningen 61535bd1b4 Renamed old setIgnoreIndices to setIndicesOptions. 2014-01-05 23:11:34 +01:00
Simon Willnauer 312d2348f7 use trace logging to print RoutingNodes in test 2014-01-04 23:18:21 +01:00
Lee Hinman 47607a69a1 Default the circuit breaker limit to 80% of the maximum JVM heap 2014-01-03 16:21:55 -07:00
Lee Hinman 5463f7953f Expose `simple_query_string` flags in `flags` parameter 2014-01-03 16:14:19 -07:00
Simon Willnauer 602c63d2aa pass on node seed to the node level settings in TestCluster 2014-01-03 21:48:43 +01:00
Igor Motov 49d0ced16c Fix potential infinite loop in double wildcard processing
Fixes #4610
2014-01-03 12:44:44 -05:00
Simon Willnauer 8fba11dd74 Fix typo in JavaDoc -- s/note/not 2014-01-03 17:55:53 +01:00
Alexander Reelsen 811b7d7d78 Do not start packages on installation
The reason to not start packages on installation is to allow to configure
them before starting up (setting heap, cluster.name etc)

Also the documentation was updated in order to show, which statements need
to be executed.
In addition, these statements are also printed out when the package is
installed, depending on whether chkconfig, system or update-rc.d is used.

Closes #3722
2014-01-03 17:40:27 +01:00
Martijn van Groningen f1bf585089 The `fields` option should always return an array for json document fields and single valued field for metadata fields.
Also the `fields` option can only be used to fetch leaf fields, trying to do fetch object fields will return in a client error.

Closes #4542
2014-01-03 17:29:12 +01:00
Andrew Raines fdfc7d7460 Add cache stats to cat/nodes.
Closes #4543.
2014-01-03 10:22:00 -06:00
David Pilato 0c7b494bb8 plugin manager: new `timeout` option
When testing plugin manager with real downloads, it could happen that the test run forever. Fortunately, test suite will be interrupted after 20 minutes, but it could be useful not to fail the whole test suite but only warn in that case.

By default, plugin manager still wait indefinitely but it can be modified using new `--timeout` option:

```sh
bin/plugin --install elasticsearch/kibana --timeout 30s

bin/plugin --install elasticsearch/kibana --timeout 1h
```

Closes #4603.
Closes #4600.
2014-01-03 16:48:18 +01:00
Martijn van Groningen 2cb5cfecec Fixed issue where the parentTypes set would be updated when a new parent type is being added or removed during a refresh, which would have lead to concurrency issues. 2014-01-03 16:37:29 +01:00
Simon Willnauer 911ef6a058 Pass correct number of expected shards to assertion 2014-01-03 15:51:33 +01:00
Simon Willnauer 5b5b2e6c85 Add NodeVersionAllocationDecider that prevent allocations that require forward compatibility.
Today during restart scenarios it is possible that we recover from a node that
has already been upgraded to version N+1. The node that we relocate to is
on version N and might not be able to read the index format from the node
we relocate from. This causes `IndexFormatToNewException` during
recovery but only after recovery has finished which can cause large
load spikes during the upgrade period.

Closes #4588
2014-01-03 15:51:33 +01:00
Martijn van Groningen d5c440cd2e Fixed SimpleIdCache#clear() to not invoke onRemoval twice, which can happen in rare cases. 2014-01-03 15:24:27 +01:00
Martijn van Groningen 48c63c137a IdCache shouldn't implement Iterable 2014-01-03 15:24:27 +01:00
Martijn van Groningen 38f038f899 Already loaded SimpleIdReaderCache should be reloaded when a new `_parent` has been introduced.
Closes #4595
Relates #4568
2014-01-03 15:24:27 +01:00
Simon Willnauer fbae6e940b Fix transient settings assertions in ElasticsearchIntegrationTest
We missed to fail if transient settings were modifed in the test since we
compared against persistent settings.
2014-01-03 15:06:03 +01:00
sam 87947cb006 Raise visibilty of `#types()` to public in request classes
`CountRequest` and `ValidateQueryRequest` have package
private accessors for `#types()` which is inconsitent with
other getters.
2014-01-03 14:15:15 +01:00
Simon Willnauer 65c4282bb9 Check if node is still present when collecting attribute shard routings
The node we need to lookup for attribute colelction might not be part
of the `DiscoveryNodes` anymore due to node failure or shutdown. This
commit adds a check and removes the shard from the iteration.

Closes #4589
2014-01-03 14:03:24 +01:00
Britta Weber 9f54e9782d rename _shard -> _index and also rename classes and variables
closes #4584
2014-01-03 14:00:23 +01:00
Florian Schilling 611dd0a396 Setup an accurate version of Haversine closes #4596 2014-01-03 17:41:36 +09:00
Shay Banon 2a73cf4f82 support aliases for columns in cat API
use it as an example in nodes for now for some columns, though we need to go over all the columns and properly name them and alias them
2014-01-03 00:41:26 +01:00
Lee Hinman a754224751 Add field data memory circuit breaker.
This adds the field data circuit breaker, which is used to estimate
the amount of memory required to load field data before loading it. It
then raises a CircuitBreakingException if the limit is exceeded.

It is configured with two parameters:

`indices.fielddata.cache.breaker.limit` - the maximum number of bytes
of field data to be loaded before circuit breaking. Defaults to
`indices.fielddata.cache.size` if set, unbounded otherwise.

`indices.fielddata.cache.breaker.overhead` - a contast for all field
data estimations to be multiplied with before aggregation. Defaults to
1.03.

Both settings can be configured dynamically using the cluster update
settings API.
2014-01-02 15:04:47 -07:00
Simon Willnauer edb3e5f0f4 s/similariry/similarity in AllFieldMapper 2014-01-02 17:53:43 +01:00
Simon Willnauer beaa9153a6 Simulate the entire toXContent instead of special caseing
Today we try to detect if we need to generate the mapping or not in
the all mapper. This is error prone since it misses conditions if not
explicitly added. We should rather similate the generation instead.

This commit also adds a random test to check if the settings
of the all field mapper are correctly applied.

Closes #4579
Closes #4581
2014-01-02 17:15:51 +01:00
Simon Willnauer 79f676e45e Term Vector settings should be treated like flags without propergation
today if a specific feature is disabled for term vectors with something
like 'store_term_vector_positions = false' term vectors might be disabeled
alltogether even if 'store_term_vectors=true' in the mapping. This depends on the
order of the values in the mapping since the more specific one might override
the less specific on.

Closes #4582
2014-01-02 17:15:51 +01:00
Shay Banon c12427d047 remove double check for null in value source 2014-01-02 17:03:24 +01:00
Martijn van Groningen aa548f5148 Remove GET `_aliases` api in favour for GET `_alias` api
Currently there are two get aliases apis that both have the same functionality, but have a different response structure. The reason for having 2 apis is historic.

The GET _alias api was added in 0.90.x and is more efficient since it only sends the needed alias data from the cluster state between the master node and the node that received the request. In the GET _aliases api the complete cluster state is send to the node that received the request and then the right information is filtered out and send back to the client.

The GET _aliases api should be removed in favour for the alias api

Closes to #4539
2014-01-02 13:56:11 +01:00
Alexander Reelsen 8d4be46e59 Made parsing of ByteSizeValue case independent
This allows to parse '12GB' as well as '12gb'

Closes #4442
2014-01-02 13:00:41 +01:00
Martijn van Groningen f4bf0d5112 Replaced `ignore_indices` with `ignore_unavailable`, `expand_wildcards` and `allow_no_indices`.
* `ignore_unavailable` - Controls whether to ignore if any specified indices are unavailable, this includes indices that don't exist or closed indices. Either `true` or `false` can be specified.
* `allow_no_indices` - Controls whether to fail if a wildcard indices expressions results into no concrete indices. Either `true` or `false` can be specified. For example if the wildcard expression `foo*` is specified and no indices are available that start with `foo` then depending on this setting the request will fail. This setting is also applicable when `_all`, `*` or no index has been specified.
* `expand_wildcards` - Controls to what kind of concrete indices wildcard indices expression expand to. If `open` is specified then the wildcard expression if expanded to only open indices and if `closed` is specified then the wildcard expression if expanded only to closed indices. Also both values (`open,closed`) can be specified to expand to all indices.

Closes to #4436
2014-01-02 12:19:45 +01:00
Alexander Reelsen 040719f337 Allow GetAliasRequest to retrieve all aliases
Results in less data being sent over the wire, as the Cat API does not
need to have the whole cluster state.

Also added matchers for hasKey() for immutable open map (I think we should
add more of those to have map style assertions).

Closes #4455
2014-01-02 12:06:29 +01:00
Britta Weber 1ede9a5730 make term statistics accessible in scripts
term statistics can be accessed via the _shard variable.

Below is a minimal example. See documentation on details.

```

DELETE paytest

PUT paytest
{
    "mappings": {
        "test": {
            "_all": {
                "auto_boost": true,
                "enabled": true
            },
            "properties": {
                "text": {
                    "index_analyzer": "fulltext_analyzer",
                    "store": "yes",
                    "type": "string"
                }
            }
        }
    },
    "settings": {
        "analysis": {
            "analyzer": {
                "fulltext_analyzer": {
                    "filter": [
                        "my_delimited_payload_filter"
                    ],
                    "tokenizer": "whitespace",
                    "type": "custom"
                }
            },
            "filter": {
                "my_delimited_payload_filter": {
                    "delimiter": "+",
                    "encoding": "float",
                    "type": "delimited_payload_filter"
                }
            }
        },
        "index": {
            "number_of_replicas": 0,
            "number_of_shards": 1
        }
    }
}

POST paytest/test/1
{
    "text": "the+1 quick+2 brown+3 fox+4 is quick+10"
}

POST paytest/test/2
{
    "text": "the+1 quick+2 red+3 fox+4"
}

POST paytest/_refresh

POST paytest/_search
{
    "script_fields": {
       "ttf": {
          "script": "_shard[\"text\"][\"quick\"].ttf()"
       }
    }
}

POST paytest/_search
{
    "script_fields": {
       "freq": {
          "script": "_shard[\"text\"][\"quick\"].freq()"
       }
    }
}
POST paytest/test/2/_termvector
POST paytest/_search
{
    "script_fields": {
       "payloads": {
          "script": "term = _shard[\"text\"].get(\"red\",_PAYLOADS);payloads = []; for(pos : term){payloads.add(pos.payloadAsFloat(-1));} return payloads;"
       }
    }
}

POST paytest/_search
{
   "script_fields": {
      "tv": {
         "script": "_shard[\"text\"][\"quick\"].freq()"
      }
   },
   "query": {
      "function_score": {
         "functions": [
            {
               "script_score": {
                  "script": "_shard[\"text\"][\"quick\"].freq()"
               }
            }
         ]
      }
   }
}

```

closes #3772
2014-01-02 11:17:33 +01:00
Britta Weber df9b8ae02e do not call score() twice 2014-01-02 11:16:55 +01:00
Martijn van Groningen a7bb28c0e7 Made single shards APIs fail if routing is configured to be required in the mapping.
This change make single shard requests fail when no routing is specified and routing has been configured to be required in the mapping. Thi

 Closes #4506
2014-01-02 10:47:53 +01:00
Simon Willnauer c78f517d36 Allow 'omit_norms' on the '_all' field
The '_all' field doesn't allow to omit norms. In certain scenarios
omitting the norm values makes a lot of sense to get senseable scoring.

Closes #3734
2014-01-02 10:27:53 +01:00
Martijn van Groningen bb01995722 Made APIs consistently accept a query in the request body's `query` field.
The following APIs now accept the query in a top level `query` field like:
* delete_by_query
* validate_query
* count

These APIs used to accept the query directly in the request body which was inconsistent with the search and explain APIs. For this reason t

Closes #4074
2014-01-02 10:06:01 +01:00
Alexander Reelsen dee325de79 Packaging: Increasing default for max mapped pages to 262144 2014-01-02 09:10:46 +01:00
Simon Willnauer e7a84d744a Add ability to run certain packages with assertions disabled
Test can be run with `-Dtests.assertion.disabled=org.elasticsearch`
to run the tests without assertions to make sure assertions
don't hide any assignements etc. that introduce bugs in production.
2013-12-30 19:36:02 +01:00
Shay Banon e6e1a3463a more cleanup of cat API, fix index lookup failure count/health 2013-12-30 16:12:15 +01:00
David Pilato b29f89f7f9 We run PluginManagerTests using only node client.
We also add some debug logs and fix `tests.network` (setting it to true was not working from jenkins)
2013-12-30 15:40:52 +01:00
Shay Banon 05c5804341 Expose filtered nodes on TransportClient
Expose the list of nodes that were filtered out with the TransportClient, for example, due to different cluster name. Relates to #4569
closes #4571
2013-12-30 15:27:50 +01:00
Shay Banon 95abbe2057 mark abstract class as abstract 2013-12-30 14:40:01 +01:00
Shay Banon debfb0e996 move helper class for allocation tests to base class 2013-12-30 14:23:34 +01:00
Shay Banon e67cad3127 Add build hash to nodes info API
also, add it to the cat nodes api
2013-12-30 13:59:56 +01:00
Adrien Grand 96cca039e9 Honor `includeDefaults` in GeoPointFieldMapper.
Close #4563
2013-12-30 13:46:19 +01:00
Adrien Grand 1654ae8937 Explicit doc_values setting.
Once doc values are enabled on a field, they can't be disabled.

Close #4560
2013-12-30 11:10:52 +01:00
Simon Willnauer 11c4218566 Start Test nodes sometimes without mock modules
We are mocking out some functionality to add assertions etc. or
randomize store types. We should randomly run with our defaults to make
sure we don't hide any potential problems.
2013-12-29 00:50:10 +01:00
Simon Willnauer a1e4258b21 Add @Slow annotation to bad apples 2013-12-29 00:03:14 +01:00
Simon Willnauer 3113203e9e Add test that throws exceptions during search execution
Currently we only test if readers are correctly released when exceptions
occur during reopen or flush. This commit adds a test that
randomly throws exceptions during the search execution ie. when Terms
are pulled or if a docs enum is created.
2013-12-28 23:58:02 +01:00
Luca Cavanna 08a077ffae re-enabled FileUtilsTests and REST tests as rest-api-spec has been added back
fixed rest-api-spec paths in TESTING docs

Relates to #4540 & #4376
2013-12-27 20:43:16 +01:00
Luca Cavanna 63cbc84393 removed rest-spec submodule and prepared project for same files added directly to the codebase (no submodule) within rest-api-spec
(temporarily disabled FileUtilsTests & REST tests as there's temporarily no rest-spec dir)

Relates to #4540 #4376
2013-12-27 20:36:12 +01:00
Adrien Grand 51bec4ec6c Add SLOPPY_ARC to GeoDistanceSearchBenchmark. 2013-12-27 15:48:24 +01:00
Simon Willnauer 1b35ae11bc Fix SuggestSearchTests to expect any order in the error message 2013-12-27 14:07:04 +01:00
Adrien Grand 55a5c26de8 Fix NPE in RangeAggregator 2013-12-27 12:48:55 +01:00
Adrien Grand 05448b6276 Doc values for geo points.
This commits add doc values support to geo point using the exact same approach
as for numeric data: geo points for a given document are stored uncompressed
and sequentially in a single binary doc values field.

Close #4207
2013-12-27 12:45:18 +01:00
Adrien Grand 9eb7441543 Make RangeAggregator a MULTI_BUCKETS aggregator.
Until now, RangeAggregator was a PER_BUCKET aggregator, expecting to be always
collected with owningBUcketOrdinal == 0. However, since the number of buckets
it creates is known in advance, it can be changed to a MULTI_BUCKETS aggregator
by just multiplying the bucket ordinal by the number of ranges.

This makes aggregations that have ranges as sub aggregations of PER_BUCKET
aggregators more efficient.

Close #4550
2013-12-27 12:43:25 +01:00
Simon Willnauer 1c2cb99751 Use RandomPicks to select a random array element 2013-12-27 12:35:57 +01:00
Simon Willnauer 11ceaccc20 Randomize node level setting per node not per cluster 2013-12-27 12:21:41 +01:00
Simon Willnauer f52a080eec Randomize CacheRecycler instance in TestCluster 2013-12-27 12:21:25 +01:00
Florian Schilling bc452dff84 * setup accurate GeoDistance Function
* adapt tests
* introduced default GeoDistance function
* Updated docs

closes #4498
2013-12-27 19:15:19 +09:00
Shay Banon 5821f90b2c cleanup cat nodes 2013-12-26 17:22:30 +01:00
Adrien Grand d0143703a1 Fix Aggregator.buildAggregation on MULTI_BUCKETS aggregators. 2013-12-26 11:38:14 +01:00
Adrien Grand f3c1a885fb Fix QueueRecycler.
Double-release protection added in 1c758b0b made QueueRecycler throw NPEs when
trying to recycle existing instances.
2013-12-26 10:49:58 +01:00
Adrien Grand a04d18d2d2 Use BINARY doc values instead of SORTED_SET doc values to store numeric data.
Although SORTED_SET doc values make things like terms aggregations very fast
thanks to the use of ordinals, ordinals are usually not that useful on numeric
data. We are more interested in the values themselves in order to be able to
compute sums, averages, etc. on these values. However, SORTED_SET is quite slow
at accessing values, so BINARY doc values are better suited at storing numeric
data.

floats and doubles are encoded without compression with little-endian byte order
(so that it may be optimizable through sun.misc.Unsafe in the future given that
most computers nowadays use the little-endian byte order) and byte, short, int,
and long are encoded using vLong encoding: they first encode the minimum value
using zig-zag encoding (so that negative values become positive) and then deltas
between successive values.

Close #3993
2013-12-26 09:58:00 +01:00
Boaz Leskes 6fbcd8f8ff Start the task timeout checking *after* adding it to the execution queue.
This prevents missing very short timeouts which fire before the calling thread had the chance to add the task to the queue and are therefore ignored. This is mostly of importance for testing where we explicitly want tasks to timeout and set it to a very low value.
2013-12-24 20:23:31 +01:00
Andrew Raines c6199b611e Need to make sure we always end up with a Cell, even if it's null.
Fixes #4544.
2013-12-24 11:01:01 -06:00
Shay Banon 1c758b0bb0 clear the list of releases once released
also double check that once a recycler is released, it can't be released again or used
2013-12-24 10:06:35 +01:00
Andrew Raines df39016e58 Add column separator for help output. 2013-12-23 22:26:28 -06:00
Andrew Raines 498e0b418a Round load avg in _cat/nodes. 2013-12-23 22:26:28 -06:00
Andrew Raines dd75020709 Turn off some columns by default in cat/nodes. 2013-12-23 22:26:28 -06:00
Andrew Raines 08ddfcd731 Add associative lookup of columns for arbitrary (and more intuitive) ordering.
% curl 'localhost:9200/_cat/nodes?v&headers=jdk,ip,name'
  jdk      ip        name
  1.7.0_40 127.0.0.1 Mordo, Karl

Closes #4433.
2013-12-23 22:26:27 -06:00
Andrew Raines c2b41f8ad9 Add table concatenation. 2013-12-23 22:26:27 -06:00
uboness 7b13f1932e - currently we make a few assumptions in the field data based aggregations base on which we decide on execution paths, e.g. initial buffer sizes we use for ordinal arrays.
- also, in geo distance, because it's based on range agg and that by default the order of the geo point per doc is unknown, always wrap it in a dedicated field data source which sorts the values if needed. But most of the times, a doc will be associated with a single point and therefore most of this wrapping is redundant and adds perf. cost for nothing.

- the idea here is for every request that "hits" a field data agg, we'll first iterate over the searchable segments and load their field data and compute the cross-segment info out of them. This info will be placed in the field context with which the value sources are created.

- we currently have some of this info on the IndexFieldData, but problem with getting it from there is that we may easily end up getting wrong info that originate in unsearchable segments.
2013-12-24 00:40:20 +01:00
Martijn van Groningen 682e9548c1 Release SearchContext releaseables also post match filters / queries. 2013-12-23 18:35:54 +01:00
Martijn van Groningen 27b53b8edf Fixed named filter and query support for the top_children, has_child and has_parent queries and filters.
Closes #4534
2013-12-23 13:53:20 +01:00
Martijn van Groningen 9c67be5181 Release parentDocs in TopChildrenQuery. 2013-12-23 13:12:50 +01:00
Simon Willnauer 27e89c2427 Added v0.90.9 2013-12-23 12:13:09 +01:00
Martijn van Groningen a3d6216f40 Make doc lookup in geo_shape filter and query consistent with terms lookup.
The `geo_shape filter and query` option in geo_shape filter and query has been replaced with the `path` option, which allows these filter and query to fetch shapes from within objects as well.

Closes #4486
2013-12-23 11:20:43 +01:00
Simon Willnauer 2d77e2a37e Disable SegmentReader ram usage by default even if -ea is provided 2013-12-23 11:01:11 +01:00
Alexander Reelsen e4244268fa Fix loading templates in config/ directory
The fixes introduced in #4235 and #4411 do not take into account, that a
template JSON in the config/ directory includes a template name, as opposed
when calling the Put Template API.

This PR allows to put both formats (either specifying a template name or not)
into files. However you template name/id may not be one of the template
element names like "template", "settings", "order" or "mapping".

Closes #4511
2013-12-22 21:37:50 +01:00
Boaz Leskes 5698f9d794 Added asserts to test validation failures presence in ClusterHealthResponse response 2013-12-22 10:21:07 +01:00
Igor Motov d92f573404 Optimize restore source JSON serialization
Don't print "restore_source":  null if restore source is null, omit entire line instead
2013-12-21 22:12:12 -05:00
Shay Banon 9b9ad1a603 fix forbidden API on lower case... 2013-12-22 00:14:06 +01:00
Shay Banon e5b19087cb fix lower case for windows tests 2013-12-22 00:08:14 +01:00
Shay Banon f47f224d33 make sure all integration tests use ElasticsearchIntegrationTest
- move ClusterSettingsTests to ElasticsearchIntegrationTest
- remove InternalNodeTests, we already have separate plugin tests that verify it
2013-12-21 23:26:32 +01:00
Shay Banon 30a0fc30d5 add randomized multi data path nodes tests 2013-12-21 23:01:22 +01:00
Boaz Leskes be27ed3a25 Added unit tests for ClusterIndexHealth and ClusterHealthResponse
Relates to #4528
2013-12-20 22:00:24 +01:00
Alexander Reelsen 0bef2c66a9 Reverting back to 0.90.7 config/templates loading behaviour
Closes #4511
2013-12-20 14:58:46 +01:00
Boaz Leskes bbffeb1b39 Counting shards was wrong if one of the indices was in the RED status
Closes #4528
2013-12-20 13:17:18 +01:00
Adrien Grand a7cfae4e7a Stricter parsing of aggregations.
- Only one aggregation type is allowed per aggregation definition.
 - Return an error in the parsers when an unknown field is encountered.

Close #4464
2013-12-20 09:51:25 +01:00
Shay Banon 5bf4e74647 Failed search on a shard tries a local replica on a network thread
When a search on a shard to a remove node fails, and then replica exists on the local node, then the execution of the search is done on the network thread. This is problematic since we need to execute it on the actual search thread pool, but can also explain #4519, where the get happens on the network thread and it waits to send the get request till the network thread we use is freed (deadlock...)
fixes #4526

note, re-enable the geo shape fetch test, this fix should solve it as well
2013-12-19 22:19:20 +01:00
Shay Banon 0c1c2dc671 Allow to enable / disable bloom filter loading on an index
Allow to have a new index level setting index.codec.bloom.load (default to true), that can control if the boom filters will be loaded or not. This is an updateable setting, that can be updated on a live index using the update settings API.

Note though, when this setting is updated, a fresh Lucene index will be reopened, causing associate caches to be dropped potentially.

closes #4525

Note, this change also disables the returning lucene ram usage stats, due to a bug in Lucene, relates to #4512
2013-12-19 21:32:14 +01:00
Simon Willnauer 80ed3d05bc Use a List of shards per shard ID rather than a set.
The shards in the set are mutated after they are added to the
set such that the hashcode doesn't fit anymore. For this reason
this used an identity hashset before but the downside of this is
that the iteration order is not deterministic. We can just use a list
since shard removal is a very rare action and the size of the list is
very small such that iteration is fast.
2013-12-19 19:55:29 +01:00
Shay Banon a92907c47e fix bloom filter posting format to get the fpp from the correct settings 2013-12-19 14:35:38 +01:00
Shay Banon 46d191c8d4 try and extract store directory also if its wrapped in a compound dir 2013-12-19 14:21:27 +01:00
Igor Motov 8c1073bb6e Update snapshot list when snapshot is deleted 2013-12-18 20:34:05 -05:00
Igor Motov aafd4ddfbd Add ability to specify base directory on the repository level
This change is needed to support multiple repositories per S3 bucket
2013-12-18 20:34:05 -05:00
Shay Banon 95ca06cf09 Add the memory used on segment/segments stats
The memory used for the Lucene index (term dict, bloom filter, ...) can now be reported per segment using the segments API, and on the segments flag on node/indices stats
closes #4512
2013-12-18 22:21:53 +01:00
Shay Banon 0a016716ed fix computation of ram bytes used in bloom filter posting format 2013-12-18 22:01:59 +01:00
Simon Willnauer 7969a719f7 s/he/it 2013-12-18 21:46:43 +01:00
Martijn van Groningen e7e1667a26 Make parsing strict for `geo_shape` query & filter and stricter for `common` query.
Closes #4508
2013-12-18 17:56:38 +01:00
Shay Banon bb4d3f55c0 Fix compilation on Java 8 + tests that rely on ordering
Note, we still have tests failing because of mvel compilation bugs, see more here: http://jira.codehaus.org/browse/MVEL-299
closes #4510
2013-12-18 17:52:19 +01:00
Simon Willnauer 9d8ab56c9b Add [0.90.8] release 2013-12-18 17:30:28 +01:00
Simon Willnauer d8dee92f98 Make BalancedAllocationDecider assignments deterministic
a previous change introduces an identity hashset that has non-deterministic
iteration order which kill the reproducibility of our unittests if they fail.
This patch adds back deterministic allocations.
2013-12-18 16:10:28 +01:00
Boaz Leskes 17e7d01753 Move XContent Rendering and Cluster Health Status calculations to ClusterHealthResponse 2013-12-18 15:24:01 +01:00
Shay Banon d5192ecd31 use the computed data structure to optimize the awareness allocation decider 2013-12-18 14:29:41 +01:00
Shay Banon 5827170d42 use the computed data structure to optimize the same shard allocation decider 2013-12-18 14:08:59 +01:00
Shay Banon f5d217c08e On node join, evict existing node(s) with the same transport address
Make sure to evict an existing node with the same transport address as a new node that joins. This can happen for example when there is a bug in a cluster state event handler, which causes the "old" node to not be evicted, or a load on the master node that will take time for the "old" node leaving to be processed.
closes #4503
2013-12-18 12:21:59 +01:00
Boaz Leskes 3c5106ae98 Added cluster health status to the Cluster Stats API
Relates to #4460
2013-12-18 12:03:49 +01:00
Simon Willnauer 314499cee0 Use existing datastructures from RoutingNodes to elect unassigned primaries
Currently we trying to find a replica for a primary that is allocated by
running through all shards in the cluster while RoutingNodes already has
a datastructure keyed by shard ID for this. We should lookup this
directly rather than using linear probing. This improves shard allocation performance
by 5x.
2013-12-18 11:48:21 +01:00
Shay Banon f0356b2126 Don't delete local shard data when its allocated on a node that doesn't exists
This is an extreme case, exposed by a bug we had in our allocation in local gateway, causing a cluster state that doesn't include a node in the nodes list, but still has the shard in the routing table pointing at the non existent node. Then, when a node on the same box comes back, it will cause the local shard data to be deleted because it thinks its fully allocated on other nodes.
fixes #4502
2013-12-18 11:37:00 +01:00
Boaz Leskes b865047125 Removed exception handling in InternalIndexShard.docStats() & storeStats()
This is already caught at another level, see #4203
2013-12-18 11:10:35 +01:00
Luca Cavanna fffa6a21dc Fixed FileUtilsTests, used wrong path separator (worked only on *nix) 2013-12-18 11:05:30 +01:00
Boaz Leskes 5475ccf738 Move XContent rendering of ClusterIndexHealth to the class itself 2013-12-18 11:00:52 +01:00
Alexander Reelsen 8dce82d64d Removed exception handling in InternalIndexShard.completionStats()
This is already caught at another level, see #4203
2013-12-18 10:52:23 +01:00
Martijn van Groningen 40ec7116d8 Removed unnecessary get call 2013-12-17 22:45:05 +01:00
Boaz Leskes ae09f85c9e removed a left over debug log 2013-12-17 16:29:02 +01:00
Boaz Leskes 33bb2ecfa8 Added a time stamp to the cluster stats response
Making it consistent with NodeStats
2013-12-17 16:06:52 +01:00
Luca Cavanna d97a00d4a7 added REST test suites runner
The REST layer can now be tested through tests that are shared between all the elasticsearch official clients.
The tests are based on REST specification that can be found on the elasticsearch-rest-api-spec project and consist of YAML files that describe the operations to be executed and the obtained results that need to be tested.

REST tests can be executed through the ElasticsearchRestTests class, which relies on the rest-spec git submodule that contains the rest spec and tests pulled from the elasticsearch-rest-spec-api project. The rest-spec submodule gets automatically initialized and updated through maven (generate-test-resources phase).

The REST runner and the needed classes are distributed within the test artifact.

The following are the options supported by the REST tests runner:

- tests.rest[true|false|host:port]: determines whether the REST tests need to be run and if so whether to rely on an external cluster (providing host and port) or fire a test cluster (default)
- tests.rest.suite: comma separated paths of the test suites to be run (by default loaded from /rest-spec/test classpath). it is possible to run only a subset of the tests providing a sub-folder or even a single yaml file (the default /rest-spec/test prefix is optional when files are loaded from classpath) e.g. -Dtests.rest.suite=index,get,create/10_with_id
- tests.rest.spec: REST spec path (default /rest-spec/api from classpath)
- tests.iters: runs multiple iterations
- tests.seed: seed to base the random behaviours on
- tests.appendseed[true|false]: enables adding the seed to each test section's description (default false)
- tests.cluster_seed: seed used to create the test cluster (if enabled)

Closes #4469
2013-12-17 15:36:16 +01:00
Shay Banon 3fed65e486 reuse shard identifier if possible 2013-12-17 15:35:01 +01:00
Alexander Reelsen 59cedea010 Fix parsing of file based template loading
We support three different settings in templates

* "settings" : { "index" : { "number_of_shards" : 12 } }
* "settings" : { "index.number_of_shards" : 12 }
* "settings" : { "number_of_shards" : 12 }

The latter one was not supported by the fix in #4235

This commit fixes this issue and uses randomized testing to test any of the three cases above when running integration tests.

Closes #4411
2013-12-17 14:34:56 +01:00
Shay Banon be860c8004 take into account field mapped analyzers in simple_query_string
need to use the correct analyzer here, that will automatically choose the correct analyzer per field
2013-12-17 14:34:20 +01:00
Simon Willnauer a4f97bed9d Randomize AllocationDecider order in tests 2013-12-17 13:55:43 +01:00
Simon Willnauer 79ab05cdcf Improve allocation of unassigned shards with early termination
When we allocate unassigned shards we can terminate early for some
shards like if we already tried to allocate a replica we don't need
to try the same replica if the first one got rejected. We also
can check if certain nodes can't allocate any primaries or shrads
at all and take those nodes out of the picture for the current round
since it will not change in the current round.
2013-12-17 13:55:43 +01:00
Boaz Leskes 2b6214cff7 Added Cluster Stats API
Closes #4460
2013-12-17 13:14:46 +01:00
Simon Willnauer 75b6415b1a Fail test with timeout and stack dump after 20 min rather than 1h 2013-12-17 12:26:14 +01:00
Boaz Leskes 9fb361cea1 Move index health calculations to ClusterIndexHealth so it can be reused. 2013-12-17 11:31:22 +01:00
Adrien Grand 33599d9a34 Compressed geo-point field data.
This commit allows to trade precision for memory when storing geo points.
This new field data impl accepts a `precision` parameter that controls the
maximum expected error for storing coordinates. This option can be updated on
a live index with the PUT mapping API.

Default precision is 1cm, which requires 8 bytes per geo-point (50% memory
saving compared to using 2 doubles).

Close #4386
2013-12-17 11:29:48 +01:00
Shay Banon a1ee68a145 fix usage of deprecated netty header method 2013-12-17 10:59:27 +01:00
Alexander Reelsen c30945a3d8 Start elasticsearch in the foreground by default
Instead of using the '-f' parameter to start elasticsearch in the
foreground, this is now the default modus.

In order to start elasticsearch in the background, the '-d' parameter
can be used.

Closes #4392
2013-12-17 10:39:22 +01:00
Shay Banon 809e870b8d introduce a native int/long open immutable map, and use it in in routing table 2013-12-16 20:33:38 +01:00
Simon Willnauer 1dc8c079da Wait until index is in the clusterstate after restart 2013-12-16 19:52:00 +01:00
Martijn van Groningen 23d2b1ea7b Renamed top level `filter` to `post_filter`.
Closes #4119
2013-12-16 17:10:14 +01:00
Lee Hinman db431b7cb3 Remove the `field` and `text` queries.
The `text` query was replaced by the `match` query and has been
deprecated for quite a while.

The `field` query should be replaced by a `query_string` query with
the `default_field` specified.

Fixes #4033
2013-12-16 08:59:36 -07:00
Simon Willnauer 3e321972cc Throw IAE if suggest results return differently sized results.
If the term suggester is used the results are merged depending on
the number of terms produced by the tokenizer / tokenfilter. If a
term suggester is executed across multiple indices that share the
same field but with different analysis chains we can't merge the
result anymore sicne tokens are our of order or have a different size.

This commit throws ESIllegalArgumentException if the number of entries
are not the same across all results.

Closes #3196
2013-12-16 15:31:44 +01:00
Shay Banon 2f2b95a6b8 better cluster reroute allocation benchmark 2013-12-16 15:29:50 +01:00
Luca Cavanna 173a91bb46 Added new IndicesLifecycle.Listener method that allows to listen for any IndexShardState internal change.
Closes #4413
2013-12-16 15:00:15 +01:00
Adrien Grand 4e7ce4ee02 Make field data changes immediately taken into account and add the ability to disallow field data loading.
This commit changes field data configuration updates so that they are
immediately taken into account for loading new segments. The way it works
is that field data configuration is now cached separately from the field
data cache, meaning that it is now possible to clear the field data
configuration from IndexFieldDataService while the cache will stay around. On
the next time that Elasticsearch will reload field data configuration, it will
check if there is already a cache entry, and reuse it if it exists.

To disable field data loading, all that is required is to change the field
data format to "none" (supported by all field data types) using the update
mapping API. Elasticsearch will then refuse to load field data on any new
segment, but field data which has been loaded on the previous segments will
remain available. So you need to clear the field data cache in order to
reclaim memory (otherwise memory will be reclaimed slower, as segments get
merged).

Close #4430
Close #4431
2013-12-16 14:34:33 +01:00
Simon Willnauer 8d321530de Reset source shards to `started` if canceling relocation.
Currently we miss to reset the source shards status to ACTIVE if we cancel
a relocation. If the shard is RELOCATING we need to reset to state ACTIVE.

Closes #4457
2013-12-16 11:52:16 +01:00
Simon Willnauer 30c6f2fa23 Improve RoutingNodes API
Currently the RoutingNodes API allows modification of it's internal state outside of the class.
This commit improves the APIs of `RoutingNode` and `RoutingNode` to change internal state
only within the classes itself.

Closes #4458
2013-12-16 11:50:45 +01:00
Sebastian Geidies 6af80d5017 Optimizes performance of AllocationDecider execution. Instead of using loops over all ShardRoutings, do accounting in RoutingNodes.
Speeds up recalculating cluster state on large clusters.
2013-12-16 11:35:45 +01:00
Alexander Reelsen 6a856c86e8 Cat API: Add endpoint to show aliases
This endpoint allows to check aliases, their indices, if a filter is
configured along with routing values for searching and indexing.

Closes #4414
2013-12-16 10:37:06 +01:00