Commit Graph

6785 Commits

Author SHA1 Message Date
Florian Schilling 611dd0a396 Setup an accurate version of Haversine closes #4596 2014-01-03 17:41:36 +09:00
Shay Banon 2a73cf4f82 support aliases for columns in cat API
use it as an example in nodes for now for some columns, though we need to go over all the columns and properly name them and alias them
2014-01-03 00:41:26 +01:00
Lee Hinman a754224751 Add field data memory circuit breaker.
This adds the field data circuit breaker, which is used to estimate
the amount of memory required to load field data before loading it. It
then raises a CircuitBreakingException if the limit is exceeded.

It is configured with two parameters:

`indices.fielddata.cache.breaker.limit` - the maximum number of bytes
of field data to be loaded before circuit breaking. Defaults to
`indices.fielddata.cache.size` if set, unbounded otherwise.

`indices.fielddata.cache.breaker.overhead` - a contast for all field
data estimations to be multiplied with before aggregation. Defaults to
1.03.

Both settings can be configured dynamically using the cluster update
settings API.
2014-01-02 15:04:47 -07:00
Dawid Weiss 84565c2951 In the spirit of the soon-to-be New Year 2014? 2014-01-02 22:07:53 +01:00
Honza Král 8517d8954e [TEST] add name parameter to get_alias in update_alias tests
to avoid failure on older es versions since get_alias without name has been
only introduced in #4539
2014-01-02 20:04:24 +01:00
Honza Král 076a24af14 [TEST] split tests with parent to pre/post 1.0 in the yaml test suite
See #4506 for details
2014-01-02 20:04:24 +01:00
Honza Král d5efb54785 [TEST] Split delete by query tests pre-1.0 and post-1.0
See #4074 for details
2014-01-02 20:04:24 +01:00
Simon Willnauer edb3e5f0f4 s/similariry/similarity in AllFieldMapper 2014-01-02 17:53:43 +01:00
Simon Willnauer beaa9153a6 Simulate the entire toXContent instead of special caseing
Today we try to detect if we need to generate the mapping or not in
the all mapper. This is error prone since it misses conditions if not
explicitly added. We should rather similate the generation instead.

This commit also adds a random test to check if the settings
of the all field mapper are correctly applied.

Closes #4579
Closes #4581
2014-01-02 17:15:51 +01:00
Simon Willnauer 79f676e45e Term Vector settings should be treated like flags without propergation
today if a specific feature is disabled for term vectors with something
like 'store_term_vector_positions = false' term vectors might be disabeled
alltogether even if 'store_term_vectors=true' in the mapping. This depends on the
order of the values in the mapping since the more specific one might override
the less specific on.

Closes #4582
2014-01-02 17:15:51 +01:00
Shay Banon c12427d047 remove double check for null in value source 2014-01-02 17:03:24 +01:00
Martijn van Groningen aa548f5148 Remove GET `_aliases` api in favour for GET `_alias` api
Currently there are two get aliases apis that both have the same functionality, but have a different response structure. The reason for having 2 apis is historic.

The GET _alias api was added in 0.90.x and is more efficient since it only sends the needed alias data from the cluster state between the master node and the node that received the request. In the GET _aliases api the complete cluster state is send to the node that received the request and then the right information is filtered out and send back to the client.

The GET _aliases api should be removed in favour for the alias api

Closes to #4539
2014-01-02 13:56:11 +01:00
Alexander Reelsen 8d4be46e59 Made parsing of ByteSizeValue case independent
This allows to parse '12GB' as well as '12gb'

Closes #4442
2014-01-02 13:00:41 +01:00
Martijn van Groningen f4bf0d5112 Replaced `ignore_indices` with `ignore_unavailable`, `expand_wildcards` and `allow_no_indices`.
* `ignore_unavailable` - Controls whether to ignore if any specified indices are unavailable, this includes indices that don't exist or closed indices. Either `true` or `false` can be specified.
* `allow_no_indices` - Controls whether to fail if a wildcard indices expressions results into no concrete indices. Either `true` or `false` can be specified. For example if the wildcard expression `foo*` is specified and no indices are available that start with `foo` then depending on this setting the request will fail. This setting is also applicable when `_all`, `*` or no index has been specified.
* `expand_wildcards` - Controls to what kind of concrete indices wildcard indices expression expand to. If `open` is specified then the wildcard expression if expanded to only open indices and if `closed` is specified then the wildcard expression if expanded only to closed indices. Also both values (`open,closed`) can be specified to expand to all indices.

Closes to #4436
2014-01-02 12:19:45 +01:00
Alexander Reelsen 040719f337 Allow GetAliasRequest to retrieve all aliases
Results in less data being sent over the wire, as the Cat API does not
need to have the whole cluster state.

Also added matchers for hasKey() for immutable open map (I think we should
add more of those to have map style assertions).

Closes #4455
2014-01-02 12:06:29 +01:00
Britta Weber 1ede9a5730 make term statistics accessible in scripts
term statistics can be accessed via the _shard variable.

Below is a minimal example. See documentation on details.

```

DELETE paytest

PUT paytest
{
    "mappings": {
        "test": {
            "_all": {
                "auto_boost": true,
                "enabled": true
            },
            "properties": {
                "text": {
                    "index_analyzer": "fulltext_analyzer",
                    "store": "yes",
                    "type": "string"
                }
            }
        }
    },
    "settings": {
        "analysis": {
            "analyzer": {
                "fulltext_analyzer": {
                    "filter": [
                        "my_delimited_payload_filter"
                    ],
                    "tokenizer": "whitespace",
                    "type": "custom"
                }
            },
            "filter": {
                "my_delimited_payload_filter": {
                    "delimiter": "+",
                    "encoding": "float",
                    "type": "delimited_payload_filter"
                }
            }
        },
        "index": {
            "number_of_replicas": 0,
            "number_of_shards": 1
        }
    }
}

POST paytest/test/1
{
    "text": "the+1 quick+2 brown+3 fox+4 is quick+10"
}

POST paytest/test/2
{
    "text": "the+1 quick+2 red+3 fox+4"
}

POST paytest/_refresh

POST paytest/_search
{
    "script_fields": {
       "ttf": {
          "script": "_shard[\"text\"][\"quick\"].ttf()"
       }
    }
}

POST paytest/_search
{
    "script_fields": {
       "freq": {
          "script": "_shard[\"text\"][\"quick\"].freq()"
       }
    }
}
POST paytest/test/2/_termvector
POST paytest/_search
{
    "script_fields": {
       "payloads": {
          "script": "term = _shard[\"text\"].get(\"red\",_PAYLOADS);payloads = []; for(pos : term){payloads.add(pos.payloadAsFloat(-1));} return payloads;"
       }
    }
}

POST paytest/_search
{
   "script_fields": {
      "tv": {
         "script": "_shard[\"text\"][\"quick\"].freq()"
      }
   },
   "query": {
      "function_score": {
         "functions": [
            {
               "script_score": {
                  "script": "_shard[\"text\"][\"quick\"].freq()"
               }
            }
         ]
      }
   }
}

```

closes #3772
2014-01-02 11:17:33 +01:00
Britta Weber df9b8ae02e do not call score() twice 2014-01-02 11:16:55 +01:00
Martijn van Groningen a7bb28c0e7 Made single shards APIs fail if routing is configured to be required in the mapping.
This change make single shard requests fail when no routing is specified and routing has been configured to be required in the mapping. Thi

 Closes #4506
2014-01-02 10:47:53 +01:00
Simon Willnauer c78f517d36 Allow 'omit_norms' on the '_all' field
The '_all' field doesn't allow to omit norms. In certain scenarios
omitting the norm values makes a lot of sense to get senseable scoring.

Closes #3734
2014-01-02 10:27:53 +01:00
Martijn van Groningen bb01995722 Made APIs consistently accept a query in the request body's `query` field.
The following APIs now accept the query in a top level `query` field like:
* delete_by_query
* validate_query
* count

These APIs used to accept the query directly in the request body which was inconsistent with the search and explain APIs. For this reason t

Closes #4074
2014-01-02 10:06:01 +01:00
Alexander Reelsen dee325de79 Packaging: Increasing default for max mapped pages to 262144 2014-01-02 09:10:46 +01:00
Simon Willnauer e7a84d744a Add ability to run certain packages with assertions disabled
Test can be run with `-Dtests.assertion.disabled=org.elasticsearch`
to run the tests without assertions to make sure assertions
don't hide any assignements etc. that introduce bugs in production.
2013-12-30 19:36:02 +01:00
Shay Banon e6e1a3463a more cleanup of cat API, fix index lookup failure count/health 2013-12-30 16:12:15 +01:00
David Pilato b29f89f7f9 We run PluginManagerTests using only node client.
We also add some debug logs and fix `tests.network` (setting it to true was not working from jenkins)
2013-12-30 15:40:52 +01:00
Shay Banon 05c5804341 Expose filtered nodes on TransportClient
Expose the list of nodes that were filtered out with the TransportClient, for example, due to different cluster name. Relates to #4569
closes #4571
2013-12-30 15:27:50 +01:00
Shay Banon 95abbe2057 mark abstract class as abstract 2013-12-30 14:40:01 +01:00
Shay Banon debfb0e996 move helper class for allocation tests to base class 2013-12-30 14:23:34 +01:00
Shay Banon e67cad3127 Add build hash to nodes info API
also, add it to the cat nodes api
2013-12-30 13:59:56 +01:00
Adrien Grand 96cca039e9 Honor `includeDefaults` in GeoPointFieldMapper.
Close #4563
2013-12-30 13:46:19 +01:00
Adrien Grand 1654ae8937 Explicit doc_values setting.
Once doc values are enabled on a field, they can't be disabled.

Close #4560
2013-12-30 11:10:52 +01:00
David Pilato 7694f0b7a0 Increase MaxPermSize to 128m for tests 2013-12-30 09:56:47 +01:00
Simon Willnauer 11c4218566 Start Test nodes sometimes without mock modules
We are mocking out some functionality to add assertions etc. or
randomize store types. We should randomly run with our defaults to make
sure we don't hide any potential problems.
2013-12-29 00:50:10 +01:00
Simon Willnauer a1e4258b21 Add @Slow annotation to bad apples 2013-12-29 00:03:14 +01:00
Simon Willnauer 3113203e9e Add test that throws exceptions during search execution
Currently we only test if readers are correctly released when exceptions
occur during reopen or flush. This commit adds a test that
randomly throws exceptions during the search execution ie. when Terms
are pulled or if a docs enum is created.
2013-12-28 23:58:02 +01:00
Luca Cavanna 08a077ffae re-enabled FileUtilsTests and REST tests as rest-api-spec has been added back
fixed rest-api-spec paths in TESTING docs

Relates to #4540 & #4376
2013-12-27 20:43:16 +01:00
Luca Cavanna 63a9ae4e2b merged rest-api-spec repo into es core
Closes #4540
Relates to #4376
2013-12-27 20:38:51 +01:00
Luca Cavanna 63cbc84393 removed rest-spec submodule and prepared project for same files added directly to the codebase (no submodule) within rest-api-spec
(temporarily disabled FileUtilsTests & REST tests as there's temporarily no rest-spec dir)

Relates to #4540 #4376
2013-12-27 20:36:12 +01:00
Adrien Grand 51bec4ec6c Add SLOPPY_ARC to GeoDistanceSearchBenchmark. 2013-12-27 15:48:24 +01:00
Simon Willnauer 1b35ae11bc Fix SuggestSearchTests to expect any order in the error message 2013-12-27 14:07:04 +01:00
Adrien Grand 55a5c26de8 Fix NPE in RangeAggregator 2013-12-27 12:48:55 +01:00
Adrien Grand 05448b6276 Doc values for geo points.
This commits add doc values support to geo point using the exact same approach
as for numeric data: geo points for a given document are stored uncompressed
and sequentially in a single binary doc values field.

Close #4207
2013-12-27 12:45:18 +01:00
Adrien Grand 9eb7441543 Make RangeAggregator a MULTI_BUCKETS aggregator.
Until now, RangeAggregator was a PER_BUCKET aggregator, expecting to be always
collected with owningBUcketOrdinal == 0. However, since the number of buckets
it creates is known in advance, it can be changed to a MULTI_BUCKETS aggregator
by just multiplying the bucket ordinal by the number of ranges.

This makes aggregations that have ranges as sub aggregations of PER_BUCKET
aggregators more efficient.

Close #4550
2013-12-27 12:43:25 +01:00
Simon Willnauer 1c2cb99751 Use RandomPicks to select a random array element 2013-12-27 12:35:57 +01:00
Simon Willnauer 11ceaccc20 Randomize node level setting per node not per cluster 2013-12-27 12:21:41 +01:00
Simon Willnauer f52a080eec Randomize CacheRecycler instance in TestCluster 2013-12-27 12:21:25 +01:00
Florian Schilling bc452dff84 * setup accurate GeoDistance Function
* adapt tests
* introduced default GeoDistance function
* Updated docs

closes #4498
2013-12-27 19:15:19 +09:00
Shay Banon 5821f90b2c cleanup cat nodes 2013-12-26 17:22:30 +01:00
Adrien Grand d0143703a1 Fix Aggregator.buildAggregation on MULTI_BUCKETS aggregators. 2013-12-26 11:38:14 +01:00
Adrien Grand f3c1a885fb Fix QueueRecycler.
Double-release protection added in 1c758b0b made QueueRecycler throw NPEs when
trying to recycle existing instances.
2013-12-26 10:49:58 +01:00
Adrien Grand a04d18d2d2 Use BINARY doc values instead of SORTED_SET doc values to store numeric data.
Although SORTED_SET doc values make things like terms aggregations very fast
thanks to the use of ordinals, ordinals are usually not that useful on numeric
data. We are more interested in the values themselves in order to be able to
compute sums, averages, etc. on these values. However, SORTED_SET is quite slow
at accessing values, so BINARY doc values are better suited at storing numeric
data.

floats and doubles are encoded without compression with little-endian byte order
(so that it may be optimizable through sun.misc.Unsafe in the future given that
most computers nowadays use the little-endian byte order) and byte, short, int,
and long are encoded using vLong encoding: they first encode the minimum value
using zig-zag encoding (so that negative values become positive) and then deltas
between successive values.

Close #3993
2013-12-26 09:58:00 +01:00