Commit Graph

227 Commits

Author SHA1 Message Date
Nik Everett 8bd9e34e39 Stop FVH from throwing away some query boosts
The FVH was throwing away some boosts on queries stopping a number of
ways to boost phrase matches to the top of the list of fragments from
working.

The plain highlighter also doesn't work for this but that is because it
doesn't support the concept of the same term having a different score at
different positions.

Also update documentation claiming that FHV is nicer for weighing terms
found by query combinations.

Closes #4351
2014-01-08 11:51:48 +01:00
Nik Everett 522d620eb6 Use FHV's phraseLimit
This prevents poisoning the FVH with documents that contain TONS of matches
which take tons of memory and time to highlight.

Closes #4645
2014-01-08 11:27:58 +01:00
Alexander Reelsen ad50afbec8 Simplify usage of nodes info API
Important: This breaks backwards compatibility with 0.90

* Removed endpoints: /_cluster/nodes, /_cluster/nodes/nodeId1,nodeId2
* Disallow usage of parameters, but make required metrics part of URI
* Changed NodesInfoRequest to return everything by default
* Fixed NPE in NodesInfoResponse

Closes #4055
2014-01-08 09:46:04 +01:00
Alexander Reelsen 6ef6bb993c Cluster state API: Improved consistency
Instead of specifying what kind of data should be filtered, this commit
streamlines the API to actually specify, what kind of data should be displayed.
This makes its behaviour similar to the other requests, like NodeIndicesStats.

A small feature has been added as well: If you specify an index to select on, not
only the metadata, but also the routing tables are filtered by index in order
to prevent too big cluster states to be returned.

Also the CAT apis have been changed to only return the wanted data in order to keep
network traffic as small as needed.

Tests for the cluster state API filtering have been added as well.

Note: This change breaks backwards compatibility with 0.90!

Closes #4065
2014-01-08 09:25:20 +01:00
Igor Motov 5d98341d11 Fix typo in snapshot/restore documentation 2014-01-07 14:03:12 -05:00
Shay Banon 4aa5ef139e randomize flush interval so multiple shards won't flush at the sam time
- also, allow to update interval using update settings on an index
2014-01-07 19:58:28 +01:00
markharwood 602de04692 A GeoHashGrid aggregation that buckets GeoPoints into cells whose dimensions are determined by a choice of GeoHash resolution.
Added a long-based representation of GeoHashes to GeoHashUtils for fast evaluation in aggregations.
The new BucketUtils provides a common heuristic for determining the number of results to obtain from each shard in "top N" type requests.
2014-01-07 18:03:33 +00:00
Lee Hinman 2cb40fcb17 Rename "exists" to "found" in TermVector and Get responses
- Adds the "created" field to the index action response
- Reverses Delete class' notFound to Found to avoid double negative
2014-01-07 09:47:07 -07:00
Simon Willnauer fa16969360 Cleanup comments and class names s/ElasticSearch/Elasticsearch
* Clean up s/ElasticSearch/Elasticsearch on docs/*
 * Clean up s/ElasticSearch/Elasticsearch on src/* bin/* & pom.xml
 * Clean up s/ElasticSearch/Elasticsearch on NOTICE.txt and README.textile

Closes #4634
2014-01-07 11:21:51 +01:00
Andrew Raines c46721a25f Document h/headers switcheroo. 2014-01-06 16:08:48 -06:00
Martijn van Groningen 32c5471d33 Rename `score` to `track_scores` in percolate api.
Closes #4624
2014-01-06 14:57:39 +01:00
Adrien Grand 9763d079b8 Eager norms loading options.
Norms can be eagerly loaded on a per-field basis by setting norms.loading to
`eager` instead of the default `lazy`:

```
"my_string_field" : {
  "type": "string",
  "norms": {
    "loading": "eager"
  }
}
```

In case this behavior should be applied to all fields, it is possible to change
the default value by setting `index.norms.loading` to `eager`.

Close #4079
2014-01-06 09:53:42 +01:00
Alexander Reelsen bb275166f1 Simplify nodes stats API
First, this breaks backwards compatibility!

* Removed /_cluster/nodes/stats endpoint
* Excpect the stats types not as parameters, but as part of the URL
* Returning all indices stats by default, returning all nodes stats by default
* Supporting groups & types in nodes stats now as well
* Updated documentation & tests accordingly
* Allow level parameter for "shards" and "indices" (cluster does not make sense here)

Closes #4057
2014-01-06 08:33:32 +01:00
Alexander Reelsen 33878be1e8 Simplify indices stats API
Note: This breaks backward compatibility

* Removed clear/all parameters, now all stats are returned by default
* Made the metrics part of the URL
* Removed a lot of handlers
* Added shards/indices/cluster level paremeter to change response serialization
* Returning translog statistics in IndicesStats
* Added TranslogStats class
* Added IndexShard.translogStats() method to get the stats from concrete implementation
* Updated documentation

Closes #4054
2014-01-06 07:27:03 +01:00
Lee Hinman 47607a69a1 Default the circuit breaker limit to 80% of the maximum JVM heap 2014-01-03 16:21:55 -07:00
Lee Hinman 5463f7953f Expose `simple_query_string` flags in `flags` parameter 2014-01-03 16:14:19 -07:00
Alexander Reelsen 811b7d7d78 Do not start packages on installation
The reason to not start packages on installation is to allow to configure
them before starting up (setting heap, cluster.name etc)

Also the documentation was updated in order to show, which statements need
to be executed.
In addition, these statements are also printed out when the package is
installed, depending on whether chkconfig, system or update-rc.d is used.

Closes #3722
2014-01-03 17:40:27 +01:00
Martijn van Groningen f1bf585089 The `fields` option should always return an array for json document fields and single valued field for metadata fields.
Also the `fields` option can only be used to fetch leaf fields, trying to do fetch object fields will return in a client error.

Closes #4542
2014-01-03 17:29:12 +01:00
David Pilato 0c7b494bb8 plugin manager: new `timeout` option
When testing plugin manager with real downloads, it could happen that the test run forever. Fortunately, test suite will be interrupted after 20 minutes, but it could be useful not to fail the whole test suite but only warn in that case.

By default, plugin manager still wait indefinitely but it can be modified using new `--timeout` option:

```sh
bin/plugin --install elasticsearch/kibana --timeout 30s

bin/plugin --install elasticsearch/kibana --timeout 1h
```

Closes #4603.
Closes #4600.
2014-01-03 16:48:18 +01:00
Britta Weber 9f54e9782d rename _shard -> _index and also rename classes and variables
closes #4584
2014-01-03 14:00:23 +01:00
Lee Hinman a754224751 Add field data memory circuit breaker.
This adds the field data circuit breaker, which is used to estimate
the amount of memory required to load field data before loading it. It
then raises a CircuitBreakingException if the limit is exceeded.

It is configured with two parameters:

`indices.fielddata.cache.breaker.limit` - the maximum number of bytes
of field data to be loaded before circuit breaking. Defaults to
`indices.fielddata.cache.size` if set, unbounded otherwise.

`indices.fielddata.cache.breaker.overhead` - a contast for all field
data estimations to be multiplied with before aggregation. Defaults to
1.03.

Both settings can be configured dynamically using the cluster update
settings API.
2014-01-02 15:04:47 -07:00
Martijn van Groningen aa548f5148 Remove GET `_aliases` api in favour for GET `_alias` api
Currently there are two get aliases apis that both have the same functionality, but have a different response structure. The reason for having 2 apis is historic.

The GET _alias api was added in 0.90.x and is more efficient since it only sends the needed alias data from the cluster state between the master node and the node that received the request. In the GET _aliases api the complete cluster state is send to the node that received the request and then the right information is filtered out and send back to the client.

The GET _aliases api should be removed in favour for the alias api

Closes to #4539
2014-01-02 13:56:11 +01:00
Martijn van Groningen f4bf0d5112 Replaced `ignore_indices` with `ignore_unavailable`, `expand_wildcards` and `allow_no_indices`.
* `ignore_unavailable` - Controls whether to ignore if any specified indices are unavailable, this includes indices that don't exist or closed indices. Either `true` or `false` can be specified.
* `allow_no_indices` - Controls whether to fail if a wildcard indices expressions results into no concrete indices. Either `true` or `false` can be specified. For example if the wildcard expression `foo*` is specified and no indices are available that start with `foo` then depending on this setting the request will fail. This setting is also applicable when `_all`, `*` or no index has been specified.
* `expand_wildcards` - Controls to what kind of concrete indices wildcard indices expression expand to. If `open` is specified then the wildcard expression if expanded to only open indices and if `closed` is specified then the wildcard expression if expanded only to closed indices. Also both values (`open,closed`) can be specified to expand to all indices.

Closes to #4436
2014-01-02 12:19:45 +01:00
Britta Weber 1ede9a5730 make term statistics accessible in scripts
term statistics can be accessed via the _shard variable.

Below is a minimal example. See documentation on details.

```

DELETE paytest

PUT paytest
{
    "mappings": {
        "test": {
            "_all": {
                "auto_boost": true,
                "enabled": true
            },
            "properties": {
                "text": {
                    "index_analyzer": "fulltext_analyzer",
                    "store": "yes",
                    "type": "string"
                }
            }
        }
    },
    "settings": {
        "analysis": {
            "analyzer": {
                "fulltext_analyzer": {
                    "filter": [
                        "my_delimited_payload_filter"
                    ],
                    "tokenizer": "whitespace",
                    "type": "custom"
                }
            },
            "filter": {
                "my_delimited_payload_filter": {
                    "delimiter": "+",
                    "encoding": "float",
                    "type": "delimited_payload_filter"
                }
            }
        },
        "index": {
            "number_of_replicas": 0,
            "number_of_shards": 1
        }
    }
}

POST paytest/test/1
{
    "text": "the+1 quick+2 brown+3 fox+4 is quick+10"
}

POST paytest/test/2
{
    "text": "the+1 quick+2 red+3 fox+4"
}

POST paytest/_refresh

POST paytest/_search
{
    "script_fields": {
       "ttf": {
          "script": "_shard[\"text\"][\"quick\"].ttf()"
       }
    }
}

POST paytest/_search
{
    "script_fields": {
       "freq": {
          "script": "_shard[\"text\"][\"quick\"].freq()"
       }
    }
}
POST paytest/test/2/_termvector
POST paytest/_search
{
    "script_fields": {
       "payloads": {
          "script": "term = _shard[\"text\"].get(\"red\",_PAYLOADS);payloads = []; for(pos : term){payloads.add(pos.payloadAsFloat(-1));} return payloads;"
       }
    }
}

POST paytest/_search
{
   "script_fields": {
      "tv": {
         "script": "_shard[\"text\"][\"quick\"].freq()"
      }
   },
   "query": {
      "function_score": {
         "functions": [
            {
               "script_score": {
                  "script": "_shard[\"text\"][\"quick\"].freq()"
               }
            }
         ]
      }
   }
}

```

closes #3772
2014-01-02 11:17:33 +01:00
Adrien Grand 1654ae8937 Explicit doc_values setting.
Once doc values are enabled on a field, they can't be disabled.

Close #4560
2013-12-30 11:10:52 +01:00
Adrien Grand 05448b6276 Doc values for geo points.
This commits add doc values support to geo point using the exact same approach
as for numeric data: geo points for a given document are stored uncompressed
and sequentially in a single binary doc values field.

Close #4207
2013-12-27 12:45:18 +01:00
Florian Schilling bc452dff84 * setup accurate GeoDistance Function
* adapt tests
* introduced default GeoDistance function
* Updated docs

closes #4498
2013-12-27 19:15:19 +09:00
Andrew Raines 69d88a1edd [DOCS] Add headers and help parameters. 2013-12-23 22:26:28 -06:00
Martijn van Groningen eb86a3a6fe [DOCS] Changed `shape_field_name` to `path` in geo_shape filter documentation.
Relates to #4486
2013-12-23 11:27:06 +01:00
Clinton Gormley 998b7b3b86 [DOCS] Fixed community links to official clients 2013-12-20 12:16:58 +01:00
Clinton Gormley dea6b112ae [DOCS] Corrected bloom loading docs 2013-12-20 11:20:54 +01:00
Clinton Gormley 2b8c82c883 [DOCS] Documented index.codec.bloom.load for #4525 2013-12-20 10:51:17 +01:00
Clinton Gormley 51dc057244 [DOCS] Added the official PHP client to the community page. 2013-12-20 10:51:17 +01:00
Richard Pijnenburg df85fdf88f Add repository information to docs
This adds the apt and yum repo information to the setup docs.
2013-12-19 15:58:08 +01:00
Adrien Grand 52db8eb324 More documentation improvements for fielddata loading. 2013-12-18 16:05:35 +01:00
Adrien Grand 07443089ce Improve documentation of the new `disabled` field data format. 2013-12-18 15:44:57 +01:00
Boaz Leskes 3c5106ae98 Added cluster health status to the Cluster Stats API
Relates to #4460
2013-12-18 12:03:49 +01:00
Chris Simpson 4f8c916eed [Docs] Fix Typo
Fixes small typo in the geo_distance aggregation docs.
2013-12-18 11:21:21 +01:00
spenceralger 89e6b9cfc4 Merge pull request #4494 from spenceralger/add_js_docs
JavaScript client docs
2013-12-17 14:41:57 -08:00
Spencer Alger a8ca8497c5 added doc page for the JavaScipt client, and listed it in the clients list. 2013-12-17 15:26:29 -07:00
Boaz Leskes 2b6214cff7 Added Cluster Stats API
Closes #4460
2013-12-17 13:14:46 +01:00
Grégory Quatannens c64abaae7e Fixing typo and grammar 2013-12-17 11:39:02 +01:00
Adrien Grand 33599d9a34 Compressed geo-point field data.
This commit allows to trade precision for memory when storing geo points.
This new field data impl accepts a `precision` parameter that controls the
maximum expected error for storing coordinates. This option can be updated on
a live index with the PUT mapping API.

Default precision is 1cm, which requires 8 bytes per geo-point (50% memory
saving compared to using 2 doubles).

Close #4386
2013-12-17 11:29:48 +01:00
Clinton Gormley 684affa5c7 [DOCS] Removed unused file 2013-12-17 11:28:19 +01:00
Alexander Reelsen b713cf56ed Allow to provide parameters not only through -D but as long parameters
All getopt long style parameters are now set as es. properties,

elasticsearch --path.data=/some/path

results in -Des.path.data=/some/path

Closes #4393
2013-12-17 10:43:27 +01:00
Alexander Reelsen c30945a3d8 Start elasticsearch in the foreground by default
Instead of using the '-f' parameter to start elasticsearch in the
foreground, this is now the default modus.

In order to start elasticsearch in the background, the '-d' parameter
can be used.

Closes #4392
2013-12-17 10:39:22 +01:00
Clinton Gormley 34b9b16233 [DOCS] Fixed some bad link refs 2013-12-16 18:07:33 +01:00
Martijn van Groningen 23d2b1ea7b Renamed top level `filter` to `post_filter`.
Closes #4119
2013-12-16 17:10:14 +01:00
Lee Hinman db431b7cb3 Remove the `field` and `text` queries.
The `text` query was replaced by the `match` query and has been
deprecated for quite a while.

The `field` query should be replaced by a `query_string` query with
the `default_field` specified.

Fixes #4033
2013-12-16 08:59:36 -07:00
Adrien Grand 4e7ce4ee02 Make field data changes immediately taken into account and add the ability to disallow field data loading.
This commit changes field data configuration updates so that they are
immediately taken into account for loading new segments. The way it works
is that field data configuration is now cached separately from the field
data cache, meaning that it is now possible to clear the field data
configuration from IndexFieldDataService while the cache will stay around. On
the next time that Elasticsearch will reload field data configuration, it will
check if there is already a cache entry, and reuse it if it exists.

To disable field data loading, all that is required is to change the field
data format to "none" (supported by all field data types) using the update
mapping API. Elasticsearch will then refuse to load field data on any new
segment, but field data which has been loaded on the previous segments will
remain available. So you need to clear the field data cache in order to
reclaim memory (otherwise memory will be reclaimed slower, as segments get
merged).

Close #4430
Close #4431
2013-12-16 14:34:33 +01:00