According to #2515 the ubuntu software center does not allow to install
debian packages which are not lintian compatible
I worked on the package and made it lintian compatible by doing
* Ignoring errors about arch dependent binaries as we will not split
this package. The arch dependent libraries are used correctly.
* Added a copyright file pointing to the apache license in debian
Closes#2515Closes#2320
Currently if MPQ is very large highlighing can take down a node
or cause high CPU / RAM consumption. If the query grows > 16 terms
we just extract the terms and do term by term highlighting.
Closes #3142#3128
The SimpleFragemntsBuilder did not correct offsets if the used
analysis chais could produce broken offsets that could lead to
StringArrayIndexOutOfBounds Exceptions
Closes#3140
- SimpleSortTests#testSortScript which was not using the mapping correctly
- SearchStatsTests#testSimpleStats which didn't clear the stats before
running the test and a previous run could have added queries
Version is now stored on a distinct field, that AbstractSimpleEngineTests
didn't correctly add before running tests. This generated a test failure
when the version needed to be loaded from the index.
Since people are using the Oracle JAVA distribution and not the OpenJDK.
You can suggest it of course. Now the installation will at least continue.
If the init script is called, it will exit with a useful error message, that
no JDK is available via the JAVA_HOME variable.
The Version class had hard to understand semantics when two versions were
compared against each other.
Sample of the new logic:
* V_0_20_0.before(V_0_90_0) => true
* V_0_90_0.after(V_0_20_0) => true
Closes#3124
This test checks for the "perfect" or a "sane" allocation
when the total number of shards is separable by the total number of nodes
the index can be allocated on.
In order to ensure that configuration files do not get overwritten when
upgrading an RPM, it is not sufficient to mark them as configuration. You
have to use the 'noreplace' parameter to make sure, they are never
overwritten. Added this parameter for the /etc/elasticsearch directory
as well as the /etc/sysconfig/elasticsearch file.
In addition, the post remove script now only deletes the user in case of
a package removal (and does nothing on package upgrade).
Closes#3123
The new AbstractSharedClusterTest abstracts integration testing further to
reduce the overhead of writing tests that don't rely on explict control over
the cluster. For instance tests that run query, facets or that test highlighting
don't need to explictly start and stop nodes. Testing features like the ones
just mentioned are based on the assumption that the underlying cluster can
be arbitray. Based on this assumption this base class allows to:
* randomize cluster and index settings if not explictly specified
* transparently test transport & node clients
* test features like search or highlighting on different cluster sizes
* allow reuse of node insteance across tests
* provide utility methods that act as upper or lower bounds that a test must pass with
ie. if a test requries at least 3 nodes then it should also pass with 4 nodes
* given a cluster has unmodified cluster settings (persistent and transient) the cluster
should not differ to a fresh started cluster when reused across nodes.
* within a test the client implementation and the clients associated node can be changed
at any time and should return a valid result.
This patch also prepares some redundant tests like 'RelocationTests.java' for randomized
testing. Test like this are very long-running on some machines and run the same test with
different parameters like 'number of writers' or 'number of relocations' which can easily
be chosen with a random number and run only ones during development but multiple times
during CI builds.
All the improvements in this change reduce the test time by ~30%
This is mainly due to the fact that SpanNearQuery allows some neat
tricks with negative slops to run zero-sloped near queries across
2 or more SpanTermQueries.
Closes#3079
New option -l, --list displays list of existing plugins
New option -h, --help displays help
Deprecate options:
-install is now -i, --install
-remove is now -r, --remove
-url is now -u, --url
Catch ArraysOutOfBoundException when no arg given to install, remove or url option
Add description on plugin name structure:
- elasticsearch/plugin/version for official elasticsearch plugins (download from download.elasticsearch.org)
- groupId/artifactId/version for community plugins (download from maven central or oss sonatype)
- username/repository for site plugins (download from github master)
Closes#3112.
This patch makes mvn eclipse:eclipse generate additional eclipse configuration
files so that Eclipse:
- uses Java 1.6 compliance level,
- truncates lines after 140 chars,
- uses 4 spaces for indentation,
- automatically adds a license header when creating a new class file,
- organizes imports the same way as Intellij Idea (which makes sense I guess
since most of the code bas has been written with Intellij, this will prevent
from having large diffs due to the fact that the order of imports has
changed).
Doc values can be expected to be more compact than payloads and should provide
better flexibility since doc values formats can be picked on a per-field basis.
This patch:
- makes _version stored as a numeric doc values field,
- manages backwards compatibility: if a version is not found in doc values,
then it will look into payloads,
- uses background merges to upgrade old segments and move _version from
payloads to doc values.
Closes#3103
PlainHighlighter fails with a NPE when the field to highlight is marked as
stored in the mapping but doesn't exist in a hit. This patch makes
FieldsVisitor.fields less error-prone by returning an empty list instead
of null when no matching stored field was found.
Closes#3109
This patch tries to make the suggester implementation as pluggable as
facets or highlight implementations. The goal is to be able to create
own suggest implementations in a suggest query.
Closes#3089
Added indices aliases exists api that allows to check to existence of an index alias. This api redirects to the master to check for the existence of one or multiple index aliases.
Possible options:
* `index` - The index name to check index aliases for. Partially names are supported via wildcards, also multiple index names can be specified separated with a comma. Also the alias name for an index can be used.
* `alias` - The name of alias to check the existence for. Like the index option, this option supports wildcards and the option the specify multiple alias names separated by a comma. This is a required option.
* `ignore_indices` - What to do is an specified index name doesn't exist. If set to `missing` then those indices are ignored.
The rest head endpoint is: `/{index}/_alias/{alias}`
Examples:
Check existence for any aliases with the name 2013 in any index:
```
curl -XHEAD 'localhost:9200/_alias/2013
```
Check existence for any aliases that start with 2013_01 in any index
```
curl -XHEAD 'localhost:9200/_alias/2013_01*
```
Check existence for any aliases in the users index.
```
curl -XHEAD 'localhost:9200/users/_alias/*
```
Closes#3100
When a type is configured with a TTL, percolation of documents of this type
was not possible. This fix ignores the TTL for percolation instead of
throwing an exception that the document is already expired.
Closes#2975
If we throw an exception in the PostingsFormat during a merge we essentially
fail the entire merge which can lead to a corrupt index. We should rather
return the default postings format for the new segment and log a warning.
Closes#3088
Added apis to get specific index aliases based on filtering by alias name and index name:
```
curl -XGET 'localhost:9200/{index_or_alias}/_alias/{alias_name}'
```
Added delete index alias api for deleting a single index alias:
```
curl -XDELETE 'localhost:9200/{index}/_alias/{alias_name}'
```
Added create index alias api for adding a single index alias:
```
curl -XPUT 'localhost:9200/{index}/_alias/{alias_name}'
curl -XPUT 'localhost:9200/{index}/_alias/{alias_name}' -d '{
"routing" : {routing},
"filter" : {filter}
}'
```
Closes#3075#3076#3077
This fix adds a default serialization step in the SimpleDateMappingTests
that parses the mapping, builds the mapper, serializes the mapper and
rebuilds the actual mapper from the serialization result. The contained
information must be equivalent to the original mapping.
The fixed bug has no issue assigned to is since the code is unreleased yet.
Set the query boost of a parsed query string query to the product of
the parsed query boost and the boost value specified in the "boost"
query string parameter. This only applies if the top level query returned
from the query parser has a boost assigned to it. In such a case we must
multiply the boost with the top level query boost otherwise the boost
will be overwritten ie. 'foo^2' has a top-level boost of 2 while
'foo^2 OR bar^3' has a top level boost of 1.0 (default) since the
boolean query is the top level query.
Closes#3024
seems like it still fails while serializing with sporadic failures in the tests (due to routing on serialization), need to test it in a consistent manner
the semantics between null fields (asking for source), and empty fields (not asking for anything) is missing
also exposes the items in the request, relates to #3061
This commit integrates the forbiddenAPI checks that checks
Java byte code against a list of "forbidden" API signatures.
The commit also contains the fixes of the current source code
that didn't pass the default API checks.
See https://code.google.com/p/forbidden-apis/ for details.
Closes#3059
Currently if somebody uses a date format that is locale dependend
date fields can only parse a single format depending on the nodes
host locale. This can cause lots of problems since nodes might have
different locales. ie. "E, d MMM yyyy HH:mm:ss Z" where you have
"Wed, 06 Dec 2000 02:55:00 -0800" for en_EN while
"Mi, 06 Dez 2000 02:55:00 -0800" for de_DE.
Closes#3047
Instead of specifying 'path.plugins' configuration option, 'plugin.types'
is used to load plugins in integration tests. This makes sure the JVM
plugins are not loaded in all following tests from then.
Also removed the now unneeded es-plugin.properties files from JVM test
plugins.
* RPM: Use the ES_USER variable to set the user (same name as in the debian package
now), while retaining backwards compatibility to existing /etc/sysconfig/elasticsearch
* RPM: Bugfix: Remove the user when uninstalling the package
* RPM: Set an existing homedir when adding the user (allows one to run cronjobs as this user)
* DEB & RPM: Unify Required-Start/Required-Stop fields in initscripts
Currently elasticsearch ships with the plain and the fast-vector highlighter.
In order to support arbitrary highlighters via plugins, you only need to
implement a Highlighter interface and register your implementation in your
plugin at the HighlightModule.
In addition you can also add arbitrary options via the 'options' field in
the highlight request, which can be parsed in the highlighter implementation.
In order to find out how to write add your own analyzer, check out the tests
classes (CustomHighlighterSearchTests and CustomHighlighter).
Closes#2828
Using an automatically detected 'min_doc_freq' if suggest type is set to
'always' is counter intuitive. If we suggest always ignore the frequency and
set threshold frequency to 0 to allow all possible candidates to be drawn if
they are within the given bounds.
Closes#3037
To prevent to extensive resource use during recovery we use
recovery throtteling by default to prevent unexpected peak load
on clusters. The default is set to 20 MB/sec.
Closes#3035
Merge Throtteling is one of the most recommended settings and crucial in the
RealTime indexing case. We should set the default to a reasonable setting
that allows folks to index in a production index and don't see large merge
peaks by default. The default is set to 20 MB/sec on the node level.
Closes#3033
The default size used to be 2x availableProcessors which seemed to
be a to lowish value in practice. 3x appeared to be a sweetspot for
most application. The default is now 3 x availableProcessors
Closes#3023
Added support for unmapped & partially mapped fields (partially mapped fields may occur when searching across multiple indices where the faceted field is mapped on some and unmapped on others). If a shard doesn't have mappings for a field, the matching documents count on that shard will be added to the missing count for that facet.
Both has_parent and has_child filters are internally executed in two rounds. In the second round all documents are evaluated whilst only specific documents need to be checked. In the has_child case only documents belonging to a specific parent type need to be checked and in the has_parent case only child documents need to be checked.
Closes#3034
Similar to the global cluster wide disable allocation flags, allow to set those on a specific index by updating its settings. The keys are the same as the cluster one, except they start with an index, for example: index.routing.allocation.disable_allocation set to true.
closes#3031
The branches used in the score method can be moved into the
scorer call and be essentially a constant operation rather than
a linear operation depending on the number of parent docs.
Older OpenSUSE distributions do not ship with systemd and therefore are
using chkconfig, but do not have their scripts placed at /etc/init.d/
This patch is more defensive and adds additional checks in the postinstall
script to prevent aborted post install scripts, which makes the RPM
uninstallable.
when resolving empty settings values, their value should be removed, for example, when using ${env.ENV_VAR}, and ENV_VAR is not set, then the setting should be removed
This commit allows to set custom headers in HTTP responses (like
setting the WWW-Authenticate header for basic auth) by adding
RestRequest.addHeader() method.
Closes#2936Closes#2540
To get the history right: This is based on PR #2723
Update requests can now be put in the bulk api. All update request options are supported.
Example usage:
```
curl -XPOST 'localhost:9200/_bulk' --date-binary @bulk.json
```
Contents of bulk.json that contains two update request items:
```
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "index1", "_retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_type" : "type1", "_index" : "index1", "_retry_on_conflict" : 3} }
{ "script" : "counter += param1", "lang" : "js", "params" : {"param1" : 1}, "upsert" : {"counter" : 1}}
```
The `doc`, `upsert` and all script related options are part of the payload. The `retry_on_conflict` option is part of the header.
Closes#2982
Before this change, the GetField#getValue() method was returning a list of values of a multivalued fields if the field values were obtained from source or if the field was stored and real-time get was used. If the field was stored but non-realtime get was used, GetField#getValue() was returning only the first element and the GetField#getValues() was returning a list of elements. This change makes behavior consistent. GetField#getValue() now always returns only the first value of the field and GetField#getValues() returns the entire list.
Typically, the main reason a reroute allocation command with allow_primary is enabled, is to force create an empty new shard because a shard (and its replicas) were lost. This can't be done today because the shard expects to have a valid index where its allocated, we need to clear its post allocation flag to make sure it is allowed to create a fresh index.