This option allows to force the xcontent type to use to store the `_source`
document. The default is to use the same format as the input format.
This commit makes this option ignored for 2.x indices and rejected for 3.0
indices.
As a replacement use ExistsQueryBuilder inside a mustNot() clause.
So instead of using `new ExistsQueryBuilder(name)` now use:
`new BoolQueryBuilder().mustNot(new ExistsQueryBuilder(name))`.
Closes#14112
This makes some minor improvements (does not fix all problems!)
It reorders unicast disco in elasticsearch.yml to be right after the network host,
for better locality.
It removes the warning (unreleased) about publish addresses, lets try to really discourage setting
that unless you need to (behind a proxy server). Most people should be fine with `network.host`
Finally it reorganizes the network docs page a bit:
We add a table of 4 "basic" settings at the very beginning:
* network.host
* discovery.zen.ping.unicast.hosts
* http.port
* transport.tcp.port
The first two being the most important, which addresses to bind and talk to, and the other two
being the port numbers.
The rest of the stuff I tried to simplify and reorder under "advanced" headers.
This is just a quick stab, I still think we need more effort into this thing, but we gotta start somewhere.
The NodeBuilder is currently used to construct a Node. However, this is
really just yet-another-builder that wraps around a Settings.Builder
witha couple convenience methods. But there are very few uses of these
convenience methods. This change removes NodeBuilder, in favor of just
using the Node constructor.
throw exception if a copy_to is within a multi field
Copy to within multi field is ignored from 2.0 on, see #10802.
Instead of just ignoring it, we should throw an exception if this
is found in the mapping when a mapping is added. For already
existing indices we should at least log a warning.
We remove the copy_to in any case.
related to #14946
When using S3 or EC2, it was possible to use a proxy to access EC2 or S3 API but username and password were not possible to be set.
This commit adds support for this. Also, to make all that consistent, proxy settings for both plugins have been renamed:
* from `cloud.aws.proxy_host` to `cloud.aws.proxy.host`
* from `cloud.aws.ec2.proxy_host` to `cloud.aws.ec2.proxy.host`
* from `cloud.aws.s3.proxy_host` to `cloud.aws.s3.proxy.host`
* from `cloud.aws.proxy_port` to `cloud.aws.proxy.port`
* from `cloud.aws.ec2.proxy_port` to `cloud.aws.ec2.proxy.port`
* from `cloud.aws.s3.proxy_port` to `cloud.aws.s3.proxy.port`
New settings are `proxy.username` and `proxy.password`.
```yml
cloud:
aws:
protocol: https
proxy:
host: proxy1.company.com
port: 8083
username: myself
password: theBestPasswordEver!
```
You can also set different proxies for `ec2` and `s3`:
```yml
cloud:
aws:
s3:
proxy:
host: proxy1.company.com
port: 8083
username: myself1
password: theBestPasswordEver1!
ec2:
proxy:
host: proxy2.company.com
port: 8083
username: myself2
password: theBestPasswordEver2!
```
Note that `password` is filtered with `SettingsFilter`.
We also fix a potential issue in S3 repository. We were supposed to accept key/secret either set under `cloud.aws` or `cloud.aws.s3` but the actual code never implemented that.
It was:
```java
account = settings.get("cloud.aws.access_key");
key = settings.get("cloud.aws.secret_key");
```
We replaced that by:
```java
String account = settings.get(CLOUD_S3.KEY, settings.get(CLOUD_AWS.KEY));
String key = settings.get(CLOUD_S3.SECRET, settings.get(CLOUD_AWS.SECRET));
```
Also, we extract all settings for S3 in `AwsS3Service` as it's already the case for `AwsEc2Service` class.
Closes#15268.
Rename processor now checks whether the field to rename exists and throws exception if it doesn't. It also checks that the new field to rename to doesn't exist yet, and throws exception otherwise. Also we make sure that the rename operation is atomic, otherwise things may break between the remove and the set and we'd leave the document in an inconsistent state.
Note that the requirement for the new field name to not exist simplifies the usecase for e.g. { "rename" : { "list.1": "list.2"} } as such a rename wouldn't be accepted if list is actually a list given that either list.2 already exists or the index is out of bounds for the existing list. If one really wants to replace an existing field, that field needs to be removed first through remove processor and then rename can be used.
Do not to load fields from _source when using the `fields` option.
Non stored (non existing) fields are ignored by the fields visitor when using the `fields` option.
Fixes#10783
Support * wildcard to retrieve stored fields when using the `fields` option.
Supported pattern styles are "xxx*", "*xxx", "*xxx*" and "xxx*yyy".
In the example we show an `exists` query inside a constant score query. While this is possible, it can mislead users to think it is necessary so we should remove it.
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
- moves calculation of the delay to a single place (ReplicaShardAllocator)
- reduces coupling between GatewayAllocator and RoutingService
- in master failover situations, elapsed delay time is forgotten
Closes#14808
At the time of geo_shape query conception, CONTAINS was not yet a supported spatial operation in Lucene. Since it is now available this commit adds ShapeRelation.CONTAINS to GeoShapeQuery. Randomized testing is included and documentation is updated.
We actually want to keep the test when using deprecated setting in 3.0.
We will keep this setting deprecated as well so people will be able to update in a smoother way.
Also add the deprecating information to the migration documentation.
Follow up for #13228.
This commit adds support for a secondary storage account:
```yml
cloud:
azure:
storage:
my_account1:
account: your_azure_storage_account1
key: your_azure_storage_key1
default: true
my_account2:
account: your_azure_storage_account2
key: your_azure_storage_key2
```
When creating a repository, you can choose which azure account you want to use for it:
```sh
curl -XPUT localhost:9200/_snapshot/my_backup1?pretty -d '{
"type": "azure"
}'
curl -XPUT localhost:9200/_snapshot/my_backup2?pretty -d '{
"type": "azure",
"settings": {
"account" : "my_account2",
"location_mode": "secondary_only"
}
}'
```
`location_mode` supports `primary_only` or `secondary_only`. Defaults to `primary_only`. Note that if you set it
to `secondary_only`, it will force `read_only` to true.
We already introduced the MatchNoneQueryBuilder query that does not
return any documents, mainly because we needed it for internal
representation of the NONE option in the IndicesQueryBuilder.
However, the query was requested at least once also for the query dsl,
and since we can parser it already we should document it as
`match_none` query in the relevant reference docs as well.
When passing the example json snippets through the query parser while working
on #14249 some of the examples could not be parsed. This PR fixes those
examples.
Relates to #14249
geoip: add a `fields` option to control what fields are added by geoip processor
geoip: instead of adding all fields, only `country_code`, `city_name`, `location`, `continent_name` and `region_name` fields are added.
fix forbiddenapis
update
clean up and add rest test
update mutate factory to use configuration utilities
compile gsub pattern
cleanup, update parseBooleans, null tests
Some users may already be familiar with column stores, so saying more explicitly
that doc values are a columnar representation of the data may help them better
and/or more quickly understand what doc values are about.
This adds the `cluster.routing.allocation.total_shards_per_node`
setting, which limits the total number of shards across all indices on
each node. It defaults to -1 and can be dynamically configured.
Resolves#14456
The completion suggester provides auto-complete/search-as-you-type functionality.
This is a navigational feature to guide users to relevant results as they are typing, improving search precision.
It is not meant for spell correction or did-you-mean functionality like the term or phrase suggesters.
The completions are indexed as a weighted FST (finite state transducer) to provide fast Top N prefix-based
searches suitable for serving relevant results as a user types.
closes#10746
Current processors setting is not reflected in nodes info API
("os.available_processors"). Add os.allocated_processors to shows
actual number of processors that we are using.
query_binary and filter_binary are unused at this point, as we only parse on the coordinating node and the java api only holds structured java objects for queries and filters, meaning they all implement Writeable and get natively serialized.
Relates to #14308Closes#14433
Also moved all processor classes into a subdirectory and introduced a
ConfigException class to be a catch-all for things that can go wrong
when constructing new processors with configurations that possibly throw
exceptions. The GrokProcessor loads patterns from the resources
directory.
fix resource path issue, and add rest-api-spec test for grok
fix rest-spec tests
changes: license, remove configexception, throw IOException
add more tests and fix iso8601-hour pattern
move grok patterns from resources to config
fix tests with pom changes, updated IngestClientIT with grok processor
update gradle build script for grok deps and test configuration
move config files to src/main/packaging
move Env out of Processor, fix test for src/main/packaging change
add docs
clean up test resources task
update Grok to be immutable
- Updated the Grok class to be immutable. This means that all the
pattern bank loading is handled by an external utility class called
PatternUtils.
- fixed tabs in the nagios patterns file's comments
This commit forbids the changing of thread pool types for any thread
pool. The motivation here is that these are expert settings with
little practical advantage.
Closes#14294, relates #2509, relates #2858, relates #5152
Removes the mapping transform feature which when used made debugging very
difficult. Users should transform their documents on the way into
Elasticsearch rather than having Elasticsearch do it.
Closes#12674
The only way to refer to the plain highlighter is now `plain`, the only way to refer to the fast vector highlighter is `fvh` and the only way to refer to the postings highlighter is `postings`. The name variants like `highlighter`, `postings-highlighter` and `fast-vector-highlighter` have been removed.
We have two types of parse methods for queries: one for the inner query, to be used once the parser is positioned within the query element, and one for the whole query source, including the query element that wraps the actual query.
With the search refactoring we ended up using the former in count, cat count and delete by query, whereas we should have used the former. It ends up working properly given that we have a registered (deprecated) query called "query", which used to allow to wrap a filter into a query, but this has the following downsides:
1) prevents us from removing the deprecated "query" query
2) we end up supporting a top level query that is not wrapped within a query element (pre 1.0 syntax iirc that shouldn't be supported anymore)
This commit finally removes the "query" query and fixes the related parsing bugs. We also had some tests that were providing queries in the wrong format, those have been fixed too.
Closes#13326Closes#14304
* Allow for multiple host specifications (e.g. _en0_,192.168.1.2,_site_).
* Add _site_ and _global_ scopes as counterparts to _local_.
* Warn on heuristic selection of publish address.
* Remove the arbitrary _non_loopback_ setting.
Closes#13954
The NotQueryBuilder has been deprecated on the 2.x branches
and can be removed with the next major version. It can be
replaced by boolean query with added mustNot() clause.
Closes#13761
This adds an API for force merging lucene segments. The `/_optimize` API is now
deprecated and replaced by the `/_forcemerge` API, which has all the same flags
and action, just a different name.
This commit removes some cache concurrency level settings that were
applicable when the cache was backed by the Guava cache implementation,
but no longer apply with the cache implementation completed in #13717.
Relates #7836, relates #13224, relates #13717
we are relying on terminate_after more and more, replaced the limit filter with it and soon it will also replace the search_exists api. At that point we should make it a stable api rather than experimental.
Closes#14183
Closed indices are currently out of scope for snapshots and shard migration,
and can cause issues in managed environments – where closing an index does
not necessarily make sense, as it still consumes the managed environment's storage quota.
This commit adds an option to dynamically disable closing indices via node or cluster settings.
Closes#14168
Hi,
I belive indexRouting should be changed to routing.index and searchRouting to routing.search.
This is what my ES 2.0.0-rc1 instance returns when I hit cat alias api:
Let me know if I should change other versions of documentation for this issue as well(Looks to me like it always was incorrect in docs).
This commit adds a new metric aggregator for computing the geo_centroid over a set of geo_point fields. This can be combined with other aggregators (e.g., geohash_grid, significant_terms) for computing the geospatial centroid based on the document sets from other aggregation results.
* Add ability for plugins to declare additional permissions with a custom plugin-security.policy file and corresponding AccessController logic. See the plugin author's guide for more information.
* Add warning messages to users for extra plugin permissions in bin/plugin.
* When bin/plugin is run interactively (stdin is a controlling terminal and -b/--batch not supplied), require user confirmation.
* Improve unit test and IDE support for plugins with additional permissions by exposing plugin's metadata as a maven test resource.
Closes#14108
Squashed commit of the following:
commit cf8ace65a7397aaccd356bf55f95d6fbb8bb571c
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 13:36:05 2015 -0400
fix new unit test from master merge
commit 9be3c5aa38f2d9ae50f3d54924a30ad9cddeeb65
Merge: 2f168b8 7368231
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 12:58:31 2015 -0400
Merge branch 'master' into off_my_back
commit 2f168b8038e32672f01ad0279fb5db77ba902ae8
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 12:56:04 2015 -0400
improve plugin author documentation
commit 6e6c2bfda68a418d92733ac22a58eec35508b2d0
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 12:52:14 2015 -0400
move security confirmation after 'plugin already installed' check, to prevent user from answering unnecessary questions.
commit 08233a2972554afef2a6a7521990283102e20d92
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 05:36:42 2015 -0400
Add documentation and pluginmanager support
commit 05dad86c51488ba43ccbd749f0164f3fbd3aee62
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 02:22:24 2015 -0400
Decentralize plugin permissions (modulo docs and pluginmanager work)
When running in GCE platform, an instance has access to:
http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/ip
Which gives back the private IP address, for example `10.240.0.2`.
http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/externalIp
Gives back the public Ip address, for example `130.211.108.21`.
As we have for `ec2`, we can support new network host settings:
* `_gce:privateIp:X_`: The private IP address of the machine for a given network interface.
* `_gce:hostname_`: The hostname of the machine.
* `_gce_`: Same as `_gce:privateIp:0_` (recommended).
Closes#13605.
Closes#13590.
BTW resolveIfPossible now throws IOException so code is also updated for ec2 discovery and
some basic tests have been added.
The `_create` API is handy way to specify an index operation should only be done if the document doesn't exist. This is currently implemented in explicit code paths all the way down to the engine. However, conceptually this is no different than any other versioned operation - instead of requiring a document is on a specific version, we require it to be deleted (or non-existent). This PR removes Engine.Create in favor of a slight extension in the VersionType logic.
There are however a couple of side effects:
- DocumentAlreadyExistsException is removed and VersionConflictException is used instead (with an improved error message)
- Update will reject version parameters if the upsert option is used (it doesn't compute anyway).
- Translog.Create is also removed infavor of Translog.Index (that's OK because their binary format was the same, so we can just read Translog.Index of the translog file)
Closes#13955
It is rarely used and was not consistently handled by different distributions anyway.
This commit also adds a test for specifying CONF_DIR when installing plugins and
starting elasticsearch.
relates to #12712 and #12954closes#5329closes#13715
With 2.0, we now bind to `localhost` by default instead of binding to the network card and use its IP address.
When the discovery plugin gets from AWS API the list of nodes that should form the cluster, this list is pinged then. But as each node is bound to `localhost`, ping does not get an answer and the node elects itself as the master node.
`network.host` must be set.
Closes#13589.
Types are still optional, but if you do provide them, they can't be null. Split the existing constructor that accepted nnull into two, one that accepts no arguments, and another one that accepts the types argument, which must be not null.
Also trimmed down different ways of setting ids, some were misleading as they would always add the ids to the existing ones and not set them, the add prefix makes that clear. Left `addIds` method that accepts a varargs argument. Added check for ids not be null.
TermsLookupQueryBuilder was left around only for bw comp reasons, but TermsQueryBuilder is its replacement. We can remove it now that it is clear query refactoring goes in master (3.0).
This commit fixes ping timeout settings inconsistencies in
ZenDiscovery. In particular, the documentation refers to the ping
timeout setting as discovery.zen.ping_timeout but the code was
ultimately using discovery.zen.ping.timeout if this was set.
This commit also changes all instances of the raw string
“discovery.zen.ping_timeout” to the constant
o.e.d.z.ZenDiscovery.SETTING_PING_TIMEOUT.
Finally, this commit removes the legacy setting
"discovery.zen.initial_ping_timeout".
Closes#6579, #9581, #9908
The current MoreLikeThisQueryBuilder validation checks for existence of at
least one `like` text or item. This is hard to check in setters, so this PR
tries to change the construction of the query so that we can do these checks
already at construction time.
Changing to using arrays for fieldnames, likeTexts, likeItems, unlikeTexts
and unlikeItems. `likeTexts` and/or `likeItems` need to be specified at
construction time to validate we have at least one item there.
Relates to #10217
This commit moves the size and ops based flush into a synchronous API into
IndexShard and removes the time-based flush alltogether since it' basically
covered by the inactive async flush API we have today. The functionality doesn't
need to be covered by scheduled task and async APIs while we can actually make all
the decisions in a sync manner which is way easier to control and to test.
Closes#13707
Refactor the function_score query so it can be parsed on the coordinating node, split parse into fromXContent and toQuery, make FunctionScoreQueryBuilder Writeable.
Closes#13653
Before this commit he tests always run bin/plugin as root which is somewhat
unrealistic and causes trouble (log files owned by root instead of
elasticsearch). After this commit `bin/plugin` runs as root when elasticsearch
is installed via the repository and as elasticsearch otherwise which is much
more realistic.
This also adds extra timeout to starting elasticsearch which is required
when all the plugins are installed. And it fixes up a problem with logging
elasticsearch's log if elasticsearch doesn't start which came up multiple
time while debugging this problem.
Also adds docs recommending running `bin/plugin` as the user that owns the
Elasticsearch files or root if installed with the packages.
Closes#13557
Moving validation from validate() to constructors and setters for the
following query builders:
* GeoDistanceQueryBuilder
* GeoDistanceRangeQueryBuilder
* GeoPolygonQueryBuilder
* GeoShapeQueryBuilder
* GeohashCellQuery
* TermsQueryBuilder
Relates to #10217
Until now we had a cloud-azure plugin which is providing 3 distinct features:
* discovery on Azure
* snapshot/restore on Aure
* SMB store
This commit splits the plugin by feature so people can use either one or the other or both features.
Doc is updated accordingly.
This PR is the second batch in moving the query validation we started
to collect in the validate() method to the corresponding setters
and constructors.
With 2.0, we now bind to `localhost` by default instead of binding to the network card and use its IP address.
When the discovery plugin gets from Azure API the list of nodes that should form the cluster, this list is pinged then. But as each node is bound to `localhost`, ping does not get an answer and the node elects itself as the master node.
Closes#13591
Requesting a million hits, or page 100,000 is always a bad idea, but users
may not be aware of this. This adds a per-index limit on the maximum size +
from that can be requested which defaults to 10,000.
This should not interfere with deep-scrolling.
Closes#9311
* Dropped ScoreType in favour of Lucene's ScoreMode
* Removed `score_type` option from `has_child` and `has_parent` queries in favour for the already existing `score_mode` option.
* Removed the score mode `sum` in favour for the already existing `total` score mode. (`sum` doesn't exist in Lucene's ScoreMode class)
* If `max_children` is set to `0` it now really means that zero children are allowed to match.
This add equals, hashcode, read/write methods, separates toQuery and JSON parsing and adds tests.
Also moving MatchQueryBuilder.Type to MatchQuery to MatchQuery, adding serialization and hashcode,
equals there.
Relates to #10217
Allocation filtering by IP only works today using the node host address. But in some cases, you might want to filter using the publish address which could be different.
This commit splits HasParentQueryParser into toQuery and fromXContent.
This change also deprecates several keys in favor of simplified settings
and adds basic unittests for HasParentQueryParser.
Relates to #10217
Previously the parser could take any Term Vectors request, but this would be
not the case of the builder which would still use MultiGetRequest.Item. This
introduces a new Item class which is used by both the builder and parser.
Beyond that the rest is mostly cleanups such as:
1) Deprecating the ignoreLike methods, in favor to using unlike.
2) Deprecating and renaming MoreLikeThisBuilder#addItem to addLikeItem.
3) Ordering the methods of MoreLikeThisBuilder more logically.
This change is needed for the upcoming query refactoring of MLT.
Closes#13372
This is an intial commit that splits HasChildQueryParser / Builder into
the two seperate steps. This one is particularly nasty since it transports
a pretty wild InnerHits object that needs heavy refactoring. Yet, this commit
has still some nocommits and needs more tests and maybe another cleanup but
it's a start to get the code out there.
This pipeline will calculate percentiles over a set of sibling buckets. This is an exact
implementation, meaning it needs to cache a copy of the series in memory and sort it to determine
the percentiles.
This comes with a few limitations: to prevent serializing data around, only the requested percentiles
are calculated (unlike the TDigest version, which allows the java API to ask for any percentile).
It also needs to store the data in-memory, resulting in some overhead if the requested series is
very large.
Until now we had a cloud-aws plugin which is providing 2 disctinct features:
* discovery on EC2
* snapshot/restore on S3
This commit splits the plugin by feature so people can use either one or the other or both features.
Doc is updated accordingly.
The shaded version of elasticsearch was built at the very beginning to avoid dependency conflicts in a specific case where:
* People use elasticsearch from Java
* People needs to embed elasticsearch jar within their own application (as it's today the only way to get a `TransportClient`)
* People also embed in their application another (most of the time older) version of dependency we are using for elasticsearch, such as: Guava, Joda, Jackson...
This conflict issue can be solved within the projects themselves by either upgrade the dependency version and use the one provided by elasticsearch or by shading elasticsearch project and relocating some conflicting packages.
Example
-------
As an example, let's say you want to use within your project `Joda 2.1` but elasticsearch `2.0.0-beta1` provides `Joda 2.8`.
Let's say you also want to run all that with shield plugin.
Create a new maven project or module with:
```xml
<groupId>fr.pilato.elasticsearch.test</groupId>
<artifactId>es-shaded</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<elasticsearch.version>2.0.0-beta1</elasticsearch.version>
</properties>
<dependencies>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.plugin</groupId>
<artifactId>shield</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
</dependencies>
```
And now shade and relocate all packages which conflicts with your own application:
```xml
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>org.joda</pattern>
<shadedPattern>fr.pilato.thirdparty.joda</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
```
You can create now a shaded version of elasticsearch + shield by running `mvn clean install`.
In your project, you can now depend on:
```xml
<dependency>
<groupId>fr.pilato.elasticsearch.test</groupId>
<artifactId>es-shaded</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>2.1</version>
</dependency>
```
Build then your TransportClient as usual:
```java
TransportClient client = TransportClient.builder()
.settings(Settings.builder()
.put("path.home", ".")
.put("shield.user", "username:password")
.put("plugin.types", "org.elasticsearch.shield.ShieldPlugin")
)
.build();
client.addTransportAddress(new InetSocketTransportAddress(new InetSocketAddress("localhost", 9300)));
// Index some data
client.prepareIndex("test", "doc", "1").setSource("foo", "bar").setRefresh(true).get();
SearchResponse searchResponse = client.prepareSearch("test").get();
```
If you want to use your own version of Joda, then import for example `org.joda.time.DateTime`. If you want to access to the shaded version (not recommended though), import `fr.pilato.thirdparty.joda.time.DateTime`.
You can run a simple test to make sure that both classes can live together within the same JVM:
```java
CodeSource codeSource = new org.joda.time.DateTime().getClass().getProtectionDomain().getCodeSource();
System.out.println("unshaded = " + codeSource);
codeSource = new fr.pilato.thirdparty.joda.time.DateTime().getClass().getProtectionDomain().getCodeSource();
System.out.println("shaded = " + codeSource);
```
It will print:
```
unshaded = (file:/path/to/joda-time-2.1.jar <no signer certificates>)
shaded = (file:/path/to/es-shaded-1.0-SNAPSHOT.jar <no signer certificates>)
```
This PR also removes fully-loaded module.
By the way, the project can now build with Maven 3.3.3 so we can relax a bit our maven policy.
Prior to 2.0 we summed up the available space on all disk on a node
due to the raid-0 like behavior. Now we don't do this anymore and use the
min & max disk space to make decisions.
Closes#13106
detect_noop is pretty cheap and noop updates compartively expensive so this
feels like a sensible default.
Also had to do some testing and documentation around how _ttl works with
detect_noop.
Closes#11282
Until a couple of hours ago we expected the position_offset_gap to default
to 0 in 2.0 and 100 in 2.1. We decided it was worth backporting that new
default to 2.0. So now that its backported we need to teach 2.1 that 2.0
also defaults to 100.
Closes#7268
This is much more fiddly than you'd expect it to be because of the way
position_offset_gap is applied in StringFieldMapper. Instead of setting
the default to 100 its simpler to make sure that all the analyzers default
to 100 and that StringFieldMapper doesn't override the default unless the
user specifies something different. Unless the index was created before
2.1, in which case the old default of 0 has to take.
Also postition_offset_gaps less than 0 aren't allowed at all.
New tests test that:
1. the new default doesn't match phrases across values with reasonably low
slop (5)
2. the new default doest match phrases across values with reasonably high
slop (50)
3. you can override the value and phrases work as you'd expect
4. if you leave the value undefined in the mapping and define it on a
custom analyzer the the value from the custom analyzer shines through
Closes#7268
The setting `plugin.types` is currently used to load plugins from the
classpath. This is necessary in tests, as well as the transport client.
This change removes the setting, and replaces it with the ability to
directly add plugins when building a transport client, as well as
infrastructure in the integration tests to specify which plugin classes
should be loaded on each node.
Multicast has known issues (see #12999 and #12993). This change moves
multicast into a plugin, and deprecates it in the docs. It also allows
for plugging in multiple zen ping implementations.
closes#13019
Fix unicast discovery to work when a host has multiple addresses.
Ban dangerous methods in java.net with forbidden APIs.
Fix ipv6 bugs and formatting of network addresses everywhere.
Closes#12999Closes#12993
Squashed commit of the following:
commit 6c1aa001d091c5cf25212a53dc701fb704337f1e
Author: Robert Muir <rmuir@apache.org>
Date: Thu Aug 20 14:25:43 2015 -0400
Fix these to be correct with addresses just in case
commit 648215627e84abf58a71400e7dc9ae775efb71d6
Merge: d00561b 41d8fbe
Author: Robert Muir <rmuir@apache.org>
Date: Thu Aug 20 13:23:09 2015 -0400
Merge branch 'master' into unicast_all_the_way_down
commit d00561b76fd1aa5850699f7901f3dae3d4d402b7
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 16:38:50 2015 +0200
limit local ports to 5 in UnicastZenPing
commit e2e15c594006746cbe24432694294a71cc99deb8
Author: Robert Muir <rmuir@apache.org>
Date: Thu Aug 20 10:32:47 2015 -0400
fix port limiting
commit 10153cb7adadda81a1f482445e703836b65cf5e2
Author: Robert Muir <rmuir@apache.org>
Date: Thu Aug 20 10:18:37 2015 -0400
don't serialize scopeids: that's broken
commit 2aa63d43db2baec68a2e9bc227cfeb85dfeb4f83
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 16:06:51 2015 +0200
restore @Network
commit c840f1d1ef438826ae1ecfd5e45942a0e30dc9c0
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 16:02:30 2015 +0200
Use NetworkAddress.formatAddress where applicable in plugins
commit 374ce878852b35d626b7a29c8c4773545b0e9ddd
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 15:34:06 2015 +0200
Use NetworkAddress.formatAddress where applicable
commit e7a606d63f1bc43c1b62b6e17adf707c76d43a15
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 10:17:57 2015 +0200
Add @Multicast annotation to disable multicast tests by default.
We only run multicast tests now when we explicitly state it. A working
multicast env is required which is not always the case.
commit 2d7d2d0347179696ab41f71f048b13305014c85b
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 09:51:28 2015 +0200
Remove extra check for local mode in InternalTestCluster
commit dda59ac39aa136d4687b9274c2692cd77f8b8f66
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 09:37:03 2015 +0200
Handle node mode across entire test cluster
We used static methods reading sys properties to define the node mode
per cluster. this had lots of problems when tests couldn't cope with
mixed or only local mode. Now we are passing it down to the cluster from the test
which allows to @SuppressNetworkMode / @SupressLocalMode on the test to force
consistent node configurations.
commit 058197b7a408318995c88ce7f6762e32348de0de
Author: Robert Muir <rmuir@apache.org>
Date: Thu Aug 20 03:19:14 2015 -0400
really ban InetSocketAddress's trappy method and break build and go to sleep, sorry
commit ac8779185aee1e17e6f5a81766290fdfc9c603ba
Author: Robert Muir <rmuir@apache.org>
Date: Thu Aug 20 03:16:52 2015 -0400
Ban methods that might surprisingly cause DNS lookups
commit e64fe3dff2b11503e5f2831eb9863d64f56c5538
Author: Robert Muir <rmuir@apache.org>
Date: Thu Aug 20 02:59:05 2015 -0400
Add unit test
commit f15434f20fb1a3691b1cc16028597d8fae937e05
Author: Robert Muir <rmuir@apache.org>
Date: Thu Aug 20 02:39:02 2015 -0400
fix ipv6 formatting bugs
commit 05c2c74098052c75fbb79ea1818a295ef2e03e30
Author: Robert Muir <rmuir@apache.org>
Date: Thu Aug 20 02:12:05 2015 -0400
format addresses correctly so I can actually read what comes out of our logs and stats apis
commit 4f9389dcf1e8925f23153c5eb271b4ce2294dbaf
Author: Robert Muir <rmuir@apache.org>
Date: Wed Aug 19 21:26:52 2015 -0400
ban dangerous methods in java.net
commit 6aacd4d9925f324903d1d099a6cf5f862aeaf677
Author: Robert Muir <rmuir@apache.org>
Date: Wed Aug 19 20:59:24 2015 -0400
ban lenient method
commit f466a842c60163d1f4554bdce8a4163edb534c2c
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 00:29:00 2015 +0200
fix tests to not mix local transport and zen unicast disco
commit 0de007a33b33fb68cf85cd86db4ca4f8ce10bbc9
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 00:10:07 2015 +0200
fix tests to not mix local transport and zen unicast disco
commit 539f6ca6e5137e0d496239adc8684688dedcc824
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Aug 20 00:02:01 2015 +0200
fix tests to not mix local transport and zen unicast disco
commit 004c2881b25467f332acc8c9f9e92b1f0f9d314e
Author: Robert Muir <rmuir@apache.org>
Date: Wed Aug 19 17:51:45 2015 -0400
Fix multinode
commit 54113af325ce31571811c49fdaae89d5687be4ba
Author: Robert Muir <rmuir@apache.org>
Date: Wed Aug 19 17:36:45 2015 -0400
fix integration tests
commit 0156a77a56319d6b9737ec6a531992052e50bd59
Author: Simon Willnauer <simonw@apache.org>
Date: Wed Aug 19 23:32:18 2015 +0200
enable multicast in MulticastZenPingIT.java
commit 1791caa35da853ce0122485fa3fd4674c671ec6e
Author: Robert Muir <rmuir@apache.org>
Date: Wed Aug 19 17:23:16 2015 -0400
Fix constant
commit 22820b53e0b2dc9fd47145c2bc29ce912a8fd484
Author: Simon Willnauer <simonw@apache.org>
Date: Wed Aug 19 22:59:09 2015 +0200
give it some extra ids for local transport crazyness
commit b2138fafa94a8a085813fd48356df63e57ade5b3
Author: Simon Willnauer <simonw@apache.org>
Date: Wed Aug 19 22:51:42 2015 +0200
pass on local addresses from configured transport rather than hard code IP addresses
commit 1bf5de1f457b081e0ce262b57d2b55d39c434156
Author: Simon Willnauer <simonw@apache.org>
Date: Wed Aug 19 22:04:31 2015 +0200
fix PluggableTransportModuleIT.java to use local disco and detach port limit for node local disco
commit b6706eddfa04c43947c16551359ae98a463d34aa
Author: Robert Muir <rmuir@apache.org>
Date: Wed Aug 19 14:16:03 2015 -0400
Default to unicast discovery, with default host list of 127.0.0.1, [::1]
The documentation states that scrolls are automatically closed when all
documents are consumed, but this is not the case. I first tried to fix
the code to close scrolls automatically but this made REST tests fail
because clearing a scroll that is already closed returned a 4xx error
instead of a 2xx code, so this has probably been this way for a very long
time.
At the moment, when installing from an url, a user provides the plugin name on
the command line like:
* bin/plugin install [plugin-name] --url [url]
This can lead to problems when picking an already existing name from another
plugin, and can potentially overwrite plugins already installed with that name.
This, this PR introduces a mandatory `name` property to the plugin descriptor
file which replaces the name formerly provided by the user.
With the addition of the `name` property to the plugin descriptor file, the user
does not need to specify the plugin name any longer when installing from a file
or url. Because of this, all arguments to `plugin install` command are now
either treated as a symbolic name, a URL or a file without the need to specify
this with an explicit option.
The new syntax for `plugin install` is now:
bin/plugin install [name or url]
* downloads official plugin
bin/plugin install analysis-kuromoji
* downloads github plugin
bin/plugin install lmenezes/elasticsearch-kopf
* install from URL or file
bin/plugin install http://link.to/foo.zip
bin/plugin install file:/path/to/foo.zip
If the argument does not parse to a valid URL, it is assumed to be a name and the
download location is resolved like before. Regardless of the source location of
the plugin, it is extracted to a temporary directory and the `name` property from
the descriptor file is used to determine the final install location.
Relates to #12715
This move the `murmur3` field to the `mapper-murmur3` plugin and fixes its
defaults so that values will not be indexed by default, as the only purpose
of this field is to speed up `cardinality` aggregations on high-cardinality
string fields, which only requires doc values.
I also removed the `rehash` option from the `cardinality` aggregation as it
doesn't bring much value (rehashing is cheap) and allowed to remove the
coupling between the `cardinality` aggregation and the `murmur3` field.
Close#12874
When elasticsearch is configured by interface (or default: loopback interfaces),
bind to all addresses on the interface rather than an arbitrary one.
If the publish address is not specified, default it from the bound addresses
based on the following sort ordering:
* ipv4/ipv6 (java.net.preferIPv4Stack, defaults to true)
* ordinary addresses
* site-local addresses
* link local addresses
* loopback addresses
One one address is published, and multicast is still always over ipv4: these
need to be future improvements.
Closes#12906Closes#12915
Squashed commit of the following:
commit 7e60833312f329a5749f9a256b9c1331a956d98f
Author: Robert Muir <rmuir@apache.org>
Date: Mon Aug 17 14:45:33 2015 -0400
fix java 7 compilation oops
commit c7b9f3a42058beb061b05c6dd67fd91477fd258a
Author: Robert Muir <rmuir@apache.org>
Date: Mon Aug 17 14:24:16 2015 -0400
Cleanup/fix logic around custom resolvers
commit bd7065f1936e14a29c9eb8fe4ecab0ce512ac08e
Author: Robert Muir <rmuir@apache.org>
Date: Mon Aug 17 13:29:42 2015 -0400
Add some unit tests for utility methods
commit 0faf71cb0ee9a45462d58af3d1bf214e8a79347c
Author: Robert Muir <rmuir@apache.org>
Date: Mon Aug 17 12:11:48 2015 -0400
localhost all the way down
commit e198bb2bc0d1673288b96e07e6e6ad842179978c
Merge: b55d092 b93a75f
Author: Robert Muir <rmuir@apache.org>
Date: Mon Aug 17 12:05:02 2015 -0400
Merge branch 'master' into network_cleanup
commit b55d092811d7832bae579c5586e171e9cc1ebe9d
Author: Robert Muir <rmuir@apache.org>
Date: Mon Aug 17 12:03:03 2015 -0400
fix docs, fix another bug in multicast (publish host = bad here!)
commit 88c462eb302b30a82585f95413927a5cbb7d54c4
Author: Robert Muir <rmuir@apache.org>
Date: Mon Aug 17 11:50:49 2015 -0400
remove nocommit
commit 89547d7b10d68b23d7f24362e1f4782f5e1ca03c
Author: Robert Muir <rmuir@apache.org>
Date: Mon Aug 17 11:49:35 2015 -0400
fix http too
commit 9b9413aca8a3f6397b5031831f910791b685e5be
Author: Robert Muir <rmuir@apache.org>
Date: Mon Aug 17 11:06:02 2015 -0400
Fix transport / interface code
Next up: multicast and then http
* Centralised plugin docs in docs/plugins/
* Moved integrations into same docs
* Moved community clients into the clients section of the docs
* Removed docs/community
Closes#11734Closes#11724Closes#11636Closes#11635Closes#11632Closes#11630Closes#12046Closes#12438Closes#12579
This allows `path.shared_data` to be added to the security manager while
still allowing a custom `data_path` for indices using shadow replicas.
For example, configuring `path.shared_data: /tmp/foo`, then created an
index with:
```
POST /myindex
{
"index": {
"number_of_shards": 1,
"number_of_replicas": 1,
"data_path": "/tmp/foo/bar/baz",
"shadow_replicas": true
}
}
```
The index will then reside in `/tmp/foo/bar/baz`.
`path.shared_data` defaults to `null` if not specified.
Resolves#12714
Relates to #11065
Instead of logging the entire `_source` in the indexing slowlog we log by
default just the first 1000 characters - this is controlled by the
`index.indexing.slowlog.source` settings and can be set to `true` to log the
whole `_source`, `false` to log none of it, and a number to log at most that
many characters.
Closes#4485
Due to the limited abilities of parsing of dynamic (not configured) arguments
like `http.cors.enabled`, that dont map to a command line argument but will
become configuration, we need to mention explicitely, that those dynamic arguments
must come last.
Also fixed some mentions of a memory index setting, that does not exist anymore.
Closes#12758
This commit adds basic support to track the number of times scripts are
compiled and compiled scripts are evicted from the script cache. These
statistics are tracked at the node level.
Closes#12673
This commit updates the Zen Discovery documentation to explain which
nodes partcipate in master election (by default) as well as the
configuration parameters for controlling this.
Closes#12727
The `multi_match` query groups terms that have the same analyzer together and
then applies the boost of the first query in each group. This is not necessary
given that boosts for each term are already applied another way.
Today this is "unofficial" as conf/scripts, but some people
want to share scripts across different nodes and so on. Because
they cannot configure it, they are forced to use dirty hacks
like symbolic links, which isnt going to work: we aren't going
to recursively scan conf/ and add permissions to all link targets
underneath it, thats crazy.
I really hate adding yet another configuration knob here, but
users resorting to using symlinks are going to be frustrated,
and do things in a more insecure way.
In order to ensure, we have the same experience across operating systems
and shells, this commit uses the java CLI parser instead of the shell
getopt parsing to parse arguments.
This also allows for support for paths, which contain spaces.
Also commons-cli depdency was upgraded to 1.3.1 and tests have been added.
Changes
* new exit code, OK_AND_EXIT, allowing to tell the caller to exit, as everything
went as expected (e.g. when running a version output)
BWC breaking:
* execute() returns an ExitStatus instead of an integer, otherwise there is no
possibility to signal by a command, if the JVM should be exited after a run.
This affects plugins, that have command line tools
* -v used to be version, but is a verbose flag by default in the current CLI infra,
must be -V or --version now
* -X has been removed - the current implementation was useless anyway, as
it prefixed those properties with "es.". You should use
ES_JAVA_OPTS/JAVA_OPTS for JVM configuration
Date math index name resolution enables you to search a range of time-series indices, rather than searching all of your time-series indices and filtering the the results or maintaining aliases. Limiting the number of indices that are searched reduces the load on the cluster and improves execution performance. For example, if you are searching for errors in your daily logs, you can use a date math name template to restrict the search to the past two days.
The added `ExpressionResolver` implementation that is responsible for resolving date math expressions in index names. This resolver is evaluated before wildcard expressions are evaluated.
The supported format: `<static_name{date_math_expr{date_format|timezone_id}}>` and the date math expressions must be enclosed within angle brackets. The `date_format` is optional and defaults to `YYYY.MM.dd`. The `timezone_id` id is optional too and defaults to `utc`.
The `{` character can be escaped by places `\\` before it.
Closes#12059
HDRHistogram has been added as an option in the percentiles and percentile_ranks aggregation. It has one option `number_significant_digits` which controls the accuracy and memory size for the algorithm
Closes#8324
The TransportSingleCustomOperationAction `prefer_local` option has been removed as it isn't worth the effort.
The TransportSingleShardAction will execute the operation on the receiving node if a concrete list doesn't provide a list of candite shards routings to perform the operation on.
Moving the query building functionality from the parser to the builders
new toQuery() method analogous to other recent query refactorings.
Relates to #10217
Moving the query building functionality from the parser to the builders
new doToQuery() method analogous to other recent query refactorings.
Relates to #10217Closes#12365
The release and smoke test python scripts used to install
plugins in the old fashion.
Also the BATS testing suite installed/removed plugins in that
way. Here the marvel tests have been removed, as marvel currently
does not work with the master branch.
In addition documentation has been updated as well, where it was
still missing.
In order to unify the handling and reuse the CLITool infrastructure
the plugin manager should make use of this as well.
This obsolets the -i and --install options but requires the user
to use `install` as the first argument of the CLI.
This is basically just a port of the existing functionality, which
is also the reason why this is not a refactoring of the plugin manager,
which will come in a separate commit.
The `_index` field is now a completely virtual field thanks
to #12027. It is no longer necessary to index the actual value
of the index name.
closes#12329
Store information reports on which nodes shard copies exist, the shard
copy version, indicating how recent they are, and any exceptions
encountered while opening the shard index or from earlier engine failure.
closes#10952
Just like specifying `?preference=_primary`, this adds the ability to
specify `?preference=_replica` or `?preference=_replica_first` on
requests that support it.
Resolves#12222
During master election each node pings in order to discover other nodes and validate the liveness of existing nodes. Based on this information the node either discovers an existing master or, if enough nodes are found (based on `discovery.zen.minimum_master_nodes>>) a new master will be elected.
Currently, the node that is elected as master will currently update it the cluster state to indicate the result of the election. Other nodes will submit a join request to the newly elected master node. Instead of immediately processing the election result, the elected master
node should wait for the incoming joins from other nodes, thus validating the elections result is properly applied. As soon as enough nodes have sent their joins request (based on the `minimum_master_nodes` settings) the cluster state is modified.
Note that if `minimum_master_nodes` is not set, this change has no effect.
Closes#12161
Require urls for URL repository to be listed in repositories.url.allowed_urls setting. This change ensures that only authorized URLs can be accessed by elasticsearch
Lucene deprecated this in 4.0 and we only try best effort to support it.
Folks should only use edit distance rather than some length based
similarity. Yet the formular is simple enough such that users can
still do it in the client if they really need to.
Closes#10638
Change the default delayed allocation timeout from 0 (no delayed allocation) to 1m. The value came from a test of having a node with 50 shards being indexed into (so beefy translog requiring flush on shutdown), then shutting it down and starting it back up and waiting for it to join the cluster. This took, on a slow machine, about 30s.
The value is conservatively low and does not try to address a virtual machine / OS restart for now, in order to not have the affect of node going away and users being concerned that shards are not being allocated to the rest of the cluster as a result of that. The setting can always be changed in order to increase the delayed allocation if needed.
closes#12166
This PR is a simple doc patch to explicitly mention with an example of
how to create an alias using a glob pattern. This comes up from
time-to-time with our customers and in the community and although
mentioned in the documentation already, is not obvious.
Also mention that the alias will not auto-update as indices matching the
glob change.
Closes#12175Closes#12176
ignore_above is used to guard against the lucene limitation
that a term cannot exceed 32766 bytes.
However, the implementation just used the character count, which
doesn't take into account the fact that some characters have
multi-byte utf-8 encodings.
This commit updates the docs to make this relationship clear.
Closes#11563
Moving the query building functionality from the parser to the builders
new toQuery() method analogous to other recent query refactorings.
Relates to #10217
Add support for retrieving fields in bulk updates
This commit adds support to retrieve fields when using the bulk update API. This functionality was previously available for the update API
but not for the bulk update API.
Closes#11527
This commit adds support to retrieve fields when using the bulk update API. This functionality was previously available for the update API
but not for the bulk update API.
Closes#11527
Fixed documentation since the default rewrite method for fuzzy queries is to
select top terms, fixed usage of the fuzzy rewrite method, and removed unused
`rewrite` parameter.
Close#6932
This rewrite method is interesting because it computes scores as if all terms
had the same frequencies, which avoids disappointments with ranking when a fuzzy
query ranks typos first given that they are less frequent than the correct term.
Plugin Manager can now use another simplified form when a user wants to install an official plugin hosted at elasticsearch download service.
The form we use is:
```sh
bin/plugin install pluginname
```
As plugins share now the same version as elasticsearch, we can automatically guess what is the exact current version of the plugin manager script.
Also, download service will now use `/org.elasticsearch.plugins/pluginName/pluginName-version.zip` URL path to download a plugin.
If the older form is provided (`user/plugin/version` or `user/plugin`), we will still use:
* elasticsearch download service at `/user/plugin/plugin-version.zip`
* maven central with groupIp=user, artifactId=plugin and version=version
* github with user=user, repoName=plugin and tag=version
* github with user=user, repoName=plugin and branch=master if no version is set
Note that community plugin providers can use other download services by using `--url` option.
If you try to use the new form with a non core elasticsearch plugin, the plugin manager will reject
it and will give you all known core plugins.
```
Usage:
-u, --url [plugin location] : Set exact URL to download the plugin from
-i, --install [plugin name] : Downloads and installs listed plugins [*]
-t, --timeout [duration] : Timeout setting: 30s, 1m, 1h... (infinite by default)
-r, --remove [plugin name] : Removes listed plugins
-l, --list : List installed plugins
-v, --verbose : Prints verbose messages
-s, --silent : Run in silent mode
-h, --help : Prints this help message
[*] Plugin name could be:
elasticsearch-plugin-name for Elasticsearch 2.0 Core plugin (download from download.elastic.co)
elasticsearch/plugin/version for elasticsearch commercial plugins (download from download.elastic.co)
groupId/artifactId/version for community plugins (download from maven central or oss sonatype)
username/repository for site plugins (download from github master)
Elasticsearch Core plugins:
- elasticsearch-analysis-icu
- elasticsearch-analysis-kuromoji
- elasticsearch-analysis-phonetic
- elasticsearch-analysis-smartcn
- elasticsearch-analysis-stempel
- elasticsearch-cloud-aws
- elasticsearch-cloud-azure
- elasticsearch-cloud-gce
- elasticsearch-delete-by-query
- elasticsearch-lang-javascript
- elasticsearch-lang-python
```
This pipeline aggregation runs a script on each bucket in the parent aggregation to determine whether the bucket is kept in the final aggregation tree. If the script returns true the bucket is retained, if it returns false the bucket is dropped
If you are using the default date or the named identifiers of dates,
the current implementation was allowed to read a year with only one
digit. In order to make this more strict, this fixes a year to be at
least 4 digits. Same applies for month, day, hour, minute, seconds.
Also the new default is `strictDateOptionalTime` for indices created
with Elasticsearch 2.0 or newer.
In addition a couple of not exposed date formats have been exposed, as they
have been mentioned in the documentation.
Closes#6158
Moving the query building functionality from the parser to the builders
new toQuery() method analogous to other recent query refactorings.
Relates to #10217
Currently parsing inner queries can return `null` which leads to unnecessary complicated
null checks further up the query hierarchy. By introducing a special EmptyQueryBuilder
that can be used to signal that this query clause should be ignored upstream where possible,
we can avoid additional null checks in parent query builders and still allow for this query
to be ignored when creating the lucene query later.
This new builder returns `null` when calling its `toQuery()` method, so at this point
we still need to handle it, but having an explicit object for the intermediate query
representation makes it easier to differentiate between cases where inner query clauses are
empty (legal) or are not set at all (illegal).
Also added check for empty inner builder list to BoolQueryBuilder and OrQueryBuilder.
Removed setters for positive/negatice query in favour of constructor with
mandatory non-null arguments in BoostingQueryBuilder.
Closes#11915
The `:ref:` link in java-api doc is connected to `current` version which is at this time `1.6`.
This commit patch this.
That being said, we might have to change it again once master will become `current` doc.
This commit reorganizes the docs to make Java API docs looking more like the REST docs.
Also, with 2.0.0, FilterBuilders don't exist anymore but only QueryBuilders.
Also, all docs api move now to docs/java-api/docs dir as for REST doc.
Remove removed queries/filters
-----
* Remove Constant Score Query with filter
* Remove Fuzzy Like This (Field) Query (flt and flt_field)
* Remove FilterBuilders
Move filters to queries
-----
* Move And Filter to And Query
* Move Bool Filter to Bool Query
* Move Exists Filter to Exists Query
* Move Geo Bounding Box Filter to Geo Bounding Box Query
* Move Geo Distance Filter to Geo Distance Query
* Move Geo Distance Range Filter to Geo Distance Range Query
* Move Geo Polygon Filter to Geo Polygon Query
* Move Geo Shape Filter to Geo Shape Query
* Move Has Child Filter by Has Child Query
* Move Has Parent Filter by Has Parent Query
* Move Ids Filter by Ids Query
* Move Limit Filter to Limit Query
* Move MatchAll Filter to MatchAll Query
* Move Missing Filter to Missing Query
* Move Nested Filter to Nested Query
* Move Not Filter to Not Query
* Move Or Filter to Or Query
* Move Range Filter to Range Query
* Move Ids Filter to Ids Query
* Move Term Filter to Term Query
* Move Terms Filter to Terms Query
* Move Type Filter to Type Query
Add missing queries
-----
* Add Common Terms Query
* Add Filtered Query
* Add Function Score Query
* Add Geohash Cell Query
* Add Regexp Query
* Add Script Query
* Add Simple Query String Query
* Add Span Containing Query
* Add Span Multi Term Query
* Add Span Within Query
Reorganize the documentation
-----
* Organize by full text queries
* Organize by term level queries
* Organize by compound queries
* Organize by joining queries
* Organize by geo queries
* Organize by specialized queries
* Organize by span queries
* Move Boosting Query
* Move DisMax Query
* Move Fuzzy Query
* Move Indices Query
* Move Match Query
* Move Mlt Query
* Move Multi Match Query
* Move Prefix Query
* Move Query String Query
* Move Span First Query
* Move Span Near Query
* Move Span Not Query
* Move Span Or Query
* Move Span Term Query
* Move Template Query
* Move Wildcard Query
Add some missing pages
----
* Add multi get API
* Add indexed-scripts link
Also closes#7826
Related to https://github.com/elastic/elasticsearch/pull/11477#issuecomment-114745934
The work around for resolving `now` doesn't need to be used for aliases, becuase alias filters are parsed at search time. However it can't be removed, because the percolator relies on it.
Parent/child can be specified again in alias filters, this now works again because alias filters are parsed at search time. Parent/child will also use the late query parse work around, to make sure to do the final preparations when the search context is around. This allows the aliases api to validate the parent/child queries without failing because there is no search context.
Closes#10485
Following the discussion in #11744, move boost and query _name to base class AbstractQueryBuilder with their getters and setters. Unify their serialization code and equals/hashcode handling in the base class too. This guarantess that every query supports both _name and boost and nothing needs to be done around those in subclasses besides properly parsing the fields in the parsers and printing them out as part of the doXContent method in the builders. More specifically, these are the performed changes:
- Introduced printBoostAndQueryName utility method in AbstractQueryBuilder that subclasses can use to print out _name and boost in their doXContent method.
- readFrom and writeTo are now final methods that take care of _name and boost serialization. Subclasses have to implement doReadFrom and doWriteTo instead.
- toQuery is a final method too that takes care of properly applying _name and boost to the lucene query. Subclasses have to implement doToQuery instead. The query returned will have boost and queryName applied automatically.
- Removed BoostableQueryBuilder interface, given that every query is boostable after this change. This won't have any negative effect on filters, as the boost simply gets ignored in that case.
- Extended equals and hashcode to handle queryName and boost automatically as well.
- Update the query test infra so that queryName and boost are tested automatically, and whenever they are forgotten in parser or doXContent tests fail, so this makes things a lot less error-prone
- Introduced DEFAULT_BOOST constant to make sure we don't repeat 1.0f all the time for default boost values.
SpanQueryBuilder is again a marker interface only. The convenient toQuery that allowed us to override the return type to SpanQuery cannot be supported anymore due to a clash with the toQuery implementation from AbstractQueryBuilder. We have to go back to castin lucene Query to SpanQuery when dealing with span queries unfortunately.
Note that this change touches not only the already refactored queries but also the untouched ones, by making sure that we parse _name and boost whenever we need to and that we print them out as part of QueryBuilder#doXContent. This will result in printing out the default boost all the time rather than skipping it in non refactored queries, something that we would have changed anyway as part of the query refactoring.
The following are the queries that support boost now while previously they didn't (parser now parses it and builder prints it out): and, exists, fquery, geo_bounding_box, geo_distance, geo_distance_range, geo_hash_cell, geo_polygon, indices, limit, missing, not, or, script, type.
The following are the queries that support _name now while previously they didn't (parser now parses it and builder prints it out): boosting, constant_score, function_score, limit, match_all, type.
Range query parser supports now _name at the same level as boost too (_name is still supported on the outer object though for bw comp).
There are two exceptions that despite have getters and setters for queryName and boost don't really support boost and queryName: query filter and span multi term query. The reason for this is that they only support a single inner object which is another query that they wrap, no other elements.
Relates to #11744Closes#10776Closes#11974
The filters aggregation now has an option to add an 'other' bucket which will, when turned on, contain all documents which do not match any of the defined filters. There is also an option to change the name of the 'other' bucket from the default of '_other_'
Closes#11289
Field stats index constraints allows to omit all field stats for indices that don't match with the constraint. An index
constraint can exclude indices' field stats based on the `min_value` and `max_value` statistic. This option is only
useful if the `level` option is set to `indices`.
For example index constraints can be useful to find out the min and max value of a particular property of your data in
a time based scenario. The following request only returns field stats for the `answer_count` property for indices
holding questions created in the year 2014:
curl -XPOST 'http://localhost:9200/_field_stats?level=indices' -d '{
"fields" : ["answer_count"] <1>
"index_constraints" : { <2>
"creation_date" : { <3>
"min_value" : { <4>
"gte" : "2014-01-01T00:00:00.000Z",
},
"max_value" : {
"lt" : "2015-01-01T00:00:00.000Z"
}
}
}
}'
Closes#11187