This commit adds a shell script which:
* move current elasticsearch core source in `/core`
* fetch `elasticsearch-parent` project in `/`
* fetch plugins in `/plugins`
* change `groupId` for all plugins to `org.elasticsearch.plugin` so versions won't conflict in maven central
* remove plugins/name/dev-tools dir which is not needed anymore
* remove plugins/name/.git dir
* remove plugins/name/LICENSE and plugins/name/CONTRIBUTING files
* clean `core/pom.xml` of useless settings that are inherited from parent project.
* `core/pom.xml` is adapted to change location of rest tests definition (`../`)
* change core name to `Elasticsearch Core`
* remove `plugins` dir from `.gitignore`
Plugins added:
* Analysis
* analysis-kuromoji
* analysis-smartcn
* analysis-stempel
* analysis-phonetic
* analysis-icu
* Mapper
* mapper-attachments
* Language
* lang-python
* lang-mvel
* lang-javascript
* Cloud
* cloud-gce
* cloud-azure
* cloud-aws
River plugins are ignored but might be added if we want to.
Todo:
* check and adapt our release tool. It now has to upload all submodules as well.
* adapt Jenkins jobs
This catches UnsatifiedLinkError when attempting to load the JNA Native
class, in cases where there are error loading the native libraries that JNA
needs to function.
Today, JNA is a optional dependency in the build but when running tests or running
with mlockall set to true, JNA must be on the classpath for Windows systems since
we always try to load JNA classes when using mlockall.
The old Natives class was renamed to JNANatives, and a new Natives class is
introduced without any direct imports on JNA classes. The Natives class checks to
see if JNA classes are available at startup. If the classes are available the Natives
class will delegate to the JNANatives class. If the classes are not available the
Natives class will not use the JNANatives class, which results in no additional attempts
to load JNA classes.
Additionally, all of the JNA classes were moved to the bootstrap package and made
package private as this is the only place they should be called from.
Closes#11360
The tests for the recently added added wildcard feature were
relying on order of the hashmap being used, which could be
different.
The implementation now ensures, that the header fields are
parsed in the order they have been added.
When we highlight on fields using wildcards then fields might match that cannot
be highlighted by the specified highlighter. The whole search request then
failed. Instead, check that the field can be highlighted and ignore the field
if it can't.
In addition ignore the exception thrown by plain highlighter if a field conatins
terms larger than 32766.
closes#9881
The Moving average predict code generated incorrect keys if the key for the first bucket of the histogram was < 0. This fix makes the moving average use the rounding class from the histogram to generate the keys for the new buckets.
Closes#11369
Updated to not mislead the reader that the data is actually gone when a document is updated. For example if you have 100GB of docs and update each one you'll only be able to access 100GB of the data, but there would theoretically be 200GB of doc data.
Closes#10375
As a follow up to #11332, this commit simplifies more class names by remove the superfluous Operation:
TransportBroadcastOperationAction -> TransportBroadcastAction
TransportMasterNodeOperationAction -> TransportMasterNodeAction
TransportMasterNodeReadOperationAction -> TransportMasterNodeReadAction
TransportShardSingleOperationAction -> TransportSingleShardAction
Closes#11349
Instead of listing the directory to file the latest segments_N file, we
should re-use the generation/filename from the last commit. This allows
us to avoid potential race conditions on the filesystem as well as
reduce the number of directory listings performed.
The count request now acts like search and barfs if all shards fail
this behavior changed and some tests like RecoveryWhileUnderLoadTests
relied on the lenient behavior of the old count API. This might be
a temporary solution to stop current test failures.
Relates to #11198
The count api used to have its own execution path, although it would do the same (up to bugs!) of the search api. This commit makes it a shortcut to the search api with size set to 0. The change is made in a backwards compatible manner, by leaving all of the java api code around too, given that you may not want to get back a whole SearchResponse when asking only for number of hits matching a query, also cause migrating from countResponse.getCount() to searchResponse.getHits().totalHits() doesn't look great from a user perspective. We can always decide to drop more code around the count api if we want to break backwards compatibility on the java api, making it a shortcut on the rest layer only.
Closes#9117Closes#11198
Add support for a specific deprecation logging that can be used to turn
on in order to notify users of a specific feature, flag, setting,
parameter, ... being deprecated.
The deprecation logger logs with a "deprecation." prefix logge
(or "org.elasticsearch.deprecation." if full name is used), and outputs
the logging to a dedicated deprecation log file.
Deprecation logging are logged under the DEBUG category. The idea is not to
enabled them by default (under WARN or ERROR) when running embedded in
another application.
By default they are turned off (INFO), in order to turn it on, the
"deprecation" category need to be set to DEBUG. This can be set in the
logging file or using the cluster update settings API, see the documentation
Closes#11033
In order to be sure that a release can be executed on the local machine,
the build_release script now checks for environment variables and tries
to execute a couple of commands.
In order to easily check for a correctly setup environment, you can
run the following commands, which exits early and does not trigger a
release process.
```
python3 dev-tools/build_release.py --check-only
```
This change adds a new "filter_path" parameter that can be used to filter and reduce the responses returned by the REST API of elasticsearch.
For example, returning only the shards that failed to be optimized:
```
curl -XPOST 'localhost:9200/beer/_optimize?filter_path=_shards.failed'
{"_shards":{"failed":0}}%
```
It supports multiple filters (separated by a comma):
```
curl -XGET 'localhost:9200/_mapping?pretty&filter_path=*.mappings.*.properties.name,*.mappings.*.properties.title'
```
It also supports the YAML response format. Here it returns only the `_id` field of a newly indexed document:
```
curl -XPOST 'localhost:9200/library/book?filter_path=_id' -d '---hello:\n world: 1\n'
---
_id: "AU0j64-b-stVfkvus5-A"
```
It also supports wildcards. Here it returns only the host name of every nodes in the cluster:
```
curl -XGET 'http://localhost:9200/_nodes/stats?filter_path=nodes.*.host*'
{"nodes":{"lvJHed8uQQu4brS-SXKsNA":{"host":"portable"}}}
```
And "**" can be used to include sub fields without knowing the exact path. Here it returns only the Lucene version of every segment:
```
curl 'http://localhost:9200/_segments?pretty&filter_path=indices.**.version'
{
"indices" : {
"beer" : {
"shards" : {
"0" : [ {
"segments" : {
"_0" : {
"version" : "5.2.0"
},
"_1" : {
"version" : "5.2.0"
}
}
} ]
}
}
}
}
```
Note that elasticsearch sometimes returns directly the raw value of a field, like the _source field. If you want to filter _source fields, you should consider combining the already existing _source parameter (see Get API for more details) with the filter_path parameter like this:
```
curl -XGET 'localhost:9200/_search?pretty&filter_path=hits.hits._source&_source=title'
{
"hits" : {
"hits" : [ {
"_source":{"title":"Book #2"}
}, {
"_source":{"title":"Book #1"}
}, {
"_source":{"title":"Book #3"}
} ]
}
}
```
We used to keep track of the rewritten query in the highlighter context to support custom rewriting done by our own postings highlighter fork. Now that we rely on lucene implementation, no rewrite happens, we can simply keep track of the original query and simplify code around it.
Closes#11317