Now the blob size information is available before writing anything,
the repository implementation can know upfront what will be the
more suitable API to upload the blob to S3.
This commit removes the DefaultS3OutputStream and S3OutputStream
classes and moves the implementation of the upload logic directly in the
S3BlobContainer.
related #26993closes#26969
You can define a proxy using the following settings:
```yml
azure.client.default.proxy.host: proxy.host
azure.client.default.proxy.port: 8888
azure.client.default.proxy.type: http
```
Supported values for `proxy.type` are `direct`, `http` or `socks`. Defaults to `direct` (no proxy).
Closes#23506
BTW I changed a test `testGetSelectedClientBackoffPolicyNbRetries` as it was using an old setting name `cloud.azure.storage.azure.max_retries` instead of `azure.client.azure1.max_retries`.
Follow up for #23405.
We remove azure deprecated settings in 7.0:
* The legacy azure settings which where starting with `cloud.azure.storage.` prefix have been removed.
This includes `account`, `key`, `default` and `timeout`.
You need to use settings which are starting with `azure.client.` prefix instead.
* Global timeout setting `cloud.azure.storage.timeout` has been removed.
You must set it per azure client instead. Like `azure.client.default.timeout: 10s` for example.
All of the snippets in our docs marked with `// TESTRESPONSE` are
checked against the response from Elasticsearch but, due to the
way they are implemented they are actually parsed as YAML instead
of JSON. Luckilly, all valid JSON is valid YAML! Unfurtunately
that means that invalid JSON has snuck into the exmples!
This adds a step during the build to parse them as JSON and fail
the build if they don't parse.
But no! It isn't quite that simple. The displayed text of some of
these responses looks like:
```
{
...
"aggregations": {
"range": {
"buckets": [
{
"to": 1.4436576E12,
"to_as_string": "10-2015",
"doc_count": 7,
"key": "*-10-2015"
},
{
"from": 1.4436576E12,
"from_as_string": "10-2015",
"doc_count": 0,
"key": "10-2015-*"
}
]
}
}
}
```
Note the `...` which isn't valid json but we like it anyway and want
it in the output. We use substitution rules to convert the `...`
into the response we expect. That yields a response that looks like:
```
{
"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,
"aggregations": {
"range": {
"buckets": [
{
"to": 1.4436576E12,
"to_as_string": "10-2015",
"doc_count": 7,
"key": "*-10-2015"
},
{
"from": 1.4436576E12,
"from_as_string": "10-2015",
"doc_count": 0,
"key": "10-2015-*"
}
]
}
}
}
```
That is what the tests consume but it isn't valid JSON! Oh no! We don't
want to go update all the substitution rules because that'd be huge and,
ultimately, wouldn't buy much. So we quote the `$body.took` bits before
parsing the JSON.
Note the responses that we use for the `_cat` APIs are all converted into
regexes and there is no expectation that they are valid JSON.
Closes#26233
The environment variable CONF_DIR was previously inconsistently used in
our packaging to customize the location of Elasticsearch configuration
files. The importance of this environment variable has increased
starting in 6.0.0 as it's now used consistently to ensure Elasticsearch
and all secondary scripts (e.g., elasticsearch-keystore) all use the
same configuration. The name CONF_DIR is there for legacy reasons yet
it's too generic. This commit renames CONF_DIR to ES_PATH_CONF.
Relates #26197
We should have the same behavior for Azure repositories as we have for S3 (see #22762).
Instead of:
```yml
cloud:
azure:
storage:
my_account1:
account: your_azure_storage_account1
key: your_azure_storage_key1
default: true
my_account2:
account: your_azure_storage_account2
key: your_azure_storage_key2
```
Support something like:
```
azure.client:
default:
account: your_azure_storage_account1
key: your_azure_storage_key1
my_account2:
account: your_azure_storage_account2
key: your_azure_storage_key2
```
Then instead of:
```
PUT _snapshot/my_backup3
{
"type": "azure",
"settings": {
"account": "my_account2"
}
}
```
Use:
```
PUT _snapshot/my_backup3
{
"type": "azure",
"settings": {
"config": "my_account2"
}
}
```
If someone uses:
```
PUT _snapshot/my_backup3
{
"type": "azure"
}
```
It will use the `default` azure repository settings.
And mark as deprecated old settings.
Closes#22763.
This commit updates the s3 repository docs to clearly mark settings as
part of the s3 client settings, as well as those that are secure and
must be stored in the elasticsearch keystore.
relates #25619
The s3 repository plugin has "third party" integ tests which rely
on external service and configuration setup. These tests are really
internal verification of the plugin (and should be moved to real integ
tests). Running them is not something a user should do, and the
documentation has been out of date for all of 5.x. This commit removes
the docs, removing potential confusion for users.
This commit adds the min wire/index compat versions to the main action
output. Not only will this make the compatility expected more
transparent, but it also allows to test which version others think the
compat versions are, similar to how we test the lucene version.
If a cluster is configured with an HDFS repository and a node is started, that node must be able
to reach HDFS, or else when it attempts to add the repository from the cluster state at start up
it will fail to connect and the repository will be left in an inconsistent state. Adding a blurb in the
docs to outline the expected availability for HDFS when using the repository plugin.
UnicodeSetFilter was only allowed in the icu_folding token filter.
It seems useful to expose this setting in icu_normalizer token filter
and char filter.
By default, the remove plugin CLI command preserves configuration
files. This is so that if a user is upgrading the plugin (which is done
by first removing the old version and then installing the new version)
they do not lose their configuration file. Yet, there are circumstances
where preserving the configuration file is not desired. This commit adds
a purge option to the remove plugin CLI command.
Relates #24981
This commit fixes an issue with the plugin docs incorrectly specifying
how to set a custom configuration directory. The correct way is to use
the environment variable CONF_DIR.
This commit adds gcs credential settings to the elasticsearch keystore.
The setting name follows the same pattern as the s3 client settings,
beginning with `gcs.client.`, followed by the client name, and then the
setting name, in this case, `credentials_file`. Using the legacy service
file setting is also deprecated.
Adds a new "icu_collation" field type that exposes lucene's
ICUCollationDocValuesField. ICUCollationDocValuesField is the replacement
for ICUCollationKeyFilter which has been deprecated since Lucene 5.
Added missing permissions required for authenticating with Kerberos to HDFS. Also implemented
code to support authentication in the form of using a Kerberos keytab file. In order to support
HDFS authentication, users must install a Kerberos keytab file on each node and transfer it to the
configuration directory. When a user specifies a Kerberos principal in the repository settings the
plugin automatically enables security for Hadoop and begins the login process. There will be a
separate PR and commit for the testing infrastructure to support these changes.
They needed to be updated now that Painless is the default and
the non-sandboxed scripting languages are going away or gone.
I dropped the entire section about customizing the classloader
whitelists. In master this barely does anything (exposes more
things to expressions).
With this commit, Azure repositories are now using an Exponential Backoff policy before failing the backup.
It uses Azure SDK default values for this policy:
* `30s` delta backoff base with
* `3s` min
* `90s` max
* `3` retries max
Users can define the number of retries they wish by setting `cloud.azure.storage.xxx.max_retries` where `xxx` is the azure named account.
Closes#22728.
This commit addresses an issue with the docs for plugin install via a
proxy on Windows where the HTTP proxy options were incorrectly
specified.
Relates #23757
In this repository, `Settings.builder` is used everywhere although it does exactly same as `Settings.settingsBuilder`. With the reference of the commit 42526ac28e , I think mistakenly this `Settings.settingsBuilder` remains in.
Today we have multiple ways to define settings when a user needs to create a repository:
* in `elasticsearch.yml` file using `repositories.azure` prefix
* when creating the repository itself with `PUT _snaphot/repo`
The plan is to:
* Deprecate `repositories.azure` settings in 5.x (done with #22856)
* Remove in 6.x (this PR)
Related to #22800
Follow up of #22857 where we deprecate automatic creation of azure containers.
BTW I found that the `AzureSnapshotRestoreServiceIntegTests` does not bring any value because it runs basically a Snapshot/Restore operation on local files which we already test in core.
So instead of trying to fix it to make it pass with this PR, I simply removed it.
This PR adds a new option for `host_type`: `tag:TAGNAME` where `TAGNAME` is the tag field you defined for your ec2 instance.
For example if you defined a tag `my-elasticsearch-host` in ec2 and set it to `myhostname1.mydomain.com`, then
setting `host_type: tag:my-elasticsearch-host` will tell Discovery Ec2 plugin to read the host name from the
`my-elasticsearch-host` tag. In this case, it will be resolved to `myhostname1.mydomain.com`.
Closes#22566.
Reported at: https://discuss.elastic.co/t/combine-elasticsearch-5-1-1-and-repository-hdfs/69659
If you define as described [in our docs](https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-hdfs-config.html) the following `elasticsearch.yml` settings:
```yml
repositories:
hdfs:
uri: "hdfs://es-master:9000/" # optional - Hadoop file-system URI
path: "some/path" # required - path with the file-system where data is stored/loaded
```
It fails at startup because we don't register the global setting `repositories.hdfs.path` in `HdfsPlugin`.
This PR removes that from our docs so people must provide those settings only when registering the repository with:
```
PUT _snapshot/my_hdfs_repository
{
"type": "hdfs",
"settings": {
"uri": "hdfs://namenode:8020/",
"path": "elasticsearch/respositories/my_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true"
}
}
```
Based on issue #22800.
Closes#22301
Added a new section detailing how to use the attachment processor
within an array.
This reverts commit #22296 and instead links to the foreach processor.
With this commit, we introduce a cache to the geoip ingest processor.
The cache is enabled by default and caches the 1000 most recent items.
The cache size is controlled by the setting `ingest.geoip.cache_size`.
Closes#22074
* [DOCS] Show EC2's auto attribute
This documents the `aws_availability_zone` node attribute as part of the `discovery-ec2` plugin. Also fixes outdated usage of "cloud aws".
Plugins: Remove pluggability of ZenPing
ZenPing is the part of zen discovery which knows how to ping nodes.
There is only one alternative implementation, which is just for testing.
This change removes the ability to add custom zen pings, and instead
hooks in the MockZenPing for tests through an overridden method in
MockNode. This also folds in the ZenPingService (which was really just a
single method) into ZenDiscovery, and removes the idea of having
multiple ZenPing instances. Finally, this was the last usage of the
ExtensionPoint classes, so that is also removed here.
Currently the default S3 buffer size is 100MB, which can be a lot for small
heaps. This pull request updates the default to be 100MB for heaps that are
greater than 2GB and 5% of the heap size otherwise.
Refactored ScriptType to clean up some of the variable and method names. Added more documentation. Deprecated the 'in' ParseField in favor of 'stored' to match the indexed scripts being replaced by stored scripts.
Lucene 6.3 is expected to be released in the next weeks so it'd be good to give
it some integration testing. I had to upgrade randomized-testing too so that
both Lucene and Elasticsearch are on the same version.
* plugins/discovery-azure-class.asciidoc
* reference/cluster.asciidoc
* reference/modules/cluster/misc.asciidoc
* reference/modules/indices/request_cache.asciidoc
After this is merged there will be no unconvereted snippets outside
of `reference`.
Related to #18160
Currently, we check if a node has the same set of custom metadata as the master
before joining the cluster. This implies freshly installing a plugin that has its
custom metadata requires a full cluster restart.