Commit Graph

405 Commits

Author SHA1 Message Date
Jim Ferenczi a53e8653f2
Add support for inlined user dictionary in Nori (#36123)
Add support for inlined user dictionary in Nori

This change adds a new option called `user_dictionary_rules` to the
Nori a tokenizer`. It can be used to set additional tokenization rules
to the Korean tokenizer directly in the settings (instead of using a file).

Closes #35842
2018-12-07 15:26:08 +01:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Alpar Torok e0a678f0c4
Remove version.qualified from MainResponse (#35412)
The fully qualified version will be returned as `version.number`
2018-11-29 08:41:39 +02:00
Alan Woodward f6a43b5939
Add a prebuilt ICU Analyzer (#34958)
The ICU plugin provides the building blocks of an analysis chain, but doesn't actually have a prebuilt analyzer. It would be a better for users if there was a simple analyzer that they could use out of the box, and also something we can point to from the CJK Analyzer docs as a superior alternative.

Relates to #34285
2018-11-21 09:00:48 +00:00
Clinton Gormley cb8bdeae68 Fixed bad link in ingest-geo-point 2018-11-15 09:20:04 +01:00
Tal Levy dc1821c707
explain geo_point mapping in geoip-processor (#29114)
simple docs change to add missing mapping explanation. Users may not be aware this is
a prerequisite for doing geo-queries on this enriched data.
2018-11-14 20:05:45 -08:00
Peter Dyson c8e685eb34 Clarify S3 repository storage class parameter (#35400)
Today it is unclear that the `storage_class` parameter to an S3 repository only
affects new objects and does not rewrite any existing objects. This commit
clarifies this point.
2018-11-12 12:04:33 +00:00
Armin Braun 02b4e28534
#31608 Add S3 Setting to Force Path Type Access (#34721)
* SNAPSHOTS: Use Path Style Access in S3

* Use path style access pattern to fix #31608
* closes #31608
2018-11-09 05:07:26 +01:00
Alpar Torok 5ae03195d3
Make version field names more meaningful (#35334)
* Consolidate the name of the qualified build version

* Field name in response should not be redundant
2018-11-07 18:36:02 +02:00
Alpar Torok 8a85b2eada
Remove build qualifier from server's Version (#35172)
With this change, `Version` no longer carries information about the qualifier,
we still need a way to show the "display version" that does have both
qualifier and snapshot. This is now stored  by the build and red from `META-INF`.
2018-11-07 14:01:05 +02:00
Itamar Syn-Hershko d46f3528c0 [Docs] Updating URL for Openstack Swift plugin (#34136)
The repository plugin for Openstack Swift was developed originally by Wikimedia foundation but is now retired. 
This changes the link to the repo where the actively maintained fork lives now.
2018-11-01 11:57:55 +01:00
Ke Li 14f540e8e6 Deprecate unicodeSetFilter in favour of unicode_set_filter (#29215) 2018-11-01 10:06:51 +00:00
Julie Tibshirani f854330e06
Make sure to use the type _doc in the REST documentation. (#34662)
* Replace custom type names with _doc in REST examples.
* Avoid using two mapping types in the percolator docs.
* Rename doc -> _doc in the main repository README.
* Also replace some custom type names in the HLRC docs.
2018-10-22 11:54:04 -07:00
Kazuhiro Sera d45fe43a68 Fix a variety of typos and misspelled words (#32792) 2018-10-03 18:11:38 +01:00
Vladimir Dolzhenko a7f62ee902
[GCE Discovery] Automatically set project-id and zone (#33721)
Fetch default values for project-id and zone from metadata server

Closes #13618
2018-10-03 11:37:36 +02:00
Andriy 6b714c9e1e [Docs] Updated link to kafka-elasticsearch-consumer project (#34234) 2018-10-02 17:46:38 +02:00
David Turner 421f58e172
Remove discovery-file plugin (#33257)
In #33241 we moved the file-based discovery functionality to core
Elasticsearch, but preserved the `discovery-file` plugin, and support for the
existing location of the `unicast_hosts.txt` file, for BWC reasons. This commit
completes the removal of this plugin.
2018-09-18 12:01:16 +01:00
markharwood 2fa09f062e
New plugin - Annotated_text field type (#30364)
New plugin for annotated_text field type.
Largely a copy of `text` field type but adds ability to include markdown-like syntax in the text.
The “AnnotatedText” class parses text+markup and converts into plain text and AnnotationTokens.
The annotation token values are injected unchanged alongside the regular text tokens to provide a
form of additional indexed overlay useful in positional searches and highlighting.
Annotated_text fields do not support fielddata as we want to phase this out.
Also includes a new "annotated" highlighter type that retains annotations and merges in search
hits as additional annotation markup.

Closes #29467
2018-09-18 10:25:27 +01:00
Or Bin a5bad4d92c Docs: Fixed a grammatical mistake: 'a HTTP ...' -> 'an HTTP ...' (#33744)
Fixed a grammatical mistake: 'a HTTP ...' -> 'an HTTP ...'

Closes #33728
2018-09-17 15:35:54 -04:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Albert Zaharovits c31c51dc80
[DOC] Repository GCS ADC not supported (#33238)
Make it clear that automatic default credentials (ADC)
is not supported for the repository-gcs plugin.
"Service Account" method is the only alternative
to authn requests to Google Cloud Storage.
2018-08-30 10:32:08 +03:00
David Turner 47859e56ac
Move file-based discovery to core (#33241)
Today we support a static list of seed hosts in core Elasticsearch, and allow a
dynamic list of seed hosts to be provided via a file using the `discovery-file`
plugin. In fact the ability to provide a dynamic list of seed hosts is
increasingly useful, so this change moves this functionality to core
Elasticsearch to avoid the need for a plugin.

Furthermore, in order to start up nodes in integration tests we currently
assign a known port to each node before startup, which unfortunately sometimes
fails if another process grabs the selected port in the meantime. By moving the
`discovery-file` functionality into the core product we can use it to avoid
this race.

This change also moves the expected path to the file from
`$ES_PATH_CONF/discovery-file/unicast_hosts.txt` to
`$ES_PATH_CONF/unicast_hosts.txt`. An example of this file is not included in
distributions.

For BWC purposes the plugin still exists, but does nothing more than create the
example file in the old location, and issue a warning when it is used. We also
continue to support the old location for the file, but warn about its
deprecation.

Relates #29244
Closes #33030
2018-08-30 06:43:04 +01:00
Hazem Khaled b87f3062b7 [DOCS] Update WordPress plugins links (#32194) 2018-08-16 13:55:40 +02:00
Albert Zaharovits 2d87287c0d
[DOCS] Reloadable Secure Settings (#31713)
Docs on reloadable secure settings for plugins #29135 .
2018-08-01 12:07:23 +03:00
Russ Cam e2b665c2e6
Consistent encoder names (#29492)
This commit updates encoder names to be consistent within documentation
and align with snake casing convention.
2018-07-24 09:21:43 +10:00
Nick Peihl ac63408655
Add region ISO code to GeoIP Ingest plugin (#31669) 2018-07-20 11:23:29 -07:00
Vladimir Dolzhenko 7c0fc209bf
ECS Task IAM profile credentials ignored in repository-s3 plugin (#31864)
ECS Task IAM profile credentials ignored in repository-s3 plugin (#31864)

Closes #26913
2018-07-19 12:54:38 +02:00
Jim Ferenczi 573613e7ca
[Docs] Fix wrong link in Korean analyzer docs (#31815) 2018-07-06 09:30:48 +02:00
David Turner 4108722052
Add support for AWS session tokens (#30414)
AWS supports the creation and use of credentials that are only valid for a
fixed period of time. These credentials comprise three parts: the usual access
key and secret key, together with a session token. This commit adds support for
these three-part credentials to the EC2 discovery plugin and the S3 repository
plugin.

Note that session tokens are only valid for a limited period of time and yet
there is no mechanism for refreshing or rotating them when they expire without
restarting Elasticsearch.  Nonetheless, this feature is already useful for
nodes that need only run for a few days, such as for training, testing or
evaluation. #29135 tracks the work towards allowing these credentials to be
refreshed at runtime.

Resolves #16428
2018-07-03 14:12:07 +01:00
Stéphane Campinas 1dd10fe69f [Docs] Correct typos (#31720) 2018-07-02 15:17:31 +02:00
ritesh-kapoor 2a3a86bb5e [DOCS] Add PQL language Plugin (#31237)
Add PQL language Plugin to community plugin page
2018-06-29 11:37:09 +02:00
Nik Everett 73549281e8
Docs: Use the default distribution to test docs (#31251)
This switches the docs tests from the `oss-zip` distribution to the
`zip` distribution so they have xpack installed and configured with the
default basic license. The goal is to be able to merge the
`x-pack/docs` directory into the `docs` directory, marking the x-pack
docs with some kind of marker. This is the first step in that process.

This also enables `-Dtests.distribution` support for the `docs`
directory so you can run the tests against the `oss-zip` distribution
with something like
```
./gradlew -p docs check -Dtests.distribution=oss-zip
```

We can set up Jenkins to run both.

Relates to #30665
2018-06-18 12:06:42 -04:00
Tanguy Leroux 3274e7fd1a
[Docs] Remove reference to repository-s3 plugin creating an S3 bucket (#31359)
Closes #30910
2018-06-15 14:55:14 +02:00
Tanguy Leroux c351b51ac4
[Docs] Fix inconsistencies in snapshot/restore doc (#30480)
Closes #30444
2018-05-22 09:19:07 +02:00
Ryan Ernst b3f3a4312b
Plugins: Remove meta plugins (#30670)
Meta plugins existed only for a short time, in order to enable breaking
up x-pack into multiple plugins. However, now that x-pack is no longer
installed as a plugin, the need for them has disappeared. This commit
removes the meta plugins infrastructure.
2018-05-18 10:56:08 -07:00
Albert Zaharovits 801973fa9f
Repository GCS plugin new client library (#30168)
This does away with the deprecated `com.google.api-client:google-api-client:1.23`
and replaces it with `com.google.cloud:google-cloud-storage:1.28.0`.
It also changes security permissions for the repository-gcs plugin.
2018-05-15 18:22:58 +03:00
Dave Moore 391bcbcbe1 Added zentity to the list of API extension plugins (#29143) 2018-05-07 14:46:47 +02:00
Jim Ferenczi d3ee35ef18 [Docs] Add snippets for POS stop tags default value
relates #30397
2018-05-05 07:53:50 +02:00
Jim Ferenczi ec187ed3be [Docs] Fix bad link
relates #30397
2018-05-04 22:07:12 +02:00
Jim Ferenczi d7c2a99347 [Docs] Fix end of section in the korean plugin docs
relates #30397
2018-05-04 21:41:50 +02:00
Jim Ferenczi 891d3bd9c3
Expose the Lucene Korean analyzer module in a plugin (#30397)
This change adds a new plugin called `analysis-nori` that exposes
Korean text analysis in es using the new Lucene Korean analyzer module named (`nori`).
The plugin adds:
* a Korean analyzer: `nori`
* a Korean tokenizer: `nori_tokenizer`
* a part of speech stop filter: `nori_part_of_speech`
* a filter that can replace Hanja characters with their Hangul transcription: `nori_readingform`
2018-05-04 20:46:13 +02:00
ZarHenry96 d9ea0dd6c3 [Docs] Add community analysis plugin (#29612) 2018-04-25 14:11:35 +02:00
Jason Tedor d99d0fa669 Add distribution type to startup scripts
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
2018-04-20 15:34:01 -07:00
Jason Tedor e64e6d8996 Add distribution flavor to startup scripts
This commit adds the distribution flavor (default versus oss) to the
build process which is passed through the startup scripts to
Elasticsearch. This change will be used to customize the message on
attempting to install/remove x-pack based on the distribution flavor.
2018-04-20 15:33:58 -07:00
Bolarinwa Saheed Olayemi a3e5773522 Docs: Link to Ansible playbook for Elasticsearch (#29238)
Links to the official Ansible playbook for Elasticsearch.
2018-03-28 18:18:42 -04:00
David Pilato 87553bba16
Add ingest-attachment support for per document `indexed_chars` limit (#28977)
We today support a global `indexed_chars` processor parameter. But in some cases, users would like to set this limit depending on the document itself.
It used to be supported in mapper-attachments plugin by extracting the limit value from a meta field in the document sent to indexation process.

We add an option which reads this limit value from the document itself
by adding a setting named `indexed_chars_field`.

Which allows running:

```
PUT _ingest/pipeline/attachment
{
  "description" : "Extract attachment information. Used to parse pdf and office files",
  "processors" : [
    {
      "attachment" : {
        "field" : "data",
        "indexed_chars_field" : "size"
      }
    }
  ]
}
```

Then index either:

```
PUT index/doc/1?pipeline=attachment
{
  "data": "BASE64"
}
```

Which will use the default value (or the one defined by `indexed_chars`)

Or

```
PUT index/doc/2?pipeline=attachment
{
  "data": "BASE64",
  "size": 1000
}
```

Closes #28942
2018-03-14 19:07:20 +01:00
Shane O'Grady 7dcd48a0b5 [discovery-gce] Align code examples and documentation (#28876)
The docs state that `_gce_` is recommended but the code sample states
that `_gce:hostname_` is recommended. This aligns the code sample with
the documentation. Also replace `type` with `zen.hosts_provider` as 
discovery.type was removed in #25080.
2018-03-12 15:37:11 +01:00
Daniel Mitterdorfer 0d78a5890e
Reduce heap-memory usage of ingest-geoip plugin (#28963)
With this commit we reduce heap usage of the ingest-geoip plugin by
memory-mapping the database files. Previously, we have stored these
files gzip-compressed but this has resulted that data are loaded on the
heap.

Closes #28782
2018-03-12 08:07:33 +01:00
Tanguy Leroux a6a138905d
Use client settings in repository-gcs (#28575)
Similarly to what has been done for s3 and azure, this commit removes
the repository settings `application_name` and `connect/read_timeout`
in favor of client settings. It introduce a GoogleCloudStorageClientSettings
class (similar to S3ClientSettings) and a bunch of unit tests for that,
it aligns the documentation to be more coherent with the S3 one, it
documents the connect/read timeouts that were not documented at all and
also adds a new client setting that allows to define a custom endpoint.
2018-02-22 15:40:20 +01:00
Ryan Ernst 9a5199fae3
Docs: Remove references to elasticsearch directory in plugins (#28647)
This directory was removed from plugins in #28589, but docs still
referenced it. This commit cleans up the plugin author docs to no longer
refer to it.
2018-02-13 09:15:57 -08:00