Docs: Prepare plugin and integration docs for 2.0
* Centralised plugin docs in docs/plugins/ * Moved integrations into same docs * Moved community clients into the clients section of the docs * Removed docs/community Closes #11734 Closes #11724 Closes #11636 Closes #11635 Closes #11632 Closes #11630 Closes #12046 Closes #12438 Closes #12579
This commit is contained in:
parent
42300938aa
commit
e143c6e460
|
@ -1,165 +1,46 @@
|
|||
[[clients]]
|
||||
== Clients
|
||||
= Community Contributed Clients
|
||||
|
||||
:client: https://www.elastic.co/guide/en/elasticsearch/client
|
||||
|
||||
Besides the link:/guide[officially supported Elasticsearch clients], there are
|
||||
a number of clients that have been contributed by the community for various languages:
|
||||
|
||||
* <<clojure>>
|
||||
* <<cold-fusion>>
|
||||
* <<erlang>>
|
||||
* <<go>>
|
||||
* <<groovy>>
|
||||
* <<haskell>>
|
||||
* <<java>>
|
||||
* <<javascript>>
|
||||
* <<dotnet>>
|
||||
* <<ocaml>>
|
||||
* <<perl>>
|
||||
* <<php>>
|
||||
* <<python>>
|
||||
* <<r>>
|
||||
* <<ruby>>
|
||||
* <<scala>>
|
||||
* <<smalltalk>>
|
||||
* <<vertx>>
|
||||
|
||||
|
||||
[[community-perl]]
|
||||
=== Perl
|
||||
|
||||
See the {client}/perl-api/current/index.html[official Elasticsearch Perl client].
|
||||
|
||||
[[community-python]]
|
||||
=== Python
|
||||
|
||||
See the {client}/python-api/current/index.html[official Elasticsearch Python client].
|
||||
|
||||
* http://github.com/elasticsearch/elasticsearch-dsl-py[elasticsearch-dsl-py]
|
||||
chainable query and filter construction built on top of official client.
|
||||
|
||||
* http://github.com/rhec/pyelasticsearch[pyelasticsearch]:
|
||||
Python client.
|
||||
|
||||
* https://github.com/eriky/ESClient[ESClient]:
|
||||
A lightweight and easy to use Python client for Elasticsearch.
|
||||
|
||||
* https://github.com/humangeo/rawes[rawes]:
|
||||
Python low level client.
|
||||
|
||||
* https://github.com/mozilla/elasticutils/[elasticutils]:
|
||||
A friendly chainable Elasticsearch interface for Python.
|
||||
|
||||
* http://intridea.github.io/surfiki-refine-elasticsearch/[Surfiki Refine]:
|
||||
Python Map-Reduce engine targeting Elasticsearch indices.
|
||||
|
||||
* http://github.com/aparo/pyes[pyes]:
|
||||
Python client.
|
||||
|
||||
|
||||
[[community-ruby]]
|
||||
=== Ruby
|
||||
|
||||
See the {client}/ruby-api/current/index.html[official Elasticsearch Ruby client].
|
||||
|
||||
* http://github.com/karmi/retire[Retire]:
|
||||
Ruby API & DSL, with ActiveRecord/ActiveModel integration (retired since Sep 2013).
|
||||
|
||||
* https://github.com/PoseBiz/stretcher[stretcher]:
|
||||
Ruby client.
|
||||
|
||||
* https://github.com/wireframe/elastic_searchable/[elastic_searchable]:
|
||||
Ruby client + Rails integration.
|
||||
|
||||
* https://github.com/ddnexus/flex[Flex]:
|
||||
Ruby Client.
|
||||
|
||||
* https://github.com/printercu/elastics-rb[elastics]:
|
||||
Tiny client with built-in zero-downtime migrations and ActiveRecord integration.
|
||||
|
||||
* https://github.com/toptal/chewy[chewy]:
|
||||
Chewy is ODM and wrapper for official elasticsearch client
|
||||
|
||||
* https://github.com/ankane/searchkick[Searchkick]:
|
||||
Intelligent search made easy
|
||||
|
||||
|
||||
[[community-php]]
|
||||
=== PHP
|
||||
|
||||
See the {client}/php-api/current/index.html[official Elasticsearch PHP client].
|
||||
|
||||
* http://github.com/ruflin/Elastica[Elastica]:
|
||||
PHP client.
|
||||
|
||||
* http://github.com/nervetattoo/elasticsearch[elasticsearch] PHP client.
|
||||
|
||||
* http://github.com/polyfractal/Sherlock[Sherlock]:
|
||||
PHP client, one-to-one mapping with query DSL, fluid interface.
|
||||
|
||||
* https://github.com/nervetattoo/elasticsearch[elasticsearch]
|
||||
PHP 5.3 client
|
||||
|
||||
[[community-java]]
|
||||
=== Java
|
||||
|
||||
* https://github.com/searchbox-io/Jest[Jest]:
|
||||
Java Rest client.
|
||||
* There is of course the {client}/java-api/current/index.html[native ES Java client]
|
||||
|
||||
[[community-javascript]]
|
||||
=== JavaScript
|
||||
|
||||
See the {client}/javascript-api/current/index.html[official Elasticsearch JavaScript client].
|
||||
|
||||
* https://github.com/fullscale/elastic.js[Elastic.js]:
|
||||
A JavaScript implementation of the Elasticsearch Query DSL and Core API.
|
||||
|
||||
* https://github.com/phillro/node-elasticsearch-client[node-elasticsearch-client]:
|
||||
A NodeJS client for Elasticsearch.
|
||||
|
||||
* https://github.com/ramv/node-elastical[node-elastical]:
|
||||
Node.js client for the Elasticsearch REST API
|
||||
|
||||
* https://github.com/printercu/elastics[elastics]: Simple tiny client that just works
|
||||
|
||||
[[community-groovy]]
|
||||
=== Groovy
|
||||
|
||||
See the {client}/groovy-api/current/index.html[official Elasticsearch Groovy client]
|
||||
|
||||
[[community-dotnet]]
|
||||
=== .NET
|
||||
|
||||
See the {client}/net-api/current/index.html[official Elasticsearch .NET client].
|
||||
|
||||
* https://github.com/Yegoroff/PlainElastic.Net[PlainElastic.Net]:
|
||||
.NET client.
|
||||
|
||||
* https://github.com/medcl/ElasticSearch.Net[ElasticSearch.NET]:
|
||||
.NET client.
|
||||
|
||||
|
||||
[[community-haskell]]
|
||||
=== Haskell
|
||||
* https://github.com/bitemyapp/bloodhound[bloodhound]:
|
||||
Haskell client and DSL.
|
||||
|
||||
|
||||
[[community-scala]]
|
||||
=== Scala
|
||||
|
||||
* https://github.com/sksamuel/elastic4s[elastic4s]:
|
||||
Scala DSL.
|
||||
|
||||
* https://github.com/scalastuff/esclient[esclient]:
|
||||
Thin Scala client.
|
||||
|
||||
* https://github.com/bsadeh/scalastic[scalastic]:
|
||||
Scala client.
|
||||
|
||||
* https://github.com/gphat/wabisabi[wabisabi]:
|
||||
Asynchronous REST API Scala client.
|
||||
|
||||
|
||||
[[community-clojure]]
|
||||
=== Clojure
|
||||
[[clojure]]
|
||||
== Clojure
|
||||
|
||||
* http://github.com/clojurewerkz/elastisch[Elastisch]:
|
||||
Clojure client.
|
||||
|
||||
[[cold-fusion]]
|
||||
== Cold Fusion
|
||||
|
||||
[[community-go]]
|
||||
=== Go
|
||||
The following project appears to be abandoned:
|
||||
|
||||
* https://github.com/mattbaird/elastigo[elastigo]:
|
||||
Go client.
|
||||
* https://github.com/jasonfill/ColdFusion-ElasticSearch-Client[ColdFusion-Elasticsearch-Client]
|
||||
Cold Fusion client for Elasticsearch
|
||||
|
||||
* https://github.com/belogik/goes[goes]:
|
||||
Go lib.
|
||||
|
||||
* https://github.com/olivere/elastic[elastic]:
|
||||
Elasticsearch client for Google Go.
|
||||
|
||||
[[community-erlang]]
|
||||
=== Erlang
|
||||
[[erlang]]
|
||||
== Erlang
|
||||
|
||||
* http://github.com/tsloughter/erlastic_search[erlastic_search]:
|
||||
Erlang client using HTTP.
|
||||
|
@ -173,51 +54,181 @@ See the {client}/net-api/current/index.html[official Elasticsearch .NET client].
|
|||
environment.
|
||||
|
||||
|
||||
[[community-eventmachine]]
|
||||
=== EventMachine
|
||||
[[go]]
|
||||
== Go
|
||||
|
||||
* http://github.com/vangberg/em-elasticsearch[em-elasticsearch]:
|
||||
elasticsearch library for eventmachine.
|
||||
* https://github.com/mattbaird/elastigo[elastigo]:
|
||||
Go client.
|
||||
|
||||
* https://github.com/belogik/goes[goes]:
|
||||
Go lib.
|
||||
|
||||
* https://github.com/olivere/elastic[elastic]:
|
||||
Elasticsearch client for Google Go.
|
||||
|
||||
|
||||
[[community-command-line]]
|
||||
=== Command Line
|
||||
[[groovy]]
|
||||
== Groovy
|
||||
|
||||
* https://github.com/elasticsearch/es2unix[es2unix]:
|
||||
Elasticsearch API consumable by the Linux command line.
|
||||
See the {client}/groovy-api/current/index.html[official Elasticsearch Groovy client].
|
||||
|
||||
* https://github.com/javanna/elasticshell[elasticshell]:
|
||||
command line shell for elasticsearch.
|
||||
[[haskell]]
|
||||
== Haskell
|
||||
* https://github.com/bitemyapp/bloodhound[bloodhound]:
|
||||
Haskell client and DSL.
|
||||
|
||||
|
||||
[[community-ocaml]]
|
||||
=== OCaml
|
||||
[[java]]
|
||||
== Java
|
||||
|
||||
Also see the {client}/java-api/current/index.html[official Elasticsearch Java client].
|
||||
|
||||
* https://github.com/searchbox-io/Jest[Jest]:
|
||||
Java Rest client.
|
||||
|
||||
[[javascript]]
|
||||
== JavaScript
|
||||
|
||||
Also see the {client}/javascript-api/current/index.html[official Elasticsearch JavaScript client].
|
||||
|
||||
* https://github.com/fullscale/elastic.js[Elastic.js]:
|
||||
A JavaScript implementation of the Elasticsearch Query DSL and Core API.
|
||||
|
||||
* https://github.com/printercu/elastics[elastics]: Simple tiny client that just works
|
||||
|
||||
* https://github.com/roundscope/ember-data-elasticsearch-kit[ember-data-elasticsearch-kit]:
|
||||
An ember-data kit for both pushing and querying objects to Elasticsearch cluster
|
||||
|
||||
The following project appears to be abandoned:
|
||||
|
||||
* https://github.com/ramv/node-elastical[node-elastical]:
|
||||
Node.js client for the Elasticsearch REST API
|
||||
|
||||
|
||||
[[dotnet]]
|
||||
== .NET
|
||||
|
||||
Also see the {client}/net-api/current/index.html[official Elasticsearch .NET client].
|
||||
|
||||
* https://github.com/Yegoroff/PlainElastic.Net[PlainElastic.Net]:
|
||||
.NET client.
|
||||
|
||||
[[ocaml]]
|
||||
== OCaml
|
||||
|
||||
The following project appears to be abandoned:
|
||||
|
||||
* https://github.com/tovbinm/ocaml-elasticsearch[ocaml-elasticsearch]:
|
||||
OCaml client for Elasticsearch
|
||||
|
||||
[[perl]]
|
||||
== Perl
|
||||
|
||||
[[community-smalltalk]]
|
||||
=== Smalltalk
|
||||
Also see the {client}/perl-api/current/index.html[official Elasticsearch Perl client].
|
||||
|
||||
* https://metacpan.org/pod/Elastijk[Elastijk]: A low level minimal HTTP client.
|
||||
|
||||
|
||||
[[php]]
|
||||
== PHP
|
||||
|
||||
Also see the {client}/php-api/current/index.html[official Elasticsearch PHP client].
|
||||
|
||||
* http://github.com/ruflin/Elastica[Elastica]:
|
||||
PHP client.
|
||||
|
||||
* http://github.com/nervetattoo/elasticsearch[elasticsearch] PHP client.
|
||||
|
||||
[[python]]
|
||||
== Python
|
||||
|
||||
Also see the {client}/python-api/current/index.html[official Elasticsearch Python client].
|
||||
|
||||
* http://github.com/elasticsearch/elasticsearch-dsl-py[elasticsearch-dsl-py]
|
||||
chainable query and filter construction built on top of official client.
|
||||
|
||||
* http://github.com/rhec/pyelasticsearch[pyelasticsearch]:
|
||||
Python client.
|
||||
|
||||
* https://github.com/eriky/ESClient[ESClient]:
|
||||
A lightweight and easy to use Python client for Elasticsearch.
|
||||
|
||||
* https://github.com/mozilla/elasticutils/[elasticutils]:
|
||||
A friendly chainable Elasticsearch interface for Python.
|
||||
|
||||
* http://github.com/aparo/pyes[pyes]:
|
||||
Python client.
|
||||
|
||||
The following projects appear to be abandoned:
|
||||
|
||||
* https://github.com/humangeo/rawes[rawes]:
|
||||
Python low level client.
|
||||
|
||||
* http://intridea.github.io/surfiki-refine-elasticsearch/[Surfiki Refine]:
|
||||
Python Map-Reduce engine targeting Elasticsearch indices.
|
||||
|
||||
[[r]]
|
||||
== R
|
||||
* https://github.com/Tomesch/elasticsearch[elasticsearch]
|
||||
R client for Elasticsearch
|
||||
|
||||
* https://github.com/ropensci/elastic[elastic]:
|
||||
A general purpose R client for Elasticsearch
|
||||
|
||||
[[ruby]]
|
||||
== Ruby
|
||||
|
||||
Also see the {client}/ruby-api/current/index.html[official Elasticsearch Ruby client].
|
||||
|
||||
* https://github.com/PoseBiz/stretcher[stretcher]:
|
||||
Ruby client.
|
||||
|
||||
* https://github.com/printercu/elastics-rb[elastics]:
|
||||
Tiny client with built-in zero-downtime migrations and ActiveRecord integration.
|
||||
|
||||
* https://github.com/toptal/chewy[chewy]:
|
||||
Chewy is ODM and wrapper for official elasticsearch client
|
||||
|
||||
* https://github.com/ankane/searchkick[Searchkick]:
|
||||
Intelligent search made easy
|
||||
|
||||
The following projects appear to be abandoned:
|
||||
|
||||
* https://github.com/wireframe/elastic_searchable/[elastic_searchable]:
|
||||
Ruby client + Rails integration.
|
||||
|
||||
* https://github.com/ddnexus/flex[Flex]:
|
||||
Ruby Client.
|
||||
|
||||
|
||||
|
||||
[[scala]]
|
||||
== Scala
|
||||
|
||||
* https://github.com/sksamuel/elastic4s[elastic4s]:
|
||||
Scala DSL.
|
||||
|
||||
* https://github.com/scalastuff/esclient[esclient]:
|
||||
Thin Scala client.
|
||||
|
||||
* https://github.com/gphat/wabisabi[wabisabi]:
|
||||
Asynchronous REST API Scala client.
|
||||
|
||||
The following project appears to be abandoned:
|
||||
|
||||
* https://github.com/bsadeh/scalastic[scalastic]:
|
||||
Scala client.
|
||||
|
||||
|
||||
[[smalltalk]]
|
||||
== Smalltalk
|
||||
|
||||
* http://ss3.gemstone.com/ss/Elasticsearch.html[Elasticsearch] -
|
||||
Smalltalk client for Elasticsearch
|
||||
|
||||
[[community-cold-fusion]]
|
||||
=== Cold Fusion
|
||||
|
||||
* https://github.com/jasonfill/ColdFusion-ElasticSearch-Client[ColdFusion-Elasticsearch-Client]
|
||||
Cold Fusion client for Elasticsearch
|
||||
[[vertx]]
|
||||
== Vert.x
|
||||
|
||||
[[community-nodejs]]
|
||||
=== NodeJS
|
||||
* https://github.com/phillro/node-elasticsearch-client[Node-Elasticsearch-Client]
|
||||
A node.js client for elasticsearch
|
||||
|
||||
[[community-r]]
|
||||
=== R
|
||||
* https://github.com/Tomesch/elasticsearch[elasticsearch]
|
||||
R client for Elasticsearch
|
||||
|
||||
* https://github.com/ropensci/elastic[elastic]:
|
||||
A general purpose R client for Elasticsearch
|
||||
* https://github.com/goodow/realtime-search[realtime-search]:
|
||||
Elasticsearch module for Vert.x
|
|
@ -1,20 +0,0 @@
|
|||
[[front-ends]]
|
||||
== Front Ends
|
||||
|
||||
* https://github.com/mobz/elasticsearch-head[elasticsearch-head]:
|
||||
A web front end for an Elasticsearch cluster.
|
||||
|
||||
* https://github.com/OlegKunitsyn/elasticsearch-browser[browser]:
|
||||
Web front-end over elasticsearch data.
|
||||
|
||||
* https://github.com/polyfractal/elasticsearch-inquisitor[Inquisitor]:
|
||||
Front-end to help debug/diagnose queries and analyzers
|
||||
|
||||
* http://elastichammer.exploringelasticsearch.com/[Hammer]:
|
||||
Web front-end for elasticsearch
|
||||
|
||||
* https://github.com/romansanchez/Calaca[Calaca]:
|
||||
Simple search client for Elasticsearch
|
||||
|
||||
* https://github.com/rdpatil4/ESClient[ESClient]:
|
||||
Simple search, update, delete client for Elasticsearch
|
|
@ -1,6 +0,0 @@
|
|||
[[github]]
|
||||
== GitHub
|
||||
|
||||
GitHub is a place where a lot of development is done around
|
||||
*elasticsearch*, here is a simple search for
|
||||
https://github.com/search?q=elasticsearch&type=Repositories[repositories].
|
|
@ -1,17 +0,0 @@
|
|||
= Community Supported Clients
|
||||
|
||||
:client: http://www.elastic.co/guide/en/elasticsearch/client
|
||||
|
||||
|
||||
include::clients.asciidoc[]
|
||||
|
||||
include::frontends.asciidoc[]
|
||||
|
||||
include::integrations.asciidoc[]
|
||||
|
||||
include::misc.asciidoc[]
|
||||
|
||||
include::monitoring.asciidoc[]
|
||||
|
||||
include::github.asciidoc[]
|
||||
|
|
@ -1,102 +0,0 @@
|
|||
[[integrations]]
|
||||
== Integrations
|
||||
|
||||
|
||||
* http://grails.org/plugin/elasticsearch[Grails]:
|
||||
Elasticsearch Grails plugin.
|
||||
|
||||
* https://github.com/carrot2/elasticsearch-carrot2[carrot2]:
|
||||
Results clustering with carrot2
|
||||
|
||||
* https://github.com/angelf/escargot[escargot]:
|
||||
Elasticsearch connector for Rails (WIP).
|
||||
|
||||
* https://metacpan.org/module/Catalyst::Model::Search::Elasticsearch[Catalyst]:
|
||||
Elasticsearch and Catalyst integration.
|
||||
|
||||
* http://github.com/aparo/django-elasticsearch[django-elasticsearch]:
|
||||
Django Elasticsearch Backend.
|
||||
|
||||
* http://github.com/Aconex/elasticflume[elasticflume]:
|
||||
http://github.com/cloudera/flume[Flume] sink implementation.
|
||||
|
||||
* http://code.google.com/p/terrastore/wiki/Search_Integration[Terrastore Search]:
|
||||
http://code.google.com/p/terrastore/[Terrastore] integration module with elasticsearch.
|
||||
|
||||
* https://github.com/infochimps-labs/wonderdog[Wonderdog]:
|
||||
Hadoop bulk loader into elasticsearch.
|
||||
|
||||
* http://geeks.aretotally.in/play-framework-module-elastic-search-distributed-searching-with-json-http-rest-or-java[Play!Framework]:
|
||||
Integrate with Play! Framework Application.
|
||||
|
||||
* https://github.com/Exercise/FOQElasticaBundle[ElasticaBundle]:
|
||||
Symfony2 Bundle wrapping Elastica.
|
||||
|
||||
* https://drupal.org/project/elasticsearch_connector[Drupal]:
|
||||
Drupal Elasticsearch integration (1.0.0 and later).
|
||||
|
||||
* http://drupal.org/project/search_api_elasticsearch[Drupal]:
|
||||
Drupal Elasticsearch integration via Search API (1.0.0 and earlier).
|
||||
|
||||
* https://github.com/refuge/couch_es[couch_es]:
|
||||
elasticsearch helper for couchdb based products (apache couchdb, bigcouch & refuge)
|
||||
|
||||
* https://github.com/sonian/elasticsearch-jetty[Jetty]:
|
||||
Jetty HTTP Transport
|
||||
|
||||
* https://github.com/dadoonet/spring-elasticsearch[Spring Elasticsearch]:
|
||||
Spring Factory for Elasticsearch
|
||||
|
||||
* https://github.com/spring-projects/spring-data-elasticsearch[Spring Data Elasticsearch]:
|
||||
Spring Data implementation for Elasticsearch
|
||||
|
||||
* https://camel.apache.org/elasticsearch.html[Apache Camel Integration]:
|
||||
An Apache camel component to integrate elasticsearch
|
||||
|
||||
* https://github.com/tlrx/elasticsearch-test[elasticsearch-test]:
|
||||
Elasticsearch Java annotations for unit testing with
|
||||
http://www.junit.org/[JUnit]
|
||||
|
||||
* http://searchbox-io.github.com/wp-elasticsearch/[Wp-Elasticsearch]:
|
||||
Elasticsearch WordPress Plugin
|
||||
|
||||
* https://github.com/wallmanderco/elasticsearch-indexer[Elasticsearch Indexer]:
|
||||
Elasticsearch WordPress Plugin
|
||||
|
||||
* https://github.com/OlegKunitsyn/eslogd[eslogd]:
|
||||
Linux daemon that replicates events to a central Elasticsearch server in real-time
|
||||
|
||||
* https://github.com/drewr/elasticsearch-clojure-repl[elasticsearch-clojure-repl]:
|
||||
Plugin that embeds nREPL for run-time introspective adventure! Also
|
||||
serves as an nREPL transport.
|
||||
|
||||
* http://haystacksearch.org/[Haystack]:
|
||||
Modular search for Django
|
||||
|
||||
* https://github.com/cleverage/play2-elasticsearch[play2-elasticsearch]:
|
||||
Elasticsearch module for Play Framework 2.x
|
||||
|
||||
* https://github.com/goodow/realtime-search[realtime-search]:
|
||||
Elasticsearch module for Vert.x
|
||||
|
||||
* https://github.com/fullscale/dangle[dangle]:
|
||||
A set of AngularJS directives that provide common visualizations for elasticsearch based on
|
||||
D3.
|
||||
|
||||
* https://github.com/roundscope/ember-data-elasticsearch-kit[ember-data-elasticsearch-kit]:
|
||||
An ember-data kit for both pushing and querying objects to Elasticsearch cluster
|
||||
|
||||
* https://github.com/kzwang/elasticsearch-osem[elasticsearch-osem]:
|
||||
A Java Object Search Engine Mapping (OSEM) for Elasticsearch
|
||||
|
||||
* https://github.com/twitter/storehaus[Twitter Storehaus]:
|
||||
Thin asynchronous Scala client for Storehaus.
|
||||
|
||||
* https://doc.tiki.org/Elasticsearch[Tiki Wiki CMS Groupware]:
|
||||
Tiki has native support for Elasticsearch. This provides faster & better search (facets, etc), along with some Natural Language Processing features (ex.: More like this)
|
||||
|
||||
* https://github.com/reachkrishnaraj/kafka-elasticsearch-standalone-consumer[Kafka Standalone Consumer]:
|
||||
Easily Scaleable & Extendable, Kafka Standalone Consumer that will read the messages from Kafka, processes and index them in ElasticSearch
|
||||
|
||||
* http://www.searchtechnologies.com/aspire-for-elasticsearch[Aspire for Elasticsearch]:
|
||||
Aspire, from Search Technologies, is a powerful connector and processing framework designed for unstructured data. It has connectors to internal and external repositories including SharePoint, Documentum, Jive, RDB, file systems, websites and more, and can transform and normalize this data before indexing in Elasticsearch.
|
|
@ -1,31 +0,0 @@
|
|||
[[misc]]
|
||||
== Misc
|
||||
|
||||
|
||||
* https://github.com/elasticsearch/puppet-elasticsearch[Puppet]:
|
||||
Elasticsearch puppet module.
|
||||
|
||||
* http://github.com/elasticsearch/cookbook-elasticsearch[Chef]:
|
||||
Chef cookbook for Elasticsearch
|
||||
|
||||
* https://github.com/medcl/salt-elasticsearch[SaltStack]:
|
||||
SaltStack Module for Elasticsearch
|
||||
|
||||
* http://www.github.com/neogenix/daikon[daikon]:
|
||||
Daikon Elasticsearch CLI
|
||||
|
||||
* https://github.com/Aconex/scrutineer[Scrutineer]:
|
||||
A high performance consistency checker to compare what you've indexed
|
||||
with your source of truth content (e.g. DB)
|
||||
|
||||
* https://www.wireshark.org/[Wireshark]:
|
||||
Protocol dissection for Zen discovery, HTTP and the binary protocol
|
||||
|
||||
* https://github.com/sscarduzio/elasticsearch-readonlyrest-plugin[Readonly REST]:
|
||||
High performance access control for Elasticsearch native REST API.
|
||||
|
||||
* https://github.com/kodcu/pes[Pes]:
|
||||
A pluggable elastic query DSL builder for Elasticsearch
|
||||
|
||||
* https://github.com/ozlerhakan/mongolastic[Mongolastic]:
|
||||
A tool that clone data from ElasticSearch to MongoDB and vice versa
|
|
@ -1,40 +0,0 @@
|
|||
[[health]]
|
||||
== Health and Performance Monitoring
|
||||
|
||||
* https://github.com/lukas-vlcek/bigdesk[bigdesk]:
|
||||
Live charts and statistics for elasticsearch cluster.
|
||||
|
||||
* https://github.com/lmenezes/elasticsearch-kopf/[Kopf]:
|
||||
Live cluster health and shard allocation monitoring with administration toolset.
|
||||
|
||||
* https://github.com/karmi/elasticsearch-paramedic[paramedic]:
|
||||
Live charts with cluster stats and indices/shards information.
|
||||
|
||||
* http://www.elastichq.org/[ElasticsearchHQ]:
|
||||
Free cluster health monitoring tool
|
||||
|
||||
* http://sematext.com/spm/index.html[SPM for Elasticsearch]:
|
||||
Performance monitoring with live charts showing cluster and node stats, integrated
|
||||
alerts, email reports, etc.
|
||||
|
||||
* https://github.com/radu-gheorghe/check-es[check-es]:
|
||||
Nagios/Shinken plugins for checking on elasticsearch
|
||||
|
||||
* https://github.com/anchor/nagios-plugin-elasticsearch[check_elasticsearch]:
|
||||
An Elasticsearch availability and performance monitoring plugin for
|
||||
Nagios.
|
||||
|
||||
* https://github.com/rbramley/Opsview-elasticsearch[opsview-elasticsearch]:
|
||||
Opsview plugin written in Perl for monitoring Elasticsearch
|
||||
|
||||
* https://github.com/polyfractal/elasticsearch-segmentspy[SegmentSpy]:
|
||||
Plugin to watch Lucene segment merges across your cluster
|
||||
|
||||
* https://github.com/mattweber/es2graphite[es2graphite]:
|
||||
Send cluster and indices stats and status to Graphite for monitoring and graphing.
|
||||
|
||||
* https://scoutapp.com[Scout]: Provides plugins for monitoring Elasticsearch https://scoutapp.com/plugin_urls/1331-elasticsearch-node-status[nodes], https://scoutapp.com/plugin_urls/1321-elasticsearch-cluster-status[clusters], and https://scoutapp.com/plugin_urls/1341-elasticsearch-index-status[indices].
|
||||
|
||||
* https://itunes.apple.com/us/app/elasticocean/id955278030?ls=1&mt=8[ElasticOcean]:
|
||||
Elasticsearch & DigitalOcean iOS Real-Time Monitoring tool to keep an eye on DigitalOcean Droplets or Elasticsearch instances or both of them on-a-go.
|
||||
|
|
@ -0,0 +1,18 @@
|
|||
[[alerting]]
|
||||
== Alerting Plugins
|
||||
|
||||
Alerting plugins allow Elasticsearch to monitor indices and to trigger alerts when thresholds are breached.
|
||||
|
||||
[float]
|
||||
=== Core alerting plugins
|
||||
|
||||
The core alerting plugins are:
|
||||
|
||||
link:/products/watcher[Watcher]::
|
||||
|
||||
Watcher is the alerting and notification product for Elasticsearch that lets
|
||||
you take action based on changes in your data. It is designed around the
|
||||
principle that if you can query something in Elasticsearch, you can alert on
|
||||
it. Simply define a query, condition, schedule, and the actions to take, and
|
||||
Watcher will do the rest.
|
||||
|
|
@ -0,0 +1,438 @@
|
|||
[[analysis-icu]]
|
||||
=== ICU Analysis Plugin
|
||||
|
||||
The ICU Analysis plugin integrates the Lucene ICU module into elasticsearch,
|
||||
adding extended Unicode support using the http://site.icu-project.org/[ICU]
|
||||
libraries, including better analysis of Asian languages, Unicode
|
||||
normalization, Unicode-aware case folding, collation support, and
|
||||
transliteration.
|
||||
|
||||
[[analysis-icu-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install analysis-icu
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[analysis-icu-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove analysis-icu
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[analysis-icu-normalization-charfilter]]
|
||||
==== ICU Normalization Character Filter
|
||||
|
||||
Normalizes characters as explained
|
||||
http://userguide.icu-project.org/transforms/normalization[here].
|
||||
It registers itself as the `icu_normalizer` character filter, which is
|
||||
available to all indices without any further configuration. The type of
|
||||
normalization can be specified with the `name` parameter, which accepts `nfc`,
|
||||
`nfkc`, and `nfkc_cf` (default). Set the `mode` parameter to `decompose` to
|
||||
convert `nfc` to `nfd` or `nfkc` to `nfkd` respectively:
|
||||
|
||||
Here are two examples, the default usage and a customised character filter:
|
||||
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"nfkc_cf_normalized": { <1>
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"char_filter": [
|
||||
"icu_normalizer"
|
||||
]
|
||||
},
|
||||
"nfd_normalized": { <2>
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"char_filter": [
|
||||
"nfd_normalizer"
|
||||
]
|
||||
}
|
||||
},
|
||||
"char_filter": {
|
||||
"nfd_normalizer": {
|
||||
"type": "icu_normalizer",
|
||||
"name": "nfc",
|
||||
"mode": "decompose"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
<1> Uses the default `nfkc_cf` normalization.
|
||||
<2> Uses the customized `nfd_normalizer` token filter, which is set to use `nfc` normalization with decomposition.
|
||||
|
||||
[[analysis-icu-tokenizer]]
|
||||
==== ICU Tokenizer
|
||||
|
||||
Tokenizes text into words on word boundaries, as defined in
|
||||
http://www.unicode.org/reports/tr29/[UAX #29: Unicode Text Segmentation].
|
||||
It behaves much like the {ref}/analysis-standard-tokenizer.html[`standard` tokenizer],
|
||||
but adds better support for some Asian languages by using a dictionary-based
|
||||
approach to identify words in Thai, Lao, Chinese, Japanese, and Korean, and
|
||||
using custom rules to break Myanmar and Khmer text into syllables.
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_icu_analyzer": {
|
||||
"tokenizer": "icu_tokenizer"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
|
||||
[[analysis-icu-normalization]]
|
||||
==== ICU Normalization Token Filter
|
||||
|
||||
Normalizes characters as explained
|
||||
http://userguide.icu-project.org/transforms/normalization[here]. It registers
|
||||
itself as the `icu_normalizer` token filter, which is available to all indices
|
||||
without any further configuration. The type of normalization can be specified
|
||||
with the `name` parameter, which accepts `nfc`, `nfkc`, and `nfkc_cf`
|
||||
(default).
|
||||
|
||||
You should probably prefer the <<analysis-icu-normalization-charfilter,Normalization character filter>>.
|
||||
|
||||
Here are two examples, the default usage and a customised token filter:
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"nfkc_cf_normalized": { <1>
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"filter": [
|
||||
"icu_normalizer"
|
||||
]
|
||||
},
|
||||
"nfc_normalized": { <2>
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"filter": [
|
||||
"nfc_normalizer"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"nfc_normalizer": {
|
||||
"type": "icu_normalizer",
|
||||
"name": "nfc"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
<1> Uses the default `nfkc_cf` normalization.
|
||||
<2> Uses the customized `nfc_normalizer` token filter, which is set to use `nfc` normalization.
|
||||
|
||||
|
||||
[[analysis-icu-folding]]
|
||||
==== ICU Folding Token Filter
|
||||
|
||||
Case folding of Unicode characters based on `UTR#30`, like the
|
||||
{ref}/analysis-asciifolding-tokenfilter.html[ASCII-folding token filter]
|
||||
on steroids. It registers itself as the `icu_folding` token filter and is
|
||||
available to all indices:
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"folded": {
|
||||
"tokenizer": "icu",
|
||||
"filter": [
|
||||
"icu_folding"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
The ICU folding token filter already does Unicode normalization, so there is
|
||||
no need to use Normalize character or token filter as well.
|
||||
|
||||
Which letters are folded can be controlled by specifying the
|
||||
`unicodeSetFilter` parameter, which accepts a
|
||||
http://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html[UnicodeSet].
|
||||
|
||||
The following example exempts Swedish characters from folding. It is important
|
||||
to note that both upper and lowercase forms should be specified, and that
|
||||
these filtered character are not lowercased which is why we add the
|
||||
`lowercase` filter as well:
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"swedish_analyzer": {
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"filter": [
|
||||
"swedish_folding",
|
||||
"lowercase"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"swedish_folding": {
|
||||
"type": "icu_folding",
|
||||
"unicodeSetFilter": "[^åäöÅÄÖ]"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
[[analysis-icu-collation]]
|
||||
==== ICU Collation Token Filter
|
||||
|
||||
Collations are used for sorting documents in a language-specific word order.
|
||||
The `icu_collation` token filter is available to all indices and defaults to
|
||||
using the
|
||||
https://www.elastic.co/guide/en/elasticsearch/guide/current/sorting-collations.html#uca[DUCET collation],
|
||||
which is a best-effort attempt at language-neutral sorting.
|
||||
|
||||
Below is an example of how to set up a field for sorting German names in
|
||||
``phonebook'' order:
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT /my_index
|
||||
{
|
||||
"settings": {
|
||||
"analysis": {
|
||||
"filter": {
|
||||
"german_phonebook": {
|
||||
"type": "icu_collation",
|
||||
"language": "de",
|
||||
"country": "DE",
|
||||
"variant": "@collation=phonebook"
|
||||
}
|
||||
},
|
||||
"analyzer": {
|
||||
"german_phonebook": {
|
||||
"tokenizer": "keyword",
|
||||
"filter": [ "german_phonebook" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"mappings": {
|
||||
"user": {
|
||||
"properties": {
|
||||
"name": { <1>
|
||||
"type": "string",
|
||||
"fields": {
|
||||
"sort": { <2>
|
||||
"type": "string",
|
||||
"analyzer": "german_phonebook"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET _search <3>
|
||||
{
|
||||
"query": {
|
||||
"match": {
|
||||
"name": "Fritz"
|
||||
}
|
||||
},
|
||||
"sort": "name.sort"
|
||||
}
|
||||
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
<1> The `name` field uses the `standard` analyzer, and so support full text queries.
|
||||
<2> The `name.sort` field uses the `keyword` analyzer to preserve the name as
|
||||
a single token, and applies the `german_phonebook` token filter to index
|
||||
the value in German phonebook sort order.
|
||||
<3> An example query which searches the `name` field and sorts on the `name.sort` field.
|
||||
|
||||
===== Collation options
|
||||
|
||||
`strength`::
|
||||
|
||||
The strength property determines the minimum level of difference considered
|
||||
significant during comparison. Possible values are : `primary`, `secondary`,
|
||||
`tertiary`, `quaternary` or `identical`. See the
|
||||
http://icu-project.org/apiref/icu4j/com/ibm/icu/text/Collator.html[ICU Collation documentation]
|
||||
for a more detailed explanation for each value. Defaults to `tertiary`
|
||||
unless otherwise specified in the collation.
|
||||
|
||||
`decomposition`::
|
||||
|
||||
Possible values: `no` (default, but collation-dependent) or `canonical`.
|
||||
Setting this decomposition property to `canonical` allows the Collator to
|
||||
handle unnormalized text properly, producing the same results as if the text
|
||||
were normalized. If `no` is set, it is the user's responsibility to insure
|
||||
that all text is already in the appropriate form before a comparison or before
|
||||
getting a CollationKey. Adjusting decomposition mode allows the user to select
|
||||
between faster and more complete collation behavior. Since a great many of the
|
||||
world's languages do not require text normalization, most locales set `no` as
|
||||
the default decomposition mode.
|
||||
|
||||
The following options are expert only:
|
||||
|
||||
`alternate`::
|
||||
|
||||
Possible values: `shifted` or `non-ignorable`. Sets the alternate handling for
|
||||
strength `quaternary` to be either shifted or non-ignorable. Which boils down
|
||||
to ignoring punctuation and whitespace.
|
||||
|
||||
`caseLevel`::
|
||||
|
||||
Possible values: `true` or `false` (default). Whether case level sorting is
|
||||
required. When strength is set to `primary` this will ignore accent
|
||||
differences.
|
||||
|
||||
|
||||
`caseFirst`::
|
||||
|
||||
Possible values: `lower` or `upper`. Useful to control which case is sorted
|
||||
first when case is not ignored for strength `tertiary`. The default depends on
|
||||
the collation.
|
||||
|
||||
`numeric`::
|
||||
|
||||
Possible values: `true` or `false` (default) . Whether digits are sorted
|
||||
according to their numeric representation. For example the value `egg-9` is
|
||||
sorted before the value `egg-21`.
|
||||
|
||||
|
||||
`variableTop`::
|
||||
|
||||
Single character or contraction. Controls what is variable for `alternate`.
|
||||
|
||||
`hiraganaQuaternaryMode`::
|
||||
|
||||
Possible values: `true` or `false`. Distinguishing between Katakana and
|
||||
Hiragana characters in `quaternary` strength.
|
||||
|
||||
|
||||
[[analysis-icu-transform]]
|
||||
==== ICU Transform Token Filter
|
||||
|
||||
Transforms are used to process Unicode text in many different ways, such as
|
||||
case mapping, normalization, transliteration and bidirectional text handling.
|
||||
|
||||
You can define which transformation you want to apply with the `id` parameter
|
||||
(defaults to `Null`), and specify text direction with the `dir` parameter
|
||||
which accepts `forward` (default) for LTR and `reverse` for RTL. Custom
|
||||
rulesets are not yet supported.
|
||||
|
||||
For example:
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"latin": {
|
||||
"tokenizer": "keyword",
|
||||
"filter": [
|
||||
"myLatinTransform"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"myLatinTransform": {
|
||||
"type": "icu_transform",
|
||||
"id": "Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC" <1>
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET icu_sample/_analyze?analyzer=latin
|
||||
{
|
||||
"text": "你好" <2>
|
||||
}
|
||||
|
||||
GET icu_sample/_analyze?analyzer=latin
|
||||
{
|
||||
"text": "здравствуйте" <3>
|
||||
}
|
||||
|
||||
GET icu_sample/_analyze?analyzer=latin
|
||||
{
|
||||
"text": "こんにちは" <4>
|
||||
}
|
||||
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
<1> This transforms transliterates characters to Latin, and separates accents
|
||||
from their base characters, removes the accents, and then puts the
|
||||
remaining text into an unaccented form.
|
||||
|
||||
<2> Returns `ni hao`.
|
||||
<3> Returns `zdravstvujte`.
|
||||
<4> Returns `kon'nichiha`.
|
||||
|
||||
For more documentation, Please see the http://userguide.icu-project.org/transforms/general[user guide of ICU Transform].
|
|
@ -0,0 +1,454 @@
|
|||
[[analysis-kuromoji]]
|
||||
=== Japanese (kuromoji) Analysis Plugin
|
||||
|
||||
The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis
|
||||
module into elasticsearch.
|
||||
|
||||
[[analysis-kuromoji-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install analysis-kuromoji
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[analysis-kuromoji-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove analysis-kuromoji
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[analysis-kuromoji-analyzer]]
|
||||
==== `kuromoji` analyzer
|
||||
|
||||
The `kuromoji` analyzer consists of the following tokenizer and token filters:
|
||||
|
||||
* <<analysis-kuromoji-tokenizer,`kuromoji_tokenizer`>>
|
||||
* <<analysis-kuromoji-baseform,`kuromoji_baseform`>> token filter
|
||||
* <<analysis-kuromoji-speech,`kuromoji_part_of_speech`>> token filter
|
||||
* {ref}/analysis-cjk-width-tokenfilter.html[`cjk_width`] token filter
|
||||
* <<analysis-kuromoji-stop,`ja_stop`>> token filter
|
||||
* <<analysis-kuromoji-stemmer,`kuromoji_stemmer`>> token filter
|
||||
* {ref}/analysis-lowercase-tokenfilter.html[`lowercase`] token filter
|
||||
|
||||
It supports the `mode` and `user_dictionary` settings from
|
||||
<<analysis-kuromoji-tokenizer,`kuromoji_tokenizer`>>.
|
||||
|
||||
[[analysis-kuromoji-charfilter]]
|
||||
==== `kuromoji_iteration_mark` character filter
|
||||
|
||||
The `kuromoji_iteration_mark` normalizes Japanese horizontal iteration marks
|
||||
(_odoriji_) to their expanded form. It accepts the following settings:
|
||||
|
||||
`normalize_kanji`::
|
||||
|
||||
Indicates whether kanji iteration marks should be normalize. Defaults to `true`.
|
||||
|
||||
`normalize_kana`::
|
||||
|
||||
Indicates whether kana iteration marks should be normalized. Defaults to `true`
|
||||
|
||||
|
||||
[[analysis-kuromoji-tokenizer]]
|
||||
==== `kuromoji_tokenizer`
|
||||
|
||||
The `kuromoji_tokenizer` accepts the following settings:
|
||||
|
||||
`mode`::
|
||||
+
|
||||
--
|
||||
|
||||
The tokenization mode determines how the tokenizer handles compound and
|
||||
unknown words. It can be set to:
|
||||
|
||||
`normal`::
|
||||
|
||||
Normal segmentation, no decomposition for compounds. Example output:
|
||||
|
||||
関西国際空港
|
||||
アブラカダブラ
|
||||
|
||||
`search`::
|
||||
|
||||
Segmentation geared towards search. This includes a decompounding process
|
||||
for long nouns, also including the full compound token as a synonym.
|
||||
Example output:
|
||||
|
||||
関西, 関西国際空港, 国際, 空港
|
||||
アブラカダブラ
|
||||
|
||||
`extended`::
|
||||
|
||||
Extended mode outputs unigrams for unknown words. Example output:
|
||||
|
||||
関西, 国際, 空港
|
||||
ア, ブ, ラ, カ, ダ, ブ, ラ
|
||||
--
|
||||
|
||||
`discard_punctuation`::
|
||||
|
||||
Whether punctuation should be discarded from the output. Defaults to `true`.
|
||||
|
||||
`user_dictionary`::
|
||||
+
|
||||
--
|
||||
The Kuromoji tokenizer uses the MeCab-IPADIC dictionary by default. A `user_dictionary`
|
||||
may be appended to the default dictionary. The dictionary should have the following CSV format:
|
||||
|
||||
[source,csv]
|
||||
-----------------------
|
||||
<text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
|
||||
-----------------------
|
||||
--
|
||||
|
||||
As a demonstration of how the user dictionary can be used, save the following
|
||||
dictionary to `$ES_HOME/config/userdict_ja.txt`:
|
||||
|
||||
[source,csv]
|
||||
-----------------------
|
||||
東京スカイツリー,東京 スカイツリー,トウキョウ スカイツリー,カスタム名詞
|
||||
-----------------------
|
||||
|
||||
Then create an analyzer as follows:
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"tokenizer": {
|
||||
"kuromoji_user_dict": {
|
||||
"type": "kuromoji_tokenizer",
|
||||
"mode": "extended",
|
||||
"discard_punctuation": "false",
|
||||
"user_dictionary": "userdict_ja.txt"
|
||||
}
|
||||
},
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"type": "custom",
|
||||
"tokenizer": "kuromoji_user_dict"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
POST kuromoji_sample/_analyze?analyzer=my_analyzer&text=東京スカイツリー
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
The above `analyze` request returns the following:
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
# Result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "東京",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
}, {
|
||||
"token" : "スカイツリー",
|
||||
"start_offset" : 2,
|
||||
"end_offset" : 8,
|
||||
"type" : "word",
|
||||
"position" : 2
|
||||
} ]
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
[[analysis-kuromoji-baseform]]
|
||||
==== `kuromoji_baseform` token filter
|
||||
|
||||
The `kuromoji_baseform` token filter replaces terms with their
|
||||
BaseFormAttribute. This acts as a lemmatizer for verbs and adjectives.
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"kuromoji_baseform"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
POST kuromoji_sample/_analyze?analyzer=my_analyzer&text=飲み
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
[source,text]
|
||||
--------------------------------------------------
|
||||
# Result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "飲む",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
} ]
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
[[analysis-kuromoji-speech]]
|
||||
==== `kuromoji_part_of_speech` token filter
|
||||
|
||||
The `kuromoji_part_of_speech` token filter removes tokens that match a set of
|
||||
part-of-speech tags. It accepts the following setting:
|
||||
|
||||
`stoptags`::
|
||||
|
||||
An array of part-of-speech tags that should be removed. It defaults to the
|
||||
`stoptags.txt` file embedded in the `lucene-analyzer-kuromoji.jar`.
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"my_posfilter"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"my_posfilter": {
|
||||
"type": "kuromoji_part_of_speech",
|
||||
"stoptags": [
|
||||
"助詞-格助詞-一般",
|
||||
"助詞-終助詞"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
POST kuromoji_sample/_analyze?analyzer=my_analyzer&text=寿司がおいしいね
|
||||
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
[source,text]
|
||||
--------------------------------------------------
|
||||
# Result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "寿司",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
}, {
|
||||
"token" : "おいしい",
|
||||
"start_offset" : 3,
|
||||
"end_offset" : 7,
|
||||
"type" : "word",
|
||||
"position" : 3
|
||||
} ]
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
[[analysis-kuromoji-readingform]]
|
||||
==== `kuromoji_readingform` token filter
|
||||
|
||||
The `kuromoji_readingform` token filter replaces the token with its reading
|
||||
form in either katakana or romaji. It accepts the following setting:
|
||||
|
||||
`use_romaji`::
|
||||
|
||||
Whether romaji reading form should be output instead of katakana. Defaults to `false`.
|
||||
|
||||
When using the pre-defined `kuromoji_readingform` filter, `use_romaji` is set
|
||||
to `true`. The default when defining a custom `kuromoji_readingform`, however,
|
||||
is `false`. The only reason to use the custom form is if you need the
|
||||
katakana reading form:
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index":{
|
||||
"analysis":{
|
||||
"analyzer" : {
|
||||
"romaji_analyzer" : {
|
||||
"tokenizer" : "kuromoji_tokenizer",
|
||||
"filter" : ["romaji_readingform"]
|
||||
},
|
||||
"katakana_analyzer" : {
|
||||
"tokenizer" : "kuromoji_tokenizer",
|
||||
"filter" : ["katakana_readingform"]
|
||||
}
|
||||
},
|
||||
"filter" : {
|
||||
"romaji_readingform" : {
|
||||
"type" : "kuromoji_readingform",
|
||||
"use_romaji" : true
|
||||
},
|
||||
"katakana_readingform" : {
|
||||
"type" : "kuromoji_readingform",
|
||||
"use_romaji" : false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
POST kuromoji_sample/_analyze?analyzer=katakana_analyzer&text=寿司 <1>
|
||||
|
||||
POST kuromoji_sample/_analyze?analyzer=romaji_analyzer&text=寿司 <2>
|
||||
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
<1> Returns `スシ`.
|
||||
<2> Returns `sushi`.
|
||||
|
||||
[[analysis-kuromoji-stemmer]]
|
||||
==== `kuromoji_stemmer` token filter
|
||||
|
||||
The `kuromoji_stemmer` token filter normalizes common katakana spelling
|
||||
variations ending in a long sound character by removing this character
|
||||
(U+30FC). Only full-width katakana characters are supported.
|
||||
|
||||
This token filter accepts the following setting:
|
||||
|
||||
`minimum_length`::
|
||||
|
||||
Katakana words shorter than the `minimum length` are not stemmed (default
|
||||
is `4`).
|
||||
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"my_katakana_stemmer"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"my_katakana_stemmer": {
|
||||
"type": "kuromoji_stemmer",
|
||||
"minimum_length": 4
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
POST kuromoji_sample/_analyze?analyzer=my_analyzer&text=コピー <1>
|
||||
|
||||
POST kuromoji_sample/_analyze?analyzer=my_analyzer&text=サーバー <2>
|
||||
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
<1> Returns `コピー`.
|
||||
<2> Return `サーバ`.
|
||||
|
||||
|
||||
[[analysis-kuromoji-stop]]
|
||||
===== `ja_stop` token filter
|
||||
|
||||
The `ja_stop` token filter filters out Japanese stopwords (`_japanese_`), and
|
||||
any other custom stopwords specified by the user. This filter only supports
|
||||
the predefined `_japanese_` stopwords list. If you want to use a different
|
||||
predefined list, then use the
|
||||
{ref}/analysis-stop-tokenfilter.html[`stop` token filter] instead.
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"analyzer_with_ja_stop": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"ja_stop"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"ja_stop": {
|
||||
"type": "ja_stop",
|
||||
"stopwords": [
|
||||
"_japanese_",
|
||||
"ストップ"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
POST kuromoji_sample/_analyze?analyzer=my_analyzer&text=ストップは消える
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
The above request returns:
|
||||
|
||||
[source,text]
|
||||
--------------------------------------------------
|
||||
# Result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "消える",
|
||||
"start_offset" : 5,
|
||||
"end_offset" : 8,
|
||||
"type" : "word",
|
||||
"position" : 3
|
||||
} ]
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
|
@ -0,0 +1,120 @@
|
|||
[[analysis-phonetic]]
|
||||
=== Phonetic Analysis Plugin
|
||||
|
||||
The Phonetic Analysis plugin provides token filters which convert tokens to
|
||||
their phonetic representation using Soundex, Metaphone, and a variety of other
|
||||
algorithms.
|
||||
|
||||
[[analysis-phonetic-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install analysis-phonetic
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[analysis-phonetic-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove analysis-phonetic
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[analysis-phonetic-token-filter]]
|
||||
==== `phonetic` token filter
|
||||
|
||||
The `phonetic` token filter takes the following settings:
|
||||
|
||||
`encoder`::
|
||||
|
||||
Which phonetic encoder to use. Accepts `metaphone` (default),
|
||||
`doublemetaphone`, `soundex`, `refinedsoundex`, `caverphone1`,
|
||||
`caverphone2`, `cologne`, `nysiis`, `koelnerphonetik`, `haasephonetik`,
|
||||
`beidermorse`.
|
||||
|
||||
`replace`::
|
||||
|
||||
Whether or not the original token should be replaced by the phonetic
|
||||
token. Accepts `true` (default) and `false`. Not supported by
|
||||
`beidermorse` encoding.
|
||||
|
||||
[source,json]
|
||||
--------------------------------------------------
|
||||
PUT phonetic_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "standard",
|
||||
"filter": [
|
||||
"standard",
|
||||
"lowercase",
|
||||
"my_metaphone"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"my_metaphone": {
|
||||
"type": "phonetic",
|
||||
"encoder": "metaphone",
|
||||
"replace": false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
POST phonetic_sample/_analyze?analyzer=my_analyzer&text=Joe Bloggs <1>
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
<1> Returns: `J`, `joe`, `BLKS`, `bloggs`
|
||||
|
||||
|
||||
[float]
|
||||
===== Double metaphone settings
|
||||
|
||||
If the `double_metaphone` encoder is used, then this additional setting is
|
||||
supported:
|
||||
|
||||
`max_code_len`::
|
||||
|
||||
The maximum length of the emitted metaphone token. Defaults to `4`.
|
||||
|
||||
[float]
|
||||
===== Beider Morse settings
|
||||
|
||||
If the `beider_morse` encoder is used, then these additional settings are
|
||||
supported:
|
||||
|
||||
`rule_type`::
|
||||
|
||||
Whether matching should be `exact` or `approx` (default).
|
||||
|
||||
`name_type`::
|
||||
|
||||
Whether names are `ashkenazi`, `sephardic`, or `generic` (default).
|
||||
|
||||
`languageset`::
|
||||
|
||||
An array of languages to check. If not specified, then the language will
|
||||
be guessed. Accepts: `any`, `comomon`, `cyrillic`, `english`, `french`,
|
||||
`german`, `hebrew`, `hungarian`, `polish`, `romanian`, `russian`,
|
||||
`spanish`.
|
||||
|
||||
|
|
@ -0,0 +1,48 @@
|
|||
[[analysis-smartcn]]
|
||||
=== Smart Chinese Analysis Plugin
|
||||
|
||||
The Smart Chinese Analysis plugin integrates Lucene's Smart Chinese analysis
|
||||
module into elasticsearch.
|
||||
|
||||
It provides an analyzer for Chinese or mixed Chinese-English text. This
|
||||
analyzer uses probabilistic knowledge to find the optimal word segmentation
|
||||
for Simplified Chinese text. The text is first broken into sentences, then
|
||||
each sentence is segmented into words.
|
||||
|
||||
|
||||
[[analysis-smartcn-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install analysis-smartcn
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[analysis-smartcn-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove analysis-smartcn
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[analysis-smartcn-tokenizer]]
|
||||
[float]
|
||||
==== `smartcn` tokenizer and token filter
|
||||
|
||||
The plugin provides the `smartcn` analyzer and `smartcn_tokenizer` tokenizer,
|
||||
which are not configurable.
|
||||
|
||||
NOTE: The `smartcn_word` token filter and `smartcn_sentence` have been deprecated.
|
||||
|
|
@ -0,0 +1,43 @@
|
|||
[[analysis-stempel]]
|
||||
=== Stempel Polish Analysis Plugin
|
||||
|
||||
The Stempel Analysis plugin integrates Lucene's Stempel analysis
|
||||
module for Polish into elasticsearch.
|
||||
|
||||
It provides high quality stemming for Polish, based on the
|
||||
http://www.egothor.org/[Egothor project].
|
||||
|
||||
[[analysis-stempel-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install analysis-stempel
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[analysis-stempel-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove analysis-stempel
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[analysis-stempel-tokenizer]]
|
||||
[float]
|
||||
==== `stempel` tokenizer and token filter
|
||||
|
||||
The plugin provides the `polish` analyzer and `polish_stem` token filter,
|
||||
which are not configurable.
|
||||
|
|
@ -0,0 +1,69 @@
|
|||
[[analysis]]
|
||||
== Analysis Plugins
|
||||
|
||||
Analysis plugins extend Elasticsearch by adding new analyzers, tokenizers,
|
||||
token filters, or character filters to Elasticsearch.
|
||||
|
||||
[float]
|
||||
==== Core analysis plugins
|
||||
|
||||
The core analysis plugins are:
|
||||
|
||||
<<analysis-icu,ICU>>::
|
||||
|
||||
Adds extended Unicode support using the http://site.icu-project.org/[ICU]
|
||||
libraries, including better analysis of Asian languages, Unicode
|
||||
normalization, Unicode-aware case folding, collation support, and
|
||||
transliteration.
|
||||
|
||||
<<analysis-kuromoji,Kuromoji>>::
|
||||
|
||||
Advanced analysis of Japanese using the http://www.atilika.org/[Kuromoji analyzer].
|
||||
|
||||
<<analysis-phonetic,Phonetic>>::
|
||||
|
||||
Analyzes tokens into their phonetic equivalent using Soundex, Metaphone,
|
||||
Caverphone, and other codecs.
|
||||
|
||||
<<analysis-smartcn,SmartCN>>::
|
||||
|
||||
An analyzer for Chinese or mixed Chinese-English text. This analyzer uses
|
||||
probabilistic knowledge to find the optimal word segmentation for Simplified
|
||||
Chinese text. The text is first broken into sentences, then each sentence is
|
||||
segmented into words.
|
||||
|
||||
<<analysis-stempel,Stempel>>::
|
||||
|
||||
Provides high quality stemming for Polish.
|
||||
|
||||
[float]
|
||||
==== Community contributed analysis plugins
|
||||
|
||||
A number of analysis plugins have been contributed by our community:
|
||||
|
||||
* https://github.com/yakaz/elasticsearch-analysis-combo/[Combo Analysis Plugin] (by Olivier Favre, Yakaz)
|
||||
* https://github.com/synhershko/elasticsearch-analysis-hebrew[Hebrew Analysis Plugin] (by Itamar Syn-Hershko)
|
||||
* https://github.com/medcl/elasticsearch-analysis-ik[IK Analysis Plugin] (by Medcl)
|
||||
* https://github.com/medcl/elasticsearch-analysis-mmseg[Mmseg Analysis Plugin] (by Medcl)
|
||||
* https://github.com/chytreg/elasticsearch-analysis-morfologik[Morfologik (Polish) Analysis plugin] (by chytreg)
|
||||
* https://github.com/imotov/elasticsearch-analysis-morphology[Russian and English Morphological Analysis Plugin] (by Igor Motov)
|
||||
* https://github.com/medcl/elasticsearch-analysis-pinyin[Pinyin Analysis Plugin] (by Medcl)
|
||||
* https://github.com/duydo/elasticsearch-analysis-vietnamese[Vietnamese Analysis Plugin] (by Duy Do)
|
||||
|
||||
These community plugins appear to have been abandoned:
|
||||
|
||||
* https://github.com/barminator/elasticsearch-analysis-annotation[Annotation Analysis Plugin] (by Michal Samek)
|
||||
* https://github.com/medcl/elasticsearch-analysis-string2int[String2Integer Analysis Plugin] (by Medcl)
|
||||
|
||||
include::analysis-icu.asciidoc[]
|
||||
|
||||
include::analysis-kuromoji.asciidoc[]
|
||||
|
||||
include::analysis-phonetic.asciidoc[]
|
||||
|
||||
include::analysis-smartcn.asciidoc[]
|
||||
|
||||
include::analysis-stempel.asciidoc[]
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,65 @@
|
|||
[[api]]
|
||||
== API Extension Plugins
|
||||
|
||||
API extension plugins add new functionality to Elasticsearch by adding new APIs or features, usually to do with search or mapping.
|
||||
|
||||
[float]
|
||||
=== Core API extension plugins
|
||||
|
||||
The core API extension plugins are:
|
||||
|
||||
<<plugins-delete-by-query,Delete by Query>>::
|
||||
|
||||
The delete by query plugin adds support for deleting all of the documents
|
||||
(from one or more indices) which match the specified query. It is a
|
||||
replacement for the problematic _delete-by-query_ functionality which has been
|
||||
removed from Elasticsearch core.
|
||||
|
||||
https://github.com/elasticsearch/elasticsearch-mapper-attachments[Mapper Attachments Type plugin]::
|
||||
|
||||
Integrates http://lucene.apache.org/tika/[Apache Tika] to provide a new field
|
||||
type `attachment` to allow indexing of documents such as PDFs and Microsoft
|
||||
Word.
|
||||
|
||||
[float]
|
||||
=== Community contributed API extension plugins
|
||||
|
||||
A number of plugins have been contributed by our community:
|
||||
|
||||
* https://github.com/carrot2/elasticsearch-carrot2[carrot2 Plugin]:
|
||||
Results clustering with http://project.carrot2.org/[carrot2] (by Dawid Weiss)
|
||||
|
||||
* https://github.com/wikimedia/search-extra[Elasticsearch Trigram Accelerated Regular Expression Filter]:
|
||||
(by Wikimedia Foundation/Nik Everett)
|
||||
|
||||
* https://github.com/kzwang/elasticsearch-image[Elasticsearch Image Plugin]:
|
||||
Uses https://code.google.com/p/lire/[Lire (Lucene Image Retrieval)] to allow users
|
||||
to index images and search for similar images (by Kevin Wang)
|
||||
|
||||
* https://github.com/wikimedia/search-highlighter[Elasticsearch Experimental Highlighter]:
|
||||
(by Wikimedia Foundation/Nik Everett)
|
||||
|
||||
* https://github.com/YannBrrd/elasticsearch-entity-resolution[Entity Resolution Plugin]:
|
||||
Uses http://github.com/larsga/Duke[Duke] for duplication detection (by Yann Barraud)
|
||||
|
||||
* https://github.com/NLPchina/elasticsearch-sql/[SQL language Plugin]:
|
||||
Allows Elasticsearch to be queried with SQL (by nlpcn)
|
||||
|
||||
* https://github.com/codelibs/elasticsearch-taste[Elasticsearch Taste Plugin]:
|
||||
Mahout Taste-based Collaborative Filtering implementation (by CodeLibs Project)
|
||||
|
||||
* https://github.com/hadashiA/elasticsearch-flavor[Elasticsearch Flavor Plugin] using
|
||||
http://mahout.apache.org/[Mahout] Collaboration filtering (by hadashiA)
|
||||
|
||||
These community plugins appear to have been abandoned:
|
||||
|
||||
* https://github.com/derryx/elasticsearch-changes-plugin[Elasticsearch Changes Plugin] (by Thomas Peuss)
|
||||
|
||||
* https://github.com/mattweber/elasticsearch-mocksolrplugin[Elasticsearch Mock Solr Plugin] (by Matt Weber)
|
||||
|
||||
* http://siren.solutions/siren/downloads/[Elasticsearch SIREn Plugin]: Nested data search (by SIREn Solutions)
|
||||
|
||||
* https://github.com/endgameinc/elasticsearch-term-plugin[Terms Component Plugin] (by Endgame Inc.)
|
||||
|
||||
|
||||
include::delete-by-query.asciidoc[]
|
|
@ -0,0 +1,62 @@
|
|||
[[plugin-authors]]
|
||||
== Help for plugin authors
|
||||
|
||||
The Elasticsearch repository contains examples of:
|
||||
|
||||
* a https://github.com/elastic/elasticsearch/tree/master/plugins/site-example[site plugin]
|
||||
for serving static HTML, JavaScript, and CSS.
|
||||
* a https://github.com/elastic/elasticsearch/tree/master/plugins/jvm-example[Java plugin]
|
||||
which contains Java code.
|
||||
|
||||
These examples provide the bare bones needed to get started. For more
|
||||
information about how to write a plugin, we recommend looking at the plugins
|
||||
listed in this documentation for inspiration.
|
||||
|
||||
[NOTE]
|
||||
.Site plugins
|
||||
====================================
|
||||
|
||||
The example site plugin mentioned above contains all of the scaffolding needed
|
||||
for integrating with Maven builds. If you don't plan on using Maven, then all
|
||||
you really need in your plugin is:
|
||||
|
||||
* The `plugin-descriptor.properties` file
|
||||
* The `_site/` directory
|
||||
* The `_site/index.html` file
|
||||
|
||||
====================================
|
||||
|
||||
[float]
|
||||
=== Plugin descriptor file
|
||||
|
||||
All plugins, be they site or Java plugins, must contain a file called
|
||||
`plugin-descriptor.properties` in the root directory. The format for this file
|
||||
is described in detail here:
|
||||
|
||||
https://github.com/elastic/elasticsearch/blob/master/dev-tools/src/main/resources/plugin-metadata/plugin-descriptor.properties[`dev-tools/src/main/resources/plugin-metadata/plugin-descriptor.properties`].
|
||||
|
||||
Either fill in this template yourself (see
|
||||
https://github.com/lmenezes/elasticsearch-kopf/blob/master/plugin-descriptor.properties[elasticsearch-kopf]
|
||||
as an example) or, if you are using Elasticsearch's Maven build system, you
|
||||
can fill in the necessary values in the `pom.xml` for your plugin. For
|
||||
instance, see
|
||||
https://github.com/elastic/elasticsearch/blob/master/plugins/site-example/pom.xml[`plugins/site-example/pom.xml`].
|
||||
|
||||
[float]
|
||||
=== Loading plugins from the classpath
|
||||
|
||||
When testing a Java plugin, it will only be auto-loaded if it is in the
|
||||
`plugins/` directory. If, instead, it is in your classpath, you can tell
|
||||
Elasticsearch to load it with the `plugin.types` setting:
|
||||
|
||||
[source,java]
|
||||
--------------------------
|
||||
settingsBuilder()
|
||||
.put("cluster.name", cluster)
|
||||
.put("path.home", getHome())
|
||||
.put("plugin.types", MyCustomPlugin.class.getName()) <1>
|
||||
.build();
|
||||
--------------------------
|
||||
<1> Tells Elasticsearch to load your plugin.
|
||||
|
||||
|
|
@ -0,0 +1,468 @@
|
|||
[[cloud-aws]]
|
||||
=== AWS Cloud Plugin
|
||||
|
||||
The Amazon Web Service (AWS) Cloud plugin uses the
|
||||
https://github.com/aws/aws-sdk-java[AWS API] for unicast discovery, and adds
|
||||
support for using S3 as a repository for
|
||||
{ref}/modules-snapshots.html[Snapshot/Restore].
|
||||
|
||||
[[cloud-aws-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install cloud-aws
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[cloud-aws-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove cloud-aws
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[cloud-aws-usage]]
|
||||
==== Getting started with AWS
|
||||
|
||||
The plugin will default to using
|
||||
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html[IAM Role]
|
||||
credentials for authentication. These can be overridden by, in increasing
|
||||
order of precedence, system properties `aws.accessKeyId` and `aws.secretKey`,
|
||||
environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_KEY`, or the
|
||||
elasticsearch config using `cloud.aws.access_key` and `cloud.aws.secret_key`:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
cloud:
|
||||
aws:
|
||||
access_key: AKVAIQBF2RECL7FJWGJQ
|
||||
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
|
||||
----
|
||||
|
||||
[[cloud-aws-usage-security]]
|
||||
===== Transport security
|
||||
|
||||
By default this plugin uses HTTPS for all API calls to AWS endpoints. If you wish to configure HTTP you can set
|
||||
`cloud.aws.protocol` in the elasticsearch config. You can optionally override this setting per individual service
|
||||
via: `cloud.aws.ec2.protocol` or `cloud.aws.s3.protocol`.
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
cloud:
|
||||
aws:
|
||||
protocol: https
|
||||
s3:
|
||||
protocol: http
|
||||
ec2:
|
||||
protocol: https
|
||||
----
|
||||
|
||||
In addition, a proxy can be configured with the `proxy_host` and `proxy_port` settings (note that protocol can be
|
||||
`http` or `https`):
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
cloud:
|
||||
aws:
|
||||
protocol: https
|
||||
proxy_host: proxy1.company.com
|
||||
proxy_port: 8083
|
||||
----
|
||||
|
||||
You can also set different proxies for `ec2` and `s3`:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
cloud:
|
||||
aws:
|
||||
s3:
|
||||
proxy_host: proxy1.company.com
|
||||
proxy_port: 8083
|
||||
ec2:
|
||||
proxy_host: proxy2.company.com
|
||||
proxy_port: 8083
|
||||
----
|
||||
|
||||
[[cloud-aws-usage-region]]
|
||||
===== Region
|
||||
|
||||
The `cloud.aws.region` can be set to a region and will automatically use the relevant settings for both `ec2` and `s3`.
|
||||
The available values are:
|
||||
|
||||
* `us-east` (`us-east-1`)
|
||||
* `us-west` (`us-west-1`)
|
||||
* `us-west-1`
|
||||
* `us-west-2`
|
||||
* `ap-southeast` (`ap-southeast-1`)
|
||||
* `ap-southeast-1`
|
||||
* `ap-southeast-2`
|
||||
* `ap-northeast` (`ap-northeast-1`)
|
||||
* `eu-west` (`eu-west-1`)
|
||||
* `eu-central` (`eu-central-1`)
|
||||
* `sa-east` (`sa-east-1`)
|
||||
* `cn-north` (`cn-north-1`)
|
||||
|
||||
[[cloud-aws-usage-signer]]
|
||||
===== EC2/S3 Signer API
|
||||
|
||||
If you are using a compatible EC2 or S3 service, they might be using an older API to sign the requests.
|
||||
You can set your compatible signer API using `cloud.aws.signer` (or `cloud.aws.ec2.signer` and `cloud.aws.s3.signer`)
|
||||
with the right signer to use. Defaults to `AWS4SignerType`.
|
||||
|
||||
[[cloud-aws-discovery]]
|
||||
==== EC2 Discovery
|
||||
|
||||
ec2 discovery allows to use the ec2 APIs to perform automatic discovery (similar to multicast in non hostile multicast
|
||||
environments). Here is a simple sample configuration:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
discovery:
|
||||
type: ec2
|
||||
----
|
||||
|
||||
The ec2 discovery is using the same credentials as the rest of the AWS services provided by this plugin (`repositories`).
|
||||
See <<cloud-aws-usage>> for details.
|
||||
|
||||
The following are a list of settings (prefixed with `discovery.ec2`) that can further control the discovery:
|
||||
|
||||
`groups`::
|
||||
|
||||
Either a comma separated list or array based list of (security) groups.
|
||||
Only instances with the provided security groups will be used in the
|
||||
cluster discovery. (NOTE: You could provide either group NAME or group
|
||||
ID.)
|
||||
|
||||
`host_type`::
|
||||
|
||||
The type of host type to use to communicate with other instances. Can be
|
||||
one of `private_ip`, `public_ip`, `private_dns`, `public_dns`. Defaults to
|
||||
`private_ip`.
|
||||
|
||||
`availability_zones`::
|
||||
|
||||
Either a comma separated list or array based list of availability zones.
|
||||
Only instances within the provided availability zones will be used in the
|
||||
cluster discovery.
|
||||
|
||||
`any_group`::
|
||||
|
||||
If set to `false`, will require all security groups to be present for the
|
||||
instance to be used for the discovery. Defaults to `true`.
|
||||
|
||||
`ping_timeout`::
|
||||
|
||||
How long to wait for existing EC2 nodes to reply during discovery.
|
||||
Defaults to `3s`. If no unit like `ms`, `s` or `m` is specified,
|
||||
milliseconds are used.
|
||||
|
||||
[[cloud-aws-discovery-permissions]]
|
||||
===== Recommended EC2 Permissions
|
||||
|
||||
EC2 discovery requires making a call to the EC2 service. You'll want to setup
|
||||
an IAM policy to allow this. You can create a custom policy via the IAM
|
||||
Management Console. It should look similar to this.
|
||||
|
||||
[source,js]
|
||||
----
|
||||
{
|
||||
"Statement": [
|
||||
{
|
||||
"Action": [
|
||||
"ec2:DescribeInstances"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"*"
|
||||
]
|
||||
}
|
||||
],
|
||||
"Version": "2012-10-17"
|
||||
}
|
||||
----
|
||||
|
||||
[[cloud-aws-discovery-filtering]]
|
||||
===== Filtering by Tags
|
||||
|
||||
The ec2 discovery can also filter machines to include in the cluster based on tags (and not just groups). The settings
|
||||
to use include the `discovery.ec2.tag.` prefix. For example, setting `discovery.ec2.tag.stage` to `dev` will only
|
||||
filter instances with a tag key set to `stage`, and a value of `dev`. Several tags set will require all of those tags
|
||||
to be set for the instance to be included.
|
||||
|
||||
One practical use for tag filtering is when an ec2 cluster contains many nodes that are not running elasticsearch. In
|
||||
this case (particularly with high `ping_timeout` values) there is a risk that a new node's discovery phase will end
|
||||
before it has found the cluster (which will result in it declaring itself master of a new cluster with the same name
|
||||
- highly undesirable). Tagging elasticsearch ec2 nodes and then filtering by that tag will resolve this issue.
|
||||
|
||||
[[cloud-aws-discovery-attributes]]
|
||||
===== Automatic Node Attributes
|
||||
|
||||
Though not dependent on actually using `ec2` as discovery (but still requires the cloud aws plugin installed), the
|
||||
plugin can automatically add node attributes relating to ec2 (for example, availability zone, that can be used with
|
||||
the awareness allocation feature). In order to enable it, set `cloud.node.auto_attributes` to `true` in the settings.
|
||||
|
||||
[[cloud-aws-discovery-endpoint]]
|
||||
===== Using other EC2 endpoint
|
||||
|
||||
If you are using any EC2 api compatible service, you can set the endpoint you want to use by setting
|
||||
`cloud.aws.ec2.endpoint` to your URL provider.
|
||||
|
||||
[[cloud-aws-repository]]
|
||||
==== S3 Repository
|
||||
|
||||
The S3 repository is using S3 to store snapshots. The S3 repository can be created using the following command:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
PUT _snapshot/my_s3_repository
|
||||
{
|
||||
"type": "s3",
|
||||
"settings": {
|
||||
"bucket": "my_bucket_name",
|
||||
"region": "us-west"
|
||||
}
|
||||
}
|
||||
----
|
||||
// AUTOSENSE
|
||||
|
||||
The following settings are supported:
|
||||
|
||||
`bucket`::
|
||||
|
||||
The name of the bucket to be used for snapshots. (Mandatory)
|
||||
|
||||
`region`::
|
||||
|
||||
The region where bucket is located. Defaults to US Standard
|
||||
|
||||
`endpoint`::
|
||||
|
||||
The endpoint to the S3 API. Defaults to AWS's default S3 endpoint. Note
|
||||
that setting a region overrides the endpoint setting.
|
||||
|
||||
`protocol`::
|
||||
|
||||
The protocol to use (`http` or `https`). Defaults to value of
|
||||
`cloud.aws.protocol` or `cloud.aws.s3.protocol`.
|
||||
|
||||
`base_path`::
|
||||
|
||||
Specifies the path within bucket to repository data. Defaults to root
|
||||
directory.
|
||||
|
||||
`access_key`::
|
||||
|
||||
The access key to use for authentication. Defaults to value of
|
||||
`cloud.aws.access_key`.
|
||||
|
||||
`secret_key`::
|
||||
|
||||
The secret key to use for authentication. Defaults to value of
|
||||
`cloud.aws.secret_key`.
|
||||
|
||||
`chunk_size`::
|
||||
|
||||
Big files can be broken down into chunks during snapshotting if needed.
|
||||
The chunk size can be specified in bytes or by using size value notation,
|
||||
i.e. `1g`, `10m`, `5k`. Defaults to `100m`.
|
||||
|
||||
`compress`::
|
||||
|
||||
When set to `true` metadata files are stored in compressed format. This
|
||||
setting doesn't affect index files that are already compressed by default.
|
||||
Defaults to `false`.
|
||||
|
||||
`server_side_encryption`::
|
||||
|
||||
When set to `true` files are encrypted on server side using AES256
|
||||
algorithm. Defaults to `false`.
|
||||
|
||||
`buffer_size`::
|
||||
|
||||
Minimum threshold below which the chunk is uploaded using a single
|
||||
request. Beyond this threshold, the S3 repository will use the
|
||||
http://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html[AWS Multipart Upload API]
|
||||
to split the chunk into several parts, each of `buffer_size` length, and
|
||||
to upload each part in its own request. Note that positioning a buffer
|
||||
size lower than `5mb` is not allowed since it will prevents the use of the
|
||||
Multipart API and may result in upload errors. Defaults to `5mb`.
|
||||
|
||||
`max_retries`::
|
||||
|
||||
Number of retries in case of S3 errors. Defaults to `3`.
|
||||
|
||||
|
||||
The S3 repositories use the same credentials as the rest of the AWS services
|
||||
provided by this plugin (`discovery`). See <<cloud-aws-usage>> for details.
|
||||
|
||||
Multiple S3 repositories can be created. If the buckets require different
|
||||
credentials, then define them as part of the repository settings.
|
||||
|
||||
[[cloud-aws-repository-permissions]]
|
||||
===== Recommended S3 Permissions
|
||||
|
||||
In order to restrict the Elasticsearch snapshot process to the minimum required resources, we recommend using Amazon
|
||||
IAM in conjunction with pre-existing S3 buckets. Here is an example policy which will allow the snapshot access to an
|
||||
S3 bucket named "snaps.example.com". This may be configured through the AWS IAM console, by creating a Custom Policy,
|
||||
and using a Policy Document similar to this (changing snaps.example.com to your bucket name).
|
||||
|
||||
[source,js]
|
||||
----
|
||||
{
|
||||
"Statement": [
|
||||
{
|
||||
"Action": [
|
||||
"s3:ListBucket",
|
||||
"s3:GetBucketLocation",
|
||||
"s3:ListBucketMultipartUploads",
|
||||
"s3:ListBucketVersions"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com"
|
||||
]
|
||||
},
|
||||
{
|
||||
"Action": [
|
||||
"s3:GetObject",
|
||||
"s3:PutObject",
|
||||
"s3:DeleteObject",
|
||||
"s3:AbortMultipartUpload",
|
||||
"s3:ListMultipartUploadParts"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com/*"
|
||||
]
|
||||
}
|
||||
],
|
||||
"Version": "2012-10-17"
|
||||
}
|
||||
----
|
||||
|
||||
You may further restrict the permissions by specifying a prefix within the bucket, in this example, named "foo".
|
||||
|
||||
[source,js]
|
||||
----
|
||||
{
|
||||
"Statement": [
|
||||
{
|
||||
"Action": [
|
||||
"s3:ListBucket",
|
||||
"s3:GetBucketLocation",
|
||||
"s3:ListBucketMultipartUploads",
|
||||
"s3:ListBucketVersions"
|
||||
],
|
||||
"Condition": {
|
||||
"StringLike": {
|
||||
"s3:prefix": [
|
||||
"foo/*"
|
||||
]
|
||||
}
|
||||
},
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com"
|
||||
]
|
||||
},
|
||||
{
|
||||
"Action": [
|
||||
"s3:GetObject",
|
||||
"s3:PutObject",
|
||||
"s3:DeleteObject",
|
||||
"s3:AbortMultipartUpload",
|
||||
"s3:ListMultipartUploadParts"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com/foo/*"
|
||||
]
|
||||
}
|
||||
],
|
||||
"Version": "2012-10-17"
|
||||
}
|
||||
----
|
||||
|
||||
The bucket needs to exist to register a repository for snapshots. If you did not create the bucket then the repository
|
||||
registration will fail. If you want elasticsearch to create the bucket instead, you can add the permission to create a
|
||||
specific bucket like this:
|
||||
|
||||
[source,js]
|
||||
----
|
||||
{
|
||||
"Action": [
|
||||
"s3:CreateBucket"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com"
|
||||
]
|
||||
}
|
||||
----
|
||||
|
||||
[[cloud-aws-repository-endpoint]]
|
||||
===== Using other S3 endpoint
|
||||
|
||||
If you are using any S3 api compatible service, you can set a global endpoint by setting `cloud.aws.s3.endpoint`
|
||||
to your URL provider. Note that this setting will be used for all S3 repositories.
|
||||
|
||||
Different `endpoint`, `region` and `protocol` settings can be set on a per-repository basis
|
||||
See <<cloud-aws-repository>> for details.
|
||||
|
||||
[[cloud-aws-testing]]
|
||||
==== Testing AWS
|
||||
|
||||
Integrations tests in this plugin require working AWS configuration and therefore disabled by default. Three buckets
|
||||
and two iam users have to be created. The first iam user needs access to two buckets in different regions and the final
|
||||
bucket is exclusive for the other iam user. To enable tests prepare a config file elasticsearch.yml with the following
|
||||
content:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
cloud:
|
||||
aws:
|
||||
access_key: AKVAIQBF2RECL7FJWGJQ
|
||||
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
|
||||
|
||||
repositories:
|
||||
s3:
|
||||
bucket: "bucket_name"
|
||||
region: "us-west-2"
|
||||
private-bucket:
|
||||
bucket: <bucket not accessible by default key>
|
||||
access_key: <access key>
|
||||
secret_key: <secret key>
|
||||
remote-bucket:
|
||||
bucket: <bucket in other region>
|
||||
region: <region>
|
||||
external-bucket:
|
||||
bucket: <bucket>
|
||||
access_key: <access key>
|
||||
secret_key: <secret key>
|
||||
endpoint: <endpoint>
|
||||
protocol: <protocol>
|
||||
|
||||
----
|
||||
|
||||
Replace all occurrences of `access_key`, `secret_key`, `endpoint`, `protocol`, `bucket` and `region` with your settings.
|
||||
Please, note that the test will delete all snapshot/restore related files in the specified buckets.
|
||||
|
||||
To run test:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
mvn -Dtests.aws=true -Dtests.config=/path/to/config/file/elasticsearch.yml clean test
|
||||
----
|
||||
|
|
@ -0,0 +1,667 @@
|
|||
[[cloud-azure]]
|
||||
=== Azure Cloud Plugin
|
||||
|
||||
The Azure Cloud plugin uses the Azure API for unicast discovery, and adds
|
||||
support for using Azure as a repository for
|
||||
{ref}/modules-snapshots.html[Snapshot/Restore].
|
||||
|
||||
[[cloud-azure-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install cloud-aws
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[cloud-azure-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove cloud-aws
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[cloud-azure-discovery]]
|
||||
==== Azure Virtual Machine Discovery
|
||||
|
||||
Azure VM discovery allows to use the azure APIs to perform automatic discovery (similar to multicast in non hostile
|
||||
multicast environments). Here is a simple sample configuration:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
cloud:
|
||||
azure:
|
||||
management:
|
||||
subscription.id: XXX-XXX-XXX-XXX
|
||||
cloud.service.name: es-demo-app
|
||||
keystore:
|
||||
path: /path/to/azurekeystore.pkcs12
|
||||
password: WHATEVER
|
||||
type: pkcs12
|
||||
|
||||
discovery:
|
||||
type: azure
|
||||
----
|
||||
|
||||
[[cloud-azure-discovery-short]]
|
||||
===== How to start (short story)
|
||||
|
||||
* Create Azure instances
|
||||
* Install Elasticsearch
|
||||
* Install Azure plugin
|
||||
* Modify `elasticsearch.yml` file
|
||||
* Start Elasticsearch
|
||||
|
||||
[[cloud-azure-discovery-settings]]
|
||||
===== Azure credential API settings
|
||||
|
||||
The following are a list of settings that can further control the credential API:
|
||||
|
||||
[horizontal]
|
||||
`cloud.azure.management.keystore.path`::
|
||||
|
||||
/path/to/keystore
|
||||
|
||||
`cloud.azure.management.keystore.type`::
|
||||
|
||||
`pkcs12`, `jceks` or `jks`. Defaults to `pkcs12`.
|
||||
|
||||
`cloud.azure.management.keystore.password`::
|
||||
|
||||
your_password for the keystore
|
||||
|
||||
`cloud.azure.management.subscription.id`::
|
||||
|
||||
your_azure_subscription_id
|
||||
|
||||
`cloud.azure.management.cloud.service.name`::
|
||||
|
||||
your_azure_cloud_service_name
|
||||
|
||||
|
||||
[[cloud-azure-discovery-settings-advanced]]
|
||||
===== Advanced settings
|
||||
|
||||
The following are a list of settings that can further control the discovery:
|
||||
|
||||
`discovery.azure.host.type`::
|
||||
|
||||
Either `public_ip` or `private_ip` (default). Azure discovery will use the
|
||||
one you set to ping other nodes.
|
||||
|
||||
`discovery.azure.endpoint.name`::
|
||||
|
||||
When using `public_ip` this setting is used to identify the endpoint name
|
||||
used to forward requests to elasticsearch (aka transport port name).
|
||||
Defaults to `elasticsearch`. In Azure management console, you could define
|
||||
an endpoint `elasticsearch` forwarding for example requests on public IP
|
||||
on port 8100 to the virtual machine on port 9300.
|
||||
|
||||
`discovery.azure.deployment.name`::
|
||||
|
||||
Deployment name if any. Defaults to the value set with
|
||||
`cloud.azure.management.cloud.service.name`.
|
||||
|
||||
`discovery.azure.deployment.slot`::
|
||||
|
||||
Either `staging` or `production` (default).
|
||||
|
||||
For example:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
discovery:
|
||||
type: azure
|
||||
azure:
|
||||
host:
|
||||
type: private_ip
|
||||
endpoint:
|
||||
name: elasticsearch
|
||||
deployment:
|
||||
name: your_azure_cloud_service_name
|
||||
slot: production
|
||||
----
|
||||
|
||||
[[cloud-azure-discovery-long]]
|
||||
==== Setup process for Azure Discovery
|
||||
|
||||
We will expose here one strategy which is to hide our Elasticsearch cluster from outside.
|
||||
|
||||
With this strategy, only VMs behind the same virtual port can talk to each
|
||||
other. That means that with this mode, you can use elasticsearch unicast
|
||||
discovery to build a cluster, using the Azure API to retrieve information
|
||||
about your nodes.
|
||||
|
||||
[[cloud-azure-discovery-long-prerequisites]]
|
||||
===== Prerequisites
|
||||
|
||||
Before starting, you need to have:
|
||||
|
||||
* A http://www.windowsazure.com/[Windows Azure account]
|
||||
* OpenSSL that isn't from MacPorts, specifically `OpenSSL 1.0.1f 6 Jan
|
||||
2014` doesn't seem to create a valid keypair for ssh. FWIW,
|
||||
`OpenSSL 1.0.1c 10 May 2012` on Ubuntu 12.04 LTS is known to work.
|
||||
* SSH keys and certificate
|
||||
+
|
||||
--
|
||||
|
||||
You should follow http://azure.microsoft.com/en-us/documentation/articles/linux-use-ssh-key/[this guide] to learn
|
||||
how to create or use existing SSH keys. If you have already did it, you can skip the following.
|
||||
|
||||
Here is a description on how to generate SSH keys using `openssl`:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
# You may want to use another dir than /tmp
|
||||
cd /tmp
|
||||
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout azure-private.key -out azure-certificate.pem
|
||||
chmod 600 azure-private.key azure-certificate.pem
|
||||
openssl x509 -outform der -in azure-certificate.pem -out azure-certificate.cer
|
||||
----
|
||||
|
||||
Generate a keystore which will be used by the plugin to authenticate with a certificate
|
||||
all Azure API calls.
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
# Generate a keystore (azurekeystore.pkcs12)
|
||||
# Transform private key to PEM format
|
||||
openssl pkcs8 -topk8 -nocrypt -in azure-private.key -inform PEM -out azure-pk.pem -outform PEM
|
||||
# Transform certificate to PEM format
|
||||
openssl x509 -inform der -in azure-certificate.cer -out azure-cert.pem
|
||||
cat azure-cert.pem azure-pk.pem > azure.pem.txt
|
||||
# You MUST enter a password!
|
||||
openssl pkcs12 -export -in azure.pem.txt -out azurekeystore.pkcs12 -name azure -noiter -nomaciter
|
||||
----
|
||||
|
||||
Upload the `azure-certificate.cer` file both in the elasticsearch Cloud Service (under `Manage Certificates`),
|
||||
and under `Settings -> Manage Certificates`.
|
||||
|
||||
IMPORTANT: When prompted for a password, you need to enter a non empty one.
|
||||
|
||||
See this http://www.windowsazure.com/en-us/manage/linux/how-to-guides/ssh-into-linux/[guide] for
|
||||
more details about how to create keys for Azure.
|
||||
|
||||
Once done, you need to upload your certificate in Azure:
|
||||
|
||||
* Go to the https://account.windowsazure.com/[management console].
|
||||
* Sign in using your account.
|
||||
* Click on `Portal`.
|
||||
* Go to Settings (bottom of the left list)
|
||||
* On the bottom bar, click on `Upload` and upload your `azure-certificate.cer` file.
|
||||
|
||||
You may want to use
|
||||
http://www.windowsazure.com/en-us/develop/nodejs/how-to-guides/command-line-tools/[Windows Azure Command-Line Tool]:
|
||||
|
||||
--
|
||||
|
||||
* Install https://github.com/joyent/node/wiki/Installing-Node.js-via-package-manager[NodeJS], for example using
|
||||
homebrew on MacOS X:
|
||||
+
|
||||
[source,sh]
|
||||
----
|
||||
brew install node
|
||||
----
|
||||
|
||||
* Install Azure tools
|
||||
+
|
||||
[source,sh]
|
||||
----
|
||||
sudo npm install azure-cli -g
|
||||
----
|
||||
|
||||
* Download and import your azure settings:
|
||||
+
|
||||
[source,sh]
|
||||
----
|
||||
# This will open a browser and will download a .publishsettings file
|
||||
azure account download
|
||||
|
||||
# Import this file (we have downloaded it to /tmp)
|
||||
# Note, it will create needed files in ~/.azure. You can remove azure.publishsettings when done.
|
||||
azure account import /tmp/azure.publishsettings
|
||||
----
|
||||
|
||||
[[cloud-azure-discovery-long-instance]]
|
||||
===== Creating your first instance
|
||||
|
||||
You need to have a storage account available. Check http://www.windowsazure.com/en-us/develop/net/how-to-guides/blob-storage/#create-account[Azure Blob Storage documentation]
|
||||
for more information.
|
||||
|
||||
You will need to choose the operating system you want to run on. To get a list of official available images, run:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
azure vm image list
|
||||
----
|
||||
|
||||
Let's say we are going to deploy an Ubuntu image on an extra small instance in West Europe:
|
||||
|
||||
[horizontal]
|
||||
Azure cluster name::
|
||||
|
||||
`azure-elasticsearch-cluster`
|
||||
|
||||
Image::
|
||||
|
||||
`b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-13_10-amd64-server-20130808-alpha3-en-us-30GB`
|
||||
|
||||
VM Name::
|
||||
|
||||
`myesnode1`
|
||||
|
||||
VM Size::
|
||||
|
||||
`extrasmall`
|
||||
|
||||
Location::
|
||||
|
||||
`West Europe`
|
||||
|
||||
Login::
|
||||
|
||||
`elasticsearch`
|
||||
|
||||
Password::
|
||||
|
||||
`password1234!!`
|
||||
|
||||
|
||||
Using command line:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
azure vm create azure-elasticsearch-cluster \
|
||||
b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-13_10-amd64-server-20130808-alpha3-en-us-30GB \
|
||||
--vm-name myesnode1 \
|
||||
--location "West Europe" \
|
||||
--vm-size extrasmall \
|
||||
--ssh 22 \
|
||||
--ssh-cert /tmp/azure-certificate.pem \
|
||||
elasticsearch password1234\!\!
|
||||
----
|
||||
|
||||
You should see something like:
|
||||
|
||||
[source,text]
|
||||
----
|
||||
info: Executing command vm create
|
||||
+ Looking up image
|
||||
+ Looking up cloud service
|
||||
+ Creating cloud service
|
||||
+ Retrieving storage accounts
|
||||
+ Configuring certificate
|
||||
+ Creating VM
|
||||
info: vm create command OK
|
||||
----
|
||||
|
||||
Now, your first instance is started.
|
||||
|
||||
[TIP]
|
||||
.Working with SSH
|
||||
===============================================
|
||||
|
||||
You need to give the private key and username each time you log on your instance:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
ssh -i ~/.ssh/azure-private.key elasticsearch@myescluster.cloudapp.net
|
||||
----
|
||||
|
||||
But you can also define it once in `~/.ssh/config` file:
|
||||
|
||||
[source,text]
|
||||
----
|
||||
Host *.cloudapp.net
|
||||
User elasticsearch
|
||||
StrictHostKeyChecking no
|
||||
UserKnownHostsFile=/dev/null
|
||||
IdentityFile ~/.ssh/azure-private.key
|
||||
----
|
||||
===============================================
|
||||
|
||||
Next, you need to install Elasticsearch on your new instance. First, copy your
|
||||
keystore to the instance, then connect to the instance using SSH:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
scp /tmp/azurekeystore.pkcs12 azure-elasticsearch-cluster.cloudapp.net:/home/elasticsearch
|
||||
ssh azure-elasticsearch-cluster.cloudapp.net
|
||||
----
|
||||
|
||||
Once connected, install Elasticsearch:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
# Install Latest Java version
|
||||
# Read http://www.webupd8.org/2012/01/install-oracle-java-jdk-7-in-ubuntu-via.html for details
|
||||
sudo add-apt-repository ppa:webupd8team/java
|
||||
sudo apt-get update
|
||||
sudo apt-get install oracle-java7-installer
|
||||
|
||||
# If you want to install OpenJDK instead
|
||||
# sudo apt-get update
|
||||
# sudo apt-get install openjdk-7-jre-headless
|
||||
|
||||
# Download Elasticsearch
|
||||
curl -s https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-2.0.0.deb -o elasticsearch-2.0.0.deb
|
||||
|
||||
# Prepare Elasticsearch installation
|
||||
sudo dpkg -i elasticsearch-2.0.0.deb
|
||||
----
|
||||
|
||||
Check that elasticsearch is running:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
curl http://localhost:9200/
|
||||
----
|
||||
|
||||
This command should give you a JSON result:
|
||||
|
||||
[source,javascript]
|
||||
----
|
||||
{
|
||||
"status" : 200,
|
||||
"name" : "Living Colossus",
|
||||
"version" : {
|
||||
"number" : "2.0.0",
|
||||
"build_hash" : "a46900e9c72c0a623d71b54016357d5f94c8ea32",
|
||||
"build_timestamp" : "2014-02-12T16:18:34Z",
|
||||
"build_snapshot" : false,
|
||||
"lucene_version" : "5.1"
|
||||
},
|
||||
"tagline" : "You Know, for Search"
|
||||
}
|
||||
----
|
||||
|
||||
[[cloud-azure-discovery-long-plugin]]
|
||||
===== Install elasticsearch cloud azure plugin
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
# Stop elasticsearch
|
||||
sudo service elasticsearch stop
|
||||
|
||||
# Install the plugin
|
||||
sudo /usr/share/elasticsearch/bin/plugin install elasticsearch/elasticsearch-cloud-azure/2.6.1
|
||||
|
||||
# Configure it
|
||||
sudo vi /etc/elasticsearch/elasticsearch.yml
|
||||
----
|
||||
|
||||
And add the following lines:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
# If you don't remember your account id, you may get it with `azure account list`
|
||||
cloud:
|
||||
azure:
|
||||
management:
|
||||
subscription.id: your_azure_subscription_id
|
||||
cloud.service.name: your_azure_cloud_service_name
|
||||
keystore:
|
||||
path: /home/elasticsearch/azurekeystore.pkcs12
|
||||
password: your_password_for_keystore
|
||||
|
||||
discovery:
|
||||
type: azure
|
||||
|
||||
# Recommended (warning: non durable disk)
|
||||
# path.data: /mnt/resource/elasticsearch/data
|
||||
----
|
||||
|
||||
Restart elasticsearch:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
sudo service elasticsearch start
|
||||
----
|
||||
|
||||
If anything goes wrong, check your logs in `/var/log/elasticsearch`.
|
||||
|
||||
[[cloud-azure-discovery-scale]]
|
||||
==== Scaling Out!
|
||||
|
||||
You need first to create an image of your previous machine.
|
||||
Disconnect from your machine and run locally the following commands:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
# Shutdown the instance
|
||||
azure vm shutdown myesnode1
|
||||
|
||||
# Create an image from this instance (it could take some minutes)
|
||||
azure vm capture myesnode1 esnode-image --delete
|
||||
|
||||
# Note that the previous instance has been deleted (mandatory)
|
||||
# So you need to create it again and BTW create other instances.
|
||||
|
||||
azure vm create azure-elasticsearch-cluster \
|
||||
esnode-image \
|
||||
--vm-name myesnode1 \
|
||||
--location "West Europe" \
|
||||
--vm-size extrasmall \
|
||||
--ssh 22 \
|
||||
--ssh-cert /tmp/azure-certificate.pem \
|
||||
elasticsearch password1234\!\!
|
||||
----
|
||||
|
||||
|
||||
[TIP]
|
||||
=========================================
|
||||
It could happen that azure changes the endpoint public IP address.
|
||||
DNS propagation could take some minutes before you can connect again using
|
||||
name. You can get from azure the IP address if needed, using:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
# Look at Network `Endpoints 0 Vip`
|
||||
azure vm show myesnode1
|
||||
----
|
||||
|
||||
=========================================
|
||||
|
||||
Let's start more instances!
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
for x in $(seq 2 10)
|
||||
do
|
||||
echo "Launching azure instance #$x..."
|
||||
azure vm create azure-elasticsearch-cluster \
|
||||
esnode-image \
|
||||
--vm-name myesnode$x \
|
||||
--vm-size extrasmall \
|
||||
--ssh $((21 + $x)) \
|
||||
--ssh-cert /tmp/azure-certificate.pem \
|
||||
--connect \
|
||||
elasticsearch password1234\!\!
|
||||
done
|
||||
----
|
||||
|
||||
If you want to remove your running instances:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
azure vm delete myesnode1
|
||||
----
|
||||
|
||||
[[cloud-azure-repository]]
|
||||
==== Azure Repository
|
||||
|
||||
To enable Azure repositories, you have first to set your azure storage settings in `elasticsearch.yml` file:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
cloud:
|
||||
azure:
|
||||
storage:
|
||||
account: your_azure_storage_account
|
||||
key: your_azure_storage_key
|
||||
----
|
||||
|
||||
For information, in previous version of the azure plugin, settings were:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
cloud:
|
||||
azure:
|
||||
storage_account: your_azure_storage_account
|
||||
storage_key: your_azure_storage_key
|
||||
----
|
||||
|
||||
The Azure repository supports following settings:
|
||||
|
||||
`container`::
|
||||
|
||||
Container name. Defaults to `elasticsearch-snapshots`
|
||||
|
||||
`base_path`::
|
||||
|
||||
Specifies the path within container to repository data. Defaults to empty
|
||||
(root directory).
|
||||
|
||||
`chunk_size`::
|
||||
|
||||
Big files can be broken down into chunks during snapshotting if needed.
|
||||
The chunk size can be specified in bytes or by using size value notation,
|
||||
i.e. `1g`, `10m`, `5k`. Defaults to `64m` (64m max)
|
||||
|
||||
`compress`::
|
||||
|
||||
When set to `true` metadata files are stored in compressed format. This
|
||||
setting doesn't affect index files that are already compressed by default.
|
||||
Defaults to `false`.
|
||||
|
||||
Some examples, using scripts:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
# The simpliest one
|
||||
PUT _snapshot/my_backup1
|
||||
{
|
||||
"type": "azure"
|
||||
}
|
||||
|
||||
# With some settings
|
||||
PUT _snapshot/my_backup2
|
||||
{
|
||||
"type": "azure",
|
||||
"settings": {
|
||||
"container": "backup_container",
|
||||
"base_path": "backups",
|
||||
"chunk_size": "32m",
|
||||
"compress": true
|
||||
}
|
||||
}
|
||||
----
|
||||
// AUTOSENSE
|
||||
|
||||
Example using Java:
|
||||
|
||||
[source,java]
|
||||
----
|
||||
client.admin().cluster().preparePutRepository("my_backup3")
|
||||
.setType("azure").setSettings(Settings.settingsBuilder()
|
||||
.put(Storage.CONTAINER, "backup_container")
|
||||
.put(Storage.CHUNK_SIZE, new ByteSizeValue(32, ByteSizeUnit.MB))
|
||||
).get();
|
||||
----
|
||||
|
||||
[[cloud-azure-repository-validation]]
|
||||
===== Repository validation rules
|
||||
|
||||
According to the http://msdn.microsoft.com/en-us/library/dd135715.aspx[containers naming guide], a container name must
|
||||
be a valid DNS name, conforming to the following naming rules:
|
||||
|
||||
* Container names must start with a letter or number, and can contain only letters, numbers, and the dash (-) character.
|
||||
* Every dash (-) character must be immediately preceded and followed by a letter or number; consecutive dashes are not
|
||||
permitted in container names.
|
||||
* All letters in a container name must be lowercase.
|
||||
* Container names must be from 3 through 63 characters long.
|
||||
|
||||
[[cloud-azure-testing]]
|
||||
==== Testing Azure
|
||||
|
||||
Integrations tests in this plugin require working Azure configuration and therefore disabled by default.
|
||||
To enable tests prepare a config file `elasticsearch.yml` with the following content:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
cloud:
|
||||
azure:
|
||||
storage:
|
||||
account: "YOUR-AZURE-STORAGE-NAME"
|
||||
key: "YOUR-AZURE-STORAGE-KEY"
|
||||
----
|
||||
|
||||
Replaces `account`, `key` with your settings. Please, note that the test will delete all snapshot/restore related
|
||||
files in the specified bucket.
|
||||
|
||||
To run test:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
mvn -Dtests.azure=true -Dtests.config=/path/to/config/file/elasticsearch.yml clean test
|
||||
----
|
||||
|
||||
[[cloud-azure-smb-workaround]]
|
||||
==== Working around a bug in Windows SMB and Java on windows
|
||||
|
||||
When using a shared file system based on the SMB protocol (like Azure File Service) to store indices, the way Lucene
|
||||
open index segment files is with a write only flag. This is the _correct_ way to open the files, as they will only be
|
||||
used for writes and allows different FS implementations to optimize for it. Sadly, in windows with SMB, this disables
|
||||
the cache manager, causing writes to be slow. This has been described in
|
||||
https://issues.apache.org/jira/browse/LUCENE-6176[LUCENE-6176], but it affects each and every Java program out there!.
|
||||
This need and must be fixed outside of ES and/or Lucene, either in windows or OpenJDK. For now, we are providing an
|
||||
experimental support to open the files with read flag, but this should be considered experimental and the correct way
|
||||
to fix it is in OpenJDK or Windows.
|
||||
|
||||
The Azure Cloud plugin provides two storage types optimized for SMB:
|
||||
|
||||
`smb_mmap_fs`::
|
||||
|
||||
a SMB specific implementation of the default
|
||||
{ref}/index-modules-store.html#mmapfs[mmap fs]
|
||||
|
||||
`smb_simple_fs`::
|
||||
|
||||
a SMB specific implementation of the default
|
||||
{ref}/index-modules-store.html#simplefs[simple fs]
|
||||
|
||||
To use one of these specific storage types, you need to install the Azure Cloud plugin and restart the node.
|
||||
Then configure Elasticsearch to set the storage type you want.
|
||||
|
||||
This can be configured for all indices by adding this to the `elasticsearch.yml` file:
|
||||
|
||||
[source,yaml]
|
||||
----
|
||||
index.store.type: smb_simple_fs
|
||||
----
|
||||
|
||||
Note that setting will be applied for newly created indices.
|
||||
|
||||
It can also be set on a per-index basis at index creation time:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
PUT my_index
|
||||
{
|
||||
"settings": {
|
||||
"index.store.type": "smb_mmap_fs"
|
||||
}
|
||||
}
|
||||
----
|
||||
// AUTOSENSE
|
|
@ -0,0 +1,479 @@
|
|||
[[cloud-gce]]
|
||||
=== GCE Cloud Plugin
|
||||
|
||||
The Google Compute Engine Cloud plugin uses the GCE API for unicast discovery.
|
||||
|
||||
[[cloud-gce-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install cloud-gce
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[cloud-gce-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove cloud-gce
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[cloud-gce-usage-discovery]]
|
||||
==== GCE Virtual Machine Discovery
|
||||
|
||||
Google Compute Engine VM discovery allows to use the google APIs to perform automatic discovery (similar to multicast
|
||||
in non hostile multicast environments). Here is a simple sample configuration:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
cloud:
|
||||
gce:
|
||||
project_id: <your-google-project-id>
|
||||
zone: <your-zone>
|
||||
discovery:
|
||||
type: gce
|
||||
--------------------------------------------------
|
||||
|
||||
[[cloud-gce-usage-discovery-short]]
|
||||
===== How to start (short story)
|
||||
|
||||
* Create Google Compute Engine instance (with compute rw permissions)
|
||||
* Install Elasticsearch
|
||||
* Install Google Compute Engine Cloud plugin
|
||||
* Modify `elasticsearch.yml` file
|
||||
* Start Elasticsearch
|
||||
|
||||
[[cloud-gce-usage-discovery-long]]
|
||||
==== Setting up GCE Discovery
|
||||
|
||||
|
||||
[[cloud-gce-usage-discovery-long-prerequisites]]
|
||||
===== Prerequisites
|
||||
|
||||
Before starting, you need:
|
||||
|
||||
* Your project ID, e.g. `es-cloud`. Get it from https://code.google.com/apis/console/[Google API Console].
|
||||
* To install https://developers.google.com/cloud/sdk/[Google Cloud SDK]
|
||||
|
||||
If you did not set it yet, you can define your default project you will work on:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
gcloud config set project es-cloud
|
||||
--------------------------------------------------
|
||||
|
||||
[[cloud-gce-usage-discovery-long-first-instance]]
|
||||
===== Creating your first instance
|
||||
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
gcutil addinstance myesnode1 \
|
||||
--service_account_scope=compute-rw,storage-full \
|
||||
--persistent_boot_disk
|
||||
--------------------------------------------------
|
||||
|
||||
Then follow these steps:
|
||||
|
||||
* You will be asked to open a link in your browser. Login and allow access to listed services.
|
||||
* You will get back a verification code. Copy and paste it in your terminal.
|
||||
* You should see an `Authentication successful.` message.
|
||||
* Choose your zone, e.g. `europe-west1-a`.
|
||||
* Choose your compute instance size, e.g. `f1-micro`.
|
||||
* Choose your OS, e.g. `projects/debian-cloud/global/images/debian-7-wheezy-v20140606`.
|
||||
* You may be asked to create a ssh key. Follow the instructions to create one.
|
||||
|
||||
When done, a report like this one should appears:
|
||||
|
||||
[source,text]
|
||||
--------------------------------------------------
|
||||
Table of resources:
|
||||
|
||||
+-----------+--------------+-------+---------+--------------+----------------+----------------+----------------+---------+----------------+
|
||||
| name | machine-type | image | network | network-ip | external-ip | disks | zone | status | status-message |
|
||||
+-----------+--------------+-------+---------+--------------+----------------+----------------+----------------+---------+----------------+
|
||||
| myesnode1 | f1-micro | | default | 10.240.20.57 | 192.158.29.199 | boot-myesnode1 | europe-west1-a | RUNNING | |
|
||||
+-----------+--------------+-------+---------+--------------+----------------+----------------+----------------+---------+----------------+
|
||||
--------------------------------------------------
|
||||
|
||||
You can now connect to your instance:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
# Connect using google cloud SDK
|
||||
gcloud compute ssh myesnode1 --zone europe-west1-a
|
||||
|
||||
# Or using SSH with external IP address
|
||||
ssh -i ~/.ssh/google_compute_engine 192.158.29.199
|
||||
--------------------------------------------------
|
||||
|
||||
[IMPORTANT]
|
||||
.Service Account Permissions
|
||||
==============================================
|
||||
|
||||
It's important when creating an instance that the correct permissions are set. At a minimum, you must ensure you have:
|
||||
|
||||
[source,text]
|
||||
--------------------------------------------------
|
||||
service_account_scope=compute-rw
|
||||
--------------------------------------------------
|
||||
|
||||
Failing to set this will result in unauthorized messages when starting Elasticsearch.
|
||||
See [Machine Permissions](#machine-permissions).
|
||||
==============================================
|
||||
|
||||
|
||||
Once connected, install Elasticsearch:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
sudo apt-get update
|
||||
|
||||
# Download Elasticsearch
|
||||
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-2.0.0.deb
|
||||
|
||||
# Prepare Java installation
|
||||
sudo apt-get install java7-runtime-headless
|
||||
|
||||
# Prepare Elasticsearch installation
|
||||
sudo dpkg -i elasticsearch-2.0.0.deb
|
||||
--------------------------------------------------
|
||||
|
||||
[[cloud-gce-usage-discovery-long-install-plugin]]
|
||||
===== Install elasticsearch cloud gce plugin
|
||||
|
||||
Install the plugin:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
# Use Plugin Manager to install it
|
||||
sudo bin/plugin install cloud-gce
|
||||
--------------------------------------------------
|
||||
|
||||
Open the `elasticsearch.yml` file:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
sudo vi /etc/elasticsearch/elasticsearch.yml
|
||||
--------------------------------------------------
|
||||
|
||||
And add the following lines:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
cloud:
|
||||
gce:
|
||||
project_id: es-cloud
|
||||
zone: europe-west1-a
|
||||
discovery:
|
||||
type: gce
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
Start elasticsearch:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
sudo /etc/init.d/elasticsearch start
|
||||
--------------------------------------------------
|
||||
|
||||
If anything goes wrong, you should check logs:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
tail -f /var/log/elasticsearch/elasticsearch.log
|
||||
--------------------------------------------------
|
||||
|
||||
If needed, you can change log level to `TRACE` by opening `logging.yml`:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
sudo vi /etc/elasticsearch/logging.yml
|
||||
--------------------------------------------------
|
||||
|
||||
and adding the following line:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
# discovery
|
||||
discovery.gce: TRACE
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
|
||||
[[cloud-gce-usage-discovery-cloning]]
|
||||
==== Cloning your existing machine
|
||||
|
||||
In order to build a cluster on many nodes, you can clone your configured instance to new nodes.
|
||||
You won't have to reinstall everything!
|
||||
|
||||
First create an image of your running instance and upload it to Google Cloud Storage:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
# Create an image of yur current instance
|
||||
sudo /usr/bin/gcimagebundle -d /dev/sda -o /tmp/
|
||||
|
||||
# An image has been created in `/tmp` directory:
|
||||
ls /tmp
|
||||
e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz
|
||||
|
||||
# Upload your image to Google Cloud Storage:
|
||||
# Create a bucket to hold your image, let's say `esimage`:
|
||||
gsutil mb gs://esimage
|
||||
|
||||
# Copy your image to this bucket:
|
||||
gsutil cp /tmp/e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz gs://esimage
|
||||
|
||||
# Then add your image to images collection:
|
||||
gcutil addimage elasticsearch-1-2-1 gs://esimage/e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz
|
||||
|
||||
# If the previous command did not work for you, logout from your instance
|
||||
# and launch the same command from your local machine.
|
||||
--------------------------------------------------
|
||||
|
||||
[[cloud-gce-usage-discovery-start-new-instances]]
|
||||
===== Start new instances
|
||||
|
||||
As you have now an image, you can create as many instances as you need:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
# Just change node name (here myesnode2)
|
||||
gcutil addinstance --image=elasticsearch-1-2-1 myesnode2
|
||||
|
||||
# If you want to provide all details directly, you can use:
|
||||
gcutil addinstance --image=elasticsearch-1-2-1 \
|
||||
--kernel=projects/google/global/kernels/gce-v20130603 myesnode2 \
|
||||
--zone europe-west1-a --machine_type f1-micro --service_account_scope=compute-rw \
|
||||
--persistent_boot_disk
|
||||
--------------------------------------------------
|
||||
|
||||
[[cloud-gce-usage-discovery-remove-instance]]
|
||||
===== Remove an instance (aka shut it down)
|
||||
|
||||
You can use https://cloud.google.com/console[Google Cloud Console] or CLI to manage your instances:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
# Stopping and removing instances
|
||||
gcutil deleteinstance myesnode1 myesnode2 \
|
||||
--zone=europe-west1-a
|
||||
|
||||
# Consider removing disk as well if you don't need them anymore
|
||||
gcutil deletedisk boot-myesnode1 boot-myesnode2 \
|
||||
--zone=europe-west1-a
|
||||
--------------------------------------------------
|
||||
|
||||
[[cloud-gce-usage-discovery-zones]]
|
||||
==== Using GCE zones
|
||||
|
||||
`cloud.gce.zone` helps to retrieve instances running in a given zone. It should be one of the
|
||||
https://developers.google.com/compute/docs/zones#available[GCE supported zones].
|
||||
|
||||
The GCE discovery can support multi zones although you need to be aware of network latency between zones.
|
||||
To enable discovery across more than one zone, just enter add your zone list to `cloud.gce.zone` setting:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
cloud:
|
||||
gce:
|
||||
project_id: <your-google-project-id>
|
||||
zone: ["<your-zone1>", "<your-zone2>"]
|
||||
discovery:
|
||||
type: gce
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
|
||||
[[cloud-gce-usage-discovery-tags]]
|
||||
==== Filtering by tags
|
||||
|
||||
The GCE discovery can also filter machines to include in the cluster based on tags using `discovery.gce.tags` settings.
|
||||
For example, setting `discovery.gce.tags` to `dev` will only filter instances having a tag set to `dev`. Several tags
|
||||
set will require all of those tags to be set for the instance to be included.
|
||||
|
||||
One practical use for tag filtering is when an GCE cluster contains many nodes that are not running
|
||||
elasticsearch. In this case (particularly with high ping_timeout values) there is a risk that a new node's discovery
|
||||
phase will end before it has found the cluster (which will result in it declaring itself master of a new cluster
|
||||
with the same name - highly undesirable). Adding tag on elasticsearch GCE nodes and then filtering by that
|
||||
tag will resolve this issue.
|
||||
|
||||
Add your tag when building the new instance:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
gcutil --project=es-cloud addinstance myesnode1 \
|
||||
--service_account_scope=compute-rw \
|
||||
--persistent_boot_disk \
|
||||
--tags=elasticsearch,dev
|
||||
--------------------------------------------------
|
||||
|
||||
Then, define it in `elasticsearch.yml`:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
cloud:
|
||||
gce:
|
||||
project_id: es-cloud
|
||||
zone: europe-west1-a
|
||||
discovery:
|
||||
type: gce
|
||||
gce:
|
||||
tags: elasticsearch, dev
|
||||
--------------------------------------------------
|
||||
|
||||
[[cloud-gce-usage-discovery-port]]
|
||||
==== Changing default transport port
|
||||
|
||||
By default, elasticsearch GCE plugin assumes that you run elasticsearch on 9300 default port.
|
||||
But you can specify the port value elasticsearch is meant to use using google compute engine metadata `es_port`:
|
||||
|
||||
[[cloud-gce-usage-discovery-port-create]]
|
||||
===== When creating instance
|
||||
|
||||
Add `--metadata=es_port:9301` option:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
# when creating first instance
|
||||
gcutil addinstance myesnode1 \
|
||||
--service_account_scope=compute-rw,storage-full \
|
||||
--persistent_boot_disk \
|
||||
--metadata=es_port:9301
|
||||
|
||||
# when creating an instance from an image
|
||||
gcutil addinstance --image=elasticsearch-1-0-0-RC1 \
|
||||
--kernel=projects/google/global/kernels/gce-v20130603 myesnode2 \
|
||||
--zone europe-west1-a --machine_type f1-micro --service_account_scope=compute-rw \
|
||||
--persistent_boot_disk --metadata=es_port:9301
|
||||
--------------------------------------------------
|
||||
|
||||
[[cloud-gce-usage-discovery-port-run]]
|
||||
===== On a running instance
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
# Get metadata fingerprint
|
||||
gcutil getinstance myesnode1 --zone=europe-west1-a
|
||||
+------------------------+---------------------+
|
||||
| property | value |
|
||||
+------------------------+---------------------+
|
||||
| metadata | |
|
||||
| fingerprint | 42WmSpB8rSM= |
|
||||
+------------------------+---------------------+
|
||||
|
||||
# Use that fingerprint
|
||||
gcutil setinstancemetadata myesnode1 \
|
||||
--zone=europe-west1-a \
|
||||
--metadata=es_port:9301 \
|
||||
--fingerprint=42WmSpB8rSM=
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
[[cloud-gce-usage-discovery-tips]]
|
||||
==== GCE Tips
|
||||
|
||||
[[cloud-gce-usage-discovery-tips-projectid]]
|
||||
===== Store project id locally
|
||||
|
||||
If you don't want to repeat the project id each time, you can save it in `~/.gcutil.flags` file using:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
gcutil getproject --project=es-cloud --cache_flag_values
|
||||
--------------------------------------------------
|
||||
|
||||
`~/.gcutil.flags` file now contains:
|
||||
|
||||
[source,text]
|
||||
--------------------------------------------------
|
||||
--project=es-cloud
|
||||
--------------------------------------------------
|
||||
|
||||
[[cloud-gce-usage-discovery-tips-permissions]]
|
||||
===== Machine Permissions
|
||||
|
||||
If you have created a machine without the correct permissions, you will see `403 unauthorized` error messages. The only
|
||||
way to alter these permissions is to delete the instance (NOT THE DISK). Then create another with the correct permissions.
|
||||
|
||||
Creating machines with gcutil::
|
||||
+
|
||||
--
|
||||
Ensure the following flags are set:
|
||||
|
||||
[source,text]
|
||||
--------------------------------------------------
|
||||
--service_account_scope=compute-rw
|
||||
--------------------------------------------------
|
||||
--
|
||||
|
||||
Creating with console (web)::
|
||||
+
|
||||
--
|
||||
When creating an instance using the web portal, click _Show advanced options_.
|
||||
|
||||
At the bottom of the page, under `PROJECT ACCESS`, choose `>> Compute >> Read Write`.
|
||||
--
|
||||
|
||||
Creating with knife google::
|
||||
+
|
||||
--
|
||||
Set the service account scopes when creating the machine:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
knife google server create www1 \
|
||||
-m n1-standard-1 \
|
||||
-I debian-7-wheezy-v20131120 \
|
||||
-Z us-central1-a \
|
||||
-i ~/.ssh/id_rsa \
|
||||
-x jdoe \
|
||||
--gce-service-account-scopes https://www.googleapis.com/auth/compute.full_control
|
||||
--------------------------------------------------
|
||||
|
||||
Or, you may use the alias:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
--gce-service-account-scopes compute-rw
|
||||
--------------------------------------------------
|
||||
--
|
||||
|
||||
[[cloud-gce-usage-discovery-testing]]
|
||||
==== Testing GCE
|
||||
|
||||
Integrations tests in this plugin require working GCE configuration and
|
||||
therefore disabled by default. To enable tests prepare a config file
|
||||
elasticsearch.yml with the following content:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
cloud:
|
||||
gce:
|
||||
project_id: es-cloud
|
||||
zone: europe-west1-a
|
||||
discovery:
|
||||
type: gce
|
||||
--------------------------------------------------
|
||||
|
||||
Replaces `project_id` and `zone` with your settings.
|
||||
|
||||
To run test:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
mvn -Dtests.gce=true -Dtests.config=/path/to/config/file/elasticsearch.yml clean test
|
||||
--------------------------------------------------
|
|
@ -1,89 +1,107 @@
|
|||
[[plugins-delete-by-query]]
|
||||
== Delete By Query Plugin
|
||||
=== Delete By Query Plugin
|
||||
|
||||
The delete by query plugin adds support for deleting all of the documents
|
||||
The delete-by-query plugin adds support for deleting all of the documents
|
||||
(from one or more indices) which match the specified query. It is a
|
||||
replacement for the problematic _delete-by-query_ functionality which has been
|
||||
removed from Elasticsearch core.
|
||||
|
||||
Internally, it uses the <<scroll-scan, Scan/Scroll>> and <<docs-bulk, Bulk>>
|
||||
APIs to delete documents in an efficient and safe manner. It is slower than
|
||||
the old _delete-by-query_ functionality, but fixes the problems with the
|
||||
previous implementation.
|
||||
Internally, it uses the {ref}/search-request-scroll.html#scroll-scan[Scan/Scroll]
|
||||
and {ref}/docs-bulk.html[Bulk] APIs to delete documents in an efficient and
|
||||
safe manner. It is slower than the old _delete-by-query_ functionality, but
|
||||
fixes the problems with the previous implementation.
|
||||
|
||||
TIP: Queries which match large numbers of documents may run for a long time,
|
||||
To understand more about why we removed delete-by-query from core and about
|
||||
the semantics of the new implementation, see
|
||||
<<delete-by-query-plugin-reason>>.
|
||||
|
||||
[TIP]
|
||||
============================================
|
||||
Queries which match large numbers of documents may run for a long time,
|
||||
as every document has to be deleted individually. Don't use _delete-by-query_
|
||||
to clean out all or most documents in an index. Rather create a new index and
|
||||
perhaps reindex the documents you want to keep.
|
||||
============================================
|
||||
|
||||
=== Installation
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
bin/plugin install elasticsearch/elasticsearch-delete-by-query
|
||||
sudo bin/plugin install delete-by-query
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
=== Removal
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
bin/plugin remove elasticsearch/elasticsearch-delete-by-query
|
||||
sudo bin/plugin remove delete-by-query
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
=== Usage
|
||||
[[delete-by-query-usage]]
|
||||
==== Using Delete-by-Query
|
||||
|
||||
The query can either be provided using a simple query string as
|
||||
a parameter:
|
||||
|
||||
[source,shell]
|
||||
--------------------------------------------------
|
||||
curl -XDELETE 'http://localhost:9200/twitter/tweet/_query?q=user:kimchy'
|
||||
DELETE /twitter/tweet/_query?q=user:kimchy
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
or using the <<query-dsl,Query DSL>> defined within the request body:
|
||||
or using the {ref}/query-dsl.html[Query DSL] defined within the request body:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XDELETE 'http://localhost:9200/twitter/tweet/_query' -d '{
|
||||
"query" : { <1>
|
||||
"term" : { "user" : "kimchy" }
|
||||
DELETE /twitter/tweet/_query
|
||||
{
|
||||
"query": { <1>
|
||||
"term": {
|
||||
"user": "kimchy"
|
||||
}
|
||||
}
|
||||
}
|
||||
'
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
<1> The query must be passed as a value to the `query` key, in the same way as
|
||||
the <<search-search,search api>>.
|
||||
the {ref}/search-search.html[search api].
|
||||
|
||||
Both of the above examples end up doing the same thing, which is to delete all
|
||||
tweets from the twitter index for the user `kimchy`.
|
||||
|
||||
Delete-by-query supports deletion across <<search-multi-index-type,multiple indices and multiple types>>.
|
||||
Delete-by-query supports deletion across
|
||||
{ref}/search-search.html#search-multi-index-type[multiple indices and multiple types].
|
||||
|
||||
==== Query-string parameters
|
||||
[float]
|
||||
=== Query-string parameters
|
||||
|
||||
The following query string parameters are supported:
|
||||
|
||||
`q`::
|
||||
|
||||
Instead of using the <<query-dsl,Query DSL>> to pass a `query` in the request
|
||||
Instead of using the {ref}/query-dsl.html[Query DSL] to pass a `query` in the request
|
||||
body, you can use the `q` query string parameter to specify a query using
|
||||
<<query-string-syntax,`query_string` syntax>>. In this case, the following
|
||||
additional parameters are supported: `df`, `analyzer`, `default_operator`,
|
||||
`lowercase_expanded_terms`, `analyze_wildcard` and `lenient`.
|
||||
See <<search-uri-request>> for details.
|
||||
{ref}/query-dsl-query-string-query.html#query-string-syntax[`query_string` syntax].
|
||||
In this case, the following additional parameters are supported: `df`,
|
||||
`analyzer`, `default_operator`, `lowercase_expanded_terms`,
|
||||
`analyze_wildcard` and `lenient`.
|
||||
See {ref}/search-uri-request.html[URI search request] for details.
|
||||
|
||||
`size`::
|
||||
|
||||
The number of hits returned *per shard* by the <<scroll-scan,scroll/scan>>
|
||||
The number of hits returned *per shard* by the {ref}/search-request-scroll.html#scroll-scan[scan]
|
||||
request. Defaults to 10. May also be specified in the request body.
|
||||
|
||||
`timeout`::
|
||||
|
@ -97,11 +115,12 @@ A comma separated list of routing values to control which shards the delete by
|
|||
query request should be executed on.
|
||||
|
||||
When using the `q` parameter, the following additional parameters are
|
||||
supported (as explained in <<search-uri-request>>): `df`, `analyzer`,
|
||||
supported (as explained in {ref}/search-uri-request.html[URI search request]): `df`, `analyzer`,
|
||||
`default_operator`.
|
||||
|
||||
|
||||
==== Response body
|
||||
[float]
|
||||
=== Response body
|
||||
|
||||
The JSON response looks like this:
|
||||
|
||||
|
@ -129,8 +148,9 @@ The JSON response looks like this:
|
|||
--------------------------------------------------
|
||||
|
||||
Internally, the query is used to execute an initial
|
||||
<<scroll-scan,scroll/scan>> request. As hits are pulled from the scroll API,
|
||||
they are passed to the <<bulk,Bulk API>> for deletion.
|
||||
{ref}/search-request-scroll.html#scroll-scan[scroll/scan] request. As hits are
|
||||
pulled from the scroll API, they are passed to the {ref}/docs-bulk.html[Bulk
|
||||
API] for deletion.
|
||||
|
||||
IMPORTANT: Delete by query will only delete the version of the document that
|
||||
was visible to search at the time the request was executed. Any documents
|
||||
|
@ -161,3 +181,90 @@ The number of documents that failed to be deleted for the given index. A
|
|||
document may fail to be deleted if it has been updated to a new version by
|
||||
another process, or if the shard containing the document has gone missing due
|
||||
to hardware failure, for example.
|
||||
|
||||
[[delete-by-query-plugin-reason]]
|
||||
==== Why Delete-By-Query is a plugin
|
||||
|
||||
The old delete-by-query API in Elasticsearch 1.x was fast but problematic. We
|
||||
decided to remove the feature from Elasticsearch for these reasons:
|
||||
|
||||
Forward compatibility::
|
||||
|
||||
The old implementation wrote a delete-by-query request, including the
|
||||
query, to the transaction log. This meant that, when upgrading to a new
|
||||
version, old unsupported queries which cannot be executed might exist in
|
||||
the translog, thus causing data corruption.
|
||||
|
||||
Consistency and correctness::
|
||||
|
||||
The old implementation executed the query and deleted all matching docs on
|
||||
the primary first. It then repeated this procedure on each replica shard.
|
||||
There was no guarantee that the queries on the primary and the replicas
|
||||
matched the same document, so it was quite possible to end up with
|
||||
different documents on each shard copy.
|
||||
|
||||
Resiliency::
|
||||
|
||||
The old implementation could cause out-of-memory exceptions, merge storms,
|
||||
and dramatic slow downs if used incorrectly.
|
||||
|
||||
[float]
|
||||
=== New delete-by-query implementation
|
||||
|
||||
The new implementation, provided by this plugin, is built internally
|
||||
using {ref}/search-request-scroll.html#scroll-scan[scan and scroll] to return
|
||||
the document IDs and versions of all the documents that need to be deleted.
|
||||
It then uses the {ref}/docs-bulk.html[`bulk` API] to do the actual deletion.
|
||||
|
||||
This can have performance as well as visibility implications. Delete-by-query
|
||||
now has the following semantics:
|
||||
|
||||
non-atomic::
|
||||
|
||||
A delete-by-query may fail at any time while some documents matching the
|
||||
query have already been deleted.
|
||||
|
||||
try-once::
|
||||
|
||||
A delete-by-query may fail at any time and will not retry it's execution.
|
||||
All retry logic is left to the user.
|
||||
|
||||
syntactic sugar::
|
||||
|
||||
A delete-by-query is equivalent to a scan/scroll search and corresponding
|
||||
bulk-deletes by ID.
|
||||
|
||||
point-in-time::
|
||||
|
||||
A delete-by-query will only delete the documents that are visible at the
|
||||
point in time the delete-by-query was started, equivalent to the
|
||||
scan/scroll API.
|
||||
|
||||
consistent::
|
||||
|
||||
A delete-by-query will yield consistent results across all replicas of a
|
||||
shard.
|
||||
|
||||
forward-compatible::
|
||||
|
||||
A delete-by-query will only send IDs to the shards as deletes such that no
|
||||
queries are stored in the transaction logs that might not be supported in
|
||||
the future.
|
||||
|
||||
visibility::
|
||||
|
||||
The effect of a delete-by-query request will not be visible to search
|
||||
until the user refreshes the index, or the index is refreshed
|
||||
automatically.
|
||||
|
||||
The new implementation suffers from two issues, which is why we decided to
|
||||
move the functionality to a plugin instead of replacing the feautre in core:
|
||||
|
||||
* It is not as fast as the previous implementation. For most use cases, this
|
||||
difference should not be noticeable but users running delete-by-query on
|
||||
many matching documents may be affected.
|
||||
|
||||
* There is currently no way to monitor or cancel a running delete-by-query
|
||||
request, except for the `timeout` parameter.
|
||||
|
||||
We have plans to solve both of these issues in a later version of Elasticsearch.
|
|
@ -0,0 +1,45 @@
|
|||
[[discovery]]
|
||||
== Discovery Plugins
|
||||
|
||||
Discovery plugins extend Elasticsearch by adding new discovery mechanisms that
|
||||
can be used instead of {ref}/modules-discovery-zen.html[Zen Discovery].
|
||||
|
||||
[float]
|
||||
==== Core discovery plugins
|
||||
|
||||
The core discovery plugins are:
|
||||
|
||||
<<cloud-aws,AWS Cloud>>::
|
||||
|
||||
The Amazon Web Service (AWS) Cloud plugin uses the
|
||||
https://github.com/aws/aws-sdk-java[AWS API] for unicast discovery, and adds
|
||||
support for using S3 as a repository for
|
||||
{ref}/modules-snapshots.html[Snapshot/Restore].
|
||||
|
||||
<<cloud-azure,Azure Cloud>>::
|
||||
|
||||
The Azure Cloud plugin uses the Azure API for unicast discovery, and adds
|
||||
support for using Azure as a repository for
|
||||
{ref}/modules-snapshots.html[Snapshot/Restore].
|
||||
|
||||
<<cloud-gce,GCE Cloud>>::
|
||||
|
||||
The Google Compute Engine Cloud plugin uses the GCE API for unicast discovery.
|
||||
|
||||
[float]
|
||||
==== Community contributed discovery plugins
|
||||
|
||||
A number of discovery plugins have been contributed by our community:
|
||||
|
||||
* https://github.com/grantr/elasticsearch-srv-discovery[DNS SRV Discovery Plugin] (by Grant Rodgers)
|
||||
* https://github.com/shikhar/eskka[eskka Discovery Plugin] (by Shikhar Bhushan)
|
||||
* https://github.com/grmblfrz/elasticsearch-zookeeper[ZooKeeper Discovery Plugin] (by Sonian Inc.)
|
||||
|
||||
include::cloud-aws.asciidoc[]
|
||||
|
||||
include::cloud-azure.asciidoc[]
|
||||
|
||||
include::cloud-gce.asciidoc[]
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,65 @@
|
|||
= Elasticsearch Plugins and Integrations
|
||||
|
||||
:ref: https://www.elastic.co/guide/en/elasticsearch/reference/master
|
||||
:guide: https://www.elastic.co/guide
|
||||
|
||||
[[intro]]
|
||||
== Introduction to plugins
|
||||
|
||||
Plugins are a way to enhance the core Elasticsearch functionality in a custom
|
||||
manner. They range from adding custom mapping types, custom analyzers, native
|
||||
scripts, custom discovery and more.
|
||||
|
||||
There are three types of plugins:
|
||||
|
||||
Java plugins::
|
||||
|
||||
These plugins contain only JAR files, and must be installed on every node
|
||||
in the cluster. After installation, each node must be restarted before
|
||||
the plugin becomes visible.
|
||||
|
||||
Site plugins::
|
||||
+
|
||||
--
|
||||
|
||||
These plugins contain static web content like Javascript, HTML, and CSS files,
|
||||
that can be served directly from Elasticsearch. Site plugins may only need to
|
||||
be installed on one node, and do not require a restart to become visible. The
|
||||
content of site plugins is accessible via a URL like:
|
||||
|
||||
http://yournode:9200/_plugin/[plugin name]
|
||||
|
||||
--
|
||||
|
||||
Mixed plugins::
|
||||
|
||||
Mixed plugins contain both JAR files and web content.
|
||||
|
||||
For advice on writing your own plugin, see <<plugin-authors>>.
|
||||
|
||||
include::plugin-script.asciidoc[]
|
||||
|
||||
include::api.asciidoc[]
|
||||
|
||||
include::alerting.asciidoc[]
|
||||
|
||||
include::analysis.asciidoc[]
|
||||
|
||||
include::discovery.asciidoc[]
|
||||
|
||||
include::management.asciidoc[]
|
||||
|
||||
include::mapper.asciidoc[]
|
||||
|
||||
include::scripting.asciidoc[]
|
||||
|
||||
include::security.asciidoc[]
|
||||
|
||||
include::repository.asciidoc[]
|
||||
|
||||
include::transport.asciidoc[]
|
||||
|
||||
include::integrations.asciidoc[]
|
||||
|
||||
include::authors.asciidoc[]
|
||||
|
|
@ -0,0 +1,220 @@
|
|||
[[integrations]]
|
||||
|
||||
== Integrations
|
||||
|
||||
Integrations are not plugins, instead they are external tools or modules which
|
||||
make it easier to work with Elasticsearch.
|
||||
|
||||
[float]
|
||||
[[cms-integrations]]
|
||||
=== CMS integrations
|
||||
|
||||
[float]
|
||||
==== Supported by the community:
|
||||
|
||||
* http://drupal.org/project/search_api_elasticsearch[Drupal]:
|
||||
Drupal Elasticsearch integration via Search API.
|
||||
|
||||
* https://drupal.org/project/elasticsearch_connector[Drupal]:
|
||||
Drupal Elasticsearch integration.
|
||||
|
||||
* http://searchbox-io.github.com/wp-elasticsearch/[Wp-Elasticsearch]:
|
||||
Elasticsearch WordPress Plugin
|
||||
|
||||
* https://github.com/wallmanderco/elasticsearch-indexer[Elasticsearch Indexer]:
|
||||
Elasticsearch WordPress Plugin
|
||||
|
||||
* https://doc.tiki.org/Elasticsearch[Tiki Wiki CMS Groupware]:
|
||||
Tiki has native support for Elasticsearch. This provides faster & better
|
||||
search (facets, etc), along with some Natural Language Processing features
|
||||
(ex.: More like this)
|
||||
|
||||
|
||||
[float]
|
||||
[[data-integrations]]
|
||||
=== Data import/export and validation
|
||||
|
||||
NOTE: Rivers were used to import data from external systems into
|
||||
Elasticsearch, but they are no longer supported in Elasticsearch 2.0.
|
||||
|
||||
[float]
|
||||
==== Supported by the community:
|
||||
|
||||
* https://github.com/jprante/elasticsearch-jdbc[JDBC importer]:
|
||||
The Java Database Connection (JDBC) importer allows to fetch data from JDBC sources for indexing into Elasticsearch (by Jörg Prante)
|
||||
|
||||
* https://github.com/reachkrishnaraj/kafka-elasticsearch-standalone-consumer[Kafka Standalone Consumer]:
|
||||
Easily Scaleable & Extendable, Kafka Standalone Consumer that will read the messages from Kafka, processes and index them in ElasticSearch
|
||||
|
||||
* https://github.com/ozlerhakan/mongolastic[Mongolastic]:
|
||||
A tool that clone data from ElasticSearch to MongoDB and vice versa
|
||||
|
||||
* https://github.com/Aconex/scrutineer[Scrutineer]:
|
||||
A high performance consistency checker to compare what you've indexed
|
||||
with your source of truth content (e.g. DB)
|
||||
|
||||
|
||||
[float]
|
||||
[[deployment]]
|
||||
=== Deployment
|
||||
|
||||
[float]
|
||||
==== Supported by Elasticsearch:
|
||||
|
||||
* https://github.com/elasticsearch/puppet-elasticsearch[Puppet]:
|
||||
Elasticsearch puppet module.
|
||||
|
||||
[float]
|
||||
==== Supported by the community:
|
||||
|
||||
* http://github.com/elasticsearch/cookbook-elasticsearch[Chef]:
|
||||
Chef cookbook for Elasticsearch
|
||||
|
||||
This project appears to have been abandoned:
|
||||
|
||||
* https://github.com/medcl/salt-elasticsearch[SaltStack]:
|
||||
SaltStack Module for Elasticsearch
|
||||
|
||||
[float]
|
||||
[[framework-integrations]]
|
||||
=== Framework integrations
|
||||
|
||||
[float]
|
||||
==== Supported by the community:
|
||||
|
||||
* http://www.searchtechnologies.com/aspire-for-elasticsearch[Aspire for Elasticsearch]:
|
||||
Aspire, from Search Technologies, is a powerful connector and processing
|
||||
framework designed for unstructured data. It has connectors to internal and
|
||||
external repositories including SharePoint, Documentum, Jive, RDB, file
|
||||
systems, websites and more, and can transform and normalize this data before
|
||||
indexing in Elasticsearch.
|
||||
|
||||
* https://camel.apache.org/elasticsearch.html[Apache Camel Integration]:
|
||||
An Apache camel component to integrate elasticsearch
|
||||
|
||||
* https://metacpan.org/release/Catmandu-Store-ElasticSearch[Catmanadu]:
|
||||
An Elasticsearch backend for the Catmandu framework.
|
||||
|
||||
* https://github.com/tlrx/elasticsearch-test[elasticsearch-test]:
|
||||
Elasticsearch Java annotations for unit testing with
|
||||
http://www.junit.org/[JUnit]
|
||||
|
||||
* https://github.com/FriendsOfSymfony/FOSElasticaBundle[FOSElasticaBundle]:
|
||||
Symfony2 Bundle wrapping Elastica.
|
||||
|
||||
* http://grails.org/plugin/elasticsearch[Grails]:
|
||||
Elasticsearch Grails plugin.
|
||||
|
||||
* http://haystacksearch.org/[Haystack]:
|
||||
Modular search for Django
|
||||
|
||||
* https://github.com/cleverage/play2-elasticsearch[play2-elasticsearch]:
|
||||
Elasticsearch module for Play Framework 2.x
|
||||
|
||||
* https://github.com/spring-projects/spring-data-elasticsearch[Spring Data Elasticsearch]:
|
||||
Spring Data implementation for Elasticsearch
|
||||
|
||||
* https://github.com/dadoonet/spring-elasticsearch[Spring Elasticsearch]:
|
||||
Spring Factory for Elasticsearch
|
||||
|
||||
* https://github.com/twitter/storehaus[Twitter Storehaus]:
|
||||
Thin asynchronous Scala client for Storehaus.
|
||||
|
||||
These projects appear to have been abandoned:
|
||||
|
||||
* https://metacpan.org/module/Catalyst::Model::Search::Elasticsearch[Catalyst]:
|
||||
Elasticsearch and Catalyst integration.
|
||||
|
||||
* http://github.com/aparo/django-elasticsearch[django-elasticsearch]:
|
||||
Django Elasticsearch Backend.
|
||||
|
||||
* https://github.com/kzwang/elasticsearch-osem[elasticsearch-osem]:
|
||||
A Java Object Search Engine Mapping (OSEM) for Elasticsearch
|
||||
|
||||
* http://geeks.aretotally.in/play-framework-module-elastic-search-distributed-searching-with-json-http-rest-or-java[Play!Framework]:
|
||||
Integrate with Play! Framework Application.
|
||||
|
||||
* http://code.google.com/p/terrastore/wiki/Search_Integration[Terrastore Search]:
|
||||
http://code.google.com/p/terrastore/[Terrastore] integration module with elasticsearch.
|
||||
|
||||
|
||||
[float]
|
||||
[[hadoop-integrations]]
|
||||
=== Hadoop integrations
|
||||
|
||||
[float]
|
||||
==== Supported by Elasticsearch:
|
||||
|
||||
* link:/guide/en/elasticsearch/hadoop/current/[es-hadoop]: Elasticsearch real-time
|
||||
search and analytics natively integrated with Hadoop. Supports Map/Reduce,
|
||||
Cascading, Apache Hive, Apache Pig, Apache Spark and Apache Storm.
|
||||
|
||||
[float]
|
||||
==== Supported by the community:
|
||||
|
||||
These projects appear to have been abandoned:
|
||||
|
||||
* http://github.com/Aconex/elasticflume[elasticflume]:
|
||||
http://github.com/cloudera/flume[Flume] sink implementation.
|
||||
|
||||
|
||||
* https://github.com/infochimps-labs/wonderdog[Wonderdog]:
|
||||
Hadoop bulk loader into elasticsearch.
|
||||
|
||||
|
||||
[float]
|
||||
[[monitoring-integrations]]
|
||||
=== Health and Performance Monitoring
|
||||
|
||||
[float]
|
||||
==== Supported by the community:
|
||||
|
||||
* https://github.com/anchor/nagios-plugin-elasticsearch[check_elasticsearch]:
|
||||
An Elasticsearch availability and performance monitoring plugin for
|
||||
Nagios.
|
||||
|
||||
* https://github.com/radu-gheorghe/check-es[check-es]:
|
||||
Nagios/Shinken plugins for checking on elasticsearch
|
||||
|
||||
* https://github.com/mattweber/es2graphite[es2graphite]:
|
||||
Send cluster and indices stats and status to Graphite for monitoring and graphing.
|
||||
|
||||
|
||||
* https://itunes.apple.com/us/app/elasticocean/id955278030?ls=1&mt=8[ElasticOcean]:
|
||||
Elasticsearch & DigitalOcean iOS Real-Time Monitoring tool to keep an eye on DigitalOcean Droplets or Elasticsearch instances or both of them on-a-go.
|
||||
|
||||
* https://github.com/rbramley/Opsview-elasticsearch[opsview-elasticsearch]:
|
||||
Opsview plugin written in Perl for monitoring Elasticsearch
|
||||
|
||||
* https://scoutapp.com[Scout]: Provides plugins for monitoring Elasticsearch https://scoutapp.com/plugin_urls/1331-elasticsearch-node-status[nodes], https://scoutapp.com/plugin_urls/1321-elasticsearch-cluster-status[clusters], and https://scoutapp.com/plugin_urls/1341-elasticsearch-index-status[indices].
|
||||
|
||||
* http://sematext.com/spm/index.html[SPM for Elasticsearch]:
|
||||
Performance monitoring with live charts showing cluster and node stats, integrated
|
||||
alerts, email reports, etc.
|
||||
|
||||
|
||||
[[other-integrations]]
|
||||
[float]
|
||||
=== Other integrations
|
||||
|
||||
[float]
|
||||
==== Supported by the community:
|
||||
|
||||
* https://github.com/kodcu/pes[Pes]:
|
||||
A pluggable elastic Javascript query DSL builder for Elasticsearch
|
||||
|
||||
* https://www.wireshark.org/[Wireshark]:
|
||||
Protocol dissection for Zen discovery, HTTP and the binary protocol
|
||||
|
||||
|
||||
These projects appears to have been abandoned:
|
||||
|
||||
* http://www.github.com/neogenix/daikon[daikon]:
|
||||
Daikon Elasticsearch CLI
|
||||
|
||||
* https://github.com/fullscale/dangle[dangle]:
|
||||
A set of AngularJS directives that provide common visualizations for elasticsearch based on
|
||||
D3.
|
||||
* https://github.com/OlegKunitsyn/eslogd[eslogd]:
|
||||
Linux daemon that replicates events to a central Elasticsearch server in real-time
|
||||
|
|
@ -0,0 +1,193 @@
|
|||
[[lang-javascript]]
|
||||
=== JavaScript Language Plugin
|
||||
|
||||
The JavaScript language plugin enables the use of JavaScript in Elasticsearch
|
||||
scripts, via Mozilla's
|
||||
https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Rhino[Rhino JavaScript] engine.
|
||||
|
||||
[[lang-javascript-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install lang-javascript
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[lang-javascript-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove lang-javascript
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[lang-javascript-usage]]
|
||||
==== Using JavaScript in Elasticsearch
|
||||
|
||||
Once the plugin has been installed, JavaScript can be used at a scripting
|
||||
language by setting the `lang` parameter to `javascript` or `js`.
|
||||
|
||||
Scripting is available in many APIs, but we will use an example with the
|
||||
`function_score` for demonstration purposes:
|
||||
|
||||
[[lang-javascript-inline]]
|
||||
[float]
|
||||
=== Inline scripts
|
||||
|
||||
WARNING: Enabling inline scripting on an unprotected Elasticsearch cluster is dangerous.
|
||||
See <<lang-javascript-file>> for a safer option.
|
||||
|
||||
If you have enabled {ref}/modules-scripting.html#enable-dynamic-scripting[inline scripts],
|
||||
you can use JavaScript as follows:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
DELETE test
|
||||
|
||||
PUT test/doc/1
|
||||
{
|
||||
"num": 1.0
|
||||
}
|
||||
|
||||
PUT test/doc/2
|
||||
{
|
||||
"num": 2.0
|
||||
}
|
||||
|
||||
GET test/_search
|
||||
{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"script_score": {
|
||||
"script": {
|
||||
"inline": "doc[\"num\"].value * factor",
|
||||
"lang": "javascript",
|
||||
"params": {
|
||||
"factor": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
----
|
||||
// AUTOSENSE
|
||||
|
||||
[[lang-javascript-indexed]]
|
||||
[float]
|
||||
=== Indexed scripts
|
||||
|
||||
WARNING: Enabling indexed scripting on an unprotected Elasticsearch cluster is dangerous.
|
||||
See <<lang-javascript-file>> for a safer option.
|
||||
|
||||
If you have enabled {ref}/modules-scripting.html#enable-dynamic-scripting[indexed scripts],
|
||||
you can use JavaScript as follows:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
DELETE test
|
||||
|
||||
PUT test/doc/1
|
||||
{
|
||||
"num": 1.0
|
||||
}
|
||||
|
||||
PUT test/doc/2
|
||||
{
|
||||
"num": 2.0
|
||||
}
|
||||
|
||||
POST _scripts/javascript/my_script <1>
|
||||
{
|
||||
"script": "doc[\"num\"].value * factor"
|
||||
}
|
||||
|
||||
GET test/_search
|
||||
{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "my_script", <2>
|
||||
"lang": "javascript",
|
||||
"params": {
|
||||
"factor": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
----
|
||||
// AUTOSENSE
|
||||
|
||||
<1> We index the script under the id `my_script`.
|
||||
<2> The function score query retrieves the script with id `my_script`.
|
||||
|
||||
|
||||
[[lang-javascript-file]]
|
||||
[float]
|
||||
=== File scripts
|
||||
|
||||
You can save your scripts to a file in the `config/scripts/` directory on
|
||||
every node. The `.javascript` file suffix identifies the script as containing
|
||||
JavaScript:
|
||||
|
||||
First, save this file as `config/scripts/my_script.javascript` on every node
|
||||
in the cluster:
|
||||
|
||||
[source,js]
|
||||
----
|
||||
doc["num"].value * factor
|
||||
----
|
||||
|
||||
then use the script as follows:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
DELETE test
|
||||
|
||||
PUT test/doc/1
|
||||
{
|
||||
"num": 1.0
|
||||
}
|
||||
|
||||
PUT test/doc/2
|
||||
{
|
||||
"num": 2.0
|
||||
}
|
||||
|
||||
GET test/_search
|
||||
{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"script_score": {
|
||||
"script": {
|
||||
"file": "my_script", <1>
|
||||
"lang": "javascript",
|
||||
"params": {
|
||||
"factor": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
----
|
||||
// AUTOSENSE
|
||||
|
||||
<1> The function score query retrieves the script with filename `my_script.javascript`.
|
||||
|
|
@ -0,0 +1,192 @@
|
|||
[[lang-python]]
|
||||
=== Python Language Plugin
|
||||
|
||||
The Python language plugin enables the use of Python in Elasticsearch
|
||||
scripts, via the http://www.jython.org/[Jython] Java implementation of Python.
|
||||
|
||||
[[lang-python-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install lang-python
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[lang-python-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove lang-python
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[lang-python-usage]]
|
||||
==== Using Python in Elasticsearch
|
||||
|
||||
Once the plugin has been installed, Python can be used at a scripting
|
||||
language by setting the `lang` parameter to `python`.
|
||||
|
||||
Scripting is available in many APIs, but we will use an example with the
|
||||
`function_score` for demonstration purposes:
|
||||
|
||||
[[lang-python-inline]]
|
||||
[float]
|
||||
=== Inline scripts
|
||||
|
||||
WARNING: Enabling inline scripting on an unprotected Elasticsearch cluster is dangerous.
|
||||
See <<lang-python-file>> for a safer option.
|
||||
|
||||
If you have enabled {ref}/modules-scripting.html#enable-dynamic-scripting[inline scripts],
|
||||
you can use Python as follows:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
DELETE test
|
||||
|
||||
PUT test/doc/1
|
||||
{
|
||||
"num": 1.0
|
||||
}
|
||||
|
||||
PUT test/doc/2
|
||||
{
|
||||
"num": 2.0
|
||||
}
|
||||
|
||||
GET test/_search
|
||||
{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"script_score": {
|
||||
"script": {
|
||||
"inline": "doc[\"num\"].value * factor",
|
||||
"lang": "python",
|
||||
"params": {
|
||||
"factor": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
----
|
||||
// AUTOSENSE
|
||||
|
||||
[[lang-python-indexed]]
|
||||
[float]
|
||||
=== Indexed scripts
|
||||
|
||||
WARNING: Enabling indexed scripting on an unprotected Elasticsearch cluster is dangerous.
|
||||
See <<lang-python-file>> for a safer option.
|
||||
|
||||
If you have enabled {ref}/modules-scripting.html#enable-dynamic-scripting[indexed scripts],
|
||||
you can use Python as follows:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
DELETE test
|
||||
|
||||
PUT test/doc/1
|
||||
{
|
||||
"num": 1.0
|
||||
}
|
||||
|
||||
PUT test/doc/2
|
||||
{
|
||||
"num": 2.0
|
||||
}
|
||||
|
||||
POST _scripts/python/my_script <1>
|
||||
{
|
||||
"script": "doc[\"num\"].value * factor"
|
||||
}
|
||||
|
||||
GET test/_search
|
||||
{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "my_script", <2>
|
||||
"lang": "python",
|
||||
"params": {
|
||||
"factor": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
----
|
||||
// AUTOSENSE
|
||||
|
||||
<1> We index the script under the id `my_script`.
|
||||
<2> The function score query retrieves the script with id `my_script`.
|
||||
|
||||
|
||||
[[lang-python-file]]
|
||||
[float]
|
||||
=== File scripts
|
||||
|
||||
You can save your scripts to a file in the `config/scripts/` directory on
|
||||
every node. The `.python` file suffix identifies the script as containing
|
||||
Python:
|
||||
|
||||
First, save this file as `config/scripts/my_script.python` on every node
|
||||
in the cluster:
|
||||
|
||||
[source,python]
|
||||
----
|
||||
doc["num"].value * factor
|
||||
----
|
||||
|
||||
then use the script as follows:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
DELETE test
|
||||
|
||||
PUT test/doc/1
|
||||
{
|
||||
"num": 1.0
|
||||
}
|
||||
|
||||
PUT test/doc/2
|
||||
{
|
||||
"num": 2.0
|
||||
}
|
||||
|
||||
GET test/_search
|
||||
{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"script_score": {
|
||||
"script": {
|
||||
"file": "my_script", <1>
|
||||
"lang": "python",
|
||||
"params": {
|
||||
"factor": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
----
|
||||
// AUTOSENSE
|
||||
|
||||
<1> The function score query retrieves the script with filename `my_script.python`.
|
||||
|
|
@ -0,0 +1,46 @@
|
|||
[[management]]
|
||||
== Management and Site Plugins
|
||||
|
||||
Management and site plugins offer UIs for managing and interacting with
|
||||
Elasticsearch.
|
||||
|
||||
[float]
|
||||
=== Core management plugins
|
||||
|
||||
The core management plugins are:
|
||||
|
||||
link:/products/marvel[Marvel]::
|
||||
|
||||
Marvel is a management and monitoring product for Elasticsearch. Marvel
|
||||
aggregates cluster wide statistics and events and offers a single interface to
|
||||
view and analyze them. Marvel is free for development use but requires a
|
||||
license to run in production.
|
||||
|
||||
https://github.com/elastic/elasticsearch-migration[Migration]::
|
||||
|
||||
This plugin will help you to check whether you can upgrade directly to
|
||||
Elasticsearch version 2.x, or whether you need to make changes to your data
|
||||
before doing so. It will run on Elasticsearch versions 0.90.x to 1.x.
|
||||
|
||||
[float]
|
||||
=== Community contributed management and site plugins
|
||||
|
||||
A number of plugins have been contributed by our community:
|
||||
|
||||
* https://github.com/lukas-vlcek/bigdesk[BigDesk Plugin] (by Lukáš Vlček)
|
||||
|
||||
* https://github.com/spinscale/elasticsearch-graphite-plugin[Elasticsearch Graphite Plugin]:
|
||||
Regularly updates a graphite host with indices stats and nodes stats (by Alexander Reelsen)
|
||||
|
||||
* https://github.com/mobz/elasticsearch-head[Elasticsearch Head Plugin] (by Ben Birch)
|
||||
* https://github.com/royrusso/elasticsearch-HQ[Elasticsearch HQ] (by Roy Russo)
|
||||
* https://github.com/andrewvc/elastic-hammer[Hammer Plugin] (by Andrew Cholakian)
|
||||
* https://github.com/polyfractal/elasticsearch-inquisitor[Inquisitor Plugin] (by Zachary Tong)
|
||||
* https://github.com/lmenezes/elasticsearch-kopf[Kopf Plugin] (by lmenezes)
|
||||
|
||||
These community plugins appear to have been abandoned:
|
||||
|
||||
* https://github.com/karmi/elasticsearch-paramedic[Paramedic Plugin] (by Karel Minařík)
|
||||
* https://github.com/polyfractal/elasticsearch-segmentspy[SegmentSpy Plugin] (by Zachary Tong)
|
||||
* https://github.com/xyu/elasticsearch-whatson[Whatson Plugin] (by Xiao Yu)
|
||||
|
|
@ -1,9 +1,41 @@
|
|||
[[mapping-size-field]]
|
||||
=== `_size` field
|
||||
[[mapper-size]]
|
||||
=== Mapper Size Plugin
|
||||
|
||||
The `_size` field, when enabled, indexes the size in bytes of the original
|
||||
<<mapping-source-field,`_source`>>. In order to enable it, set
|
||||
the mapping as follows:
|
||||
The mapper-size plugin provides the `_size` meta field which, when enabled,
|
||||
indexes the size in bytes of the original
|
||||
{ref}/mapping-source-field.html[`_source`] field.
|
||||
|
||||
[[mapper-size-install]]
|
||||
[float]
|
||||
==== Installation
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin install mapper-size
|
||||
----------------------------------------------------------------
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must
|
||||
be restarted after installation.
|
||||
|
||||
[[mapper-size-remove]]
|
||||
[float]
|
||||
==== Removal
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
[source,sh]
|
||||
----------------------------------------------------------------
|
||||
sudo bin/plugin remove mapper-size
|
||||
----------------------------------------------------------------
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
[[mapper-size-usage]]
|
||||
==== Using the `_size` field
|
||||
|
||||
In order to enable the `_size` field, set the mapping as follows:
|
||||
|
||||
[source,js]
|
||||
--------------------------
|
|
@ -0,0 +1,18 @@
|
|||
[[mapper]]
|
||||
== Mapper Plugins
|
||||
|
||||
Mapper plugins allow new field datatypes to be added to Elasticsearch.
|
||||
|
||||
[float]
|
||||
=== Core mapper plugins
|
||||
|
||||
The core mapper plugins are:
|
||||
|
||||
<<mapper-size>>::
|
||||
|
||||
The mapper-size plugin provides the `_size` meta field which, when enabled,
|
||||
indexes the size in bytes of the original
|
||||
{ref}/mapping-source-field.html[`_source`] field.
|
||||
|
||||
include::mapper-size.asciidoc[]
|
||||
|
|
@ -0,0 +1,240 @@
|
|||
[[plugin-management]]
|
||||
== Plugin Management
|
||||
|
||||
The `plugin` script is used to install, list, and remove plugins. It is
|
||||
located in the `$ES_HOME/bin` directory by default but it may be in a
|
||||
{ref}/setup-dir-layout.html[different location] if you installed Elasticsearch
|
||||
with an RPM or deb package.
|
||||
|
||||
Run the following command to get usage instructions:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin -h
|
||||
-----------------------------------
|
||||
|
||||
[[installation]]
|
||||
=== Installing Plugins
|
||||
|
||||
The documentation for each plugin usually includes specific installation
|
||||
instructions for that plugin, but below we document the various available
|
||||
options:
|
||||
|
||||
[float]
|
||||
=== Core Elasticsearch plugins
|
||||
|
||||
Core Elasticsearch plugins can be installed as follows:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin install [plugin_name]
|
||||
-----------------------------------
|
||||
|
||||
For instance, to install the core <<analysis-icu,ICU plugin>>, just run the
|
||||
following command:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin install analysis-icu
|
||||
-----------------------------------
|
||||
|
||||
This command will install the version of the plugin that matches your
|
||||
Elasticsearch version.
|
||||
|
||||
[float]
|
||||
=== Community and non-core plugins
|
||||
|
||||
Non-core plugins provided by Elasticsearch, or plugins provided by the
|
||||
community, can be installed from `download.elastic.co`, from Maven (Central
|
||||
and Sonatype), or from GitHub. In this case, the command is as follows:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin install [org]/[user|component]/[version]
|
||||
-----------------------------------
|
||||
|
||||
For instance, to install the https://github.com/lmenezes/elasticsearch-kopf[Kopf]
|
||||
plugin from GitHub, run one of the following commands:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin install lmenezes/elasticsearch-kopf <1>
|
||||
sudo bin/plugin install lmenezes/elasticsearch-kopf/1.x <2>
|
||||
-----------------------------------
|
||||
<1> Installs the latest version from GitHub.
|
||||
<2> Installs the 1.x version from GitHub.
|
||||
|
||||
When installing from Maven Central/Sonatype, `[org]` should be replaced by
|
||||
the artifact `groupId`, and `[user|component]` by the `artifactId`. For
|
||||
instance, to install the
|
||||
https://github.com/elastic/elasticsearch-mapper-attachments[mapper attachment]
|
||||
plugin from Sonatype, run:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin install org.elasticsearch/elasticsearch-mapper-attachments/2.6.0 <1>
|
||||
-----------------------------------
|
||||
<1> When installing from `download.elastic.co` or from Maven Central/Sonatype, the
|
||||
version is required.
|
||||
|
||||
[float]
|
||||
=== Custom URL or file system
|
||||
|
||||
A plugin can also be downloaded directly from a custom location by specifying the URL:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin install [plugin-name] --url [url] <1>
|
||||
-----------------------------------
|
||||
<1> Both the URL and the plugin name must be specified.
|
||||
|
||||
For instance, to install a plugin from your local file system, you could run:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin install my_plugin --url file:/path/to/plugin.zip
|
||||
-----------------------------------
|
||||
|
||||
[[listing-removing]]
|
||||
=== Listing and Removing Installed Plugins
|
||||
|
||||
[float]
|
||||
=== Listing plugins
|
||||
|
||||
A list of the currently loaded plugins can be retrieved with the `list` option:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin list
|
||||
-----------------------------------
|
||||
|
||||
Alternatively, use the {ref}/cluster-nodes-info.html[node-info API] to find
|
||||
out which plugins are installed on each node in the cluster
|
||||
|
||||
[float]
|
||||
=== Removing plugins
|
||||
|
||||
Plugins can be removed manually, by deleting the appropriate directory under
|
||||
`plugins/`, or using the public script:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin remove [pluginname]
|
||||
-----------------------------------
|
||||
|
||||
=== Other command line parameters
|
||||
|
||||
The `plugin` scripts supports a number of other command line parameters:
|
||||
|
||||
[float]
|
||||
=== Silent/Verbose mode
|
||||
|
||||
The `--verbose` parameter outputs more debug information, while the `--silent`
|
||||
parameter turns off all output. The script may return the following exit
|
||||
codes:
|
||||
|
||||
[horizontal]
|
||||
`0`:: everything was OK
|
||||
`64`:: unknown command or incorrect option parameter
|
||||
`74`:: IO error
|
||||
`70`:: any other error
|
||||
|
||||
[float]
|
||||
=== Custom config directory
|
||||
|
||||
If your `elasticsearch.yml` config file is in a custom location, you will need
|
||||
to specify the path to the config file when using the `plugin` script. You
|
||||
can do this as follows:
|
||||
|
||||
[source,sh]
|
||||
---------------------
|
||||
sudo bin/plugin -Des.path.conf=/path/to/custom/config/dir install <plugin name>
|
||||
---------------------
|
||||
|
||||
You can also set the `CONF_DIR` environment variable to the custom config
|
||||
directory path.
|
||||
|
||||
[float]
|
||||
=== Timeout settings
|
||||
|
||||
By default, the `plugin` script will wait indefinitely when downloading before
|
||||
failing. The timeout parameter can be used to explicitly specify how long it
|
||||
waits. Here is some examples of setting it to different values:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
# Wait for 30 seconds before failing
|
||||
sudo bin/plugin install mobz/elasticsearch-head --timeout 30s
|
||||
|
||||
# Wait for 1 minute before failing
|
||||
sudo bin/plugin install mobz/elasticsearch-head --timeout 1m
|
||||
|
||||
# Wait forever (default)
|
||||
sudo bin/plugin install mobz/elasticsearch-head --timeout 0
|
||||
-----------------------------------
|
||||
|
||||
[float]
|
||||
=== Proxy settings
|
||||
|
||||
To install a plugin via a proxy, you can pass the proxy details in with the
|
||||
Java settings `proxyHost` and `proxyPort`. On Unix based systems, these
|
||||
options can be set on the command line:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
sudo bin/plugin install mobz/elasticsearch-head -DproxyHost=host_name -DproxyPort=port_number
|
||||
-----------------------------------
|
||||
|
||||
On Windows, they need to be added to the `JAVA_OPTS` environment variable:
|
||||
|
||||
[source,shell]
|
||||
-----------------------------------
|
||||
set JAVA_OPTS="-DproxyHost=host_name -DproxyPort=port_number"
|
||||
bin/plugin install mobz/elasticsearch-head
|
||||
-----------------------------------
|
||||
|
||||
=== Settings related to plugins
|
||||
|
||||
[float]
|
||||
=== Custom plugins directory
|
||||
|
||||
The `plugins` directory can be changed from the default by adding the
|
||||
following to the `elasticsearch.yml` config file:
|
||||
|
||||
[source,yml]
|
||||
---------------------
|
||||
path.plugins: /path/to/custom/plugins/dir
|
||||
---------------------
|
||||
|
||||
The default location of the `plugins` directory depends on
|
||||
{ref}/setup-dir-layout.html[which package you install].
|
||||
|
||||
[float]
|
||||
=== Mandatory Plugins
|
||||
|
||||
If you rely on some plugins, you can define mandatory plugins by adding
|
||||
`plugin.mandatory` setting to the `config/elasticsearch.yml` file, for
|
||||
example:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
plugin.mandatory: mapper-attachments,lang-groovy
|
||||
--------------------------------------------------
|
||||
|
||||
For safety reasons, a node will not start if it is missing a mandatory plugin.
|
||||
|
||||
[float]
|
||||
=== Lucene version dependent plugins
|
||||
|
||||
For some plugins, such as analysis plugins, a specific major Lucene version is
|
||||
required to run. In that case, the plugin provides in its
|
||||
`es-plugin.properties` file the Lucene version for which the plugin was built for.
|
||||
|
||||
If present at startup the node will check the Lucene version before loading
|
||||
the plugin. You can disable that check using
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
plugins.check_lucene: false
|
||||
--------------------------------------------------
|
||||
|
|
@ -0,0 +1,37 @@
|
|||
[[repository]]
|
||||
== Snapshot/Restore Repository Plugins
|
||||
|
||||
Repository plugins extend the {ref}/modules-snapshots.html[Snapshot/Restore]
|
||||
functionality in Elasticsearch by adding repositories backed by the cloud or
|
||||
by distributed file systems:
|
||||
|
||||
[float]
|
||||
==== Core repository plugins
|
||||
|
||||
The core repository plugins are:
|
||||
|
||||
<<cloud-aws,AWS Cloud>>::
|
||||
|
||||
The Amazon Web Service (AWS) Cloud plugin adds support for using S3 as a
|
||||
repository.
|
||||
|
||||
<<cloud-azure,Azure Cloud>>::
|
||||
|
||||
The Azure Cloud plugin adds support for using Azure as a repository.
|
||||
|
||||
https://github.com/elastic/elasticsearch-hadoop/tree/master/repository-hdfs[Hadoop HDFS Repository]::
|
||||
|
||||
The Hadoop HDFS Repository plugin adds support for using an HDFS file system
|
||||
as a repository.
|
||||
|
||||
|
||||
[float]
|
||||
=== Community contributed repository plugins
|
||||
|
||||
The following plugin has been contributed by our community:
|
||||
|
||||
* https://github.com/wikimedia/search-repository-swift[Openstack Swift] (by http://en.cam4.es/youngqcmeat/Wikimedia Foundation)
|
||||
|
||||
This community plugin appears to have been abandoned:
|
||||
|
||||
* https://github.com/kzwang/elasticsearch-repository-gridfs[GridFS] Repository (by Kevin Wang)
|
|
@ -0,0 +1,32 @@
|
|||
[[scripting]]
|
||||
== Scripting Plugins
|
||||
|
||||
Scripting plugins extend the scripting functionality in Elasticsearch to allow
|
||||
the use of other scripting languages.
|
||||
|
||||
[float]
|
||||
=== Core scripting plugins
|
||||
|
||||
The core scripting plugins are:
|
||||
|
||||
<<lang-javascript,JavaScript Language>>::
|
||||
|
||||
The JavaScript language plugin enables the use of JavaScript in Elasticsearch
|
||||
scripts, via Mozilla's
|
||||
https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Rhino[Rhino JavaScript] engine.
|
||||
|
||||
<<lang-python,Python Language>>::
|
||||
|
||||
The Python language plugin enables the use of Python in Elasticsearch
|
||||
scripts, via the http://www.jython.org/[Jython] Java implementation of Python.
|
||||
|
||||
[float]
|
||||
=== Abandoned community scripting plugins
|
||||
|
||||
This plugin has been contributed by our community, but appears to be abandoned:
|
||||
|
||||
* https://github.com/hiredman/elasticsearch-lang-clojure[Clojure Language Plugin] (by Kevin Downey)
|
||||
|
||||
include::lang-javascript.asciidoc[]
|
||||
|
||||
include::lang-python.asciidoc[]
|
|
@ -0,0 +1,29 @@
|
|||
[[security]]
|
||||
== Security Plugins
|
||||
|
||||
Security plugins add a security layer to Elasticsearch.
|
||||
|
||||
[float]
|
||||
=== Core security plugins
|
||||
|
||||
The core security plugins are:
|
||||
|
||||
link:/products/shield[Shield]::
|
||||
|
||||
Shield is the Elastic product that makes it easy for anyone to add
|
||||
enterprise-grade security to their ELK stack. Designed to address the growing security
|
||||
needs of thousands of enterprises using ELK today, Shield provides peace of
|
||||
mind when it comes to protecting your data.
|
||||
|
||||
[float]
|
||||
=== Community contributed security plugins
|
||||
|
||||
The following plugin has been contributed by our community:
|
||||
|
||||
* https://github.com/sscarduzio/elasticsearch-readonlyrest-plugin[Readonly REST]:
|
||||
High performance access control for Elasticsearch native REST API (by Simone Scarduzio)
|
||||
|
||||
This community plugin appears to have been abandoned:
|
||||
|
||||
* https://github.com/sonian/elasticsearch-jetty[Jetty HTTP transport plugin]:
|
||||
Uses Jetty to provide SSL connections, basic authentication, and request logging (by Sonian Inc.)
|
|
@ -0,0 +1,22 @@
|
|||
[[transport]]
|
||||
== Transport Plugins
|
||||
|
||||
Transport plugins offer alternatives to HTTP.
|
||||
|
||||
[float]
|
||||
=== Core transport plugins
|
||||
|
||||
The core transport plugins are:
|
||||
|
||||
https://github.com/elasticsearch/elasticsearch-transport-wares::[Servlet transport]::
|
||||
|
||||
Use the REST interface over servlets.
|
||||
|
||||
[float]
|
||||
=== Community contributed transport plugins
|
||||
|
||||
The following community plugins appear to have been abandoned:
|
||||
|
||||
* https://github.com/kzwang/elasticsearch-transport-redis[Redis transport plugin] (by Kevin Wang)
|
||||
* https://github.com/tlrx/transport-zeromq[ØMQ transport plugin] (by Tanguy Leroux)
|
||||
|
|
@ -5,6 +5,7 @@
|
|||
:branch: 2.0
|
||||
:jdk: 1.8.0_25
|
||||
:defguide: https://www.elastic.co/guide/en/elasticsearch/guide/current
|
||||
:plugins: https://www.elastic.co/guide/en/elasticsearch/plugins/master
|
||||
|
||||
include::getting-started.asciidoc[]
|
||||
|
||||
|
|
|
@ -32,9 +32,10 @@ can be customised when a mapping type is created.
|
|||
|
||||
The original JSON representing the body of the document.
|
||||
|
||||
<<mapping-size-field,`_size`>>::
|
||||
{plugins}/mapping-size.html[`_size`]::
|
||||
|
||||
The size of the `_source` field in bytes.
|
||||
The size of the `_source` field in bytes, provided by the
|
||||
{plugins}/mapping-size.html[`mapper-size` plugin].
|
||||
|
||||
[float]
|
||||
=== Indexing meta-fields
|
||||
|
|
|
@ -9,288 +9,4 @@ custom manner. They range from adding custom mapping types, custom
|
|||
analyzers (in a more built in fashion), native scripts, custom discovery
|
||||
and more.
|
||||
|
||||
[float]
|
||||
[[installing]]
|
||||
==== Installing plugins
|
||||
|
||||
Installing plugins can either be done manually by placing them under the
|
||||
`plugins` directory, or using the `plugin` script.
|
||||
|
||||
Installing plugins typically take the following form:
|
||||
|
||||
[source,sh]
|
||||
-----------------------------------
|
||||
bin/plugin install plugin_name
|
||||
-----------------------------------
|
||||
|
||||
The plugin will be automatically downloaded in this case from `download.elastic.co` download service using the
|
||||
same version as your elasticsearch version.
|
||||
|
||||
For older version of elasticsearch (prior to 2.0.0) or community plugins, you would use the following form:
|
||||
|
||||
[source,sh]
|
||||
-----------------------------------
|
||||
bin/plugin install <org>/<user/component>/<version>
|
||||
-----------------------------------
|
||||
|
||||
The plugins will be automatically downloaded in this case from `download.elastic.co` (for older plugins),
|
||||
and in case they don't exist there, from maven (central and sonatype).
|
||||
|
||||
Note that when the plugin is located in maven central or sonatype
|
||||
repository, `<org>` is the artifact `groupId` and `<user/component>` is
|
||||
the `artifactId`.
|
||||
|
||||
A plugin can also be installed directly by specifying the URL for it,
|
||||
for example:
|
||||
|
||||
[source,sh]
|
||||
-----------------------------------
|
||||
bin/plugin install plugin-name --url file:///path/to/plugin
|
||||
-----------------------------------
|
||||
|
||||
|
||||
You can run `bin/plugin -h`, or `bin/plugin install -h` for help on the install command
|
||||
as well as `bin/plugin remove -h` for help on the remove command..
|
||||
|
||||
[float]
|
||||
[[site-plugins]]
|
||||
==== Site Plugins
|
||||
|
||||
Plugins can have "sites" in them, any plugin that exists under the
|
||||
`plugins` directory with a `_site` directory, its content will be
|
||||
statically served when hitting `/_plugin/[plugin_name]/` url. Those can
|
||||
be added even after the process has started.
|
||||
|
||||
Installed plugins that do not contain any java related content, will
|
||||
automatically be detected as site plugins, and their content will be
|
||||
moved under `_site`.
|
||||
|
||||
The ability to install plugins from Github allows to easily install site
|
||||
plugins hosted there by downloading the actual repo, for example,
|
||||
running:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
bin/plugin install mobz/elasticsearch-head
|
||||
bin/plugin install lukas-vlcek/bigdesk
|
||||
--------------------------------------------------
|
||||
|
||||
Will install both of those site plugins, with `elasticsearch-head`
|
||||
available under `http://localhost:9200/_plugin/head/` and `bigdesk`
|
||||
available under `http://localhost:9200/_plugin/bigdesk/`.
|
||||
|
||||
[float]
|
||||
==== Mandatory Plugins
|
||||
|
||||
If you rely on some plugins, you can define mandatory plugins using the
|
||||
`plugin.mandatory` attribute, for example, here is a sample config:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
plugin.mandatory: mapper-attachments,lang-groovy
|
||||
--------------------------------------------------
|
||||
|
||||
For safety reasons, if a mandatory plugin is not installed, the node
|
||||
will not start.
|
||||
|
||||
[float]
|
||||
==== Installed Plugins
|
||||
|
||||
A list of the currently loaded plugins can be retrieved using the
|
||||
<<cluster-nodes-info,Nodes Info API>>.
|
||||
|
||||
[float]
|
||||
==== Removing plugins
|
||||
|
||||
Removing plugins can either be done manually by removing them under the
|
||||
`plugins` directory, or using the `plugin` script.
|
||||
|
||||
Removing plugins typically take the following form:
|
||||
|
||||
[source,sh]
|
||||
-----------------------------------
|
||||
plugin remove <pluginname>
|
||||
-----------------------------------
|
||||
|
||||
[float]
|
||||
==== Silent/Verbose mode
|
||||
|
||||
When running the `plugin` script, you can get more information (debug mode) using `--verbose`.
|
||||
On the opposite, if you want `plugin` script to be silent, use `--silent` option.
|
||||
|
||||
Note that exit codes could be:
|
||||
|
||||
* `0`: everything was OK
|
||||
* `64`: unknown command or incorrect option parameter
|
||||
* `74`: IO error
|
||||
* `70`: other errors
|
||||
|
||||
[source,sh]
|
||||
-----------------------------------
|
||||
bin/plugin install mobz/elasticsearch-head --verbose
|
||||
plugin remove head --silent
|
||||
-----------------------------------
|
||||
|
||||
[float]
|
||||
==== Timeout settings
|
||||
|
||||
By default, the `plugin` script will wait indefinitely when downloading before failing.
|
||||
The timeout parameter can be used to explicitly specify how long it waits. Here is some examples of setting it to
|
||||
different values:
|
||||
|
||||
[source,sh]
|
||||
-----------------------------------
|
||||
# Wait for 30 seconds before failing
|
||||
bin/plugin install mobz/elasticsearch-head --timeout 30s
|
||||
|
||||
# Wait for 1 minute before failing
|
||||
bin/plugin install mobz/elasticsearch-head --timeout 1m
|
||||
|
||||
# Wait forever (default)
|
||||
bin/plugin install mobz/elasticsearch-head --timeout 0
|
||||
-----------------------------------
|
||||
|
||||
[float]
|
||||
==== Proxy settings
|
||||
|
||||
|
||||
To install a plugin via a proxy, you can pass the proxy details using the environment variables `proxyHost` and `proxyPort`.
|
||||
|
||||
[source,sh]
|
||||
-----------------------------------
|
||||
set JAVA_OPTS="-DproxyHost=host_name -DproxyPort=port_number"
|
||||
bin/plugin install mobz/elasticsearch-head
|
||||
-----------------------------------
|
||||
|
||||
[float]
|
||||
==== Lucene version dependent plugins
|
||||
|
||||
For some plugins, such as analysis plugins, a specific major Lucene version is
|
||||
required to run. In that case, the plugin provides in its `es-plugin.properties`
|
||||
file the Lucene version for which the plugin was built for.
|
||||
|
||||
If present at startup the node will check the Lucene version before loading the plugin.
|
||||
|
||||
You can disable that check using `plugins.check_lucene: false`.
|
||||
|
||||
[float]
|
||||
[[known-plugins]]
|
||||
=== Known Plugins
|
||||
|
||||
[float]
|
||||
[[analysis-plugins]]
|
||||
==== Analysis Plugins
|
||||
|
||||
.Supported by Elasticsearch
|
||||
* https://github.com/elasticsearch/elasticsearch-analysis-icu[ICU Analysis plugin]
|
||||
* https://github.com/elasticsearch/elasticsearch-analysis-kuromoji[Japanese (Kuromoji) Analysis plugin].
|
||||
* https://github.com/elasticsearch/elasticsearch-analysis-smartcn[Smart Chinese Analysis Plugin]
|
||||
* https://github.com/elasticsearch/elasticsearch-analysis-stempel[Stempel (Polish) Analysis plugin]
|
||||
|
||||
.Supported by the community
|
||||
* https://github.com/barminator/elasticsearch-analysis-annotation[Annotation Analysis Plugin] (by Michal Samek)
|
||||
* https://github.com/yakaz/elasticsearch-analysis-combo/[Combo Analysis Plugin] (by Olivier Favre, Yakaz)
|
||||
* https://github.com/jprante/elasticsearch-analysis-hunspell[Hunspell Analysis Plugin] (by Jörg Prante)
|
||||
* https://github.com/medcl/elasticsearch-analysis-ik[IK Analysis Plugin] (by Medcl)
|
||||
* https://github.com/suguru/elasticsearch-analysis-japanese[Japanese Analysis plugin] (by suguru).
|
||||
* https://github.com/medcl/elasticsearch-analysis-mmseg[Mmseg Analysis Plugin] (by Medcl)
|
||||
* https://github.com/chytreg/elasticsearch-analysis-morfologik[Morfologik (Polish) Analysis plugin] (by chytreg)
|
||||
* https://github.com/imotov/elasticsearch-analysis-morphology[Russian and English Morphological Analysis Plugin] (by Igor Motov)
|
||||
* https://github.com/synhershko/elasticsearch-analysis-hebrew[Hebrew Analysis Plugin] (by Itamar Syn-Hershko)
|
||||
* https://github.com/medcl/elasticsearch-analysis-pinyin[Pinyin Analysis Plugin] (by Medcl)
|
||||
* https://github.com/medcl/elasticsearch-analysis-string2int[String2Integer Analysis Plugin] (by Medcl)
|
||||
* https://github.com/duydo/elasticsearch-analysis-vietnamese[Vietnamese Analysis Plugin] (by Duy Do)
|
||||
|
||||
[float]
|
||||
[[discovery-plugins]]
|
||||
==== Discovery Plugins
|
||||
|
||||
.Supported by Elasticsearch
|
||||
* https://github.com/elasticsearch/elasticsearch-cloud-aws[AWS Cloud Plugin] - EC2 discovery and S3 Repository
|
||||
* https://github.com/elasticsearch/elasticsearch-cloud-azure[Azure Cloud Plugin] - Azure discovery
|
||||
* https://github.com/elasticsearch/elasticsearch-cloud-gce[Google Compute Engine Cloud Plugin] - GCE discovery
|
||||
|
||||
.Supported by the community
|
||||
* https://github.com/shikhar/eskka[eskka Discovery Plugin] (by Shikhar Bhushan)
|
||||
* https://github.com/grantr/elasticsearch-srv-discovery[DNS SRV Discovery Plugin] (by Grant Rodgers)
|
||||
|
||||
[float]
|
||||
[[transport]]
|
||||
==== Transport Plugins
|
||||
|
||||
.Supported by Elasticsearch
|
||||
* https://github.com/elasticsearch/elasticsearch-transport-wares[Servlet transport]
|
||||
|
||||
.Supported by the community
|
||||
* https://github.com/tlrx/transport-zeromq[ZeroMQ transport layer plugin] (by Tanguy Leroux)
|
||||
* https://github.com/sonian/elasticsearch-jetty[Jetty HTTP transport plugin] (by Sonian Inc.)
|
||||
* https://github.com/kzwang/elasticsearch-transport-redis[Redis transport plugin] (by Kevin Wang)
|
||||
|
||||
[float]
|
||||
[[scripting]]
|
||||
==== Scripting Plugins
|
||||
|
||||
.Supported by Elasticsearch
|
||||
* https://github.com/elasticsearch/elasticsearch-lang-groovy[Groovy lang Plugin]
|
||||
* https://github.com/elasticsearch/elasticsearch-lang-javascript[JavaScript language Plugin]
|
||||
* https://github.com/elasticsearch/elasticsearch-lang-python[Python language Plugin]
|
||||
|
||||
.Supported by the community
|
||||
* https://github.com/hiredman/elasticsearch-lang-clojure[Clojure Language Plugin] (by Kevin Downey)
|
||||
* https://github.com/NLPchina/elasticsearch-sql/[SQL language Plugin] (by nlpcn)
|
||||
|
||||
[float]
|
||||
[[site]]
|
||||
==== Site Plugins
|
||||
|
||||
.Supported by the community
|
||||
* https://github.com/lukas-vlcek/bigdesk[BigDesk Plugin] (by Lukáš Vlček)
|
||||
* https://github.com/mobz/elasticsearch-head[Elasticsearch Head Plugin] (by Ben Birch)
|
||||
* https://github.com/royrusso/elasticsearch-HQ[Elasticsearch HQ] (by Roy Russo)
|
||||
* https://github.com/andrewvc/elastic-hammer[Hammer Plugin] (by Andrew Cholakian)
|
||||
* https://github.com/polyfractal/elasticsearch-inquisitor[Inquisitor Plugin] (by Zachary Tong)
|
||||
* https://github.com/karmi/elasticsearch-paramedic[Paramedic Plugin] (by Karel Minařík)
|
||||
* https://github.com/polyfractal/elasticsearch-segmentspy[SegmentSpy Plugin] (by Zachary Tong)
|
||||
* https://github.com/xyu/elasticsearch-whatson[Whatson Plugin] (by Xiao Yu)
|
||||
* https://github.com/lmenezes/elasticsearch-kopf[Kopf Plugin] (by lmenezes)
|
||||
|
||||
[float]
|
||||
[[repository-plugins]]
|
||||
==== Snapshot/Restore Repository Plugins
|
||||
|
||||
.Supported by Elasticsearch
|
||||
|
||||
* https://github.com/elasticsearch/elasticsearch-hadoop/tree/master/repository-hdfs[Hadoop HDFS] Repository
|
||||
* https://github.com/elasticsearch/elasticsearch-cloud-aws#s3-repository[AWS S3] Repository
|
||||
|
||||
.Supported by the community
|
||||
|
||||
* https://github.com/kzwang/elasticsearch-repository-gridfs[GridFS] Repository (by Kevin Wang)
|
||||
* https://github.com/wikimedia/search-repository-swift[Openstack Swift]
|
||||
|
||||
[float]
|
||||
[[misc]]
|
||||
==== Misc Plugins
|
||||
|
||||
.Supported by Elasticsearch
|
||||
* https://github.com/elasticsearch/elasticsearch-mapper-attachments[Mapper Attachments Type plugin]
|
||||
|
||||
.Supported by the community
|
||||
* https://github.com/carrot2/elasticsearch-carrot2[carrot2 Plugin]: Results clustering with carrot2 (by Dawid Weiss)
|
||||
* https://github.com/derryx/elasticsearch-changes-plugin[Elasticsearch Changes Plugin] (by Thomas Peuss)
|
||||
* https://github.com/johtani/elasticsearch-extended-analyze[Extended Analyze Plugin] (by Jun Ohtani)
|
||||
* https://github.com/YannBrrd/elasticsearch-entity-resolution[Entity Resolution Plugin] using http://github.com/larsga/Duke[Duke] for duplication detection (by Yann Barraud)
|
||||
* https://github.com/spinscale/elasticsearch-graphite-plugin[Elasticsearch Graphite Plugin] (by Alexander Reelsen)
|
||||
* https://github.com/mattweber/elasticsearch-mocksolrplugin[Elasticsearch Mock Solr Plugin] (by Matt Weber)
|
||||
* https://github.com/viniciusccarvalho/elasticsearch-newrelic[Elasticsearch New Relic Plugin] (by Vinicius Carvalho)
|
||||
* https://github.com/swoop-inc/elasticsearch-statsd-plugin[Elasticsearch Statsd Plugin] (by Swoop Inc.)
|
||||
* https://github.com/endgameinc/elasticsearch-term-plugin[Terms Component Plugin] (by Endgame Inc.)
|
||||
* http://tlrx.github.com/elasticsearch-view-plugin[Elasticsearch View Plugin] (by Tanguy Leroux)
|
||||
* https://github.com/sonian/elasticsearch-zookeeper[ZooKeeper Discovery Plugin] (by Sonian Inc.)
|
||||
* https://github.com/kzwang/elasticsearch-image[Elasticsearch Image Plugin] (by Kevin Wang)
|
||||
* https://github.com/wikimedia/search-highlighter[Elasticsearch Experimental Highlighter] (by Wikimedia Foundation/Nik Everett)
|
||||
* https://github.com/wikimedia/search-extra[Elasticsearch Trigram Accelerated Regular Expression Filter] (by Wikimedia Foundation/Nik Everett)
|
||||
* https://github.com/salyh/elasticsearch-security-plugin[Elasticsearch Security Plugin] (by Hendrik Saly)
|
||||
* https://github.com/codelibs/elasticsearch-taste[Elasticsearch Taste Plugin] (by CodeLibs Project)
|
||||
* http://siren.solutions/siren/downloads/[Elasticsearch SIREn Plugin]: Nested data search (by SIREn Solutions)
|
||||
|
||||
See the {plugins}/index.html[Plugins documentation] for more.
|
||||
|
|
|
@ -1,290 +0,0 @@
|
|||
ICU Analysis for Elasticsearch
|
||||
==================================
|
||||
|
||||
The ICU Analysis plugin integrates Lucene ICU module into elasticsearch, adding ICU relates analysis components.
|
||||
|
||||
In order to install the plugin, simply run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-analysis-icu/2.5.0
|
||||
```
|
||||
|
||||
You need to install a version matching your Elasticsearch version:
|
||||
|
||||
| elasticsearch | ICU Analysis Plugin | Docs |
|
||||
|---------------|-----------------------|------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.6.0-SNAPSHOT](https://github.com/elastic/elasticsearch-analysis-icu/tree/es-1.x/#version-260-snapshot-for-elasticsearch-1x) |
|
||||
| es-1.5 | 2.5.0 | [2.5.0](https://github.com/elastic/elasticsearch-analysis-icu/tree/v2.5.0/#version-250-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.4.3 | [2.4.3](https://github.com/elasticsearch/elasticsearch-analysis-icu/tree/v2.4.3/#version-243-for-elasticsearch-14) |
|
||||
| < 1.4.5 | 2.4.2 | [2.4.2](https://github.com/elastic/elasticsearch-analysis-icu/tree/v2.4.2/#version-242-for-elasticsearch-14) |
|
||||
| < 1.4.3 | 2.4.1 | [2.4.1](https://github.com/elastic/elasticsearch-analysis-icu/tree/v2.4.1/#version-241-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.3.0 | [2.3.0](https://github.com/elastic/elasticsearch-analysis-icu/tree/v2.3.0/#icu-analysis-for-elasticsearch) |
|
||||
| es-1.2 | 2.2.0 | [2.2.0](https://github.com/elastic/elasticsearch-analysis-icu/tree/v2.2.0/#icu-analysis-for-elasticsearch) |
|
||||
| es-1.1 | 2.1.0 | [2.1.0](https://github.com/elastic/elasticsearch-analysis-icu/tree/v2.1.0/#icu-analysis-for-elasticsearch) |
|
||||
| es-1.0 | 2.0.0 | [2.0.0](https://github.com/elastic/elasticsearch-analysis-icu/tree/v2.0.0/#icu-analysis-for-elasticsearch) |
|
||||
| es-0.90 | 1.13.0 | [1.13.0](https://github.com/elastic/elasticsearch-analysis-icu/tree/v1.13.0/#icu-analysis-for-elasticsearch) |
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install analysis-icu \
|
||||
--url file:target/releases/elasticsearch-analysis-icu-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
|
||||
ICU Normalization
|
||||
-----------------
|
||||
|
||||
Normalizes characters as explained [here](http://userguide.icu-project.org/transforms/normalization). It registers itself by default under `icu_normalizer` or `icuNormalizer` using the default settings. Allows for the name parameter to be provided which can include the following values: `nfc`, `nfkc`, and `nfkc_cf`. Here is a sample settings:
|
||||
|
||||
```js
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"normalized" : {
|
||||
"tokenizer" : "keyword",
|
||||
"filter" : ["icu_normalizer"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
ICU Folding
|
||||
-----------
|
||||
|
||||
Folding of unicode characters based on `UTR#30`. It registers itself under `icu_folding` and `icuFolding` names. Sample setting:
|
||||
|
||||
```js
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"folded" : {
|
||||
"tokenizer" : "keyword",
|
||||
"filter" : ["icu_folding"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
ICU Filtering
|
||||
-------------
|
||||
|
||||
The folding can be filtered by a set of unicode characters with the parameter `unicodeSetFilter`. This is useful for a
|
||||
non-internationalized search engine where retaining a set of national characters which are primary letters in a specific
|
||||
language is wanted. See syntax for the UnicodeSet [here](http://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html).
|
||||
|
||||
The Following example exempts Swedish characters from the folding. Note that the filtered characters are NOT lowercased which is why we add that filter below.
|
||||
|
||||
```js
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"folding" : {
|
||||
"tokenizer" : "standard",
|
||||
"filter" : ["my_icu_folding", "lowercase"]
|
||||
}
|
||||
}
|
||||
"filter" : {
|
||||
"my_icu_folding" : {
|
||||
"type" : "icu_folding"
|
||||
"unicodeSetFilter" : "[^åäöÅÄÖ]"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
ICU Collation
|
||||
-------------
|
||||
|
||||
Uses collation token filter. Allows to either specify the rules for collation
|
||||
(defined [here](http://www.icu-project.org/userguide/Collate_Customization.html)) using the `rules` parameter
|
||||
(can point to a location or expressed in the settings, location can be relative to config location), or using the
|
||||
`language` parameter (further specialized by country and variant). By default registers under `icu_collation` or
|
||||
`icuCollation` and uses the default locale.
|
||||
|
||||
Here is a sample settings:
|
||||
|
||||
```js
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"collation" : {
|
||||
"tokenizer" : "keyword",
|
||||
"filter" : ["icu_collation"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
And here is a sample of custom collation:
|
||||
|
||||
```js
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"collation" : {
|
||||
"tokenizer" : "keyword",
|
||||
"filter" : ["myCollator"]
|
||||
}
|
||||
},
|
||||
"filter" : {
|
||||
"myCollator" : {
|
||||
"type" : "icu_collation",
|
||||
"language" : "en"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Optional options:
|
||||
* `strength` - The strength property determines the minimum level of difference considered significant during comparison.
|
||||
The default strength for the Collator is `tertiary`, unless specified otherwise by the locale used to create the Collator.
|
||||
Possible values: `primary`, `secondary`, `tertiary`, `quaternary` or `identical`.
|
||||
See [ICU Collation](http://icu-project.org/apiref/icu4j/com/ibm/icu/text/Collator.html) documentation for a more detailed
|
||||
explanation for the specific values.
|
||||
* `decomposition` - Possible values: `no` or `canonical`. Defaults to `no`. Setting this decomposition property with
|
||||
`canonical` allows the Collator to handle un-normalized text properly, producing the same results as if the text were
|
||||
normalized. If `no` is set, it is the user's responsibility to insure that all text is already in the appropriate form
|
||||
before a comparison or before getting a CollationKey. Adjusting decomposition mode allows the user to select between
|
||||
faster and more complete collation behavior. Since a great many of the world's languages do not require text
|
||||
normalization, most locales set `no` as the default decomposition mode.
|
||||
|
||||
Expert options:
|
||||
* `alternate` - Possible values: `shifted` or `non-ignorable`. Sets the alternate handling for strength `quaternary`
|
||||
to be either shifted or non-ignorable. What boils down to ignoring punctuation and whitespace.
|
||||
* `caseLevel` - Possible values: `true` or `false`. Default is `false`. Whether case level sorting is required. When
|
||||
strength is set to `primary` this will ignore accent differences.
|
||||
* `caseFirst` - Possible values: `lower` or `upper`. Useful to control which case is sorted first when case is not ignored
|
||||
for strength `tertiary`.
|
||||
* `numeric` - Possible values: `true` or `false`. Whether digits are sorted according to numeric representation. For
|
||||
example the value `egg-9` is sorted before the value `egg-21`. Defaults to `false`.
|
||||
* `variableTop` - Single character or contraction. Controls what is variable for `alternate`.
|
||||
* `hiraganaQuaternaryMode` - Possible values: `true` or `false`. Defaults to `false`. Distinguishing between Katakana
|
||||
and Hiragana characters in `quaternary` strength .
|
||||
|
||||
ICU Tokenizer
|
||||
-------------
|
||||
|
||||
Breaks text into words according to [UAX #29: Unicode Text Segmentation](http://www.unicode.org/reports/tr29/).
|
||||
|
||||
```js
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"tokenized" : {
|
||||
"tokenizer" : "icu_tokenizer",
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
ICU Normalization CharFilter
|
||||
-----------------
|
||||
|
||||
Normalizes characters as explained [here](http://userguide.icu-project.org/transforms/normalization).
|
||||
It registers itself by default under `icu_normalizer` or `icuNormalizer` using the default settings.
|
||||
Allows for the name parameter to be provided which can include the following values: `nfc`, `nfkc`, and `nfkc_cf`.
|
||||
Allows for the mode parameter to be provided which can include the following values: `compose` and `decompose`.
|
||||
Use `decompose` with `nfc` or `nfkc`, to get `nfd` or `nfkd`, respectively.
|
||||
Here is a sample settings:
|
||||
|
||||
```js
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"normalized" : {
|
||||
"tokenizer" : "keyword",
|
||||
"char_filter" : ["icu_normalizer"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
ICU Transform
|
||||
-------------
|
||||
Transforms are used to process Unicode text in many different ways. Some include case mapping, normalization,
|
||||
transliteration and bidirectional text handling.
|
||||
|
||||
You can defined transliterator identifiers by using `id` property, and specify direction to `forward` or `reverse` by
|
||||
using `dir` property, The default value of both properties are `Null` and `forward`.
|
||||
|
||||
For example:
|
||||
|
||||
```js
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"latin" : {
|
||||
"tokenizer" : "keyword",
|
||||
"filter" : ["myLatinTransform"]
|
||||
}
|
||||
},
|
||||
"filter" : {
|
||||
"myLatinTransform" : {
|
||||
"type" : "icu_transform",
|
||||
"id" : "Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This transform transliterated characters to latin, and separates accents from their base characters, removes the accents,
|
||||
and then puts the remaining text into an unaccented form.
|
||||
|
||||
The results are:
|
||||
|
||||
`你好` to `ni hao`
|
||||
|
||||
`здравствуйте` to `zdravstvujte`
|
||||
|
||||
`こんにちは` to `kon'nichiha`
|
||||
|
||||
Currently the filter only supports identifier and direction, custom rulesets are not yet supported.
|
||||
|
||||
For more documentation, Please see the [user guide of ICU Transform](http://userguide.icu-project.org/transforms/general).
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
|
@ -1,552 +0,0 @@
|
|||
Japanese (kuromoji) Analysis for Elasticsearch
|
||||
==================================
|
||||
|
||||
The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis module into elasticsearch.
|
||||
|
||||
In order to install the plugin, run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-analysis-kuromoji/2.5.0
|
||||
```
|
||||
|
||||
You need to install a version matching your Elasticsearch version:
|
||||
|
||||
| elasticsearch | Kuromoji Analysis Plugin | Docs |
|
||||
|---------------|-----------------------------|------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.6.0-SNAPSHOT](https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/tree/es-1.x/#version-260-snapshot-for-elasticsearch-1x) |
|
||||
| es-1.5 | 2.5.0 | [2.5.0](https://github.com/elastic/elasticsearch-analysis-kuromoji/tree/v2.5.0/#version-250-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.4.3 | [2.4.3](https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/tree/v2.4.3/#version-243-for-elasticsearch-14) |
|
||||
| < 1.4.5 | 2.4.2 | [2.4.2](https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/tree/v2.4.2/#version-242-for-elasticsearch-14) |
|
||||
| < 1.4.3 | 2.4.1 | [2.4.1](https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/tree/v2.4.1/#version-241-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.3.0 | [2.3.0](https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/tree/v2.3.0/#japanese-kuromoji-analysis-for-elasticsearch) |
|
||||
| es-1.2 | 2.2.0 | [2.2.0](https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/tree/v2.2.0/#japanese-kuromoji-analysis-for-elasticsearch) |
|
||||
| es-1.1 | 2.1.0 | [2.1.0](https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/tree/v2.1.0/#japanese-kuromoji-analysis-for-elasticsearch) |
|
||||
| es-1.0 | 2.0.0 | [2.0.0](https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/tree/v2.0.0/#japanese-kuromoji-analysis-for-elasticsearch) |
|
||||
| es-0.90 | 1.8.0 | [1.8.0](https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/tree/v1.8.0/#japanese-kuromoji-analysis-for-elasticsearch) |
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install analysis-kuromoji \
|
||||
--url file:target/releases/elasticsearch-analysis-kuromoji-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
Includes Analyzer, Tokenizer, TokenFilter, CharFilter
|
||||
-----------------------------------------------
|
||||
|
||||
The plugin includes these analyzer and tokenizer, tokenfilter.
|
||||
|
||||
| name | type |
|
||||
|-------------------------|-------------|
|
||||
| kuromoji_iteration_mark | charfilter |
|
||||
| kuromoji | analyzer |
|
||||
| kuromoji_tokenizer | tokenizer |
|
||||
| kuromoji_baseform | tokenfilter |
|
||||
| kuromoji_part_of_speech | tokenfilter |
|
||||
| kuromoji_readingform | tokenfilter |
|
||||
| kuromoji_stemmer | tokenfilter |
|
||||
| ja_stop | tokenfilter |
|
||||
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
## Analyzer : kuromoji
|
||||
|
||||
An analyzer of type `kuromoji`.
|
||||
This analyzer is the following tokenizer and tokenfilter combination.
|
||||
|
||||
* `kuromoji_tokenizer` : Kuromoji Tokenizer
|
||||
* `kuromoji_baseform` : Kuromoji BasicFormFilter (TokenFilter)
|
||||
* `kuromoji_part_of_speech` : Kuromoji Part of Speech Stop Filter (TokenFilter)
|
||||
* `cjk_width` : CJK Width Filter (TokenFilter)
|
||||
* `stop` : Stop Filter (TokenFilter)
|
||||
* `kuromoji_stemmer` : Kuromoji Katakana Stemmer Filter(TokenFilter)
|
||||
* `lowercase` : LowerCase Filter (TokenFilter)
|
||||
|
||||
## CharFilter : kuromoji_iteration_mark
|
||||
|
||||
A charfilter of type `kuromoji_iteration_mark`.
|
||||
This charfilter is Normalizes Japanese horizontal iteration marks (odoriji) to their expanded form.
|
||||
|
||||
The following ar setting that can be set for a `kuromoji_iteration_mark` charfilter type:
|
||||
|
||||
| **Setting** | **Description** | **Default value** |
|
||||
|:----------------|:-------------------------------------------------------------|:------------------|
|
||||
| normalize_kanji | indicates whether kanji iteration marks should be normalized | `true` |
|
||||
| normalize_kana | indicates whether kanji iteration marks should be normalized | `true` |
|
||||
|
||||
## Tokenizer : kuromoji_tokenizer
|
||||
|
||||
A tokenizer of type `kuromoji_tokenizer`.
|
||||
|
||||
The following are settings that can be set for a `kuromoji_tokenizer` tokenizer type:
|
||||
|
||||
| **Setting** | **Description** | **Default value** |
|
||||
|:--------------------|:--------------------------------------------------------------------------------------------------------------------------|:------------------|
|
||||
| mode | Tokenization mode: this determines how the tokenizer handles compound and unknown words. `normal` and `search`, `extended`| `search` |
|
||||
| discard_punctuation | `true` if punctuation tokens should be dropped from the output. | `true` |
|
||||
| user_dictionary | set User Dictionary file | |
|
||||
|
||||
### Tokenization mode
|
||||
|
||||
The mode is three types.
|
||||
|
||||
* `normal` : Ordinary segmentation: no decomposition for compounds
|
||||
|
||||
* `search` : Segmentation geared towards search: this includes a decompounding process for long nouns, also including the full compound token as a synonym.
|
||||
|
||||
* `extended` : Extended mode outputs unigrams for unknown words.
|
||||
|
||||
#### Difference tokenization mode outputs
|
||||
|
||||
Input text is `関西国際空港` and `アブラカダブラ`.
|
||||
|
||||
| **mode** | `関西国際空港` | `アブラカダブラ` |
|
||||
|:-----------|:-------------|:-------|
|
||||
| `normal` | `関西国際空港` | `アブラカダブラ` |
|
||||
| `search` | `関西` `関西国際空港` `国際` `空港` | `アブラカダブラ` |
|
||||
| `extended` | `関西` `国際` `空港` | `ア` `ブ` `ラ` `カ` `ダ` `ブ` `ラ` |
|
||||
|
||||
### User Dictionary
|
||||
|
||||
Kuromoji tokenizer use MeCab-IPADIC dictionary by default.
|
||||
And Kuromoji is added an entry of dictionary to define by user; this is User Dictionary.
|
||||
User Dictionary entries are defined using the following CSV format:
|
||||
|
||||
```
|
||||
<text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
|
||||
```
|
||||
|
||||
Dictionary Example
|
||||
|
||||
```
|
||||
東京スカイツリー,東京 スカイツリー,トウキョウ スカイツリー,カスタム名詞
|
||||
```
|
||||
|
||||
To use User Dictionary set file path to `user_dict` attribute.
|
||||
User Dictionary file is placed `ES_HOME/config` directory.
|
||||
|
||||
### example
|
||||
|
||||
_Example Settings:_
|
||||
|
||||
```sh
|
||||
curl -XPUT 'http://localhost:9200/kuromoji_sample/' -d'
|
||||
{
|
||||
"settings": {
|
||||
"index":{
|
||||
"analysis":{
|
||||
"tokenizer" : {
|
||||
"kuromoji_user_dict" : {
|
||||
"type" : "kuromoji_tokenizer",
|
||||
"mode" : "extended",
|
||||
"discard_punctuation" : "false",
|
||||
"user_dictionary" : "userdict_ja.txt"
|
||||
}
|
||||
},
|
||||
"analyzer" : {
|
||||
"my_analyzer" : {
|
||||
"type" : "custom",
|
||||
"tokenizer" : "kuromoji_user_dict"
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
_Example Request using `_analyze` API :_
|
||||
|
||||
```sh
|
||||
curl -XPOST 'http://localhost:9200/kuromoji_sample/_analyze?analyzer=my_analyzer&pretty' -d '東京スカイツリー'
|
||||
```
|
||||
|
||||
_Response :_
|
||||
|
||||
```json
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "東京",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
}, {
|
||||
"token" : "スカイツリー",
|
||||
"start_offset" : 2,
|
||||
"end_offset" : 8,
|
||||
"type" : "word",
|
||||
"position" : 2
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
||||
## TokenFilter : kuromoji_baseform
|
||||
|
||||
A token filter of type `kuromoji_baseform` that replaces term text with BaseFormAttribute.
|
||||
This acts as a lemmatizer for verbs and adjectives.
|
||||
|
||||
### example
|
||||
|
||||
_Example Settings:_
|
||||
|
||||
```sh
|
||||
curl -XPUT 'http://localhost:9200/kuromoji_sample/' -d'
|
||||
{
|
||||
"settings": {
|
||||
"index":{
|
||||
"analysis":{
|
||||
"analyzer" : {
|
||||
"my_analyzer" : {
|
||||
"tokenizer" : "kuromoji_tokenizer",
|
||||
"filter" : ["kuromoji_baseform"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
_Example Request using `_analyze` API :_
|
||||
|
||||
```sh
|
||||
curl -XPOST 'http://localhost:9200/kuromoji_sample/_analyze?analyzer=my_analyzer&pretty' -d '飲み'
|
||||
```
|
||||
|
||||
_Response :_
|
||||
|
||||
```json
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "飲む",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
||||
## TokenFilter : kuromoji_part_of_speech
|
||||
|
||||
A token filter of type `kuromoji_part_of_speech` that removes tokens that match a set of part-of-speech tags.
|
||||
|
||||
The following are settings that can be set for a stop token filter type:
|
||||
|
||||
| **Setting** | **Description** |
|
||||
|:------------|:-----------------------------------------------------|
|
||||
| stoptags | A list of part-of-speech tags that should be removed |
|
||||
|
||||
Note that default setting is stoptags.txt include lucene-analyzer-kuromoji.jar.
|
||||
|
||||
### example
|
||||
|
||||
_Example Settings:_
|
||||
|
||||
```sh
|
||||
curl -XPUT 'http://localhost:9200/kuromoji_sample/' -d'
|
||||
{
|
||||
"settings": {
|
||||
"index":{
|
||||
"analysis":{
|
||||
"analyzer" : {
|
||||
"my_analyzer" : {
|
||||
"tokenizer" : "kuromoji_tokenizer",
|
||||
"filter" : ["my_posfilter"]
|
||||
}
|
||||
},
|
||||
"filter" : {
|
||||
"my_posfilter" : {
|
||||
"type" : "kuromoji_part_of_speech",
|
||||
"stoptags" : [
|
||||
"助詞-格助詞-一般",
|
||||
"助詞-終助詞"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
_Example Request using `_analyze` API :_
|
||||
|
||||
```sh
|
||||
curl -XPOST 'http://localhost:9200/kuromoji_sample/_analyze?analyzer=my_analyzer&pretty' -d '寿司がおいしいね'
|
||||
```
|
||||
|
||||
_Response :_
|
||||
|
||||
```json
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "寿司",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
}, {
|
||||
"token" : "おいしい",
|
||||
"start_offset" : 3,
|
||||
"end_offset" : 7,
|
||||
"type" : "word",
|
||||
"position" : 3
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
||||
## TokenFilter : kuromoji_readingform
|
||||
|
||||
A token filter of type `kuromoji_readingform` that replaces the term attribute with the reading of a token in either katakana or romaji form.
|
||||
The default reading form is katakana.
|
||||
|
||||
The following are settings that can be set for a `kuromoji_readingform` token filter type:
|
||||
|
||||
| **Setting** | **Description** | **Default value** |
|
||||
|:------------|:----------------------------------------------------------|:------------------|
|
||||
| use_romaji | `true` if romaji reading form output instead of katakana. | `false` |
|
||||
|
||||
Note that elasticsearch-analysis-kuromoji built-in `kuromoji_readingform` set default `true` to `use_romaji` attribute.
|
||||
|
||||
### example
|
||||
|
||||
_Example Settings:_
|
||||
|
||||
```sh
|
||||
curl -XPUT 'http://localhost:9200/kuromoji_sample/' -d'
|
||||
{
|
||||
"settings": {
|
||||
"index":{
|
||||
"analysis":{
|
||||
"analyzer" : {
|
||||
"romaji_analyzer" : {
|
||||
"tokenizer" : "kuromoji_tokenizer",
|
||||
"filter" : ["romaji_readingform"]
|
||||
},
|
||||
"katakana_analyzer" : {
|
||||
"tokenizer" : "kuromoji_tokenizer",
|
||||
"filter" : ["katakana_readingform"]
|
||||
}
|
||||
},
|
||||
"filter" : {
|
||||
"romaji_readingform" : {
|
||||
"type" : "kuromoji_readingform",
|
||||
"use_romaji" : true
|
||||
},
|
||||
"katakana_readingform" : {
|
||||
"type" : "kuromoji_readingform",
|
||||
"use_romaji" : false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
_Example Request using `_analyze` API :_
|
||||
|
||||
```sh
|
||||
curl -XPOST 'http://localhost:9200/kuromoji_sample/_analyze?analyzer=katakana_analyzer&pretty' -d '寿司'
|
||||
```
|
||||
|
||||
_Response :_
|
||||
|
||||
```json
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "スシ",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
||||
_Example Request using `_analyze` API :_
|
||||
|
||||
```sh
|
||||
curl -XPOST 'http://localhost:9200/kuromoji_sample/_analyze?analyzer=romaji_analyzer&pretty' -d '寿司'
|
||||
```
|
||||
|
||||
_Response :_
|
||||
|
||||
```json
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "sushi",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
||||
## TokenFilter : kuromoji_stemmer
|
||||
|
||||
A token filter of type `kuromoji_stemmer` that normalizes common katakana spelling variations ending in a long sound character by removing this character (U+30FC).
|
||||
Only katakana words longer than a minimum length are stemmed (default is four).
|
||||
|
||||
Note that only full-width katakana characters are supported.
|
||||
|
||||
The following are settings that can be set for a `kuromoji_stemmer` token filter type:
|
||||
|
||||
| **Setting** | **Description** | **Default value** |
|
||||
|:----------------|:---------------------------|:------------------|
|
||||
| minimum_length | The minimum length to stem | `4` |
|
||||
|
||||
### example
|
||||
|
||||
_Example Settings:_
|
||||
|
||||
```sh
|
||||
curl -XPUT 'http://localhost:9200/kuromoji_sample/' -d'
|
||||
{
|
||||
"settings": {
|
||||
"index":{
|
||||
"analysis":{
|
||||
"analyzer" : {
|
||||
"my_analyzer" : {
|
||||
"tokenizer" : "kuromoji_tokenizer",
|
||||
"filter" : ["my_katakana_stemmer"]
|
||||
}
|
||||
},
|
||||
"filter" : {
|
||||
"my_katakana_stemmer" : {
|
||||
"type" : "kuromoji_stemmer",
|
||||
"minimum_length" : 4
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
_Example Request using `_analyze` API :_
|
||||
|
||||
```sh
|
||||
curl -XPOST 'http://localhost:9200/kuromoji_sample/_analyze?analyzer=my_analyzer&pretty' -d 'コピー'
|
||||
```
|
||||
|
||||
_Response :_
|
||||
|
||||
```json
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "コピー",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 3,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
||||
_Example Request using `_analyze` API :_
|
||||
|
||||
```sh
|
||||
curl -XPOST 'http://localhost:9200/kuromoji_sample/_analyze?analyzer=my_analyzer&pretty' -d 'サーバー'
|
||||
```
|
||||
|
||||
_Response :_
|
||||
|
||||
```json
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "サーバ",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 4,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## TokenFilter : ja_stop
|
||||
|
||||
|
||||
A token filter of type `ja_stop` that provide a predefined "_japanese_" stop words.
|
||||
*Note: It is only provide "_japanese_". If you want to use other predefined stop words, you can use `stop` token filter.*
|
||||
|
||||
_Example Settings:_
|
||||
|
||||
### example
|
||||
|
||||
```sh
|
||||
curl -XPUT 'http://localhost:9200/kuromoji_sample/' -d'
|
||||
{
|
||||
"settings": {
|
||||
"index":{
|
||||
"analysis":{
|
||||
"analyzer" : {
|
||||
"analyzer_with_ja_stop" : {
|
||||
"tokenizer" : "kuromoji_tokenizer",
|
||||
"filter" : ["ja_stop"]
|
||||
}
|
||||
},
|
||||
"filter" : {
|
||||
"ja_stop" : {
|
||||
"type" : "ja_stop",
|
||||
"stopwords" : ["_japanese_", "ストップ"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
_Example Request using `_analyze` API :_
|
||||
|
||||
```sh
|
||||
curl -XPOST 'http://localhost:9200/kuromoji_sample/_analyze?analyzer=katakana_analyzer&pretty' -d 'ストップは消える'
|
||||
```
|
||||
|
||||
_Response :_
|
||||
|
||||
```json
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "消える",
|
||||
"start_offset" : 5,
|
||||
"end_offset" : 8,
|
||||
"type" : "word",
|
||||
"position" : 3
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
|
@ -1,93 +0,0 @@
|
|||
Phonetic Analysis for Elasticsearch
|
||||
===================================
|
||||
|
||||
The Phonetic Analysis plugin integrates phonetic token filter analysis with elasticsearch.
|
||||
|
||||
In order to install the plugin, simply run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-analysis-phonetic/2.5.0
|
||||
```
|
||||
|
||||
|
||||
| elasticsearch |Phonetic Analysis Plugin| Docs |
|
||||
|---------------|-----------------------|------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.6.0-SNAPSHOT](https://github.com/elastic/elasticsearch-analysis-phonetic/tree/es-1.x/#version-260-snapshot-for-elasticsearch-1x) |
|
||||
| es-1.5 | 2.5.0 | [2.5.0](https://github.com/elastic/elasticsearch-analysis-phonetic/tree/v2.5.0/#version-250-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.4.3 | [2.4.3](https://github.com/elasticsearch/elasticsearch-analysis-phonetic/tree/v2.4.3/#version-243-for-elasticsearch-14) |
|
||||
| < 1.4.5 | 2.4.2 | [2.4.2](https://github.com/elastic/elasticsearch-analysis-phonetic/tree/v2.4.2/#version-242-for-elasticsearch-14) |
|
||||
| < 1.4.3 | 2.4.1 | [2.4.1](https://github.com/elastic/elasticsearch-analysis-phonetic/tree/v2.4.1/#version-241-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.3.0 | [2.3.0](https://github.com/elastic/elasticsearch-analysis-phonetic/tree/v2.3.0/#phonetic-analysis-for-elasticsearch) |
|
||||
| es-1.2 | 2.2.0 | [2.2.0](https://github.com/elastic/elasticsearch-analysis-phonetic/tree/v2.2.0/#phonetic-analysis-for-elasticsearch) |
|
||||
| es-1.1 | 2.1.0 | [2.1.0](https://github.com/elastic/elasticsearch-analysis-phonetic/tree/v2.1.0/#phonetic-analysis-for-elasticsearch) |
|
||||
| es-1.0 | 2.0.0 | [2.0.0](https://github.com/elastic/elasticsearch-analysis-phonetic/tree/v2.0.0/#phonetic-analysis-for-elasticsearch) |
|
||||
| es-0.90 | 1.8.0 | [1.8.0](https://github.com/elastic/elasticsearch-analysis-phonetic/tree/v1.8.0/#phonetic-analysis-for-elasticsearch) |
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install analysis-phonetic \
|
||||
--url file:target/releases/elasticsearch-analysis-phonetic-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
## User guide
|
||||
|
||||
A `phonetic` token filter that can be configured with different `encoder` types:
|
||||
`metaphone`, `doublemetaphone`, `soundex`, `refinedsoundex`,
|
||||
`caverphone1`, `caverphone2`, `cologne`, `nysiis`,
|
||||
`koelnerphonetik`, `haasephonetik`, `beidermorse`
|
||||
|
||||
The `replace` parameter (defaults to `true`) controls if the token processed
|
||||
should be replaced with the encoded one (set it to `true`), or added (set it to `false`).
|
||||
|
||||
```js
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"my_analyzer" : {
|
||||
"tokenizer" : "standard",
|
||||
"filter" : ["standard", "lowercase", "my_metaphone"]
|
||||
}
|
||||
},
|
||||
"filter" : {
|
||||
"my_metaphone" : {
|
||||
"type" : "phonetic",
|
||||
"encoder" : "metaphone",
|
||||
"replace" : false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Note that `beidermorse` does not support `replace` parameter.
|
||||
|
||||
|
||||
Questions
|
||||
---------
|
||||
|
||||
If you have questions or comments please use the [mailing list](https://groups.google.com/group/elasticsearch) instead
|
||||
of Github Issues tracker.
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
|
@ -1,58 +0,0 @@
|
|||
Smart Chinese Analysis for Elasticsearch
|
||||
==================================
|
||||
|
||||
The Smart Chinese Analysis plugin integrates Lucene Smart Chinese analysis module into elasticsearch.
|
||||
|
||||
In order to install the plugin, simply run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-analysis-smartcn/2.5.0
|
||||
```
|
||||
|
||||
|
||||
| elasticsearch | Smart Chinese Analysis Plugin | Docs |
|
||||
|---------------|-----------------------|------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.6.0-SNAPSHOT](https://github.com/elastic/elasticsearch-analysis-smartcn/tree/es-1.x/#version-260-snapshot-for-elasticsearch-1x) |
|
||||
| es-1.5 | 2.5.0 | [2.5.0](https://github.com/elastic/elasticsearch-analysis-smartcn/tree/v2.5.0/#version-250-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.4.4 | [2.4.4](https://github.com/elasticsearch/elasticsearch-analysis-smartcn/tree/v2.4.4/#version-244-for-elasticsearch-14) |
|
||||
| < 1.4.5 | 2.4.3 | [2.4.3](https://github.com/elastic/elasticsearch-analysis-smartcn/tree/v2.4.3/#version-243-for-elasticsearch-14) |
|
||||
| < 1.4.3 | 2.4.2 | [2.4.2](https://github.com/elastic/elasticsearch-analysis-smartcn/tree/v2.4.2/#version-242-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.3.1 | [2.3.1](https://github.com/elastic/elasticsearch-analysis-smartcn/tree/v2.3.1/#version-231-for-elasticsearch-13) |
|
||||
| es-1.2 | 2.2.0 | [2.2.0](https://github.com/elastic/elasticsearch-analysis-smartcn/tree/v2.2.0/#smart-chinese-analysis-for-elasticsearch) |
|
||||
| es-1.1 | 2.1.0 | [2.1.0](https://github.com/elastic/elasticsearch-analysis-smartcn/tree/v2.1.0/#smart-chinese-analysis-for-elasticsearch) |
|
||||
| es-1.0 | 2.0.0 | [2.0.0](https://github.com/elastic/elasticsearch-analysis-smartcn/tree/v2.0.0/#smart-chinese-analysis-for-elasticsearch) |
|
||||
| es-0.90 | 1.8.0 | [1.8.0](https://github.com/elastic/elasticsearch-analysis-smartcn/tree/v1.8.0/#smart-chinese-analysis-for-elasticsearch) |
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install analysis-smartcn \
|
||||
--url file:target/releases/elasticsearch-analysis-smartcn-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
## User guide
|
||||
|
||||
The plugin includes the `smartcn` analyzer and `smartcn_tokenizer` tokenizer.
|
||||
|
||||
Note that `smartcn_word` token filter and `smartcn_sentence` have been deprecated.
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
|
@ -1,56 +0,0 @@
|
|||
Stempel (Polish) Analysis for Elasticsearch
|
||||
==================================
|
||||
|
||||
The Stempel (Polish) Analysis plugin integrates Lucene stempel (polish) analysis module into elasticsearch.
|
||||
|
||||
In order to install the plugin, simply run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-analysis-stempel/2.4.3
|
||||
```
|
||||
|
||||
| elasticsearch | Stempel Analysis Plugin | Docs |
|
||||
|---------------|-----------------------|------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.6.0-SNAPSHOT](https://github.com/elastic/elasticsearch-analysis-stempel/tree/es-1.x/#version-260-snapshot-for-elasticsearch-1x) |
|
||||
| es-1.5 | 2.5.0 | [2.5.0](https://github.com/elastic/elasticsearch-analysis-stempel/tree/v2.5.0/#version-250-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.4.3 | [2.4.3](https://github.com/elasticsearch/elasticsearch-analysis-stempel/tree/v2.4.3/#version-243-for-elasticsearch-14) |
|
||||
| < 1.4.5 | 2.4.2 | [2.4.2](https://github.com/elastic/elasticsearch-analysis-stempel/tree/v2.4.2/#version-242-for-elasticsearch-14) |
|
||||
| < 1.4.3 | 2.4.1 | [2.4.1](https://github.com/elastic/elasticsearch-analysis-stempel/tree/v2.4.1/#version-241-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.3.0 | [2.3.0](https://github.com/elastic/elasticsearch-analysis-stempel/tree/v2.3.0/#stempel-polish-analysis-for-elasticsearch) |
|
||||
| es-1.2 | 2.2.0 | [2.2.0](https://github.com/elastic/elasticsearch-analysis-stempel/tree/v2.2.0/#stempel-polish-analysis-for-elasticsearch) |
|
||||
| es-1.1 | 2.1.0 | [2.1.0](https://github.com/elastic/elasticsearch-analysis-stempel/tree/v2.1.0/#stempel-polish-analysis-for-elasticsearch) |
|
||||
| es-1.0 | 2.0.0 | [2.0.0](https://github.com/elastic/elasticsearch-analysis-stempel/tree/v2.0.0/#stempel-polish-analysis-for-elasticsearch) |
|
||||
| es-0.90 | 1.13.0 | [1.13.0](https://github.com/elastic/elasticsearch-analysis-stempel/tree/v1.13.0/#stempel-polish-analysis-for-elasticsearch) |
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install analysis-stempel \
|
||||
--url file:target/releases/elasticsearch-analysis-stempel-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
Stempel Plugin
|
||||
-----------------
|
||||
|
||||
The plugin includes the `polish` analyzer and `polish_stem` token filter.
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
|
@ -1,361 +0,0 @@
|
|||
AWS Cloud Plugin for Elasticsearch
|
||||
==================================
|
||||
|
||||
The Amazon Web Service (AWS) Cloud plugin allows to use [AWS API](https://github.com/aws/aws-sdk-java)
|
||||
for the unicast discovery mechanism and add S3 repositories.
|
||||
|
||||
In order to install the plugin, run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-cloud-aws/2.5.1
|
||||
```
|
||||
|
||||
You need to install a version matching your Elasticsearch version:
|
||||
|
||||
| Elasticsearch | AWS Cloud Plugin | Docs |
|
||||
|------------------------|-------------------|------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.6.0-SNAPSHOT](https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/es-1.x/#version-260-snapshot-for-elasticsearch-1x) |
|
||||
| es-1.5 | 2.5.1 | [2.5.1](https://github.com/elastic/elasticsearch-cloud-aws/tree/v2.5.1/#version-251-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.4.2 | [2.4.2](https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/v2.4.2/#version-242-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.3.0 | [2.3.0](https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/v2.3.0/#version-230-for-elasticsearch-13) |
|
||||
| es-1.2 | 2.2.0 | [2.2.0](https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/v2.2.0/#aws-cloud-plugin-for-elasticsearch) |
|
||||
| es-1.1 | 2.1.1 | [2.1.1](https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/v2.1.1/#aws-cloud-plugin-for-elasticsearch) |
|
||||
| es-1.0 | 2.0.0 | [2.0.0](https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/v2.0.0/#aws-cloud-plugin-for-elasticsearch) |
|
||||
| es-0.90 | 1.16.0 | [1.16.0](https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/v1.16.0/#aws-cloud-plugin-for-elasticsearch) |
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install cloud-aws \
|
||||
--url file:target/releases/elasticsearch-cloud-aws-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
## Generic Configuration
|
||||
|
||||
The plugin will default to using [IAM Role](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html) credentials
|
||||
for authentication. These can be overridden by, in increasing order of precedence, system properties `aws.accessKeyId` and `aws.secretKey`,
|
||||
environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_KEY`, or the elasticsearch config using `cloud.aws.access_key` and `cloud.aws.secret_key`:
|
||||
|
||||
```
|
||||
cloud:
|
||||
aws:
|
||||
access_key: AKVAIQBF2RECL7FJWGJQ
|
||||
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
|
||||
```
|
||||
|
||||
### Transport security
|
||||
|
||||
By default this plugin uses HTTPS for all API calls to AWS endpoints. If you wish to configure HTTP you can set
|
||||
`cloud.aws.protocol` in the elasticsearch config. You can optionally override this setting per individual service
|
||||
via: `cloud.aws.ec2.protocol` or `cloud.aws.s3.protocol`.
|
||||
|
||||
```
|
||||
cloud:
|
||||
aws:
|
||||
protocol: https
|
||||
s3:
|
||||
protocol: http
|
||||
ec2:
|
||||
protocol: https
|
||||
```
|
||||
|
||||
In addition, a proxy can be configured with the `proxy_host` and `proxy_port` settings (note that protocol can be `http` or `https`):
|
||||
|
||||
```
|
||||
cloud:
|
||||
aws:
|
||||
protocol: https
|
||||
proxy_host: proxy1.company.com
|
||||
proxy_port: 8083
|
||||
```
|
||||
|
||||
You can also set different proxies for `ec2` and `s3`:
|
||||
|
||||
```
|
||||
cloud:
|
||||
aws:
|
||||
s3:
|
||||
proxy_host: proxy1.company.com
|
||||
proxy_port: 8083
|
||||
ec2:
|
||||
proxy_host: proxy2.company.com
|
||||
proxy_port: 8083
|
||||
```
|
||||
|
||||
### Region
|
||||
|
||||
The `cloud.aws.region` can be set to a region and will automatically use the relevant settings for both `ec2` and `s3`. The available values are:
|
||||
|
||||
* `us-east` (`us-east-1`)
|
||||
* `us-west` (`us-west-1`)
|
||||
* `us-west-1`
|
||||
* `us-west-2`
|
||||
* `ap-southeast` (`ap-southeast-1`)
|
||||
* `ap-southeast-1`
|
||||
* `ap-southeast-2`
|
||||
* `ap-northeast` (`ap-northeast-1`)
|
||||
* `eu-west` (`eu-west-1`)
|
||||
* `eu-central` (`eu-central-1`)
|
||||
* `sa-east` (`sa-east-1`)
|
||||
* `cn-north` (`cn-north-1`)
|
||||
|
||||
|
||||
### EC2/S3 Signer API
|
||||
|
||||
If you are using a compatible EC2 or S3 service, they might be using an older API to sign the requests.
|
||||
You can set your compatible signer API using `cloud.aws.signer` (or `cloud.aws.ec2.signer` and `cloud.aws.s3.signer`)
|
||||
with the right signer to use. Defaults to `AWS4SignerType`.
|
||||
|
||||
|
||||
## EC2 Discovery
|
||||
|
||||
ec2 discovery allows to use the ec2 APIs to perform automatic discovery (similar to multicast in non hostile multicast environments). Here is a simple sample configuration:
|
||||
|
||||
```
|
||||
discovery:
|
||||
type: ec2
|
||||
```
|
||||
|
||||
The ec2 discovery is using the same credentials as the rest of the AWS services provided by this plugin (`repositories`).
|
||||
See [Generic Configuration](#generic-configuration) for details.
|
||||
|
||||
The following are a list of settings (prefixed with `discovery.ec2`) that can further control the discovery:
|
||||
|
||||
* `groups`: Either a comma separated list or array based list of (security) groups. Only instances with the provided security groups will be used in the cluster discovery. (NOTE: You could provide either group NAME or group ID.)
|
||||
* `host_type`: The type of host type to use to communicate with other instances. Can be one of `private_ip`, `public_ip`, `private_dns`, `public_dns`. Defaults to `private_ip`.
|
||||
* `availability_zones`: Either a comma separated list or array based list of availability zones. Only instances within the provided availability zones will be used in the cluster discovery.
|
||||
* `any_group`: If set to `false`, will require all security groups to be present for the instance to be used for the discovery. Defaults to `true`.
|
||||
* `ping_timeout`: How long to wait for existing EC2 nodes to reply during discovery. Defaults to `3s`. If no unit like `ms`, `s` or `m` is specified, milliseconds are used.
|
||||
|
||||
### Recommended EC2 Permissions
|
||||
|
||||
EC2 discovery requires making a call to the EC2 service. You'll want to setup an IAM policy to allow this. You can create a custom policy via the IAM Management Console. It should look similar to this.
|
||||
|
||||
```js
|
||||
{
|
||||
"Statement": [
|
||||
{
|
||||
"Action": [
|
||||
"ec2:DescribeInstances"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"*"
|
||||
]
|
||||
}
|
||||
],
|
||||
"Version": "2012-10-17"
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
### Filtering by Tags
|
||||
|
||||
The ec2 discovery can also filter machines to include in the cluster based on tags (and not just groups). The settings to use include the `discovery.ec2.tag.` prefix. For example, setting `discovery.ec2.tag.stage` to `dev` will only filter instances with a tag key set to `stage`, and a value of `dev`. Several tags set will require all of those tags to be set for the instance to be included.
|
||||
|
||||
One practical use for tag filtering is when an ec2 cluster contains many nodes that are not running elasticsearch. In this case (particularly with high `ping_timeout` values) there is a risk that a new node's discovery phase will end before it has found the cluster (which will result in it declaring itself master of a new cluster with the same name - highly undesirable). Tagging elasticsearch ec2 nodes and then filtering by that tag will resolve this issue.
|
||||
|
||||
### Automatic Node Attributes
|
||||
|
||||
Though not dependent on actually using `ec2` as discovery (but still requires the cloud aws plugin installed), the plugin can automatically add node attributes relating to ec2 (for example, availability zone, that can be used with the awareness allocation feature). In order to enable it, set `cloud.node.auto_attributes` to `true` in the settings.
|
||||
|
||||
|
||||
### Using other EC2 endpoint
|
||||
|
||||
If you are using any EC2 api compatible service, you can set the endpoint you want to use by setting `cloud.aws.ec2.endpoint`
|
||||
to your URL provider.
|
||||
|
||||
## S3 Repository
|
||||
|
||||
The S3 repository is using S3 to store snapshots. The S3 repository can be created using the following command:
|
||||
|
||||
```sh
|
||||
$ curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
|
||||
"type": "s3",
|
||||
"settings": {
|
||||
"bucket": "my_bucket_name",
|
||||
"region": "us-west"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
The following settings are supported:
|
||||
|
||||
* `bucket`: The name of the bucket to be used for snapshots. (Mandatory)
|
||||
* `region`: The region where bucket is located. Defaults to US Standard
|
||||
* `endpoint`: The endpoint to the S3 API. Defaults to AWS's default S3 endpoint. Note that setting a region overrides the endpoint setting.
|
||||
* `protocol`: The protocol to use (`http` or `https`). Defaults to value of `cloud.aws.protocol` or `cloud.aws.s3.protocol`.
|
||||
* `base_path`: Specifies the path within bucket to repository data. Defaults to value of `repositories.s3.base_path` or to root directory if not set.
|
||||
* `access_key`: The access key to use for authentication. Defaults to value of `cloud.aws.access_key`.
|
||||
* `secret_key`: The secret key to use for authentication. Defaults to value of `cloud.aws.secret_key`.
|
||||
* `chunk_size`: Big files can be broken down into chunks during snapshotting if needed. The chunk size can be specified in bytes or by using size value notation, i.e. `1g`, `10m`, `5k`. Defaults to `100m`.
|
||||
* `compress`: When set to `true` metadata files are stored in compressed format. This setting doesn't affect index files that are already compressed by default. Defaults to `false`.
|
||||
* `server_side_encryption`: When set to `true` files are encrypted on server side using AES256 algorithm. Defaults to `false`.
|
||||
* `buffer_size`: Minimum threshold below which the chunk is uploaded using a single request. Beyond this threshold, the S3 repository will use the [AWS Multipart Upload API](http://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html) to split the chunk into several parts, each of `buffer_size` length, and to upload each part in its own request. Note that positionning a buffer size lower than `5mb` is not allowed since it will prevents the use of the Multipart API and may result in upload errors. Defaults to `5mb`.
|
||||
* `max_retries`: Number of retries in case of S3 errors. Defaults to `3`.
|
||||
|
||||
The S3 repositories are using the same credentials as the rest of the AWS services provided by this plugin (`discovery`).
|
||||
See [Generic Configuration](#generic-configuration) for details.
|
||||
|
||||
Multiple S3 repositories can be created. If the buckets require different credentials, then define them as part of the repository settings.
|
||||
|
||||
### Recommended S3 Permissions
|
||||
|
||||
In order to restrict the Elasticsearch snapshot process to the minimum required resources, we recommend using Amazon IAM in conjunction with pre-existing S3 buckets. Here is an example policy which will allow the snapshot access to an S3 bucket named "snaps.example.com". This may be configured through the AWS IAM console, by creating a Custom Policy, and using a Policy Document similar to this (changing snaps.example.com to your bucket name).
|
||||
|
||||
```js
|
||||
{
|
||||
"Statement": [
|
||||
{
|
||||
"Action": [
|
||||
"s3:ListBucket",
|
||||
"s3:GetBucketLocation",
|
||||
"s3:ListBucketMultipartUploads",
|
||||
"s3:ListBucketVersions"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com"
|
||||
]
|
||||
},
|
||||
{
|
||||
"Action": [
|
||||
"s3:GetObject",
|
||||
"s3:PutObject",
|
||||
"s3:DeleteObject",
|
||||
"s3:AbortMultipartUpload",
|
||||
"s3:ListMultipartUploadParts"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com/*"
|
||||
]
|
||||
}
|
||||
],
|
||||
"Version": "2012-10-17"
|
||||
}
|
||||
```
|
||||
|
||||
You may further restrict the permissions by specifying a prefix within the bucket, in this example, named "foo".
|
||||
|
||||
```js
|
||||
{
|
||||
"Statement": [
|
||||
{
|
||||
"Action": [
|
||||
"s3:ListBucket",
|
||||
"s3:GetBucketLocation",
|
||||
"s3:ListBucketMultipartUploads",
|
||||
"s3:ListBucketVersions"
|
||||
],
|
||||
"Condition": {
|
||||
"StringLike": {
|
||||
"s3:prefix": [
|
||||
"foo/*"
|
||||
]
|
||||
}
|
||||
},
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com"
|
||||
]
|
||||
},
|
||||
{
|
||||
"Action": [
|
||||
"s3:GetObject",
|
||||
"s3:PutObject",
|
||||
"s3:DeleteObject",
|
||||
"s3:AbortMultipartUpload",
|
||||
"s3:ListMultipartUploadParts"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com/foo/*"
|
||||
]
|
||||
}
|
||||
],
|
||||
"Version": "2012-10-17"
|
||||
}
|
||||
```
|
||||
|
||||
The bucket needs to exist to register a repository for snapshots. If you did not create the bucket then the repository registration will fail. If you want elasticsearch to create the bucket instead, you can add the permission to create a specific bucket like this:
|
||||
|
||||
```js
|
||||
{
|
||||
"Action": [
|
||||
"s3:CreateBucket"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"arn:aws:s3:::snaps.example.com"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Using other S3 endpoint
|
||||
|
||||
If you are using any S3 api compatible service, you can set a global endpoint by setting `cloud.aws.s3.endpoint`
|
||||
to your URL provider. Note that this setting will be used for all S3 repositories.
|
||||
|
||||
Different `endpoint`, `region` and `protocol` settings can be set on a per-repository basis (see [S3 Repository](#s3-repository) section for detail).
|
||||
|
||||
|
||||
## Testing
|
||||
|
||||
Integrations tests in this plugin require working AWS configuration and therefore disabled by default. Three buckets and two iam users have to be created. The first iam user needs access to two buckets in different regions and the final bucket is exclusive for the other iam user. To enable tests prepare a config file elasticsearch.yml with the following content:
|
||||
|
||||
```
|
||||
cloud:
|
||||
aws:
|
||||
access_key: AKVAIQBF2RECL7FJWGJQ
|
||||
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
|
||||
|
||||
repositories:
|
||||
s3:
|
||||
bucket: "bucket_name"
|
||||
region: "us-west-2"
|
||||
private-bucket:
|
||||
bucket: <bucket not accessible by default key>
|
||||
access_key: <access key>
|
||||
secret_key: <secret key>
|
||||
remote-bucket:
|
||||
bucket: <bucket in other region>
|
||||
region: <region>
|
||||
external-bucket:
|
||||
bucket: <bucket>
|
||||
access_key: <access key>
|
||||
secret_key: <secret key>
|
||||
endpoint: <endpoint>
|
||||
protocol: <protocol>
|
||||
|
||||
```
|
||||
|
||||
Replace all occurrences of `access_key`, `secret_key`, `endpoint`, `protocol`, `bucket` and `region` with your settings. Please, note that the test will delete all snapshot/restore related files in the specified buckets.
|
||||
|
||||
To run test:
|
||||
|
||||
```sh
|
||||
mvn -Dtests.aws=true -Dtests.config=/path/to/config/file/elasticsearch.yml clean test
|
||||
```
|
||||
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
|
@ -1,14 +0,0 @@
|
|||
Elasticsearch
|
||||
Copyright 2009-2015 Elasticsearch
|
||||
|
||||
This product includes software developed by The Apache Software
|
||||
Foundation (http://www.apache.org/).
|
||||
|
||||
activation-*.jar, javax.inject-*.jar, and jaxb-*.jar are under the CDDL license,
|
||||
the original source code for these can be found at http://www.oracle.com/.
|
||||
|
||||
jersey-*.jar are under the CDDL license, the original source code for these
|
||||
can be found at https://jersey.java.net/.
|
||||
|
||||
The LICENSE and NOTICE files for all dependencies may be found in the licenses/
|
||||
directory.
|
|
@ -1,568 +0,0 @@
|
|||
Azure Cloud Plugin for Elasticsearch
|
||||
====================================
|
||||
|
||||
The Azure Cloud plugin allows to use Azure API for the unicast discovery mechanism.
|
||||
|
||||
In order to install the plugin, run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-cloud-azure/2.6.1
|
||||
```
|
||||
|
||||
You need to install a version matching your Elasticsearch version:
|
||||
|
||||
| Elasticsearch | Azure Cloud Plugin| Docs |
|
||||
|------------------------|-------------------|------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.7.0-SNAPSHOT](https://github.com/elasticsearch/elasticsearch-cloud-azure/tree/es-1.x/#version-270-snapshot-for-elasticsearch-1x)|
|
||||
| es-1.5 | 2.6.1 | [2.6.1](https://github.com/elastic/elasticsearch-cloud-azure/tree/v2.6.1/#version-261-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.5.2 | [2.5.2](https://github.com/elastic/elasticsearch-cloud-azure/tree/v2.5.2/#version-252-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.4.0 | [2.4.0](https://github.com/elasticsearch/elasticsearch-cloud-azure/tree/v2.4.0/#version-240-for-elasticsearch-13) |
|
||||
| es-1.2 | 2.3.0 | [2.3.0](https://github.com/elasticsearch/elasticsearch-cloud-azure/tree/v2.3.0/#azure-cloud-plugin-for-elasticsearch) |
|
||||
| es-1.1 | 2.2.0 | [2.2.0](https://github.com/elasticsearch/elasticsearch-cloud-azure/tree/v2.2.0/#azure-cloud-plugin-for-elasticsearch) |
|
||||
| es-1.0 | 2.1.0 | [2.1.0](https://github.com/elasticsearch/elasticsearch-cloud-azure/tree/v2.1.0/#azure-cloud-plugin-for-elasticsearch) |
|
||||
| es-0.90 | 1.0.0.alpha1 | [1.0.0.alpha1](https://github.com/elasticsearch/elasticsearch-cloud-azure/tree/v1.0.0.alpha1/#azure-cloud-plugin-for-elasticsearch)|
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install cloud-azure \
|
||||
--url file:target/releases/elasticsearch-cloud-azure-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
Azure Virtual Machine Discovery
|
||||
===============================
|
||||
|
||||
Azure VM discovery allows to use the azure APIs to perform automatic discovery (similar to multicast in non hostile
|
||||
multicast environments). Here is a simple sample configuration:
|
||||
|
||||
```
|
||||
cloud:
|
||||
azure:
|
||||
management:
|
||||
subscription.id: XXX-XXX-XXX-XXX
|
||||
cloud.service.name: es-demo-app
|
||||
keystore:
|
||||
path: /path/to/azurekeystore.pkcs12
|
||||
password: WHATEVER
|
||||
type: pkcs12
|
||||
|
||||
discovery:
|
||||
type: azure
|
||||
```
|
||||
|
||||
How to start (short story)
|
||||
--------------------------
|
||||
|
||||
* Create Azure instances
|
||||
* Install Elasticsearch
|
||||
* Install Azure plugin
|
||||
* Modify `elasticsearch.yml` file
|
||||
* Start Elasticsearch
|
||||
|
||||
Azure credential API settings
|
||||
-----------------------------
|
||||
|
||||
The following are a list of settings that can further control the credential API:
|
||||
|
||||
* `cloud.azure.management.keystore.path`: /path/to/keystore
|
||||
* `cloud.azure.management.keystore.type`: `pkcs12`, `jceks` or `jks`. Defaults to `pkcs12`.
|
||||
* `cloud.azure.management.keystore.password`: your_password for the keystore
|
||||
* `cloud.azure.management.subscription.id`: your_azure_subscription_id
|
||||
* `cloud.azure.management.cloud.service.name`: your_azure_cloud_service_name
|
||||
|
||||
Note that in previous versions, it was:
|
||||
|
||||
```
|
||||
cloud:
|
||||
azure:
|
||||
keystore: /path/to/keystore
|
||||
password: your_password_for_keystore
|
||||
subscription_id: your_azure_subscription_id
|
||||
service_name: your_azure_cloud_service_name
|
||||
```
|
||||
|
||||
Advanced settings
|
||||
-----------------
|
||||
|
||||
The following are a list of settings that can further control the discovery:
|
||||
|
||||
* `discovery.azure.host.type`: either `public_ip` or `private_ip` (default). Azure discovery will use the one you set to ping
|
||||
other nodes. This feature was not documented before but was existing under `cloud.azure.host_type`.
|
||||
* `discovery.azure.endpoint.name`: when using `public_ip` this setting is used to identify the endpoint name used to forward requests
|
||||
to elasticsearch (aka transport port name). Defaults to `elasticsearch`. In Azure management console, you could define
|
||||
an endpoint `elasticsearch` forwarding for example requests on public IP on port 8100 to the virtual machine on port 9300.
|
||||
This feature was not documented before but was existing under `cloud.azure.port_name`.
|
||||
* `discovery.azure.deployment.name`: deployment name if any. Defaults to the value set with `cloud.azure.management.cloud.service.name`.
|
||||
* `discovery.azure.deployment.slot`: either `staging` or `production` (default).
|
||||
|
||||
For example:
|
||||
|
||||
```
|
||||
discovery:
|
||||
type: azure
|
||||
azure:
|
||||
host:
|
||||
type: private_ip
|
||||
endpoint:
|
||||
name: elasticsearch
|
||||
deployment:
|
||||
name: your_azure_cloud_service_name
|
||||
slot: production
|
||||
```
|
||||
|
||||
How to start (long story)
|
||||
--------------------------
|
||||
|
||||
We will expose here one strategy which is to hide our Elasticsearch cluster from outside.
|
||||
|
||||
With this strategy, only VM behind this same virtual port can talk to each other.
|
||||
That means that with this mode, you can use elasticsearch unicast discovery to build a cluster.
|
||||
|
||||
Best, you can use the `elasticsearch-cloud-azure` plugin to let it fetch information about your nodes using
|
||||
azure API.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Before starting, you need to have:
|
||||
|
||||
* A [Windows Azure account](http://www.windowsazure.com/)
|
||||
* SSH keys and certificate
|
||||
* OpenSSL that isn't from MacPorts, specifically `OpenSSL 1.0.1f 6 Jan
|
||||
2014` doesn't seem to create a valid keypair for ssh. FWIW,
|
||||
`OpenSSL 1.0.1c 10 May 2012` on Ubuntu 12.04 LTS is known to work.
|
||||
|
||||
You should follow [this guide](http://azure.microsoft.com/en-us/documentation/articles/linux-use-ssh-key/) to learn
|
||||
how to create or use existing SSH keys. If you have already did it, you can skip the following.
|
||||
|
||||
Here is a description on how to generate SSH keys using `openssl`:
|
||||
|
||||
```sh
|
||||
# You may want to use another dir than /tmp
|
||||
cd /tmp
|
||||
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout azure-private.key -out azure-certificate.pem
|
||||
chmod 600 azure-private.key azure-certificate.pem
|
||||
openssl x509 -outform der -in azure-certificate.pem -out azure-certificate.cer
|
||||
```
|
||||
|
||||
Generate a keystore which will be used by the plugin to authenticate with a certificate
|
||||
all Azure API calls.
|
||||
|
||||
```sh
|
||||
# Generate a keystore (azurekeystore.pkcs12)
|
||||
# Transform private key to PEM format
|
||||
openssl pkcs8 -topk8 -nocrypt -in azure-private.key -inform PEM -out azure-pk.pem -outform PEM
|
||||
# Transform certificate to PEM format
|
||||
openssl x509 -inform der -in azure-certificate.cer -out azure-cert.pem
|
||||
cat azure-cert.pem azure-pk.pem > azure.pem.txt
|
||||
# You MUST enter a password!
|
||||
openssl pkcs12 -export -in azure.pem.txt -out azurekeystore.pkcs12 -name azure -noiter -nomaciter
|
||||
```
|
||||
|
||||
Upload the `azure-certificate.cer` file both in the elasticsearch Cloud Service (under `Manage Certificates`),
|
||||
and under `Settings -> Manage Certificates`.
|
||||
|
||||
**Important**: when prompted for a password, you need to enter a non empty one.
|
||||
|
||||
See this [guide](http://www.windowsazure.com/en-us/manage/linux/how-to-guides/ssh-into-linux/) to have
|
||||
more details on how to create keys for Azure.
|
||||
|
||||
Once done, you need to upload your certificate in Azure:
|
||||
|
||||
* Go to the [management console](https://account.windowsazure.com/).
|
||||
* Sign in using your account.
|
||||
* Click on `Portal`.
|
||||
* Go to Settings (bottom of the left list)
|
||||
* On the bottom bar, click on `Upload` and upload your `azure-certificate.cer` file.
|
||||
|
||||
You may want to use [Windows Azure Command-Line Tool](http://www.windowsazure.com/en-us/develop/nodejs/how-to-guides/command-line-tools/):
|
||||
|
||||
* Install [NodeJS](https://github.com/joyent/node/wiki/Installing-Node.js-via-package-manager), for example using
|
||||
homebrew on MacOS X:
|
||||
|
||||
```sh
|
||||
brew install node
|
||||
```
|
||||
|
||||
* Install Azure tools:
|
||||
|
||||
```sh
|
||||
sudo npm install azure-cli -g
|
||||
```
|
||||
|
||||
* Download and import your azure settings:
|
||||
|
||||
```sh
|
||||
# This will open a browser and will download a .publishsettings file
|
||||
azure account download
|
||||
|
||||
# Import this file (we have downloaded it to /tmp)
|
||||
# Note, it will create needed files in ~/.azure. You can remove azure.publishsettings when done.
|
||||
azure account import /tmp/azure.publishsettings
|
||||
```
|
||||
|
||||
### Creating your first instance
|
||||
|
||||
You need to have a storage account available. Check [Azure Blob Storage documentation](http://www.windowsazure.com/en-us/develop/net/how-to-guides/blob-storage/#create-account)
|
||||
for more information.
|
||||
|
||||
You will need to choose the operating system you want to run on. To get a list of official available images, run:
|
||||
|
||||
```sh
|
||||
azure vm image list
|
||||
```
|
||||
|
||||
Let's say we are going to deploy an Ubuntu image on an extra small instance in West Europe:
|
||||
|
||||
* Azure cluster name: `azure-elasticsearch-cluster`
|
||||
* Image: `b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-13_10-amd64-server-20130808-alpha3-en-us-30GB`
|
||||
* VM Name: `myesnode1`
|
||||
* VM Size: `extrasmall`
|
||||
* Location: `West Europe`
|
||||
* Login: `elasticsearch`
|
||||
* Password: `password1234!!`
|
||||
|
||||
Using command line:
|
||||
|
||||
```sh
|
||||
azure vm create azure-elasticsearch-cluster \
|
||||
b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-13_10-amd64-server-20130808-alpha3-en-us-30GB \
|
||||
--vm-name myesnode1 \
|
||||
--location "West Europe" \
|
||||
--vm-size extrasmall \
|
||||
--ssh 22 \
|
||||
--ssh-cert /tmp/azure-certificate.pem \
|
||||
elasticsearch password1234\!\!
|
||||
```
|
||||
|
||||
You should see something like:
|
||||
|
||||
```
|
||||
info: Executing command vm create
|
||||
+ Looking up image
|
||||
+ Looking up cloud service
|
||||
+ Creating cloud service
|
||||
+ Retrieving storage accounts
|
||||
+ Configuring certificate
|
||||
+ Creating VM
|
||||
info: vm create command OK
|
||||
```
|
||||
|
||||
Now, your first instance is started. You need to install Elasticsearch on it.
|
||||
|
||||
> **Note on SSH**
|
||||
>
|
||||
> You need to give the private key and username each time you log on your instance:
|
||||
>
|
||||
>```sh
|
||||
>ssh -i ~/.ssh/azure-private.key elasticsearch@myescluster.cloudapp.net
|
||||
>```
|
||||
>
|
||||
> But you can also define it once in `~/.ssh/config` file:
|
||||
>
|
||||
>```
|
||||
>Host *.cloudapp.net
|
||||
> User elasticsearch
|
||||
> StrictHostKeyChecking no
|
||||
> UserKnownHostsFile=/dev/null
|
||||
> IdentityFile ~/.ssh/azure-private.key
|
||||
>```
|
||||
|
||||
|
||||
```sh
|
||||
# First, copy your keystore on this machine
|
||||
scp /tmp/azurekeystore.pkcs12 azure-elasticsearch-cluster.cloudapp.net:/home/elasticsearch
|
||||
|
||||
# Then, connect to your instance using SSH
|
||||
ssh azure-elasticsearch-cluster.cloudapp.net
|
||||
```
|
||||
|
||||
Once connected, install Elasticsearch:
|
||||
|
||||
```sh
|
||||
# Install Latest Java version
|
||||
# Read http://www.webupd8.org/2012/01/install-oracle-java-jdk-7-in-ubuntu-via.html for details
|
||||
sudo add-apt-repository ppa:webupd8team/java
|
||||
sudo apt-get update
|
||||
sudo apt-get install oracle-java7-installer
|
||||
|
||||
# If you want to install OpenJDK instead
|
||||
# sudo apt-get update
|
||||
# sudo apt-get install openjdk-7-jre-headless
|
||||
|
||||
# Download Elasticsearch
|
||||
curl -s https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.0.0.deb -o elasticsearch-1.0.0.deb
|
||||
|
||||
# Prepare Elasticsearch installation
|
||||
sudo dpkg -i elasticsearch-1.0.0.deb
|
||||
```
|
||||
|
||||
Check that elasticsearch is running:
|
||||
|
||||
```sh
|
||||
curl http://localhost:9200/
|
||||
```
|
||||
|
||||
This command should give you a JSON result:
|
||||
|
||||
```javascript
|
||||
{
|
||||
"status" : 200,
|
||||
"name" : "Living Colossus",
|
||||
"version" : {
|
||||
"number" : "1.0.0",
|
||||
"build_hash" : "a46900e9c72c0a623d71b54016357d5f94c8ea32",
|
||||
"build_timestamp" : "2014-02-12T16:18:34Z",
|
||||
"build_snapshot" : false,
|
||||
"lucene_version" : "4.6"
|
||||
},
|
||||
"tagline" : "You Know, for Search"
|
||||
}
|
||||
```
|
||||
|
||||
### Install elasticsearch cloud azure plugin
|
||||
|
||||
```sh
|
||||
# Stop elasticsearch
|
||||
sudo service elasticsearch stop
|
||||
|
||||
# Install the plugin
|
||||
sudo /usr/share/elasticsearch/bin/plugin install elasticsearch/elasticsearch-cloud-azure/2.6.1
|
||||
|
||||
# Configure it
|
||||
sudo vi /etc/elasticsearch/elasticsearch.yml
|
||||
```
|
||||
|
||||
And add the following lines:
|
||||
|
||||
```yaml
|
||||
# If you don't remember your account id, you may get it with `azure account list`
|
||||
cloud:
|
||||
azure:
|
||||
management:
|
||||
subscription.id: your_azure_subscription_id
|
||||
cloud.service.name: your_azure_cloud_service_name
|
||||
keystore:
|
||||
path: /home/elasticsearch/azurekeystore.pkcs12
|
||||
password: your_password_for_keystore
|
||||
|
||||
discovery:
|
||||
type: azure
|
||||
|
||||
# Recommended (warning: non durable disk)
|
||||
# path.data: /mnt/resource/elasticsearch/data
|
||||
```
|
||||
|
||||
Restart elasticsearch:
|
||||
|
||||
```sh
|
||||
sudo service elasticsearch start
|
||||
```
|
||||
|
||||
If anything goes wrong, check your logs in `/var/log/elasticsearch`.
|
||||
|
||||
|
||||
Scaling Out!
|
||||
------------
|
||||
|
||||
You need first to create an image of your previous machine.
|
||||
Disconnect from your machine and run locally the following commands:
|
||||
|
||||
```sh
|
||||
# Shutdown the instance
|
||||
azure vm shutdown myesnode1
|
||||
|
||||
# Create an image from this instance (it could take some minutes)
|
||||
azure vm capture myesnode1 esnode-image --delete
|
||||
|
||||
# Note that the previous instance has been deleted (mandatory)
|
||||
# So you need to create it again and BTW create other instances.
|
||||
|
||||
azure vm create azure-elasticsearch-cluster \
|
||||
esnode-image \
|
||||
--vm-name myesnode1 \
|
||||
--location "West Europe" \
|
||||
--vm-size extrasmall \
|
||||
--ssh 22 \
|
||||
--ssh-cert /tmp/azure-certificate.pem \
|
||||
elasticsearch password1234\!\!
|
||||
```
|
||||
|
||||
> **Note:** It could happen that azure changes the endpoint public IP address.
|
||||
> DNS propagation could take some minutes before you can connect again using
|
||||
> name. You can get from azure the IP address if needed, using:
|
||||
>
|
||||
> ```sh
|
||||
> # Look at Network `Endpoints 0 Vip`
|
||||
> azure vm show myesnode1
|
||||
> ```
|
||||
|
||||
Let's start more instances!
|
||||
|
||||
```sh
|
||||
for x in $(seq 2 10)
|
||||
do
|
||||
echo "Launching azure instance #$x..."
|
||||
azure vm create azure-elasticsearch-cluster \
|
||||
esnode-image \
|
||||
--vm-name myesnode$x \
|
||||
--vm-size extrasmall \
|
||||
--ssh $((21 + $x)) \
|
||||
--ssh-cert /tmp/azure-certificate.pem \
|
||||
--connect \
|
||||
elasticsearch password1234\!\!
|
||||
done
|
||||
```
|
||||
|
||||
If you want to remove your running instances:
|
||||
|
||||
```
|
||||
azure vm delete myesnode1
|
||||
```
|
||||
|
||||
Azure Repository
|
||||
================
|
||||
|
||||
To enable Azure repositories, you have first to set your azure storage settings in `elasticsearch.yml` file:
|
||||
|
||||
```
|
||||
cloud:
|
||||
azure:
|
||||
storage:
|
||||
account: your_azure_storage_account
|
||||
key: your_azure_storage_key
|
||||
```
|
||||
|
||||
For information, in previous version of the azure plugin, settings were:
|
||||
|
||||
```
|
||||
cloud:
|
||||
azure:
|
||||
storage_account: your_azure_storage_account
|
||||
storage_key: your_azure_storage_key
|
||||
```
|
||||
|
||||
The Azure repository supports following settings:
|
||||
|
||||
* `container`: Container name. Defaults to `elasticsearch-snapshots`
|
||||
* `base_path`: Specifies the path within container to repository data. Defaults to empty (root directory).
|
||||
* `chunk_size`: Big files can be broken down into chunks during snapshotting if needed. The chunk size can be specified
|
||||
in bytes or by using size value notation, i.e. `1g`, `10m`, `5k`. Defaults to `64m` (64m max)
|
||||
* `compress`: When set to `true` metadata files are stored in compressed format. This setting doesn't affect index
|
||||
files that are already compressed by default. Defaults to `false`.
|
||||
|
||||
Some examples, using scripts:
|
||||
|
||||
```sh
|
||||
# The simpliest one
|
||||
$ curl -XPUT 'http://localhost:9200/_snapshot/my_backup1' -d '{
|
||||
"type": "azure"
|
||||
}'
|
||||
|
||||
# With some settings
|
||||
$ curl -XPUT 'http://localhost:9200/_snapshot/my_backup2' -d '{
|
||||
"type": "azure",
|
||||
"settings": {
|
||||
"container": "backup_container",
|
||||
"base_path": "backups",
|
||||
"chunk_size": "32m",
|
||||
"compress": true
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
Example using Java:
|
||||
|
||||
```java
|
||||
client.admin().cluster().preparePutRepository("my_backup3")
|
||||
.setType("azure").setSettings(Settings.settingsBuilder()
|
||||
.put(Storage.CONTAINER, "backup_container")
|
||||
.put(Storage.CHUNK_SIZE, new ByteSizeValue(32, ByteSizeUnit.MB))
|
||||
).get();
|
||||
```
|
||||
|
||||
Repository validation rules
|
||||
---------------------------
|
||||
|
||||
According to the [containers naming guide](http://msdn.microsoft.com/en-us/library/dd135715.aspx), a container name must
|
||||
be a valid DNS name, conforming to the following naming rules:
|
||||
|
||||
* Container names must start with a letter or number, and can contain only letters, numbers, and the dash (-) character.
|
||||
* Every dash (-) character must be immediately preceded and followed by a letter or number; consecutive dashes are not
|
||||
permitted in container names.
|
||||
* All letters in a container name must be lowercase.
|
||||
* Container names must be from 3 through 63 characters long.
|
||||
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Integrations tests in this plugin require working Azure configuration and therefore disabled by default.
|
||||
To enable tests prepare a config file `elasticsearch.yml` with the following content:
|
||||
|
||||
```
|
||||
cloud:
|
||||
azure:
|
||||
storage:
|
||||
account: "YOUR-AZURE-STORAGE-NAME"
|
||||
key: "YOUR-AZURE-STORAGE-KEY"
|
||||
```
|
||||
|
||||
Replaces `account`, `key` with your settings. Please, note that the test will delete all snapshot/restore related files in the specified bucket.
|
||||
|
||||
To run test:
|
||||
|
||||
```sh
|
||||
mvn -Dtests.azure=true -Dtests.config=/path/to/config/file/elasticsearch.yml clean test
|
||||
```
|
||||
|
||||
Working around a bug in Windows SMB and Java on windows
|
||||
=======================================================
|
||||
When using a shared file system based on the SMB protocol (like Azure File Service) to store indices, the way Lucene open index segment files is with a write only flag. This is the *correct* way to open the files, as they will only be used for writes and allows different FS implementations to optimize for it. Sadly, in windows with SMB, this disables the cache manager, causing writes to be slow. This has been described in [LUCENE-6176](https://issues.apache.org/jira/browse/LUCENE-6176), but it affects each and every Java program out there!. This need and must be fixed outside of ES and/or Lucene, either in windows or OpenJDK. For now, we are providing an experimental support to open the files with read flag, but this should be considered experimental and the correct way to fix it is in OpenJDK or Windows.
|
||||
|
||||
The Azure Cloud plugin provides two storage types optimized for SMB:
|
||||
|
||||
- `smb_mmap_fs`: a SMB specific implementation of the default [mmap fs](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#mmapfs)
|
||||
- `smb_simple_fs`: a SMB specific implementation of the default [simple fs](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#simplefs)
|
||||
|
||||
To use one of these specific storage types, you need to install the Azure Cloud plugin and restart the node.
|
||||
Then configure Elasticsearch to set the storage type you want.
|
||||
|
||||
This can be configured for all indices by adding this to the `elasticsearch.yml` file:
|
||||
|
||||
```yaml
|
||||
index.store.type: smb_simple_fs
|
||||
```
|
||||
|
||||
Note that setting will be applied for newly created indices.
|
||||
|
||||
It can also be set on a per-index basis at index creation time:
|
||||
|
||||
```sh
|
||||
curl -XPUT localhost:9200/my_index -d '{
|
||||
"settings": {
|
||||
"index.store.type": "smb_mmap_fs"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
|
@ -1,8 +0,0 @@
|
|||
Elasticsearch
|
||||
Copyright 2009-2015 Elasticsearch
|
||||
|
||||
This product includes software developed by The Apache Software
|
||||
Foundation (http://www.apache.org/).
|
||||
|
||||
The LICENSE and NOTICE files for all dependencies may be found in the licenses/
|
||||
directory.
|
|
@ -1,421 +0,0 @@
|
|||
Google Compute Engine Cloud Plugin for Elasticsearch
|
||||
====================================================
|
||||
|
||||
The GCE Cloud plugin allows to use GCE API for the unicast discovery mechanism.
|
||||
|
||||
In order to install the plugin, run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-cloud-gce/2.5.0
|
||||
```
|
||||
|
||||
You need to install a version matching your Elasticsearch version:
|
||||
|
||||
| Elasticsearch | GCE Cloud Plugin | Docs |
|
||||
|------------------------|-------------------|------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.6.0-SNAPSHOT](https://github.com/elasticsearch/elasticsearch-cloud-gce/tree/es-1.x/#google-compute-engine-cloud-plugin-for-elasticsearch)|
|
||||
| es-1.5 | 2.5.0 | [2.5.0](https://github.com/elastic/elasticsearch-cloud-gce/tree/v2.5.0/#version-250-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.4.1 | [2.4.1](https://github.com/elasticsearch/elasticsearch-cloud-gce/tree/v2.4.1/#version-241-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.3.0 | [2.3.0](https://github.com/elasticsearch/elasticsearch-cloud-gce/tree/v2.3.0/#version-230-for-elasticsearch-13) |
|
||||
| es-1.2 | 2.2.0 | [2.2.0](https://github.com/elasticsearch/elasticsearch-cloud-gce/tree/v2.2.0/#google-compute-engine-cloud-plugin-for-elasticsearch)|
|
||||
| es-1.1 | 2.1.2 | [2.1.2](https://github.com/elasticsearch/elasticsearch-cloud-gce/tree/v2.1.2/#google-compute-engine-cloud-plugin-for-elasticsearch)|
|
||||
| es-1.0 | 2.0.1 | [2.0.1](https://github.com/elasticsearch/elasticsearch-cloud-gce/tree/v2.0.1/#google-compute-engine-cloud-plugin-for-elasticsearch)|
|
||||
| es-0.90 | 1.3.0 | [1.3.0](https://github.com/elasticsearch/elasticsearch-cloud-gce/tree/v1.3.0/#google-compute-engine-cloud-plugin-for-elasticsearch)|
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install cloud-gce \
|
||||
--url file:target/releases/elasticsearch-cloud-gce-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
|
||||
Google Compute Engine Virtual Machine Discovery
|
||||
===============================
|
||||
|
||||
Google Compute Engine VM discovery allows to use the google APIs to perform automatic discovery (similar to multicast in non hostile
|
||||
multicast environments). Here is a simple sample configuration:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
gce:
|
||||
project_id: <your-google-project-id>
|
||||
zone: <your-zone>
|
||||
discovery:
|
||||
type: gce
|
||||
```
|
||||
|
||||
How to start (short story)
|
||||
--------------------------
|
||||
|
||||
* Create Google Compute Engine instance (with compute rw permissions)
|
||||
* Install Elasticsearch
|
||||
* Install Google Compute Engine Cloud plugin
|
||||
* Modify `elasticsearch.yml` file
|
||||
* Start Elasticsearch
|
||||
|
||||
How to start (long story)
|
||||
--------------------------
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Before starting, you should have:
|
||||
|
||||
* Your project ID. Let's say here `es-cloud`. Get it from [Google APIS Console](https://code.google.com/apis/console/).
|
||||
* [Google Cloud SDK](https://developers.google.com/cloud/sdk/)
|
||||
|
||||
If you did not set it yet, you can define your default project you will work on:
|
||||
|
||||
```sh
|
||||
gcloud config set project es-cloud
|
||||
```
|
||||
|
||||
### Creating your first instance
|
||||
|
||||
|
||||
```sh
|
||||
gcutil addinstance myesnode1 \
|
||||
--service_account_scope=compute-rw,storage-full \
|
||||
--persistent_boot_disk
|
||||
```
|
||||
|
||||
You will be asked to open a link in your browser. Login and allow access to listed services.
|
||||
You will get back a verification code. Copy and paste it in your terminal.
|
||||
|
||||
You should get `Authentication successful.` message.
|
||||
|
||||
Then, choose your zone. Let's say here that we choose `europe-west1-a`.
|
||||
|
||||
Choose your compute instance size. Let's say `f1-micro`.
|
||||
|
||||
Choose your OS. Let's say `projects/debian-cloud/global/images/debian-7-wheezy-v20140606`.
|
||||
|
||||
You may be asked to create a ssh key. Follow instructions to create one.
|
||||
|
||||
When done, a report like this one should appears:
|
||||
|
||||
```sh
|
||||
Table of resources:
|
||||
|
||||
+-----------+--------------+-------+---------+--------------+----------------+----------------+----------------+---------+----------------+
|
||||
| name | machine-type | image | network | network-ip | external-ip | disks | zone | status | status-message |
|
||||
+-----------+--------------+-------+---------+--------------+----------------+----------------+----------------+---------+----------------+
|
||||
| myesnode1 | f1-micro | | default | 10.240.20.57 | 192.158.29.199 | boot-myesnode1 | europe-west1-a | RUNNING | |
|
||||
+-----------+--------------+-------+---------+--------------+----------------+----------------+----------------+---------+----------------+
|
||||
```
|
||||
|
||||
You can now connect to your instance:
|
||||
|
||||
```
|
||||
# Connect using google cloud SDK
|
||||
gcloud compute ssh myesnode1 --zone europe-west1-a
|
||||
|
||||
# Or using SSH with external IP address
|
||||
ssh -i ~/.ssh/google_compute_engine 192.158.29.199
|
||||
```
|
||||
|
||||
*Note Regarding Service Account Permissions*
|
||||
|
||||
It's important when creating an instance that the correct permissions are set. At a minimum, you must ensure you have:
|
||||
|
||||
```
|
||||
service_account_scope=compute-rw
|
||||
```
|
||||
|
||||
Failing to set this will result in unauthorized messages when starting Elasticsearch.
|
||||
See [Machine Permissions](#machine-permissions).
|
||||
|
||||
Once connected, install Elasticsearch:
|
||||
|
||||
```sh
|
||||
sudo apt-get update
|
||||
|
||||
# Download Elasticsearch
|
||||
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.2.1.deb
|
||||
|
||||
# Prepare Java installation
|
||||
sudo apt-get install java7-runtime-headless
|
||||
|
||||
# Prepare Elasticsearch installation
|
||||
sudo dpkg -i elasticsearch-1.2.1.deb
|
||||
```
|
||||
|
||||
### Install elasticsearch cloud gce plugin
|
||||
|
||||
Install the plugin:
|
||||
|
||||
```sh
|
||||
# Use Plugin Manager to install it
|
||||
sudo /usr/share/elasticsearch/bin/plugin install elasticsearch/elasticsearch-cloud-gce/2.2.0
|
||||
|
||||
# Configure it:
|
||||
sudo vi /etc/elasticsearch/elasticsearch.yml
|
||||
```
|
||||
|
||||
And add the following lines:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
gce:
|
||||
project_id: es-cloud
|
||||
zone: europe-west1-a
|
||||
discovery:
|
||||
type: gce
|
||||
```
|
||||
|
||||
|
||||
Start elasticsearch:
|
||||
|
||||
```sh
|
||||
sudo /etc/init.d/elasticsearch start
|
||||
```
|
||||
|
||||
If anything goes wrong, you should check logs:
|
||||
|
||||
```sh
|
||||
tail -f /var/log/elasticsearch/elasticsearch.log
|
||||
```
|
||||
|
||||
If needed, you can change log level to `TRACE` by modifying `sudo vi /etc/elasticsearch/logging.yml`:
|
||||
|
||||
```yaml
|
||||
# discovery
|
||||
discovery.gce: TRACE
|
||||
```
|
||||
|
||||
|
||||
|
||||
### Cloning your existing machine
|
||||
|
||||
In order to build a cluster on many nodes, you can clone your configured instance to new nodes.
|
||||
You won't have to reinstall everything!
|
||||
|
||||
First create an image of your running instance and upload it to Google Cloud Storage:
|
||||
|
||||
```sh
|
||||
# Create an image of yur current instance
|
||||
sudo /usr/bin/gcimagebundle -d /dev/sda -o /tmp/
|
||||
|
||||
# An image has been created in `/tmp` directory:
|
||||
ls /tmp
|
||||
e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz
|
||||
|
||||
# Upload your image to Google Cloud Storage:
|
||||
# Create a bucket to hold your image, let's say `esimage`:
|
||||
gsutil mb gs://esimage
|
||||
|
||||
# Copy your image to this bucket:
|
||||
gsutil cp /tmp/e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz gs://esimage
|
||||
|
||||
# Then add your image to images collection:
|
||||
gcutil addimage elasticsearch-1-2-1 gs://esimage/e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz
|
||||
|
||||
# If the previous command did not work for you, logout from your instance
|
||||
# and launch the same command from your local machine.
|
||||
```
|
||||
|
||||
### Start new instances
|
||||
|
||||
As you have now an image, you can create as many instances as you need:
|
||||
|
||||
```sh
|
||||
# Just change node name (here myesnode2)
|
||||
gcutil addinstance --image=elasticsearch-1-2-1 myesnode2
|
||||
|
||||
# If you want to provide all details directly, you can use:
|
||||
gcutil addinstance --image=elasticsearch-1-2-1 \
|
||||
--kernel=projects/google/global/kernels/gce-v20130603 myesnode2 \
|
||||
--zone europe-west1-a --machine_type f1-micro --service_account_scope=compute-rw \
|
||||
--persistent_boot_disk
|
||||
```
|
||||
|
||||
### Remove an instance (aka shut it down)
|
||||
|
||||
You can use [Google Cloud Console](https://cloud.google.com/console) or CLI to manage your instances:
|
||||
|
||||
```sh
|
||||
# Stopping and removing instances
|
||||
gcutil deleteinstance myesnode1 myesnode2 \
|
||||
--zone=europe-west1-a
|
||||
|
||||
# Consider removing disk as well if you don't need them anymore
|
||||
gcutil deletedisk boot-myesnode1 boot-myesnode2 \
|
||||
--zone=europe-west1-a
|
||||
```
|
||||
|
||||
Using zones
|
||||
-----------
|
||||
|
||||
`cloud.gce.zone` helps to retrieve instances running in a given zone. It should be one of the
|
||||
[GCE supported zones](https://developers.google.com/compute/docs/zones#available).
|
||||
|
||||
The GCE discovery can support multi zones although you need to be aware of network latency between zones.
|
||||
To enable discovery across more than one zone, just enter add your zone list to `cloud.gce.zone` setting:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
gce:
|
||||
project_id: <your-google-project-id>
|
||||
zone: ["<your-zone1>", "<your-zone2>"]
|
||||
discovery:
|
||||
type: gce
|
||||
```
|
||||
|
||||
|
||||
|
||||
Filtering by tags
|
||||
-----------------
|
||||
|
||||
The GCE discovery can also filter machines to include in the cluster based on tags using `discovery.gce.tags` settings.
|
||||
For example, setting `discovery.gce.tags` to `dev` will only filter instances having a tag set to `dev`. Several tags
|
||||
set will require all of those tags to be set for the instance to be included.
|
||||
|
||||
One practical use for tag filtering is when an GCE cluster contains many nodes that are not running
|
||||
elasticsearch. In this case (particularly with high ping_timeout values) there is a risk that a new node's discovery
|
||||
phase will end before it has found the cluster (which will result in it declaring itself master of a new cluster
|
||||
with the same name - highly undesirable). Adding tag on elasticsearch GCE nodes and then filtering by that
|
||||
tag will resolve this issue.
|
||||
|
||||
Add your tag when building the new instance:
|
||||
|
||||
```sh
|
||||
gcutil --project=es-cloud addinstance myesnode1 \
|
||||
--service_account_scope=compute-rw \
|
||||
--persistent_boot_disk \
|
||||
--tags=elasticsearch,dev
|
||||
```
|
||||
|
||||
Then, define it in `elasticsearch.yml`:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
gce:
|
||||
project_id: es-cloud
|
||||
zone: europe-west1-a
|
||||
discovery:
|
||||
type: gce
|
||||
gce:
|
||||
tags: elasticsearch, dev
|
||||
```
|
||||
|
||||
Changing default transport port
|
||||
-------------------------------
|
||||
|
||||
By default, elasticsearch GCE plugin assumes that you run elasticsearch on 9300 default port.
|
||||
But you can specify the port value elasticsearch is meant to use using google compute engine metadata `es_port`:
|
||||
|
||||
### When creating instance
|
||||
|
||||
Add `--metadata=es_port:9301` option:
|
||||
|
||||
```sh
|
||||
# when creating first instance
|
||||
gcutil addinstance myesnode1 \
|
||||
--service_account_scope=compute-rw,storage-full \
|
||||
--persistent_boot_disk \
|
||||
--metadata=es_port:9301
|
||||
|
||||
# when creating an instance from an image
|
||||
gcutil addinstance --image=elasticsearch-1-0-0-RC1 \
|
||||
--kernel=projects/google/global/kernels/gce-v20130603 myesnode2 \
|
||||
--zone europe-west1-a --machine_type f1-micro --service_account_scope=compute-rw \
|
||||
--persistent_boot_disk --metadata=es_port:9301
|
||||
```
|
||||
|
||||
### On a running instance
|
||||
|
||||
```sh
|
||||
# Get metadata fingerprint
|
||||
gcutil getinstance myesnode1 --zone=europe-west1-a
|
||||
+------------------------+---------------------------------------------------------------------------------------------------------+
|
||||
| property | value |
|
||||
+------------------------+---------------------------------------------------------------------------------------------------------+
|
||||
| metadata | |
|
||||
| fingerprint | 42WmSpB8rSM= |
|
||||
+------------------------+---------------------------------------------------------------------------------------------------------+
|
||||
|
||||
# Use that fingerprint
|
||||
gcutil setinstancemetadata myesnode1 \
|
||||
--zone=europe-west1-a \
|
||||
--metadata=es_port:9301 \
|
||||
--fingerprint=42WmSpB8rSM=
|
||||
```
|
||||
|
||||
|
||||
Tips
|
||||
----
|
||||
|
||||
### Store project id locally
|
||||
|
||||
If you don't want to repeat the project id each time, you can save it in `~/.gcutil.flags` file using:
|
||||
|
||||
```sh
|
||||
gcutil getproject --project=es-cloud --cache_flag_values
|
||||
```
|
||||
|
||||
`~/.gcutil.flags` file now contains:
|
||||
|
||||
```
|
||||
--project=es-cloud
|
||||
```
|
||||
|
||||
### Machine Permissions
|
||||
|
||||
**Creating machines with gcutil**
|
||||
|
||||
Ensure the following flags are set:
|
||||
|
||||
````
|
||||
--service_account_scope=compute-rw
|
||||
```
|
||||
|
||||
**Creating with console (web)**
|
||||
|
||||
When creating an instance using the web portal, click **Show advanced options**.
|
||||
|
||||
At the bottom of the page, under `PROJECT ACCESS`, choose `>> Compute >> Read Write`.
|
||||
|
||||
**Creating with knife google**
|
||||
|
||||
Set the service account scopes when creating the machine:
|
||||
|
||||
```
|
||||
$ knife google server create www1 \
|
||||
-m n1-standard-1 \
|
||||
-I debian-7-wheezy-v20131120 \
|
||||
-Z us-central1-a \
|
||||
-i ~/.ssh/id_rsa \
|
||||
-x jdoe \
|
||||
--gce-service-account-scopes https://www.googleapis.com/auth/compute.full_control
|
||||
```
|
||||
|
||||
Or, you may use the alias:
|
||||
|
||||
```
|
||||
--gce-service-account-scopes compute-rw
|
||||
```
|
||||
|
||||
If you have created a machine without the correct permissions, you will see `403 unauthorized` error messages. The only
|
||||
way to alter these permissions is to delete the instance (NOT THE DISK). Then create another with the correct permissions.
|
||||
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
|
@ -1,5 +0,0 @@
|
|||
Example JVM Plugin for Elasticsearch
|
||||
==================================
|
||||
Leniency is the root of all evil
|
||||
|
||||
|
|
@ -1,177 +0,0 @@
|
|||
JavaScript lang Plugin for Elasticsearch
|
||||
==================================
|
||||
|
||||
The JavaScript language plugin allows to have `javascript` (or `js`) as the language of scripts to execute.
|
||||
|
||||
In order to install the plugin, simply run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-lang-javascript/2.5.0
|
||||
```
|
||||
|
||||
You need to install a version matching your Elasticsearch version:
|
||||
|
||||
| elasticsearch | JavaScript Plugin | Docs |
|
||||
|---------------|-----------------------|------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.6.0-SNAPSHOT](https://github.com/elasticsearch/elasticsearch-transport-thrift/tree/es-1.x/#version-260-snapshot-for-elasticsearch-1x) |
|
||||
| es-1.5 | 2.5.0 | [2.5.0](https://github.com/elastic/elasticsearch-lang-javascript/tree/v2.5.0/#version-250-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.4.1 | [2.4.1](https://github.com/elasticsearch/elasticsearch-lang-javascript/tree/v2.4.1/#version-241-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.3.1 | [2.3.1](https://github.com/elasticsearch/elasticsearch-lang-javascript/tree/v2.3.1/#version-231-for-elasticsearch-13) |
|
||||
| es-1.2 | 2.2.0 | [2.2.0](https://github.com/elasticsearch/elasticsearch-lang-javascript/tree/v2.2.0/#javascript-lang-plugin-for-elasticsearch) |
|
||||
| es-1.1 | 2.1.0 | [2.1.0](https://github.com/elasticsearch/elasticsearch-lang-javascript/tree/v2.1.0/#javascript-lang-plugin-for-elasticsearch) |
|
||||
| es-1.0 | 2.0.0 | [2.0.0](https://github.com/elasticsearch/elasticsearch-lang-javascript/tree/v2.0.0/#javascript-lang-plugin-for-elasticsearch) |
|
||||
| es-0.90 | 1.4.0 | [1.4.0](https://github.com/elasticsearch/elasticsearch-lang-javascript/tree/v1.4.0/#javascript-lang-plugin-for-elasticsearch) |
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install lang-javascript \
|
||||
--url file:target/releases/elasticsearch-lang-javascript-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
|
||||
Using javascript with function_score
|
||||
------------------------------------
|
||||
|
||||
Let's say you want to use `function_score` API using `javascript`. Here is
|
||||
a way of doing it:
|
||||
|
||||
```sh
|
||||
curl -XDELETE "http://localhost:9200/test"
|
||||
|
||||
curl -XPUT "http://localhost:9200/test/doc/1" -d '{
|
||||
"num": 1.0
|
||||
}'
|
||||
|
||||
curl -XPUT "http://localhost:9200/test/doc/2?refresh" -d '{
|
||||
"num": 2.0
|
||||
}'
|
||||
|
||||
curl -XGET "http://localhost:9200/test/_search?pretty" -d '
|
||||
{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"script_score": {
|
||||
"script": "doc[\"num\"].value",
|
||||
"lang": "javascript"
|
||||
}
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
gives
|
||||
|
||||
```javascript
|
||||
{
|
||||
// ...
|
||||
"hits": {
|
||||
"total": 2,
|
||||
"max_score": 4,
|
||||
"hits": [
|
||||
{
|
||||
// ...
|
||||
"_score": 4
|
||||
},
|
||||
{
|
||||
// ...
|
||||
"_score": 1
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Using javascript with script_fields
|
||||
-----------------------------------
|
||||
|
||||
```sh
|
||||
curl -XDELETE "http://localhost:9200/test"
|
||||
|
||||
curl -XPUT "http://localhost:9200/test/doc/1?refresh" -d'
|
||||
{
|
||||
"obj1": {
|
||||
"test": "something"
|
||||
},
|
||||
"obj2": {
|
||||
"arr2": [ "arr_value1", "arr_value2" ]
|
||||
}
|
||||
}'
|
||||
|
||||
curl -XGET "http://localhost:9200/test/_search" -d'
|
||||
{
|
||||
"script_fields": {
|
||||
"s_obj1": {
|
||||
"script": "_source.obj1", "lang": "js"
|
||||
},
|
||||
"s_obj1_test": {
|
||||
"script": "_source.obj1.test", "lang": "js"
|
||||
},
|
||||
"s_obj2": {
|
||||
"script": "_source.obj2", "lang": "js"
|
||||
},
|
||||
"s_obj2_arr2": {
|
||||
"script": "_source.obj2.arr2", "lang": "js"
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
gives
|
||||
|
||||
```javascript
|
||||
{
|
||||
// ...
|
||||
"hits": [
|
||||
{
|
||||
// ...
|
||||
"fields": {
|
||||
"s_obj2_arr2": [
|
||||
[
|
||||
"arr_value1",
|
||||
"arr_value2"
|
||||
]
|
||||
],
|
||||
"s_obj1_test": [
|
||||
"something"
|
||||
],
|
||||
"s_obj2": [
|
||||
{
|
||||
"arr2": [
|
||||
"arr_value1",
|
||||
"arr_value2"
|
||||
]
|
||||
}
|
||||
],
|
||||
"s_obj1": [
|
||||
{
|
||||
"test": "something"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
|
@ -1,178 +0,0 @@
|
|||
Python lang Plugin for Elasticsearch
|
||||
==================================
|
||||
|
||||
The Python (jython) language plugin allows to have `python` as the language of scripts to execute.
|
||||
|
||||
In order to install the plugin, simply run:
|
||||
|
||||
```sh
|
||||
bin/plugin install elasticsearch/elasticsearch-lang-python/2.5.0
|
||||
```
|
||||
|
||||
You need to install a version matching your Elasticsearch version:
|
||||
|
||||
| elasticsearch | Python Lang Plugin | Docs |
|
||||
|---------------|-----------------------|------------|
|
||||
| master | Build from source | See below |
|
||||
| es-1.x | Build from source | [2.6.0-SNAPSHOT](https://github.com/elasticsearch/elasticsearch-lang-python/tree/es-1.x/#version-260-snapshot-for-elasticsearch-1x) |
|
||||
| es-1.5 | 2.5.0 | [2.5.0](https://github.com/elastic/elasticsearch-lang-python/tree/v2.5.0/#version-250-for-elasticsearch-15) |
|
||||
| es-1.4 | 2.4.1 | [2.4.1](https://github.com/elasticsearch/elasticsearch-lang-python/tree/v2.4.1/#version-241-for-elasticsearch-14) |
|
||||
| es-1.3 | 2.3.1 | [2.3.1](https://github.com/elasticsearch/elasticsearch-lang-python/tree/v2.3.1/#version-231-for-elasticsearch-13) |
|
||||
| < 1.3.5 | 2.3.0 | [2.3.0](https://github.com/elasticsearch/elasticsearch-lang-python/tree/v2.3.0/#version-230-for-elasticsearch-13) |
|
||||
| es-1.2 | 2.2.0 | [2.2.0](https://github.com/elasticsearch/elasticsearch-lang-python/tree/v2.2.0/#python-lang-plugin-for-elasticsearch) |
|
||||
| es-1.0 | 2.0.0 | [2.0.0](https://github.com/elasticsearch/elasticsearch-lang-python/tree/v2.0.0/#python-lang-plugin-for-elasticsearch) |
|
||||
| es-0.90 | 1.0.0 | [1.0.0](https://github.com/elasticsearch/elasticsearch-lang-python/tree/v1.0.0/#python-lang-plugin-for-elasticsearch) |
|
||||
|
||||
To build a `SNAPSHOT` version, you need to build it with Maven:
|
||||
|
||||
```bash
|
||||
mvn clean install
|
||||
plugin install lang-python \
|
||||
--url file:target/releases/elasticsearch-lang-python-X.X.X-SNAPSHOT.zip
|
||||
```
|
||||
|
||||
User Guide
|
||||
----------
|
||||
|
||||
Using python with function_score
|
||||
--------------------------------
|
||||
|
||||
Let's say you want to use `function_score` API using `python`. Here is
|
||||
a way of doing it:
|
||||
|
||||
```sh
|
||||
curl -XDELETE "http://localhost:9200/test"
|
||||
|
||||
curl -XPUT "http://localhost:9200/test/doc/1" -d '{
|
||||
"num": 1.0
|
||||
}'
|
||||
|
||||
curl -XPUT "http://localhost:9200/test/doc/2?refresh" -d '{
|
||||
"num": 2.0
|
||||
}'
|
||||
|
||||
curl -XGET "http://localhost:9200/test/_search?pretty" -d'
|
||||
{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"script_score": {
|
||||
"script": "doc[\"num\"].value * _score",
|
||||
"lang": "python"
|
||||
}
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
gives
|
||||
|
||||
```javascript
|
||||
{
|
||||
// ...
|
||||
"hits": {
|
||||
"total": 2,
|
||||
"max_score": 2,
|
||||
"hits": [
|
||||
{
|
||||
// ...
|
||||
"_score": 2
|
||||
},
|
||||
{
|
||||
// ...
|
||||
"_score": 1
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Using python with script_fields
|
||||
-------------------------------
|
||||
|
||||
```sh
|
||||
curl -XDELETE "http://localhost:9200/test"
|
||||
|
||||
curl -XPUT "http://localhost:9200/test/doc/1?refresh" -d'
|
||||
{
|
||||
"obj1": {
|
||||
"test": "something"
|
||||
},
|
||||
"obj2": {
|
||||
"arr2": [ "arr_value1", "arr_value2" ]
|
||||
}
|
||||
}'
|
||||
|
||||
curl -XGET "http://localhost:9200/test/_search" -d'
|
||||
{
|
||||
"script_fields": {
|
||||
"s_obj1": {
|
||||
"script": "_source[\"obj1\"]", "lang": "python"
|
||||
},
|
||||
"s_obj1_test": {
|
||||
"script": "_source[\"obj1\"][\"test\"]", "lang": "python"
|
||||
},
|
||||
"s_obj2": {
|
||||
"script": "_source[\"obj2\"]", "lang": "python"
|
||||
},
|
||||
"s_obj2_arr2": {
|
||||
"script": "_source[\"obj2\"][\"arr2\"]", "lang": "python"
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
gives
|
||||
|
||||
```javascript
|
||||
{
|
||||
// ...
|
||||
"hits": [
|
||||
{
|
||||
// ...
|
||||
"fields": {
|
||||
"s_obj2_arr2": [
|
||||
[
|
||||
"arr_value1",
|
||||
"arr_value2"
|
||||
]
|
||||
],
|
||||
"s_obj1_test": [
|
||||
"something"
|
||||
],
|
||||
"s_obj2": [
|
||||
{
|
||||
"arr2": [
|
||||
"arr_value1",
|
||||
"arr_value2"
|
||||
]
|
||||
}
|
||||
],
|
||||
"s_obj1": [
|
||||
{
|
||||
"test": "something"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
This software is licensed under the Apache 2 license, quoted below.
|
||||
|
||||
Copyright 2009-2014 Elasticsearch <http://www.elasticsearch.org>
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not
|
||||
use this file except in compliance with the License. You may obtain a copy of
|
||||
the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations under
|
||||
the License.
|
Loading…
Reference in New Issue