The reindex API is mature now, and we will work to maintain backwards
compatibility in accordance with our backwards compatibility
policy. This commit unmarks the reindex API as experimental.
Relates #23621
When using a bulk processor in test, you might write something like:
```java
BulkProcessor bulkProcessor = BulkProcessor.builder(client, new BulkProcessor.Listener() {
@Override public void beforeBulk(long executionId, BulkRequest request) {}
@Override public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {}
@Override public void afterBulk(long executionId, BulkRequest request, Throwable failure) {}
})
.setBulkActions(10000)
.setFlushInterval(TimeValue.timeValueSeconds(10))
.build();
for (int i = 0; i < 10000; i++) {
bulkProcessor.add(new IndexRequest("foo", "bar", "doc_" + i)
.source(jsonBuilder().startObject().field("foo", "bar").endObject()
));
}
bulkProcessor.flush();
client.admin().indices().prepareRefresh("foo").get();
SearchResponse response = client.prepareSearch("foo").get();
// response does not contain any hit
```
The problem is that by default bulkProcessor defines the number of concurrent requests to 1 which is using behind the scene an Async BulkRequestHandler.
When you call `flush()` in a test, you expect it to flush all the content of the bulk so you can search for your docs.
But because of the async handling, there is a great chance that none of the documents has been indexed yet when you call the `refresh` method.
We should advice in our Java guide to explicitly set concurrent requests to `0` so users will use behind the scene the Sync BulkRequestHandler.
```java
BulkProcessor bulkProcessor = BulkProcessor.builder(client, new BulkProcessor.Listener() {
@Override public void beforeBulk(long executionId, BulkRequest request) {}
@Override public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {}
@Override public void afterBulk(long executionId, BulkRequest request, Throwable failure) {}
})
.setBulkActions(5000)
.setFlushInterval(TimeValue.timeValueSeconds(10))
.setConcurrentRequests(0)
.build();
```
Closes#22158.
By default, it is recommended to start bulk with a size of 10-15MB, and increase it gradually to get the right size for the environment. The example shows originally 1GB, which can lead to some users to just copy-paste the code snippet and start with excessively big sizes.
Backport of #21664 in master branch.
With ES 5.0 we do not include Jackson
Databind anymore with ES core. This commit
updates our docs to state that users need
to add this artifact now in their projects.
The shaded version of elasticsearch was built at the very beginning to avoid dependency conflicts in a specific case where:
* People use elasticsearch from Java
* People needs to embed elasticsearch jar within their own application (as it's today the only way to get a `TransportClient`)
* People also embed in their application another (most of the time older) version of dependency we are using for elasticsearch, such as: Guava, Joda, Jackson...
This conflict issue can be solved within the projects themselves by either upgrade the dependency version and use the one provided by elasticsearch or by shading elasticsearch project and relocating some conflicting packages.
Example
-------
As an example, let's say you want to use within your project `Joda 2.1` but elasticsearch `2.0.0-beta1` provides `Joda 2.8`.
Let's say you also want to run all that with shield plugin.
Create a new maven project or module with:
```xml
<groupId>fr.pilato.elasticsearch.test</groupId>
<artifactId>es-shaded</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<elasticsearch.version>2.0.0-beta1</elasticsearch.version>
</properties>
<dependencies>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.plugin</groupId>
<artifactId>shield</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
</dependencies>
```
And now shade and relocate all packages which conflicts with your own application:
```xml
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>org.joda</pattern>
<shadedPattern>fr.pilato.thirdparty.joda</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
```
You can create now a shaded version of elasticsearch + shield by running `mvn clean install`.
In your project, you can now depend on:
```xml
<dependency>
<groupId>fr.pilato.elasticsearch.test</groupId>
<artifactId>es-shaded</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>2.1</version>
</dependency>
```
Build then your TransportClient as usual:
```java
TransportClient client = TransportClient.builder()
.settings(Settings.builder()
.put("path.home", ".")
.put("shield.user", "username:password")
.put("plugin.types", "org.elasticsearch.shield.ShieldPlugin")
)
.build();
client.addTransportAddress(new InetSocketTransportAddress(new InetSocketAddress("localhost", 9300)));
// Index some data
client.prepareIndex("test", "doc", "1").setSource("foo", "bar").setRefresh(true).get();
SearchResponse searchResponse = client.prepareSearch("test").get();
```
If you want to use your own version of Joda, then import for example `org.joda.time.DateTime`. If you want to access to the shaded version (not recommended though), import `fr.pilato.thirdparty.joda.time.DateTime`.
You can run a simple test to make sure that both classes can live together within the same JVM:
```java
CodeSource codeSource = new org.joda.time.DateTime().getClass().getProtectionDomain().getCodeSource();
System.out.println("unshaded = " + codeSource);
codeSource = new fr.pilato.thirdparty.joda.time.DateTime().getClass().getProtectionDomain().getCodeSource();
System.out.println("shaded = " + codeSource);
```
It will print:
```
unshaded = (file:/path/to/joda-time-2.1.jar <no signer certificates>)
shaded = (file:/path/to/es-shaded-1.0-SNAPSHOT.jar <no signer certificates>)
```
This PR also removes fully-loaded module.
By the way, the project can now build with Maven 3.3.3 so we can relax a bit our maven policy.
This commit reorganizes the docs to make Java API docs looking more like the REST docs.
Also, with 2.0.0, FilterBuilders don't exist anymore but only QueryBuilders.
Also, all docs api move now to docs/java-api/docs dir as for REST doc.
Remove removed queries/filters
-----
* Remove Constant Score Query with filter
* Remove Fuzzy Like This (Field) Query (flt and flt_field)
* Remove FilterBuilders
Move filters to queries
-----
* Move And Filter to And Query
* Move Bool Filter to Bool Query
* Move Exists Filter to Exists Query
* Move Geo Bounding Box Filter to Geo Bounding Box Query
* Move Geo Distance Filter to Geo Distance Query
* Move Geo Distance Range Filter to Geo Distance Range Query
* Move Geo Polygon Filter to Geo Polygon Query
* Move Geo Shape Filter to Geo Shape Query
* Move Has Child Filter by Has Child Query
* Move Has Parent Filter by Has Parent Query
* Move Ids Filter by Ids Query
* Move Limit Filter to Limit Query
* Move MatchAll Filter to MatchAll Query
* Move Missing Filter to Missing Query
* Move Nested Filter to Nested Query
* Move Not Filter to Not Query
* Move Or Filter to Or Query
* Move Range Filter to Range Query
* Move Ids Filter to Ids Query
* Move Term Filter to Term Query
* Move Terms Filter to Terms Query
* Move Type Filter to Type Query
Add missing queries
-----
* Add Common Terms Query
* Add Filtered Query
* Add Function Score Query
* Add Geohash Cell Query
* Add Regexp Query
* Add Script Query
* Add Simple Query String Query
* Add Span Containing Query
* Add Span Multi Term Query
* Add Span Within Query
Reorganize the documentation
-----
* Organize by full text queries
* Organize by term level queries
* Organize by compound queries
* Organize by joining queries
* Organize by geo queries
* Organize by specialized queries
* Organize by span queries
* Move Boosting Query
* Move DisMax Query
* Move Fuzzy Query
* Move Indices Query
* Move Match Query
* Move Mlt Query
* Move Multi Match Query
* Move Prefix Query
* Move Query String Query
* Move Span First Query
* Move Span Near Query
* Move Span Not Query
* Move Span Or Query
* Move Span Term Query
* Move Template Query
* Move Wildcard Query
Add some missing pages
----
* Add multi get API
* Add indexed-scripts link
Also closes#7826
Related to https://github.com/elastic/elasticsearch/pull/11477#issuecomment-114745934