Docs: Rewrote the migrating-to-2.0 section

2015-08-14 20:26:06 +02:00 · 2015-08-14 20:26:06 +02:00 · db1e83884f
parent 0240b581e7
commit db1e83884f
16 changed files with 1545 additions and 982 deletions
--- a/docs/reference/migration/migrate_2_0.asciidoc
+++ b/docs/reference/migration/migrate_2_0.asciidoc
--- a/docs/reference/migration/migrate_2_0/aggs.asciidoc
+++ b/docs/reference/migration/migrate_2_0/aggs.asciidoc
@ -0,0 +1,69 @@
 === Aggregation changes
 ==== Min doc count defaults to zero
 Both the `histogram` and `date_histogram` aggregations now have a default
 `min_doc_count` of `0` instead of `1`.
 ==== Timezone for date field
 Specifying the `time_zone` parameter in queries or aggregations on fields of
 type `date` must now be either an ISO 8601 UTC offset, or a timezone id. For
 example, the value `+1:00` must now be written as `+01:00`.
 ==== Time zones and offsets
 The `histogram` and the `date_histogram` aggregation now support a simplified
 `offset` option that replaces the previous `pre_offset` and `post_offset`
 rounding options. Instead of having to specify two separate offset shifts of
 the underlying buckets, the `offset` option moves the bucket boundaries in
 positive or negative direction depending on its argument.
 The `date_histogram` options for `pre_zone` and `post_zone` are replaced by
 the `time_zone` option. The behavior of `time_zone` is equivalent to the
 former `pre_zone` option. Setting `time_zone` to a value like "+01:00" now
 will lead to the bucket calculations being applied in the specified time zone.
 The `key` is returned as the timestamp in UTC, but the `key_as_string` is
 returned in the time zone specified.
 In addition to this, the `pre_zone_adjust_large_interval` is removed because
 we now always return dates and bucket keys in UTC.
 ==== Including/excluding terms
 `include`/`exclude` filtering on the `terms` aggregation now uses the same
 syntax as <<regexp-syntax,regexp queries>> instead of the Java regular
 expression syntax. While simple regexps should still work, more complex ones
 might need some rewriting. Also, the `flags` parameter is no longer supported.
 ==== Boolean fields
 Aggregations on `boolean` fields will now return `0` and `1` as keys, and
 `"true"` and `"false"` as string keys.  See <<migration-bool-fields>> for more
 information.
 ==== Java aggregation classes
 The `date_histogram` aggregation now returns a `Histogram` object in the
 response, and the `DateHistogram` class has been removed.  Similarly the
 `date_range`, `ipv4_range`, and `geo_distance` aggregations all return a
 `Range` object in the response, and the `IPV4Range`, `DateRange`, and
 `GeoDistance` classes have been removed.
 The motivation for this is to have a single response API for the Range and
 Histogram aggregations regardless of the type of data being queried.  To
 support this some changes were made in the `MultiBucketAggregation` interface
 which applies to all bucket aggregations:
 * The `getKey()` method now returns `Object` instead of `String`. The actual
  object type returned depends on the type of aggregation requested (e.g. the
  `date_histogram` will return a `DateTime` object for this method whereas a
  `histogram` will return a `Number`).
 * A `getKeyAsString()` method has been added to return the String
  representation of the key.
 * All other `getKeyAsX()` methods have been removed.
 * The `getBucketAsKey(String)` methods have been removed on all aggregations
  except the `filters` and `terms` aggregations.
--- a/docs/reference/migration/migrate_2_0/crud.asciidoc
+++ b/docs/reference/migration/migrate_2_0/crud.asciidoc
@ -0,0 +1,129 @@
 === CRUD and routing changes
 ==== Explicit custom routing
 Custom `routing` values can no longer be extracted from the document body, but
 must be specified explicitly as part of the query string, or in the metadata
 line in the <<docs-bulk,`bulk`>> API.  See <<migration-meta-fields>> for an
 example.
 ==== Routing hash function
 The default hash function that is used for routing has been changed from
 `djb2` to `murmur3`. This change should be transparent unless you relied on
 very specific properties of `djb2`. This will help ensure a better balance of
 the document counts between shards.
 In addition, the following routing-related node settings have been deprecated:
 `cluster.routing.operation.hash.type`::
  This was an undocumented setting that allowed to configure which hash function
  to use for routing. `murmur3` is now enforced on new indices.
 `cluster.routing.operation.use_type`::
  This was an undocumented setting that allowed to take the `_type` of the
  document into account when computing its shard (default: `false`). `false` is
  now enforced on new indices.
 ==== Delete API with custom routing
 The delete API used to be broadcast to all shards in the index which meant
 that, when using custom routing, the `routing` parameter was optional. Now,
 the delete request is forwarded only to the document holding the shard. If you
 are using custom routing then you should specify the `routing` value when
 deleting a document, just as is already required for the `index`, `create`,
 and `update` APIs.
 To make sure that you never forget a routing value, make routing required with
 the following mapping:
 [source,js]
 ---------------------------
 PUT my_index
 {
  "mappings": {
    "my_type": {
      "_routing": {
        "required": true
      }
    }
  }
 }
 ---------------------------
 ==== All stored meta-fields returned by default
 Previously, meta-fields like `_routing`, `_timestamp`, etc would only be
 included in a GET request if specifically requested with the `fields`
 parameter.  Now, all meta-fields which have stored values will be returned by
 default.  Additionally, they are now returned at the top level (along with
 `_index`, `_type`, and `_id`) instead of in the `fields` element.
 For instance, the following request:
 [source,sh]
 ---------------
 GET /my_index/my_type/1
 ---------------
 might return:
 [source,js]
 ---------------
 {
  "_index":     "my_index",
  "_type":      "my_type",
  "_id":        "1",
  "_timestamp": 10000000, <1>,
  "_source": {
    "foo" : [ "bar" ]
  }
 }
 ---------------
 <1> The `_timestamp` is returned by default, and at the top level.
 ==== Async replication
 The `replication` parameter has been removed from all CRUD operations
 (`index`, `create`,  `update`, `delete`, `bulk`) as it interfered with the
 <<indices-synced-flush,synced flush>> feature.  These operations are now
 synchronous only and a request will only return once the changes have been
 replicated to all active shards in the shard group.
 Instead, use more client processes to send more requests in parallel.
 ==== Documents must be specified without a type wrapper
 Previously, the document body could be wrapped in another object with the name
 of the `type`:
 [source,js]
 --------------------------
 PUT my_index/my_type/1
 {
  "my_type": { <1>
    "text": "quick brown fox"
  }
 }
 --------------------------
 <1> This `my_type` wrapper is not part of the document itself, but represents the document type.
 This feature was deprecated before but could be reenabled with the
 `mapping.allow_type_wrapper` index setting.  This setting is no longer
 supported.  The above document should be indexed as follows:
 [source,js]
 --------------------------
 PUT my_index/my_type/1
 {
  "text": "quick brown fox"
 }
 --------------------------
 ==== Term Vectors API
 Usage of `/_termvector` is deprecated in favor of `/_termvectors`.
--- a/docs/reference/migration/migrate_2_0/index_apis.asciidoc
+++ b/docs/reference/migration/migrate_2_0/index_apis.asciidoc
@ -0,0 +1,42 @@
 === Index API changes
 ==== Index aliases
 Fields used in alias filters no longer have to exist in the mapping at alias
 creation time. Previously, alias filters were parsed at alias creation time
 and the parsed form was cached in memory. Now, alias filters are  parsed at
 request time and the fields in filters are resolved from the current mapping.
 This also means that index aliases now support `has_parent` and `has_child`
 queries.
 The <<alias-retrieving, GET alias api>> will now throw an exception if no
 matching aliases are found. This change brings the defaults for this API in
 line with the other Indices APIs. The <<multi-index>> options can be used on a
 request to change this behavior.
 ==== File based index templates
 Index templates can no longer be configured on disk. Use the
 <<indices-templates,`_template`>> API instead.
 ==== Analyze API changes
 The Analyze API now returns the the `position` of the first token as `0`
 instead of `1`.
 The `prefer_local` parameter has been removed. The `_analyze` API is a light
 operation and the caller shouldn't be concerned about whether it executes on
 the node that receives the request or another node.
 The `text()` method on `AnalyzeRequest` now returns `String[]` instead of
 `String`.
 ==== Removed `id_cache` from clear cache api
 The <<indices-clearcache,clear cache>> API no longer supports the `id_cache`
 option.  Instead, use the `fielddata` option to clear the cache for the
 `_parent` field.
--- a/docs/reference/migration/migrate_2_0/java.asciidoc
+++ b/docs/reference/migration/migrate_2_0/java.asciidoc
@ -0,0 +1,76 @@
 === Java API changes
 ==== Transport API construction
 The `TransportClient` construction code has changed, it now uses the builder
 pattern. Instead of:
 [source,java]
 --------------------------------------------------
 Settings settings = Settings.settingsBuilder()
        .put("cluster.name", "myClusterName").build();
 Client client = new TransportClient(settings);
 --------------------------------------------------
 Use the following:
 [source,java]
 --------------------------------------------------
 Settings settings = Settings.settingsBuilder()
        .put("cluster.name", "myClusterName").build();
 Client client = TransportClient.builder().settings(settings).build();
 --------------------------------------------------
 ==== Automatically thread client listeners
 Previously, the user had to set request listener threads to `true` when on the
 client side in order not to block IO threads on heavy operations. This proved
 to be very trappy for users, and ended up creating problems that are very hard
 to debug.
 In 2.0, Elasticsearch automatically threads listeners that are used from the
 client when the client is a node client or a transport client. Threading can
 no longer be manually set.
 ==== Query/filter refactoring
 `org.elasticsearch.index.queries.FilterBuilders` has been removed as part of the merge of
 queries and filters. These filters are now available in `QueryBuilders` with the same name.
 All methods that used to accept a `FilterBuilder` now accept a `QueryBuilder` instead.
 In addition some query builders have been removed or renamed:
 * `commonTerms(...)` renamed with `commonTermsQuery(...)`
 * `queryString(...)` renamed with `queryStringQuery(...)`
 * `simpleQueryString(...)` renamed with `simpleQueryStringQuery(...)`
 * `textPhrase(...)` removed
 * `textPhrasePrefix(...)` removed
 * `textPhrasePrefixQuery(...)` removed
 * `filtered(...)` removed. Use `filteredQuery(...)` instead.
 * `inQuery(...)` removed.
 ==== GetIndexRequest
 `GetIndexRequest.features()` now returns an array of Feature Enums instead of an array of String values.
 The following deprecated methods have been removed:
 * `GetIndexRequest.addFeatures(String[])` - Use
  `GetIndexRequest.addFeatures(Feature[])` instead
 * `GetIndexRequest.features(String[])` - Use
  `GetIndexRequest.features(Feature[])` instead.
 * `GetIndexRequestBuilder.addFeatures(String[])` - Use
  `GetIndexRequestBuilder.addFeatures(Feature[])` instead.
 * `GetIndexRequestBuilder.setFeatures(String[])` - Use
  `GetIndexRequestBuilder.setFeatures(Feature[])` instead.
 ==== BytesQueryBuilder removed
 The redundant BytesQueryBuilder has been removed in favour of the
 WrapperQueryBuilder internally.
--- a/docs/reference/migration/migrate_2_0/mapping.asciidoc
+++ b/docs/reference/migration/migrate_2_0/mapping.asciidoc
@ -0,0 +1,390 @@
 === Mapping changes
 A number of changes have been made to mappings to remove ambiguity and to
 ensure that conflicting mappings cannot be created.
 One major change is that dynamically added fields must have their mapping
 confirmed by the master node before indexing continues.  This is to avoid a
 problem where different shards in the same index dynamically add different
 mappings for the same field.  These conflicting mappings can silently return
 incorrect results and can lead to index corruption.
 This change can make indexing slower when frequently adding many new fields.
 We are looking at ways of optimising this process but we chose safety over
 performance for this extreme use case.
 ==== Conflicting field mappings
 Fields with the same name, in the same index, in different types, must have
 the same mapping, with the exception of the <<copy-to>>, <<dynamic>>,
 <<enabled>>, <<ignore-above>>, <<include-in-all>>, and <<properties>>
 parameters, which may have different settings per field.
 [source,js]
 ---------------
 PUT my_index
 {
  "mappings": {
    "type_one": {
      "properties": {
        "name": { <1>
          "type": "string"
        }
      }
    },
    "type_two": {
      "properties": {
        "name": { <1>
          "type":     "string",
          "analyzer": "english"
        }
      }
    }
  }
 }
 ---------------
 <1> The two `name` fields have conflicting mappings and will prevent Elasticsearch
    from starting.
 Elasticsearch will not start in the presence of conflicting field mappings.
 These indices must be deleted or reindexed using a new mapping.
 The `ignore_conflicts` option of the put mappings API has been removed.
 Conflicts can't be ignored anymore.
 ==== Fields cannot be referenced by short name
 A field can no longer be referenced using its short name.  Instead, the full
 path to the field is required.  For instance:
 [source,js]
 ---------------
 PUT my_index
 {
  "mappings": {
    "my_type": {
      "properties": {
        "title":     { "type": "string" }, <1>
        "name": {
          "properties": {
            "title": { "type": "string" }, <2>
            "first": { "type": "string" },
            "last":  { "type": "string" }
          }
        }
      }
    }
  }
 }
 ---------------
 <1> This field is referred to as `title`.
 <2> This field is referred to as `name.title`.
 Previously, the two `title` fields in the example above could have been
 confused with each other when using the short name `title`.
 ==== Type name prefix removed
 Previously, two fields with the same name in two different types could
 sometimes be disambiguated by prepending the type name.  As a side effect, it
 would add a filter on the type name to the relevant query.  This feature was
 ambiguous -- a type name could be confused with a field name -- and didn't
 work everywhere e.g. aggregations.
 Instead, fields should be specified with the full path, but without a type
 name prefix.  If you wish to filter by the `_type` field, either specify the
 type in the URL or add an explicit filter.
 The following example query in 1.x:
 [source,js]
 ----------------------------
 GET my_index/_search
 {
  "query": {
    "match": {
      "my_type.some_field": "quick brown fox"
    }
  }
 }
 ----------------------------
 would be rewritten in 2.0 as:
 [source,js]
 ----------------------------
 GET my_index/my_type/_search <1>
 {
  "query": {
    "match": {
      "some_field": "quick brown fox" <2>
    }
  }
 }
 ----------------------------
 <1> The type name can be specified in the URL to act as a filter.
 <2> The field name should be specified without the type prefix.
 ==== Field names may not contain dots
 In 1.x, it was possible to create fields with dots in their name, for
 instance:
 [source,js]
 ----------------------------
 PUT my_index
 {
  "mappings": {
    "my_type": {
      "properties": {
        "foo.bar": { <1>
          "type": "string"
        },
        "foo": {
          "properties": {
            "bar": { <1>
              "type": "string"
            }
          }
        }
      }
    }
  }
 }
 ----------------------------
 <1> These two fields cannot be distinguised as both are referred to as `foo.bar`.
 You can no longer create fields with dots in the name.
 ==== Type names may not start with a dot
 In 1.x, Elasticsearch would issue a warning if a type name included a dot,
 e.g. `my.type`.  Now that type names are no longer used to distinguish between
 fields in differnt types, this warning has been relaxed: type names may now
 contain dots, but they may not *begin* with a dot.  The only exception to this
 is the special `.percolator` type.
 ==== Types may no longer be deleted
 In 1.x it was possible to delete a type mapping, along with all of the
 documents of that type, using the delete mapping API.  This is no longer
 supported, because remnants of the fields in the type could remain in the
 index, causing corruption later on.
 Instead, if you need to delete a type mapping, you should reindex to a new
 index which does not contain the mapping.  If you just need to delete the
 documents that belong to that type, then use the delete-by-query plugin
 instead.
 [[migration-meta-fields]]
 ==== Type meta-fields
 The <<mapping-fields,meta-fields>> associated with had configuration options
 removed, to make them more reliable:
 * `_id` configuration can no longer be changed.  If you need to sort, use the <<mapping-uid-field,`_uid`>> field instead.
 * `_type` configuration can no longer be changed.
 * `_index` configuration can no longer be changed.
 * `_routing` configuration is limited to marking routing as required.
 * `_field_names` configuration is limited to disabling the field.
 * `_size` configuration is limited to enabling the field.
 * `_timestamp` configuration is limited to enabling the field, setting format and default value.
 * `_boost` has been removed.
 * `_analyzer` has been removed.
 Importantly, *meta-fields can no longer be specified as part of the document
 body.*  Instead, they must be specified in the query string parameters.  For
 instance, in 1.x, the `routing` could be specified as follows:
 [source,json]
 -----------------------------
 PUT my_index
 {
  "mappings": {
    "my_type": {
      "_routing": {
        "path": "group" <1>
      },
      "properties": {
        "group": { <1>
          "type": "string"
        }
      }
    }
  }
 }
 PUT my_index/my_type/1 <2>
 {
  "group": "foo"
 }
 -----------------------------
 <1> This 1.x mapping tells Elasticsearch to extract the `routing` value from the `group` field in the document body.
 <2> This indexing request uses a `routing` value of `foo`.
 In 2.0, the routing must be specified explicitly:
 [source,json]
 -----------------------------
 PUT my_index
 {
  "mappings": {
    "my_type": {
      "_routing": {
        "required": true <1>
      },
      "properties": {
        "group": {
          "type": "string"
        }
      }
    }
  }
 }
 PUT my_index/my_type/1?routing=bar <2>
 {
  "group": "foo"
 }
 -----------------------------
 <1> Routing can be marked as required to ensure it is not forgotten during indexing.
 <2> This indexing request uses a `routing` value of `bar`.
 ==== Analyzer mappings
 Previously, `index_analyzer` and `search_analyzer` could be set separately,
 while the `analyzer` setting would set both.  The `index_analyzer` setting has
 been removed in favour of just using the `analyzer` setting.
 If just the `analyzer` is set, it will be used at index time and at search time.  To use a different analyzer at search time, specify both the `analyzer` and a `search_analyzer`.
 The `index_analyzer`, `search_analyzer`,  and `analyzer` type-level settings
 have also been removed, as is is no longer possible to select fields based on
 the type name.
 The `_analyzer` meta-field, which allowed setting an analyzer per document has
 also been removed.  It will be ignored on older indices.
 ==== Date fields and Unix timestamps
 Previously, `date` fields would first try to parse values as a Unix timestamp
 -- milliseconds-since-the-epoch -- before trying to use their defined date
 `format`.  This meant that formats like `yyyyMMdd` could never work, as values
 would be interpreted as timestamps.
 In 2.0, we have added two formats: `epoch_millis` and `epoch_second`.  Only
 date fields that use these formats will be able to parse timestamps.
 These formats cannot be used in dynamic templates, because they are
 indistinguishable from long values.
 ==== Default date format
 The default date format has changed from `date_optional_time` to
 `strict_date_optional_time`, which expects a 4 digit year, and a 2 digit month
 and day, (and optionally, 2 digit hour, minute, and second).
 A dynamically added date field, by default, includes the `epoch_millis`
 format to support timestamp parsing.  For instance:
 [source,js]
 -------------------------
 PUT my_index/my_type/1
 {
  "date_one": "2015-01-01" <1>
 }
 -------------------------
 <1> Has `format`: `"strict_date_optional_time||epoch_millis"`.
 [[migration-bool-fields]]
 ==== Boolean fields
 Boolean fields used to have a string fielddata with `F` meaning `false` and `T`
 meaning `true`. They have been refactored to use numeric fielddata, with `0`
 for `false` and `1` for `true`. As a consequence, the format of the responses of
 the following APIs changed when applied to boolean fields: `0`/`1` is returned
 instead of `F`/`T`:
 * <<search-request-fielddata-fields,fielddata fields>>
 * <<search-request-sort,sort values>>
 * <<search-aggregations-bucket-terms-aggregation,terms aggregations>>
 In addition, terms aggregations use a custom formatter for boolean (like for
 dates and ip addresses, which are also backed by numbers) in order to return
 the user-friendly representation of boolean fields: `false`/`true`:
 [source,js]
 ---------------
 "buckets": [
  {
     "key": 0,
     "key_as_string": "false",
     "doc_count": 42
  },
  {
     "key": 1,
     "key_as_string": "true",
     "doc_count": 12
  }
 ]
 ---------------
 ==== `index_name` and `path` removed
 The `index_name` setting was used to change the name of the Lucene field,
 and the `path` setting was used on `object` fields to determine whether the
 Lucene field should use the full path (including parent object fields), or
 just the final `name`.
 These setting have been removed as their purpose is better served with the
 <<copy-to>> parameter.
 ==== Murmur3 Fields
 Fields of type `murmur3` can no longer change `doc_values` or `index` setting.
 They are always mapped as follows:
 [source,js]
 ---------------------
 {
  "type":       "murmur3",
  "index":      "no",
  "doc_values": true
 }
 ---------------------
 ==== Mappings in config files not supported
 The ability to specify mappings in configuration files has been removed. To
 specify default mappings that apply to multiple indexes, use
 <<indices-templates,index templates>> instead.
 Along with this change, the following settings have ben removed:
 * `index.mapper.default_mapping_location`
 * `index.mapper.default_percolator_mapping_location`
 ==== Posting and doc-values codecs
 It is no longer possible to specify per-field postings and doc values formats
 in the mappings. This setting will be ignored on indices created before 2.0
 and will cause mapping parsing to fail on indices created on or after 2.0. For
 old indices, this means that new segments will be written with the default
 postings and doc values formats of the current codec.
 It is still possible to change the whole codec by using the `index.codec`
 setting. Please however note that using a non-default codec is discouraged as
 it could prevent future versions of Elasticsearch from being able to read the
 index.
 ==== Compress and compress threshold
 The `compress` and `compress_threshold` options have been removed from the
 `_source` field and fields of type `binary`.  These fields are compressed by
 default.  If you would like to increase compression levels, use the new
 <<index-codec,`index.codec: best_compression`>> setting instead.
--- a/docs/reference/migration/migrate_2_0/packaging.asciidoc
+++ b/docs/reference/migration/migrate_2_0/packaging.asciidoc
@ -0,0 +1,58 @@
 === Plugin and packaging changes
 ==== Symbolic links and paths
 Elasticsearch 2.0 runs with the Java security manager enabled and is much more
 restrictive about which paths it is allowed to access.  Various paths can be
 configured, e.g. `path.data`, `path.scripts`, `path.repo`.  A configured path
 may itself be a symbolic link, but no symlinks under that path will be
 followed (with the exception of `path.scripts`, which does follow symlinks).
 ==== Running `/bin/elasticsearch`
 The command line parameter parsing has been rewritten to deal properly with
 spaces in parameters. All config settings can still be specified on the
 command line when starting Elasticsearch, but they must appear after the
 built-in "static parameters", such as `-d` (to daemonize) and `-p` (the PID path).
 For instance:
 [source,sh]
 -----------
 /bin/elasticsearch -d -p /tmp/foo.pid --http.cors.enabled=true --http.cors.allow-origin='*'
 -----------
 For a list of static parameters, run `/bin/elasticsearch -h`
 ==== `-f` removed
 The `-f` parameter, which used to indicate that Elasticsearch should be run in
 the foreground, was deprecated in 1.0 and removed in 2.0.
 ==== `V` for version
 The `-v` parameter now means `--verbose` for both `bin/plugin` and
 `bin/elasticsearch` (although it has no effect on the latter).  To output the
 version, use `-V` or `--version` instead.
 ==== Plugin manager should run as root
 The permissions of the `config`, `bin`, and `plugins` directories in the RPM
 and deb packages have been made more restrictive.  The plugin manager should
 be run as root otherwise it will not be able to install plugins.
 ==== Support for official plugins
 Almost all of the official Elasticsearch plugins have been moved to the main
 `elasticsearch` repository. They will be released at the same time as
 Elasticsearch and have the same version number as Elasticsearch.
 Official plugins can be installed as follows:
 [source,sh]
 ---------------
 sudo bin/plugin install analysis-icu
 ---------------
 Community-provided plugins can be installed as before.
--- a/docs/reference/migration/migrate_2_0/parent_child.asciidoc
+++ b/docs/reference/migration/migrate_2_0/parent_child.asciidoc
@ -0,0 +1,43 @@
 === Parent/Child changes
 Parent/child has been rewritten completely to reduce memory usage and to
 execute `has_child` and `has_parent` queries faster and more efficient. The
 `_parent` field uses doc values by default. The refactored and improved
 implementation is only active for indices created on or after version 2.0.
 In order to benefit from all the performance and memory improvements, we
 recommend reindexing all existing indices that use the `_parent` field.
 ==== Parent type cannot pre-exist
 A mapping type is declared as a child of another mapping type by specifying
 the `_parent` meta field:
 [source,js]
 --------------------------
 DELETE *
 PUT my_index
 {
  "mappings": {
    "my_parent": {},
    "my_child": {
      "_parent": {
        "type": "my_parent" <1>
      }
    }
  }
 }
 --------------------------
 <1> The `my_parent` type is the parent of the `my_child` type.
 The mapping for the parent type can be added at the same time as the mapping
 for the child type, but cannot be added before the child type.
 ==== `top_children` query removed
 The `top_children` query has been removed in favour of the `has_child` query.
 It wasn't always faster than the `has_child` query and the was usually
 inaccurate. The total hits and any aggregations in the same search request
 would be incorrect if `top_children` was used.
--- a/docs/reference/migration/migrate_2_0/query_dsl.asciidoc
+++ b/docs/reference/migration/migrate_2_0/query_dsl.asciidoc
@ -0,0 +1,186 @@
 === Query DSL changes
 ==== Queries and filters merged
 Queries and filters have been merged -- all filter clauses are now query
 clauses. Instead, query clauses can now be used in _query context_ or in
 _filter context_:
 Query context::
 A query used in query context will caculated relevance scores and will not be
 cacheable.  Query context is used whenever filter context does not apply.
 Filter context::
 +
 --
 A query used in filter context will not calculate relevance scores, and will
 be cacheable. Filter context is introduced by:
 * the `constant_score` query
 * the `must_not` and (newly added) `filter` parameter in the `bool` query
 * the `filter` and `filters` parameters in the `function_score` query
 * any API called `filter`, such as the `post_filter` search parameter, or in
  aggregations or index aliases
 --
 As a result of this change, he `execution` option of the `terms` filter is now
 deprecated and ignored if provided.
 ==== `or` and `and` now implemented via `bool`
 The `or` and `and` filters previously had a different execution pattern to the
 `bool` filter. It used to be important to use `and`/`or` with certain filter
 clauses, and `bool` with others.
 This distinction has been removed: the `bool` query is now smart enough to
 handle both cases optimally.  As a result of this change, the `or` and `and`
 filters are now sugar syntax which are executed internally as a `bool` query.
 These filters may be removed in the future.
 ==== `filtered` query and `query` filter deprecated
 The `query` filter is deprecated as is it no longer needed -- all queries can
 be used in query or filter context.
 The `filtered` query is deprecated in favour of the `bool` query. Instead of
 the following:
 [source,js]
 -------------------------
 GET _search
 {
  "query": {
    "filtered": {
      "query": {
        "match": {
          "text": "quick brown fox"
        }
      },
      "filter": {
        "term": {
          "status": "published"
        }
      }
    }
  }
 }
 -------------------------
 move the query and filter to the `must` and `filter` parameters in the `bool`
 query:
 [source,js]
 -------------------------
 GET _search
 {
  "query": {
    "bool": {
      "must": {
        "match": {
          "text": "quick brown fox"
        }
      },
      "filter": {
        "term": {
          "status": "published"
        }
      }
    }
  }
 }
 -------------------------
 ==== Filter auto-caching
 It used to be possible to control which filters were cached with the `_cache`
 option and to provide a custom `_cache_key`.  These options are deprecated
 and, if present, will be ignored.
 Query clauses used in filter context are now auto-cached when it makes sense
 to do so.  The algorithm takes into account the frequency of use, the cost of
 query execution, and the cost of building the filter.
 The `terms` filter lookup mechanism no longer caches the values of the
 document containing the terms.  It relies on the filesystem cache instead. If
 the lookup index is not too large, it is recommended to replicate it to all
 nodes by setting `index.auto_expand_replicas: 0-all` in order to remove the
 network overhead as well.
 ==== Numeric queries use IDF for scoring
 Previously, term queries on numeric fields were deliberately prevented from
 using the usual Lucene scoring logic and this behaviour was undocumented and,
 to some, unexpected.
 Single `term` queries on numeric fields now score in the same way as string
 fields, using IDF and norms (if enabled).
 To query numeric fields without scoring, the query clause should be used in
 filter context, e.g. in the `filter` parameter of the `bool` query, or wrapped
 in a `constant_score` query:
 [source,js]
 ----------------------------
 GET _search
 {
  "query": {
    "bool": {
      "must": [
        {
          "match": { <1>
            "numeric_tag": 5
          }
        }
      ],
      "filter": [
        {
          "match": { <2>
            "count": 5
          }
        }
      ]
    }
  }
 }
 ----------------------------
 <1> This clause would include IDF in the relevance score calculation.
 <2> This clause would have no effect on the relevance score.
 ==== Fuzziness and fuzzy-like-this
 Fuzzy matching used to calculate the score for each fuzzy alternative, meaning
 that rare misspellings would have a higher score than the more common correct
 spellings. Now, fuzzy matching blends the scores of all the fuzzy alternatives
 to use the IDF of the most frequently occurring alternative.
 Fuzziness can no longer be specified using a percentage, but should instead
 use the number of allowed edits:
 * `0`, `1`, `2`, or
 * `AUTO` (which chooses `0`, `1`, or `2` based on the length of the term)
 The `fuzzy_like_this` and `fuzzy_like_this_field` queries used a very
 expensive approach to fuzzy matching and have been removed.
 ==== More Like This
 The More Like This (`mlt`) API and the `more_like_this_field` (`mlt_field`)
 query have been removed in favor of the
 <<query-dsl-mlt-query, `more_like_this`>> query.
 The parameter `percent_terms_to_match` has been removed in favor of
 `minimum_should_match`.
 ==== `limit` filter deprecated
 The `limit` filter is deprecated and becomes a no-op. You can achieve similar
 behaviour using the <<search-request-body,terminate_after>> parameter.
 ==== Jave plugins registering custom queries
 Java plugins that register custom queries can do so by using the
 `IndicesQueriesModule#addQuery(Class<? extends QueryParser>)` method. Other
 ways to register custom queries are not supported anymore.
--- a/docs/reference/migration/migrate_2_0/removals.asciidoc
+++ b/docs/reference/migration/migrate_2_0/removals.asciidoc
@ -0,0 +1,68 @@
 === Removed features
 ==== Rivers have been removed
 Elasticsearch does not support rivers anymore. While we had first planned to
 keep them around to ease migration, keeping support for rivers proved to be
 challenging as it conflicted with other important changes that we wanted to
 bring to 2.0 like synchronous dynamic mappings updates, so we eventually
 decided to remove them entirely. See
 link:/blog/deprecating_rivers[Deprecating Rivers] for more background about
 why we took this decision.
 ==== Facets have been removed
 Facets, deprecated since 1.0, have now been removed.  Instead, use the much
 more powerful and flexible <<search-aggregations,aggregations>> framework.
 This also means that Kibana 3 will not work with Elasticsearch 2.0.
 ==== Delete-by-query is now a plugin
 The old delete-by-query functionality was fast but unsafe.  It could lead to
 document differences between the primary and replica shards, and could even
 produce out of memory exceptions and cause the cluster to crash.
 This feature has been reimplemented using the <<scroll-scan,scroll/scan>> and
 the <<docs-bulk,`bulk`>> API, which may be slower for queries which match
 large numbers of documents, but is safe.
 Currently, a long running delete-by-query job cannot be cancelled, which is
 one of the reasons that this functionality is only available as a plugin.  You
 can install the plugin with:
 [source,sh]
 ------------------
 ./bin/plugin install delete-by-query
 ------------------
 ==== `_shutdown` API
 The `_shutdown` API has been removed without a replacement. Nodes should be
 managed via the operating system and the provided start/stop scripts.
 ==== `_size` is now a plugin
 The `_size` meta-data field, which indexes the size in bytes of the original
 JSON document, has been moved out of core and is available as a plugin.  It
 can be installed as:
 [source,sh]
 ------------------
 ./bin/plugin install mapper-size
 ------------------
 ==== Thrift and memcached transport
 The thrift and memcached transport plugins are no longer supported.  Instead, use
 either the HTTP transport (enabled by default) or the node or transport Java client.
 ==== Bulk UDP
 The bulk UDP API has been removed.  Instead, use the standard
 <<docs-bulk,`bulk`>> API, or use UDP to send documents to Logstash first.
 ==== MergeScheduler pluggability
 The merge scheduler is no longer pluggable.
--- a/docs/reference/migration/migrate_2_0/scripting.asciidoc
+++ b/docs/reference/migration/migrate_2_0/scripting.asciidoc
@ -0,0 +1,102 @@
 === Scripting changes
 ==== Scripting syntax
 The syntax for scripts has been made consistent across all APIs. The accepted
 format is as follows:
 Inline/Dynamic scripts::
 +
 --
 [source,js]
 ---------------
 "script": {
  "inline": "doc['foo'].value + val", <1>
  "lang":   "groovy", <2>
  "params": { "val": 3 } <3>
 }
 ---------------
 <1> The inline script to execute.
 <2> The optional language of the script.
 <3> Any named parameters.
 --
 Indexed scripts::
 +
 --
 [source,js]
 ---------------
 "script": {
  "id":     "my_script_id", <1>
  "lang":   "groovy", <2>
  "params": { "val": 3 } <3>
 }
 ---------------
 <1> The ID of the indexed script.
 <2> The optional language of the script.
 <3> Any named parameters.
 --
 File scripts::
 +
 --
 [source,js]
 ---------------
 "script": {
  "file":   "my_file", <1>
  "lang":   "groovy", <2>
  "params": { "val": 3 } <3>
 }
 ---------------
 <1> The filename of the script, without the `.lang` suffix.
 <2> The optional language of the script.
 <3> Any named parameters.
 --
 For example, an update request might look like this:
 [source,js]
 ---------------
 POST my_index/my_type/1/_update
 {
  "script": {
    "inline": "ctx._source.count += val",
    "params": { "val": 3 }
  },
  "upsert": {
    "count": 0
  }
 }
 ---------------
 A short syntax exists for running inline scripts in the default scripting
 language without any parameters:
 [source,js]
 ----------------
 GET _search
 {
  "script_fields": {
    "concat_fields": {
      "script": "doc['one'].value + ' ' + doc['two'].value"
    }
  }
 }
 ----------------
 ==== Scripting settings
 The `script.disable_dynamic` node setting has been replaced by fine-grained
 script settings described in <<migration-script-settings>>.
 ==== Groovy scripts sandbox
 The Groovy sandbox and related settings have been removed. Groovy is now a
 non-sandboxed scripting language, without any option to turn the sandbox on.
 ==== Plugins making use of scripts
 Plugins that make use of scripts must register their own script context
 through `ScriptModule`. Script contexts can be used as part of fine-grained
 settings to enable/disable scripts selectively.
--- a/docs/reference/migration/migrate_2_0/search.asciidoc
+++ b/docs/reference/migration/migrate_2_0/search.asciidoc
@ -0,0 +1,121 @@
 === Search changes
 ==== Partial fields
 Partial fields have been removed in favor of <<search-request-source-filtering,source filtering>>.
 ==== `search_type=count` deprecated
 The `count` search type has been deprecated. All benefits from this search
 type can now be achieved by using the (default) `query_then_fetch` search type
 and setting `size` to `0`.
 ==== The count api internally uses the search api
 The count api is now a shortcut to the search api with `size` set to 0. As a
 result, a total failure will result in an exception being returned rather
 than a normal response with `count` set to `0` and shard failures.
 ==== All stored meta-fields returned by default
 Previously, meta-fields like `_routing`, `_timestamp`, etc would only be
 included in the search results if specifically requested with the `fields`
 parameter.  Now, all meta-fields which have stored values will be returned by
 default.  Additionally, they are now returned at the top level (along with
 `_index`, `_type`, and `_id`) instead of in the `fields` element.
 For instance, the following request:
 [source,sh]
 ---------------
 GET /my_index/_search?fields=foo
 ---------------
 might return:
 [source,js]
 ---------------
 {
   [...]
   "hits": {
      "total": 1,
      "max_score": 1,
      "hits": [
         {
            "_index":     "my_index",
            "_type":      "my_type",
            "_id":        "1",
            "_score":     1,
            "_timestamp": 10000000, <1>
            "fields": {
              "foo" : [ "bar" ]
            }
         }
      ]
   }
 }
 ---------------
 <1> The `_timestamp` is returned by default, and at the top level.
 ==== Script fields
 Script fields in 1.x were only returned as a single value. Even if the return
 value of a script was a list, it would be returned as an array containing an
 array:
 [source,js]
 ---------------
 "fields": {
  "my_field": [
    [
      "v1",
      "v2"
    ]
  ]
 }
 ---------------
 In elasticsearch 2.0, scripts that return a list of values are treated as
 multivalued fields. The same example would return the following response, with
 values in a single array.
 [source,js]
 ---------------
 "fields": {
  "my_field": [
    "v1",
    "v2"
  ]
 }
 ---------------
 ==== Timezone for date field
 Specifying the `time_zone` parameter in queries or aggregations on fields of
 type `date` must now be either an ISO 8601 UTC offset, or a timezone id. For
 example, the value `+1:00` must now be written as `+01:00`.
 ==== Only highlight queried fields
 The default value for the `require_field_match` option has changed from
 `false` to `true`, meaning that the highlighters will, by default, only take
 the fields that were queried into account.
 This means that, when querying the `_all` field, trying to highlight on any
 field other than `_all`  will produce no highlighted snippets. Querying the
 same fields that need to be highlighted is the cleaner solution to get
 highlighted snippets back. Otherwise `require_field_match` option can be set
 to `false` to ignore field names completely when highlighting.
 The postings highlighter doesn't support the `require_field_match` option
 anymore, it will only highlight fields that were queried.
 ==== Postings highlighter doesn't support `match_phrase_prefix`
 The `match` query with type set to `phrase_prefix` (or the
 `match_phrase_prefix` query) is not supported by the postings highlighter. No
 highlighted snippets will be returned.
--- a/docs/reference/migration/migrate_2_0/settings.asciidoc
+++ b/docs/reference/migration/migrate_2_0/settings.asciidoc
@ -0,0 +1,125 @@
 === Setting changes
 [[migration-script-settings]]
 ==== Scripting settings
 The `script.disable_dynamic` node setting has been replaced by fine-grained
 script settings described in the <<enable-dynamic-scripting,scripting docs>>.
 The following setting previously used to enable dynamic or inline scripts:
 [source,yaml]
 ---------------
 script.disable_dynamic: false
 ---------------
 It should be replaced with the following two settings in `elasticsearch.yml` that
 achieve the same result:
 [source,yaml]
 ---------------
 script.inline: on
 script.indexed: on
 ---------------
 ==== Units required for time and byte-sized settings
 Any settings which accept time or byte values must now be specified with
 units.  For instance, it is too easy to set the `refresh_interval` to 1
 *millisecond* instead of 1 second:
 [source,js]
 ---------------
 PUT _settings
 {
  "index.refresh_interval": 1
 }
 ---------------
 In 2.0, the above request will throw an exception. Instead the refresh
 interval should be set to `"1s"` for one second.
 ==== Shadow replica settings
 The `node.enable_custom_paths` setting has been removed and replaced by the
 `path.shared_data` setting to allow shadow replicas with custom paths to work
 with the security manager. For example, if your previous configuration had:
 [source,yaml]
 ------
 node.enable_custom_paths: true
 ------
 And you created an index using shadow replicas with `index.data_path` set to
 `/opt/data/my_index` with the following:
 [source,js]
 --------------------------------------------------
 PUT /my_index
 {
  "index": {
    "number_of_shards": 1,
    "number_of_replicas": 4,
    "data_path": "/opt/data/my_index",
    "shadow_replicas": true
  }
 }
 --------------------------------------------------
 For 2.0, you will need to set `path.shared_data` to a parent directory of the
 index's data_path, so:
 [source,yaml]
 -----------
 path.shared_data: /opt/data
 -----------
 ==== Resource watcher settings renamed
 The setting names for configuring the resource watcher have been renamed
 to prevent clashes with the watcher plugin
 * `watcher.enabled` is now `resource.reload.enabled`
 * `watcher.interval` is now `resource.reload.interval`
 * `watcher.interval.low` is now `resource.reload.interval.low`
 * `watcher.interval.medium` is now `resource.reload.interval.medium`
 * `watcher.interval.high` is now `resource.reload.interval.high`
 ==== Hunspell dictionary configuration
 The parameter `indices.analysis.hunspell.dictionary.location` has been
 removed, and `<path.conf>/hunspell` is always used.
 ==== CORS allowed origins
 The CORS allowed origins setting, `http.cors.allow-origin`, no longer has a default value. Previously, the default value
 was `*`, which would allow CORS requests from any origin and is considered insecure. The `http.cors.allow-origin` setting
 should be specified with only the origins that should be allowed, like so:
 [source,yaml]
 ---------------
 http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/
 ---------------
 ==== JSONP support
 JSONP callback support has now been removed. CORS should be used to access Elasticsearch
 over AJAX instead:
 [source,yaml]
 ---------------
 http.cors.enabled: true
 http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/
 ---------------
 ==== In memory indices
 The `memory` / `ram` store (`index.store.type`) option was removed in
 Elasticsearch.  In-memory indices are no longer supported.
 ==== Log messages truncated
 Log messages are now truncated at 10,000 characters. This can be changed in
 the `logging.yml` configuration file with the `file.layout.conversionPattern`
 setting.
 Remove mapping.date.round_ceil setting for date math parsing #8889 (issues: #8556, #8598)
--- a/docs/reference/migration/migrate_2_0/snapshot_restore.asciidoc
+++ b/docs/reference/migration/migrate_2_0/snapshot_restore.asciidoc
@ -0,0 +1,37 @@
 === Snapshot and Restore changes
 ==== File-system repositories must be whitelisted
 Locations of the shared file system repositories and the URL repositories with
 `file:` URLs now have to be registered before starting Elasticsearch using the
 `path.repo` setting. The `path.repo` setting can contain one or more
 repository locations:
 [source,yaml]
 ---------------
 path.repo: ["/mnt/daily", "/mnt/weekly"]
 ---------------
 If the repository location is specified as an absolute path it has to start
 with one of the locations specified in `path.repo`. If the location is
 specified as a relative path, it will be resolved against the first location
 specified in the `path.repo` setting.
 ==== URL repositories must be whitelisted
 URL repositories with `http:`, `https:`, and `ftp:` URLs have to be
 whitelisted before starting Elasticsearch with the
 `repositories.url.allowed_urls` setting. This setting supports wildcards in
 the place of host, path, query, and fragment. For example:
 [source,yaml]
 -----------------------------------
 repositories.url.allowed_urls: ["http://www.example.org/root/*", "https://*.mydomain.com/*?*#*"]
 -----------------------------------
 ==== Wildcard expansion
 The obsolete parameters `expand_wildcards_open` and `expand_wildcards_close`
 are no longer supported by the snapshot and restore operations. These
 parameters have been replaced by a single `expand_wildcards` parameter. See
 <<multi-index,the multi-index docs>> for more.
--- a/docs/reference/migration/migrate_2_0/stats.asciidoc
+++ b/docs/reference/migration/migrate_2_0/stats.asciidoc
@ -0,0 +1,57 @@
 === Stats, info, and `cat` changes
 ==== Sigar removed
 We no longer ship the Sigar library for operating system dependent statistics,
 as it no longer seems to be maintained.  Instead, we rely on the statistics
 provided by the JVM.  This has resulted in a number of changes to the node
 info, and node stats responses:
 * `network.*` has been removed from nodes info and nodes stats.
 * `fs.*.dev` and `fs.*.disk*` have been removed from nodes stats.
 * `os.*` has been removed from nodes stats, except for `os.timestamp`,
  `os.load_average`, `os.mem.*`, and `os.swap.*`.
 * `os.mem.total` and `os.swap.total` have been removed from nodes info.
 * `process.mem.resident` and `process.mem.share` have been removed from node stats.
 ==== Removed `id_cache` from stats apis
 Removed `id_cache` metric from nodes stats, indices stats and cluster stats
 apis. This metric has also been removed from the shards cat, indices cat and
 nodes cat apis. Parent/child memory is now reported under fielddata, because
 it has internally be using fielddata for a while now.
 To just see how much parent/child related field data is taking, the
 `fielddata_fields` option can be used on the stats apis. Indices stats
 example:
 [source,js]
 --------------------------------------------------
 GET /_stats/fielddata?fielddata_fields=_parent
 --------------------------------------------------
 ==== Percolator stats
 The total time spent running percolator queries is now called `percolate.time`
 instead of `percolate.get_time`.
 ==== Cluster state REST API
 The cluster state API doesn't return the `routing_nodes` section anymore when
 `routing_table` is requested. The newly introduced `routing_nodes` flag can be
 used separately to control whether `routing_nodes` should be returned.
 ==== Index status API
 The deprecated index status API has been removed.
 ==== `cat` APIs verbose by default
 The `cat` APIs now default to being verbose, which means they output column
 headers by default. Verbosity can be turned off with the `v` parameter:
 [source,sh]
 -----------------
 GET _cat/shards?v=0
 -----------------
--- a/docs/reference/migration/migrate_2_0/striping.asciidoc
+++ b/docs/reference/migration/migrate_2_0/striping.asciidoc
@ -0,0 +1,20 @@
 === Multiple `data.path` striping
 Previously, if the `data.path` setting listed multiple data paths, then a
 shard would be ``striped'' across all paths by writing a whole file to each
 path in turn (in accordance with the `index.store.distributor` setting).  The
 result was that files from a single segment in a shard could be spread across
 multiple disks, and the failure of any one disk could corrupt multiple shards.
 This striping is no longer supported.  Instead, different shards may be
 allocated to different paths, but all of the files in a single shard will be
 written to the same path.
 If striping is detected while starting Elasticsearch 2.0.0 or later, *all of
 the files belonging to the same shard will be migrated to the same path*. If
 there is not enough disk space to complete this migration, the upgrade will be
 cancelled and can only be resumed once enough disk space is made available.
 The `index.store.distributor` setting has also been removed.