OpenSearch/modules
Adrien Grand 31158ab3d5
Add per-field metadata. (#50333)
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
2020-01-08 16:21:18 +01:00
..
aggs-matrix-stats Declare remaining parsers `final` (#50571) (#50615) 2020-01-03 11:48:11 -05:00
analysis-common Log deprecation for nGram and edgeNGram custom filters (#50376) (#50445) 2019-12-20 22:00:08 +01:00
ingest-common Sync grok patterns with logstash patterns (#50381) 2020-01-08 14:59:34 +01:00
ingest-geoip Allow list of IPs in geoip ingest processor (#49573) (#49947) 2019-12-07 00:19:09 +01:00
ingest-user-agent update ingest-user-agent regexes.yml (#47807) 2019-10-18 16:26:48 +02:00
lang-expression Scripting: ScriptFactory not required by compile (#50344) (#50392) 2019-12-19 12:50:25 -07:00
lang-mustache Scripting: ScriptFactory not required by compile (#50344) (#50392) 2019-12-19 12:50:25 -07:00
lang-painless [TEST] Unknown scripting annotations raise error (#50343) (#50346) 2019-12-19 16:22:22 -07:00
mapper-extras Add per-field metadata. (#50333) 2020-01-08 16:21:18 +01:00
parent-join Fix NPE bug inner_hits (#50709) 2020-01-07 14:21:54 -05:00
percolator Correctly handle MSM for nested disjunctions (#50669) 2020-01-07 09:32:30 +00:00
rank-eval Add a cluster setting to disallow loading fielddata on _id field (#49166) 2019-11-28 09:35:28 +01:00
reindex Use Void context on parsers where possible (#50573) (#50617) 2020-01-03 13:28:55 -05:00
repository-url Remove Unused Single Delete in BlobStoreRepository (#50024) (#50123) 2019-12-12 11:17:46 +01:00
systemd Extend systemd timeout during startup (#49784) 2019-12-03 14:25:45 -05:00
transport-netty4 Stop Allocating Buffers in CopyBytesSocketChannel (#49825) (#49832) 2019-12-04 19:36:52 +01:00
build.gradle Apply 2-space indent to all gradle scripts (#49071) 2019-11-14 11:01:23 +00:00