OpenSearch/plugins
David Pilato 87553bba16
Add ingest-attachment support for per document `indexed_chars` limit (#28977)
We today support a global `indexed_chars` processor parameter. But in some cases, users would like to set this limit depending on the document itself.
It used to be supported in mapper-attachments plugin by extracting the limit value from a meta field in the document sent to indexation process.

We add an option which reads this limit value from the document itself
by adding a setting named `indexed_chars_field`.

Which allows running:

```
PUT _ingest/pipeline/attachment
{
  "description" : "Extract attachment information. Used to parse pdf and office files",
  "processors" : [
    {
      "attachment" : {
        "field" : "data",
        "indexed_chars_field" : "size"
      }
    }
  ]
}
```

Then index either:

```
PUT index/doc/1?pipeline=attachment
{
  "data": "BASE64"
}
```

Which will use the default value (or the one defined by `indexed_chars`)

Or

```
PUT index/doc/2?pipeline=attachment
{
  "data": "BASE64",
  "size": 1000
}
```

Closes #28942
2018-03-14 19:07:20 +01:00
..
analysis-icu Remove the `update_all_types` option. (#28288) 2018-01-22 12:03:07 +01:00
analysis-kuromoji upgrade to lucene 7.2.1 (#28218) 2018-01-15 16:47:46 +01:00
analysis-phonetic Fix daitch_mokotoff phonetic filter to use the dedicated Lucene filter (#28225) 2018-01-15 19:35:54 +01:00
analysis-smartcn upgrade to lucene 7.2.1 (#28218) 2018-01-15 16:47:46 +01:00
analysis-stempel upgrade to lucene 7.2.1 (#28218) 2018-01-15 16:47:46 +01:00
analysis-ukrainian upgrade to lucene 7.2.1 (#28218) 2018-01-15 16:47:46 +01:00
discovery-azure-classic Require JDK 9 for compilation (#28071) 2018-01-16 13:45:13 -05:00
discovery-ec2 Copy Lucene IOUtils (#29012) 2018-03-13 12:49:33 -04:00
discovery-file Remove some unused code (#27792) 2017-12-13 16:45:55 +01:00
discovery-gce Copy Lucene IOUtils (#29012) 2018-03-13 12:49:33 -04:00
examples Replace jvm-example by two plugin examples (#28339) 2018-01-26 17:34:24 +01:00
ingest-attachment Add ingest-attachment support for per document `indexed_chars` limit (#28977) 2018-03-14 19:07:20 +01:00
ingest-geoip Copy Lucene IOUtils (#29012) 2018-03-13 12:49:33 -04:00
ingest-user-agent Use non deprecated xcontenthelper (#28503) 2018-02-05 16:18:18 -07:00
mapper-murmur3 Add support for filtering mappings fields (#27603) 2017-12-05 20:31:29 +01:00
mapper-size Remove the `update_all_types` option. (#28288) 2018-01-22 12:03:07 +01:00
repository-azure Copy Lucene IOUtils (#29012) 2018-03-13 12:49:33 -04:00
repository-gcs [Test] GoogleCloudStorageFixture command line is too long on Windows (#28991) 2018-03-12 18:02:30 +01:00
repository-hdfs Fix third-party audit tasks on JDK 8 2018-01-16 22:59:29 -05:00
repository-s3 Remove redundant argument for buildConfiguration of s3 plugin (#28281) 2018-01-23 22:32:46 -08:00
store-smb Validate top-level keys for create index request (#23755) (#23869) 2017-09-26 09:49:20 -07:00
transport-nio Remove NioNotEnabledBootstrapCheck bootstrap check (#28901) 2018-03-08 11:06:36 -07:00
build.gradle Plugins: Include license and notice files in zip (#23191) 2017-02-15 11:23:12 -08:00