Commit Graph

6 Commits

Author SHA1 Message Date
David Pilato 2b15d20f93 Remove support for Visio and POTM files
Actually we never supported Visio files but we are failing hard (kill a node) when that kind of file is provided.
See https://github.com/elastic/elasticsearch/pull/22079#issuecomment-277035357

This commits excludes Visio parsing from Tika so it does not fail anymore but returns empty content instead.

As a side effect, it also removes support for POTM files.

Closes .
2017-02-03 13:03:52 +01:00
David Pilato 8701f7a3ce Add missing mime4j library
In some cases (apparently with outlook files), mime4j library is needed.
We removed it in the past which can cause elasticsearch to crash when you are using ingest-attachment (and probably mapper-attachments as well in 2.x series) with a file which requires this library.

 Similar problem as the one reported at .
2017-01-24 10:25:02 +01:00
David Pilato 7517c50698 Update to Tika 1.14
Closes .
2016-11-16 11:29:14 +01:00
Alexander Reelsen 3c2e51d831 Deps: Update ingest-attachment to latest libraries ()
Also added a test to check for a with a regular PDF,
instead of only an encrypted one with expected exception.
2016-10-10 12:55:05 +02:00
Ryan Ernst 1d40c4bbc1 Make java9 work again
This change makes ES compile with java9 again, build 118.
* There are a handful of changes due to failure to determine types during compile.
* The attachment plugins which use tika needed to have tika upgraded in order to pickup fixes there for java 9.
* azure discovery and s3 repository indirectly depend on jaxb, which is no longer in the default modules. They now add a jaxb dependency externally, and make JarHell allow for this package.
2016-05-21 09:41:51 -07:00
Alexander Reelsen 0d4711c2fc Ingest: Add attachment processor
This is a simple port of the mapper attachment plugin to the ingest
functionality, no new features. The only option is to limit
the number of chars to prevent indexing of huge documents.

Fields can be selected in the processor as well.

Close 
2016-02-09 17:03:30 +01:00