[test] Add standalone runner

It could be sometime useful to have a stand alone runner to see how exactly Tika extracts content from a given file.

You can run `StandaloneRunner` class using:

*  `-u file://URL/TO/YOUR/DOC`
*  `--size` set extracted size (default to mapper attachment size)
*  `BASE64` encoded binary

Example:

```sh
StandaloneRunner BASE64Text
StandaloneRunner -u /tmp/mydoc.pdf
StandaloneRunner -u /tmp/mydoc.pdf --size 1000000
```

It produces something like:

```
## Extracted text
--------------------- BEGIN -----------------------
This is the extracted text
---------------------- END ------------------------
## Metadata
- author: null
- content_length: null
- content_type: application/pdf
- date: null
- keywords: null
- language: null
- name: null
- title: null
```

Closes #99.
(cherry picked from commit 720b3bf)
(cherry picked from commit 990fa15)
This commit is contained in:
David Pilato 2015-02-09 17:43:59 +01:00
parent c353936b58
commit 931be57da9
1 changed files with 36 additions and 0 deletions

View File

@ -311,6 +311,42 @@ It gives back:
} }
``` ```
Stand alone runner
------------------
If you want to run some tests within your IDE, you can use `StandaloneRunner` class.
It accepts arguments:
* `-u file://URL/TO/YOUR/DOC`
* `--size` set extracted size (default to mapper attachment size)
* `BASE64` encoded binary
Example:
```sh
StandaloneRunner BASE64Text
StandaloneRunner -u /tmp/mydoc.pdf
StandaloneRunner -u /tmp/mydoc.pdf --size 1000000
```
It produces something like:
```
## Extracted text
--------------------- BEGIN -----------------------
This is the extracted text
---------------------- END ------------------------
## Metadata
- author: null
- content_length: null
- content_type: application/pdf
- date: null
- keywords: null
- language: null
- name: null
- title: null
```
License License
------- -------