lucene/solr/contrib/extraction
Noble Paul 54e827e9b6 SOLR-8842: security rules made more foolproof by asking the requesthandler about the well known
permission name.
  The APIs are also modified to ue 'index' as the unique identifier instead of name.
  Name is an optional attribute
  now and only to be used when specifying well-known permissions
2016-03-17 23:36:18 +05:30
..
src SOLR-8842: security rules made more foolproof by asking the requesthandler about the well known 2016-03-17 23:36:18 +05:30
README.txt SOLR-3650: checkpoint, migrated CHANGES.txt for contrib/uima and contrib/extraction 2012-07-31 01:37:17 +00:00
build.xml SOLR-8180: jcl-over-slf4j is officially a solrj/solr dependency now; not marked optional in a POM. 2015-12-01 18:12:00 +00:00
ivy.xml SOLR-8180: jcl-over-slf4j is officially a solrj/solr dependency now; not marked optional in a POM. 2015-12-01 18:12:00 +00:00

README.txt

Apache Solr Content Extraction Library (Solr Cell)

Introduction
------------

Apache Solr Extraction provides a means for extracting and indexing content contained in "rich" documents, such
as Microsoft Word, Adobe PDF, etc.  (Each name is a trademark of their respective owners)  This contrib module
uses Apache Tika to extract content and metadata from the files, which can then be indexed.  For more information,
see http://wiki.apache.org/solr/ExtractingRequestHandler

Getting Started
---------------
You will need Solr up and running.  Then, simply add the extraction JAR file, plus the Tika dependencies (in the ./lib folder)
to your Solr Home lib directory.  See http://wiki.apache.org/solr/ExtractingRequestHandler for more details on hooking it in
 and configuring.