SOLR-6378: Fixed example/example-DIH/ issues with "tika" and "solr" configurations, and tidied up README.txt

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1618878 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Erik Hatcher 2014-08-19 14:56:26 +00:00
parent 26fcb11272
commit f1feec3579
7 changed files with 37 additions and 13 deletions

View File

@ -294,6 +294,9 @@ Bug Fixes
* SOLR-6314: Multi-threaded facet counts differ when SolrCloud has >1 shard (Erick Erickson) * SOLR-6314: Multi-threaded facet counts differ when SolrCloud has >1 shard (Erick Erickson)
* SOLR-6378: Fixed example/example-DIH/ issues with "tika" and "solr" configurations, and tidied up README.txt
(Daniel Shchyokin via ehatcher)
Optimizations Optimizations
--------------------- ---------------------

View File

@ -21,26 +21,30 @@ Change to the parent (example) directory. Start solr by executing the following
> cd .. > cd ..
> java -Dsolr.solr.home="./example-DIH/solr/" -jar start.jar > java -Dsolr.solr.home="./example-DIH/solr/" -jar start.jar
in this directory, and when Solr is started connect to in this directory, and when Solr is started connect to:
http://localhost:8983/solr/ http://localhost:8983/solr/
To import data from the hsqldb database, connect to * To import data from the hsqldb database, connect to:
http://localhost:8983/solr/db/dataimport?command=full-import http://localhost:8983/solr/db/dataimport?command=full-import
To import data from the slashdot feed, connect to * To import data from an RSS feed, connect to:
http://localhost:8983/solr/rss/dataimport?command=full-import http://localhost:8983/solr/rss/dataimport?command=full-import
To import data from your imap server * To import data from your IMAP server:
1. Edit the example-DIH/solr/mail/conf/mail-data-config.xml and add details about username, password, imap server 1. Edit the example-DIH/solr/mail/conf/mail-data-config.xml and add details about username, password, IMAP server
2. Connect to http://localhost:8983/solr/mail/dataimport?command=full-import 2. Connect to http://localhost:8983/solr/mail/dataimport?command=full-import
To copy data from db Solr core, connect to * To copy data from db Solr core, connect to:
http://localhost:8983/solr/solr/dataimport?command=full-import http://localhost:8983/solr/solr/dataimport?command=full-import
* To index a full text document using Tika integration:
http://localhost:8983/solr/tika/dataimport?command=full-import
See also README.txt in the solr subdirectory, and check See also README.txt in the solr subdirectory, and check
http://wiki.apache.org/solr/DataImportHandler for detailed http://wiki.apache.org/solr/DataImportHandler for detailed

View File

@ -1,2 +1,16 @@
/*C1*/SET SCHEMA PUBLIC /*C2*/SET SCHEMA PUBLIC
CONNECT USER SA CONNECT USER SA
SET AUTOCOMMIT FALSE
/*C3*/SET SCHEMA PUBLIC
CONNECT USER SA
SET AUTOCOMMIT FALSE
/*C4*/SET SCHEMA PUBLIC
CONNECT USER SA
SET AUTOCOMMIT FALSE
/*C5*/SET SCHEMA PUBLIC
CONNECT USER SA
SET AUTOCOMMIT FALSE
/*C2*/DISCONNECT
/*C3*/DISCONNECT
/*C4*/DISCONNECT
/*C5*/DISCONNECT

View File

@ -1,5 +1,5 @@
#HSQL Database Engine 1.8.0.5 #HSQL Database Engine 1.8.0.10
#Fri Aug 29 10:24:33 IST 2008 #Tue Aug 19 10:31:19 EDT 2014
hsqldb.script_format=0 hsqldb.script_format=0
runtime.gc_interval=0 runtime.gc_interval=0
sql.enforce_strict_size=false sql.enforce_strict_size=false

View File

@ -17,6 +17,9 @@
<dataConfig> <dataConfig>
<document> <document>
<entity name="sep" processor="SolrEntityProcessor" url="http://127.0.0.1:8983/solr/db " query="*:*"/> <entity name="sep" processor="SolrEntityProcessor"
url="http://127.0.0.1:8983/solr/db "
query="*:*"
fl="*,orig_version_l:_version_"/>
</document> </document>
</dataConfig> </dataConfig>

View File

@ -2,7 +2,7 @@
<dataSource type="BinFileDataSource" /> <dataSource type="BinFileDataSource" />
<document> <document>
<entity name="tika-test" processor="TikaEntityProcessor" <entity name="tika-test" processor="TikaEntityProcessor"
url="../contrib/extraction/src/test-files/extraction/solr-word.pdf" format="text"> url="exampledocs/solr-word.pdf" format="text">
<field column="Author" name="author" meta="true"/> <field column="Author" name="author" meta="true"/>
<field column="title" name="title" meta="true"/> <field column="title" name="title" meta="true"/>
<field column="text" name="text"/> <field column="text" name="text"/>

Binary file not shown.