lucene/solr/example/exampledocs
Erik Hatcher a46a46720a Adjust films README using new bin/post script instead of curl
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1647930 13f79535-47bb-0310-9956-ffa450edef68
2014-12-26 02:45:21 +00:00
..
README.txt Adjust films README using new bin/post script instead of curl 2014-12-26 02:45:21 +00:00
books.csv exampledocs: use the author field that's in the example schema now 2010-05-15 13:50:35 +00:00
books.json SOLR-2598: exampledocs/books.json should use name instead of title 2011-06-21 14:27:40 +00:00
exampledocs_generator.py SOLR-6127: Improve example docs, using films data 2014-12-25 21:27:12 +00:00
films-LICENSE.txt SOLR-6127: Improve example docs, using films data 2014-12-25 21:27:12 +00:00
films.csv SOLR-6127: Improve example docs, using films data 2014-12-25 21:27:12 +00:00
films.json SOLR-6127: Improve example docs, using films data 2014-12-25 21:27:12 +00:00
films.xml SOLR-6127: Improve example docs, using films data 2014-12-25 21:27:12 +00:00
gb18030-example.xml SOLR-2350: SimplePostTool (aka: post.jar) has been improved to work with files of any mime-type or charset 2011-02-07 21:39:32 +00:00
hd.xml SOLR-2502: put in start of Join example 2011-05-09 18:17:46 +00:00
ipod_other.xml SOLR-2502: put in start of Join example 2011-05-09 18:17:46 +00:00
ipod_video.xml SOLR-2502: put in start of Join example 2011-05-09 18:17:46 +00:00
manufacturers.xml SOLR-2502: missed compName 2011-05-09 19:25:46 +00:00
mem.xml SOLR-2502: put in start of Join example 2011-05-09 18:17:46 +00:00
money.xml SOLR-2202: Money/Currency FieldType 2012-03-09 22:40:06 +00:00
monitor.xml SOLR-5378: A new SuggestComponent that fully utilizes the Lucene suggester module and adds pluggable dictionaries, payloads and better distributed support 2013-11-23 13:46:12 +00:00
monitor2.xml SOLR-5378: A new SuggestComponent that fully utilizes the Lucene suggester module and adds pluggable dictionaries, payloads and better distributed support 2013-11-23 13:46:12 +00:00
mp500.xml SOLR-2502: put in start of Join example 2011-05-09 18:17:46 +00:00
sample.html put a really simple html doc in exampledocs for using in ref guide examples 2014-12-17 00:17:37 +00:00
sd500.xml SOLR-2502: put in start of Join example 2011-05-09 18:17:46 +00:00
solr-word.pdf SOLR-6378: Fixed example/example-DIH/ issues with "tika" and "solr" configurations, and tidied up README.txt 2014-08-19 14:56:26 +00:00
solr.xml SVN-GIT conversion, path copy emulation. 2016-01-23 01:18:39 +01:00
test_utf8.sh SOLR-3889: improve utf-8 test script error message when curl call to Solr fails 2012-09-27 15:01:25 +00:00
utf8-example.xml example: uncomment char outside the BMP 2011-02-27 20:39:28 +00:00
vidcard.xml SOLR-2502: put in start of Join example 2011-05-09 18:17:46 +00:00

README.txt

We have a movie data set in JSON, Solr XML, and CSV formats.
All 3 formats contain the same data.  You can use any one format to index documents to Solr.

The data is fetched from Freebase and the data license is present in the films-LICENSE.txt file.

This data consists of the following fields -
 * "id" - unique identifier for the movie
 * "name" - Name of the movie
 * "directed_by" - The person(s) who directed the making of the film
 * "initial_release_date" - The earliest official initial film screening date in any country
 * "genre" - The genre(s) that the movie belongs to

 Steps:
   * Start Solr:
       bin/solr start

   * Create a "films" core
       bin/solr create_core -n films -c data_driven_schema_configs

   * Update the schema (by default it will guess the field types based on the date as it is indexed):
curl http://localhost:8983/solr/films/schema/fields -X POST -H 'Content-type:application/json' --data-binary '
[
    {
        "name":"genre",
        "type":"string",
        "stored":true,
        "multiValued":true
    },
    {
        "name":"directed_by",
        "type":"string",
        "stored":true,
        "multiValued":true
    },
    {
        "name":"name",
        "type":"text_general",
        "stored":true
    },
    {
        "name":"initial_release_date",
        "type":"tdate",
        "stored":true
    }
]'

   * Now let's index the data, using one of these three commands:

     - JSON: bin/post films example/exampledocs/films.json
     - XML: bin/post films example/exampledocs/films.xml
     - CSV: bin/post films example/exampledocs/films.csv params=f.genre.split=true&f.directed_by.split=true&f.genre.separator=|&f.directed_by.separator=|

   * Let's get searching.
     - Search for 'Batman':
       http://localhost:8983/solr/films/query?q=name:batman

     - Show me all 'Super hero' movies:
       http://localhost:8983/solr/films/query?q=*:*&fq=genre:%22Superhero%20movie%22

     - Let's see the distribution of genres across all the movies. See the facet section for the counts:
       http://localhost:8983/solr/films/query?q=*:*&facet=true&facet.field=genre

Exploring the data further - 

  * Increase the MAX_ITERATIONS value, put in your freebase API_KEY and run the exampledocs_generator.py script using Python 3.
    Now re-index Solr with the new data.