lucene/solr/example/films/README.txt

We have a movie data set in JSON, Solr XML, and CSV formats.
All 3 formats contain the same data.  You can use any one format to index documents to Solr.

The data is fetched from Freebase and the data license is present in the films-LICENSE.txt file.

This data consists of the following fields -
 * "id" - unique identifier for the movie
 * "name" - Name of the movie
 * "directed_by" - The person(s) who directed the making of the film
 * "initial_release_date" - The earliest official initial film screening date in any country
 * "genre" - The genre(s) that the movie belongs to

 Steps:
   * Start Solr:
       bin/solr start

   * Create a "films" core
       bin/solr create_core -n films -c data_driven_schema_configs

   * Update the schema (by default it will guess the field types based on the date as it is indexed):
curl http://localhost:8983/solr/films/schema/fields -X POST -H 'Content-type:application/json' --data-binary '
[
    {
        "name":"genre",
        "type":"string",
        "stored":true,
        "multiValued":true
    },
    {
        "name":"directed_by",
        "type":"string",
        "stored":true,
        "multiValued":true
    },
    {
        "name":"name",
        "type":"text_general",
        "stored":true
    },
    {
        "name":"initial_release_date",
        "type":"tdate",
        "stored":true
    }
]'

   * Now let's index the data, using one of these three commands:

     - JSON: bin/post films example/films/films.json
     - XML: bin/post films example/films/films.xml
     - CSV: bin/post films example/films/films.csv params=f.genre.split=true&f.directed_by.split=true&f.genre.separator=|&f.directed_by.separator=|

   * Let's get searching.
     - Search for 'Batman':
       http://localhost:8983/solr/films/query?q=name:batman

     - Show me all 'Super hero' movies:
       http://localhost:8983/solr/films/query?q=*:*&fq=genre:%22Superhero%20movie%22

     - Let's see the distribution of genres across all the movies. See the facet section for the counts:
       http://localhost:8983/solr/films/query?q=*:*&facet=true&facet.field=genre

Exploring the data further - 

  * Increase the MAX_ITERATIONS value, put in your freebase API_KEY and run the film_data_generator.py script using Python 3.
    Now re-index Solr with the new data.
SOLR-6127: Improve example docs, using films data git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1647918 13f79535-47bb-0310-9956-ffa450edef68 2014-12-25 16:27:12 -05:00			`We have a movie data set in JSON, Solr XML, and CSV formats.`
			`All 3 formats contain the same data. You can use any one format to index documents to Solr.`

			`The data is fetched from Freebase and the data license is present in the films-LICENSE.txt file.`

			`This data consists of the following fields -`
			`* "id" - unique identifier for the movie`
			`* "name" - Name of the movie`
			`* "directed_by" - The person(s) who directed the making of the film`
			`* "initial_release_date" - The earliest official initial film screening date in any country`
			`* "genre" - The genre(s) that the movie belongs to`

			`Steps:`
			`* Start Solr:`
			`bin/solr start`

			`* Create a "films" core`
			`bin/solr create_core -n films -c data_driven_schema_configs`

			`* Update the schema (by default it will guess the field types based on the date as it is indexed):`
			`curl http://localhost:8983/solr/films/schema/fields -X POST -H 'Content-type:application/json' --data-binary '`
			`[`
			`{`
			`"name":"genre",`
			`"type":"string",`
			`"stored":true,`
			`"multiValued":true`
			`},`
			`{`
			`"name":"directed_by",`
			`"type":"string",`
			`"stored":true,`
			`"multiValued":true`
			`},`
			`{`
			`"name":"name",`
			`"type":"text_general",`
			`"stored":true`
			`},`
			`{`
			`"name":"initial_release_date",`
			`"type":"tdate",`
			`"stored":true`
			`}`
			`]'`

Adjust films README using new bin/post script instead of curl git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1647930 13f79535-47bb-0310-9956-ffa450edef68 2014-12-25 21:45:21 -05:00			`* Now let's index the data, using one of these three commands:`
SOLR-6127: Improve example docs, using films data git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1647918 13f79535-47bb-0310-9956-ffa450edef68 2014-12-25 16:27:12 -05:00
SOLR-6127: fix paths in README git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1649355 13f79535-47bb-0310-9956-ffa450edef68 2015-01-04 11:44:49 -05:00			`- JSON: bin/post films example/films/films.json`
			`- XML: bin/post films example/films/films.xml`
			`- CSV: bin/post films example/films/films.csv params=f.genre.split=true&f.directed_by.split=true&f.genre.separator=\|&f.directed_by.separator=\|`
SOLR-6127: Improve example docs, using films data git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1647918 13f79535-47bb-0310-9956-ffa450edef68 2014-12-25 16:27:12 -05:00
			`* Let's get searching.`
			`- Search for 'Batman':`
			`http://localhost:8983/solr/films/query?q=name:batman`

			`- Show me all 'Super hero' movies:`
			`http://localhost:8983/solr/films/query?q=:&fq=genre:%22Superhero%20movie%22`

			`- Let's see the distribution of genres across all the movies. See the facet section for the counts:`
			`http://localhost:8983/solr/films/query?q=:&facet=true&facet.field=genre`

			`Exploring the data further -`

SOLR-6127: Fix reference to previously renamed script git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1649376 13f79535-47bb-0310-9956-ffa450edef68 2015-01-04 12:48:03 -05:00			`* Increase the MAX_ITERATIONS value, put in your freebase API_KEY and run the film_data_generator.py script using Python 3.`
SOLR-6127: Improve example docs, using films data git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1647918 13f79535-47bb-0310-9956-ffa450edef68 2014-12-25 16:27:12 -05:00			`Now re-index Solr with the new data.`