SOLR-6870: overhaul/rename tutorial

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1651560 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Erik Hatcher 2015-01-14 03:05:35 +00:00
parent 4aef9bdc8e
commit 83db0a52a1
9 changed files with 606 additions and 739 deletions

View File

@ -15,7 +15,7 @@ Getting Started
You need a Java 1.8 VM or later installed.
In this release, there is an example Solr server including a bundled
servlet container in the directory named "example".
See the tutorial at http://lucene.apache.org/solr/tutorial.html
See the Quick Start guide at http://lucene.apache.org/solr/quickstart.html
================== 6.0.0 ==================

View File

@ -89,15 +89,11 @@ For more information about Solr examples please read...
* example/solr/README.txt
For more information about the "Solr Home" and Solr specific configuration
* http://lucene.apache.org/solr/tutorial.html
For a Tutorial using this example configuration
* http://wiki.apache.org/solr/SolrResources
* http://lucene.apache.org/solr/quickstart.html
For a Quick Start guide
* http://lucene.apache.org/solr/resources.html
For a list of other tutorials and introductory articles.
A tutorial is available at:
http://lucene.apache.org/solr/tutorial.html
or linked from "docs/index.html" in a binary distribution.
Also, there are Solr clients for many programming languages, see

View File

@ -151,7 +151,7 @@
so we pass ourself (${ant.file}) here. The list of module build.xmls is given
via string parameter, that must be splitted by the XSL at '|'.
-->
<xslt in="${ant.file}" out="${javadoc.dir}/index.html" style="site/xsl/index.xsl" force="true">
<xslt in="${ant.file}" out="${javadoc.dir}/index.html" style="site/index.xsl" force="true">
<outputproperty name="method" value="html"/>
<outputproperty name="version" value="4.0"/>
<outputproperty name="encoding" value="UTF-8"/>
@ -162,12 +162,12 @@
</xslt>
<pegdown todir="${javadoc.dir}">
<fileset dir="." includes="SYSTEM_REQUIREMENTS.txt"/>
<globmapper from="*.txt" to="*.html"/>
<fileset dir="site/"/>
<globmapper from="*.mdtext" to="*.html"/>
</pegdown>
<copy todir="${javadoc.dir}">
<fileset dir="site/html" includes="**/*"/>
<fileset dir="site/assets/" />
</copy>
</target>

View File

@ -48,7 +48,7 @@ For more information about this example please read...
* example/solr/README.txt
For more information about the "Solr Home" and Solr specific configuration
* http://lucene.apache.org/solr/tutorial.html
* http://lucene.apache.org/solr/quickstart.html
For a Tutorial using this example configuration
* http://wiki.apache.org/solr/SolrResources
For a list of other tutorials and introductory articles.

View File

@ -1,39 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="203.005pt" height="102.654pt" viewBox="0 0 203.005 102.654" version="1.1">
<defs>
<clipPath id="clip1">
<path d="M 0 36 L 49 36 L 49 102.652344 L 0 102.652344 Z "/>
</clipPath>
<clipPath id="clip2">
<path d="M 53 53 L 100 53 L 100 102.652344 L 53 102.652344 Z "/>
</clipPath>
<clipPath id="clip3">
<path d="M 106 35 L 123 35 L 123 102.652344 L 106 102.652344 Z "/>
</clipPath>
<clipPath id="clip4">
<path d="M 163 29 L 203.003906 29 L 203.003906 52 L 163 52 Z "/>
</clipPath>
</defs>
<g id="surface1">
<g clip-path="url(#clip1)" clip-rule="nonzero">
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(14.048767%,12.319946%,16.685486%);fill-opacity:1;" d="M 40.761719 70.890625 C 38.414062 69.644531 35.765625 68.765625 32.886719 68.277344 C 30.054688 67.800781 27.179688 67.5625 24.332031 67.5625 C 22.003906 67.5625 19.652344 67.359375 17.332031 66.964844 C 15.078125 66.578125 13.035156 65.871094 11.257812 64.859375 C 9.527344 63.875 8.097656 62.5 7.003906 60.765625 C 5.925781 59.058594 5.382812 56.796875 5.382812 54.066406 C 5.433594 51.65625 5.992188 49.597656 7.046875 47.9375 C 8.117188 46.253906 9.511719 44.886719 11.195312 43.871094 C 12.921875 42.832031 14.914062 42.066406 17.117188 41.597656 C 20.867188 40.800781 24.828125 40.652344 28.863281 41.214844 C 30.441406 41.4375 31.996094 41.832031 33.496094 42.382812 C 34.964844 42.929688 36.347656 43.675781 37.601562 44.601562 C 38.835938 45.515625 39.929688 46.664062 40.847656 48.011719 L 41.359375 48.765625 L 45.148438 47.171875 L 44.34375 46.039062 C 43.3125 44.582031 42.171875 43.269531 40.949219 42.140625 C 39.703125 40.988281 38.238281 40.007812 36.59375 39.234375 C 34.96875 38.46875 33.113281 37.878906 31.070312 37.480469 C 29.050781 37.089844 26.695312 36.890625 24.070312 36.890625 C 21.550781 36.890625 18.949219 37.179688 16.339844 37.75 C 13.691406 38.328125 11.234375 39.289062 9.039062 40.597656 C 6.796875 41.9375 4.949219 43.722656 3.542969 45.90625 C 2.121094 48.125 1.394531 50.863281 1.394531 54.042969 C 1.394531 57.382812 2.066406 60.226562 3.386719 62.492188 C 4.703125 64.746094 6.464844 66.570312 8.625 67.914062 C 10.742188 69.234375 13.210938 70.183594 15.964844 70.734375 C 18.652344 71.273438 21.46875 71.546875 24.332031 71.546875 C 26.609375 71.546875 29.019531 71.71875 31.496094 72.0625 C 33.90625 72.394531 36.152344 73.066406 38.164062 74.058594 C 40.125 75.027344 41.761719 76.375 43.023438 78.0625 C 44.242188 79.699219 44.859375 81.90625 44.859375 84.625 C 44.859375 87.066406 44.261719 89.148438 43.089844 90.820312 C 41.878906 92.542969 40.304688 93.964844 38.410156 95.046875 C 36.476562 96.152344 34.289062 96.96875 31.902344 97.464844 C 29.476562 97.972656 27.078125 98.230469 24.769531 98.230469 C 20.898438 98.230469 17.070312 97.492188 13.394531 96.03125 C 9.738281 94.578125 6.492188 92.398438 3.753906 89.550781 L 3.019531 88.785156 L 0 91.402344 L 0.847656 92.25 C 3.550781 94.953125 6.90625 97.308594 10.824219 99.25 C 14.785156 101.21875 19.480469 102.214844 24.769531 102.214844 C 27.285156 102.214844 29.957031 101.929688 32.707031 101.359375 C 35.496094 100.785156 38.117188 99.800781 40.488281 98.429688 C 42.898438 97.039062 44.902344 95.222656 46.449219 93.023438 C 48.039062 90.765625 48.84375 87.941406 48.84375 84.625 C 48.84375 81.265625 48.089844 78.417969 46.597656 76.167969 C 45.128906 73.945312 43.164062 72.171875 40.761719 70.890625 "/>
</g>
<g clip-path="url(#clip2)" clip-rule="nonzero">
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(14.048767%,12.319946%,16.685486%);fill-opacity:1;" d="M 76.390625 98.667969 C 73.65625 98.667969 71.160156 98.105469 68.964844 96.992188 C 66.738281 95.867188 64.773438 94.335938 63.125 92.4375 C 61.464844 90.53125 60.148438 88.300781 59.207031 85.808594 C 58.261719 83.304688 57.722656 80.679688 57.613281 78.054688 C 57.613281 75.625 58.066406 73.144531 58.949219 70.675781 C 59.835938 68.214844 61.125 65.972656 62.785156 64.007812 C 64.4375 62.050781 66.449219 60.433594 68.769531 59.191406 C 71.046875 57.96875 73.609375 57.351562 76.390625 57.351562 C 79.003906 57.351562 81.472656 57.929688 83.726562 59.070312 C 86.015625 60.226562 88.023438 61.777344 89.707031 63.675781 C 91.390625 65.585938 92.738281 67.816406 93.707031 70.308594 C 94.675781 72.796875 95.164062 75.40625 95.164062 78.054688 C 95.164062 80.480469 94.714844 82.960938 93.828125 85.429688 C 92.941406 87.894531 91.652344 90.136719 89.996094 92.097656 C 88.34375 94.050781 86.335938 95.65625 84.019531 96.867188 C 81.742188 98.0625 79.175781 98.667969 76.390625 98.667969 M 92.96875 61.292969 C 90.984375 58.976562 88.574219 57.058594 85.804688 55.597656 C 82.996094 54.117188 79.828125 53.367188 76.390625 53.367188 C 73.257812 53.367188 70.265625 54.035156 67.496094 55.359375 C 64.746094 56.671875 62.316406 58.484375 60.273438 60.742188 C 58.234375 62.992188 56.601562 65.640625 55.417969 68.617188 C 54.230469 71.59375 53.628906 74.769531 53.628906 78.054688 C 53.628906 81.15625 54.183594 84.195312 55.28125 87.082031 C 56.371094 89.964844 57.929688 92.582031 59.90625 94.863281 C 61.890625 97.15625 64.304688 99.027344 67.078125 100.429688 C 69.875 101.84375 73.003906 102.59375 76.371094 102.652344 L 76.410156 102.652344 C 79.597656 102.59375 82.613281 101.875 85.378906 100.527344 C 88.121094 99.1875 90.550781 97.359375 92.597656 95.101562 C 94.632812 92.851562 96.25 90.230469 97.40625 87.316406 C 98.5625 84.390625 99.148438 81.273438 99.148438 78.054688 C 99.148438 75.136719 98.609375 72.164062 97.550781 69.21875 C 96.488281 66.273438 94.945312 63.605469 92.96875 61.292969 "/>
</g>
<g clip-path="url(#clip3)" clip-rule="nonzero">
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(14.048767%,12.319946%,16.685486%);fill-opacity:1;" d="M 119.78125 97.5625 C 119.113281 97.730469 118.523438 97.855469 118.03125 97.9375 C 117.511719 98.027344 116.953125 98.113281 116.347656 98.199219 C 115.789062 98.277344 115.238281 98.320312 114.707031 98.320312 C 113.371094 98.320312 112.386719 97.875 111.699219 96.964844 C 110.9375 95.960938 110.566406 94.960938 110.566406 93.914062 L 110.566406 35.578125 L 106.582031 35.578125 L 106.582031 93.914062 C 106.582031 96.101562 107.273438 98.039062 108.636719 99.667969 C 110.050781 101.359375 112.0625 102.214844 114.621094 102.214844 C 115.421875 102.214844 116.21875 102.167969 116.988281 102.078125 C 117.730469 101.988281 118.394531 101.898438 118.992188 101.808594 C 119.601562 101.71875 120.308594 101.578125 121.09375 101.398438 L 122.445312 101.085938 L 120.671875 97.339844 Z "/>
</g>
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(14.048767%,12.319946%,16.685486%);fill-opacity:1;" d="M 136.976562 57.378906 C 134.90625 58.835938 133.074219 60.6875 131.507812 62.902344 L 131.507812 54.066406 L 127.523438 54.066406 L 127.523438 101.777344 L 131.507812 101.777344 L 131.507812 72.074219 C 132.058594 70.128906 132.820312 68.300781 133.769531 66.648438 C 134.726562 64.980469 135.914062 63.511719 137.304688 62.285156 C 138.695312 61.058594 140.3125 60.066406 142.117188 59.328125 C 143.914062 58.589844 145.945312 58.160156 148.148438 58.050781 L 149.207031 57.996094 L 149.207031 54.066406 L 148.09375 54.066406 C 143.84375 54.066406 140.105469 55.179688 136.976562 57.378906 "/>
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(85.639954%,20.640564%,13.365173%);fill-opacity:1;" d="M 193.910156 10.613281 L 160.910156 46.289062 L 202.382812 27.15625 C 201.15625 20.824219 198.152344 15.132812 193.910156 10.613281 "/>
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(85.639954%,20.640564%,13.365173%);fill-opacity:1;" d="M 169.425781 0 C 164.855469 0 160.5 0.917969 156.527344 2.574219 L 152.097656 39.886719 L 174.226562 0.347656 C 172.65625 0.121094 171.058594 0 169.425781 0 "/>
<g clip-path="url(#clip4)" clip-rule="nonzero">
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(85.639954%,20.640564%,13.365173%);fill-opacity:1;" d="M 202.714844 29.210938 L 163.414062 51.203125 L 200.285156 46.828125 C 202.035156 42.761719 203.003906 38.285156 203.003906 33.578125 C 203.003906 32.097656 202.898438 30.640625 202.714844 29.210938 "/>
</g>
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(85.639954%,20.640564%,13.365173%);fill-opacity:1;" d="M 188.226562 61.40625 C 192.617188 58.433594 196.261719 54.449219 198.835938 49.789062 L 164.277344 56.648438 Z "/>
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(85.639954%,20.640564%,13.365173%);fill-opacity:1;" d="M 176.25 0.695312 L 157.011719 42.390625 L 192.695312 9.382812 C 188.222656 5.078125 182.5625 2 176.25 0.695312 "/>
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(85.639954%,20.640564%,13.365173%);fill-opacity:1;" d="M 172.234375 67.03125 C 175.953125 66.722656 179.496094 65.816406 182.773438 64.394531 L 163.414062 62.097656 Z "/>
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(85.639954%,20.640564%,13.365173%);fill-opacity:1;" d="M 138.820312 19.773438 C 137.308594 23.117188 136.328125 26.75 135.988281 30.566406 L 141.203125 39.886719 Z "/>
<path style=" stroke:none;fill-rule:nonzero;fill:rgb(85.639954%,20.640564%,13.365173%);fill-opacity:1;" d="M 153.617188 3.953125 C 148.878906 6.488281 144.824219 10.121094 141.785156 14.519531 L 146.652344 39.023438 Z "/>
</g>
</svg>

Before

Width:  |  Height:  |  Size: 9.1 KiB

View File

@ -1,686 +0,0 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Solr Tutorial</title>
<style>
pre.code {
background-color: #D3D3D3;
padding: 0.2em;
}
.codefrag {
font-family: monospace;
font-weight:bold;
}
</style>
</head>
<body>
<div id="content">
<h1>Solr Tutorial</h1>
<a name="N1000E"></a><a name="Overview"></a>
<h2 class="boxed">Overview</h2>
<div class="section">
<p>
This document covers the basics of running Solr using an example
schema, and some sample data.
</p>
</div>
<a name="N10018"></a><a name="Requirements"></a>
<h2 class="boxed">Requirements</h2>
<div class="section">
<p>
To follow along with this tutorial, you will need...
</p>
<ol>
<li>Java 1.8 or greater. Some places you can get it are from
<a href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">Oracle</a>,
<a href="http://openjdk.java.net/">Open JDK</a>, or
<a href="http://www.ibm.com/developerworks/java/jdk/">IBM</a>.
<ul>
<li>Running <span class="codefrag">java -version</span> at the command
line should indicate a version number starting with 1.8.
</li>
<li>Gnu's GCJ is not supported and does not work with Solr.</li>
</ul>
</li>
<li>A <a href="http://lucene.apache.org/solr/mirrors-solr-latest-redir.html">Solr release</a>.
</li>
</ol>
</div>
<a name="N10040"></a><a name="Getting+Started"></a>
<h2 class="boxed">Getting Started</h2>
<div class="section">
<p>
<strong>
Please run the browser showing this tutorial and the Solr server on the same machine so tutorial links will correctly point to your Solr server.
</strong>
</p>
<p>
Begin by unzipping the Solr release and changing your working directory
to be the "<span class="codefrag">example</span>" directory. (Note that the base directory name may vary with the version of Solr downloaded.) For example, with a shell in UNIX, Cygwin, or MacOS:
</p>
<pre class="code">
user:~solr$ <strong>ls</strong>
solr-nightly.zip
user:~solr$ <strong>unzip -q solr-nightly.zip</strong>
user:~solr$ <strong>cd solr-nightly/example/</strong>
</pre>
<p>
Solr can run in any Java Servlet Container of your choice, but to simplify
this tutorial, the example index includes a small installation of Jetty.
</p>
<p>
To launch Jetty with the Solr WAR, and the example configs, just run the <span class="codefrag">start.jar</span> ...
</p>
<pre class="code">
user:~/solr/example$ <strong>java -jar start.jar</strong>
2012-06-06 15:25:59.815:INFO:oejs.Server:jetty-8.1.2.v20120308
2012-06-06 15:25:59.834:INFO:oejdp.ScanningAppProvider:Deployment monitor .../solr/example/webapps at interval 0
2012-06-06 15:25:59.839:INFO:oejd.DeploymentManager:Deployable added: .../solr/example/webapps/solr.war
...
Jun 6, 2012 3:26:03 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [collection1] Registered new searcher Searcher@7527e2ee main{StandardDirectoryReader(segments_1:1)}
</pre>
<p>
This will start up the Jetty application server on port 8983, and use your terminal to display the logging information from Solr.
</p>
<p>
You can see that the Solr is running by loading <a href="http://localhost:8983/solr/">http://localhost:8983/solr/</a> in your web browser. This is the main starting point for Administering Solr.
</p>
</div>
<a name="N10078"></a><a name="Indexing+Data"></a>
<h2 class="boxed">Indexing Data</h2>
<div class="section">
<p>
Your Solr server is up and running, but it doesn't contain any data. You can
modify a Solr index by POSTing commands to Solr to add (or
update) documents, delete documents, and commit pending adds and deletes.
These commands can be in a
<a href="http://wiki.apache.org/solr/UpdateRequestHandler">variety of formats</a>.
</p>
<p>
The <span class="codefrag">exampledocs</span> directory contains sample files
showing of the types of commands Solr accepts, as well as a java utility
for posting them from the command line. Run <span class="codefrag">java -jar post.jar -h</span> so see it's various options.
</p>
<p> To try this, open a new terminal window, enter the exampledocs directory,
and run "<span class="codefrag">java -Dc=collection_name -jar post.jar</span>" on some of the XML
files in that directory.
</p>
<pre class="code">
user:~/solr/example/exampledocs$ <strong>java -Dc=techproducts -jar post.jar solr.xml monitor.xml</strong>
SimplePostTool: version 1.4
SimplePostTool: POSTing files to http://localhost:8983/solr/update..
SimplePostTool: POSTing file solr.xml
SimplePostTool: POSTing file monitor.xml
SimplePostTool: COMMITting Solr index changes..
</pre>
<p>
You have now indexed two documents in Solr, and committed these changes.
You can now search for "solr" by loading the <a href="http://localhost:8983/solr/#/collection1/query">"Query" tab</a> in the Admin interface, and entering "solr" in the "q" text box. Clicking the "Execute Query" button should display the following URL containing one result...
</p>
<p>
<a href="http://localhost:8983/solr/collection1/select?q=solr&amp;wt=xml">http://localhost:8983/solr/collection1/select?q=solr&amp;wt=xml</a>
</p>
<p>
You can index all of the sample data, using the following command
(assuming your command line shell supports the *.xml notation):
</p>
<pre class="code">
user:~/solr/example/exampledocs$ <strong>java -Dc=techproducts -jar post.jar *.xml</strong>
SimplePostTool: version 1.4
SimplePostTool: POSTing files to http://localhost:8983/solr/update..
SimplePostTool: POSTing file gb18030-example.xml
SimplePostTool: POSTing file hd.xml
SimplePostTool: POSTing file ipod_other.xml
SimplePostTool: POSTing file ipod_video.xml
...
SimplePostTool: POSTing file solr.xml
SimplePostTool: POSTing file utf8-example.xml
SimplePostTool: POSTing file vidcard.xml
SimplePostTool: COMMITting Solr index changes..
</pre>
<p>
...and now you can search for all sorts of things using the default <a href="http://wiki.apache.org/solr/SolrQuerySyntax">Solr Query Syntax</a> (a superset of the Lucene query syntax)...
</p>
<ul>
<li>
<a href="http://localhost:8983/solr/#/collection1/query?q=video">video</a>
</li>
<li>
<a href="http://localhost:8983/solr/#/collection1/query?q=name:video">name:video</a>
</li>
<li>
<a href="http://localhost:8983/solr/#/collection1/query?q=%2Bvideo%20%2Bprice%3A[*%20TO%20400]">+video +price:[* TO 400]</a>
</li>
</ul>
<p></p>
<p>
There are many other different ways to import your data into Solr... one can
</p>
<ul>
<li>Import records from a database using the
<a href="http://wiki.apache.org/solr/DataImportHandler">Data Import Handler (DIH)</a>.
</li>
<li>
<a href="http://wiki.apache.org/solr/UpdateCSV">Load a CSV file</a> (comma separated values),
including those exported by Excel or MySQL.
</li>
<li>
<a href="http://wiki.apache.org/solr/UpdateJSON">POST JSON documents</a>
</li>
<li>Index binary documents such as Word and PDF with
<a href="http://wiki.apache.org/solr/ExtractingRequestHandler">Solr Cell</a> (ExtractingRequestHandler).
</li>
<li>
Use <a href="http://wiki.apache.org/solr/Solrj">SolrJ</a> for Java or other Solr clients to
programatically create documents to send to Solr.
</li>
</ul>
</div>
<a name="N100EE"></a><a name="Updating+Data"></a>
<h2 class="boxed">Updating Data</h2>
<div class="section">
<p>
You may have noticed that even though the file <span class="codefrag">solr.xml</span> has now
been POSTed to the server twice, you still only get 1 result when searching for
"solr". This is because the example <span class="codefrag">schema.xml</span> specifies a "<span class="codefrag">uniqueKey</span>" field
called "<span class="codefrag">id</span>". Whenever you POST commands to Solr to add a
document with the same value for the <span class="codefrag">uniqueKey</span> as an existing document, it
automatically replaces it for you. You can see that that has happened by
looking at the values for <span class="codefrag">numDocs</span> and <span class="codefrag">maxDoc</span> in the
"CORE"/searcher section of the statistics page... </p>
<p>
<a href="http://localhost:8983/solr/#/collection1/plugins/core?entry=searcher">http://localhost:8983/solr/#/collection1/plugins/core?entry=searcher</a>
</p>
<p>
<strong><span class="codefrag">numDocs</span></strong> represents the number of searchable documents in the
index (and will be larger than the number of XML files since some files
contained more than one <span class="codefrag">&lt;doc&gt;</span>). <strong><span class="codefrag">maxDoc</span></strong>
may be larger as the <span class="codefrag">maxDoc</span> count includes logically deleted documents that
have not yet been removed from the index. You can re-post the sample XML
files over and over again as much as you want and <span class="codefrag">numDocs</span> will never
increase, because the new documents will constantly be replacing the old.
</p>
<p>
Go ahead and edit the existing XML files to change some of the data, and re-run
the <span class="codefrag">java -jar post.jar</span> command, you'll see your changes reflected
in subsequent searches.
</p>
<a name="N1012D"></a><a name="Deleting+Data"></a>
<h3 class="boxed">Deleting Data</h3>
<p>
You can delete data by POSTing a delete command to the update URL and
specifying the value of the document's unique key field, or a query that
matches multiple documents (be careful with that one!). Since these commands
are smaller, we will specify them right on the command line rather than
reference an XML file.
</p>
<p>Execute the following command to delete a specific document</p>
<pre class="code">java -Ddata=args -Dcommit=false -jar post.jar "&lt;delete&gt;&lt;id&gt;SP2514N&lt;/id&gt;&lt;/delete&gt;"</pre>
<p>
Because we have specified "commit=false", a search for <a href="http://localhost:8983/solr/#/collection1/query?q=id:SP2514N">id:SP2514N</a> we still find the document we have deleted. Since the example configuration uses Solr's "autoCommit" feature Solr will still automatically persist this change to the index, but it will not affect search results until an "openSearcher" commit is explicitly executed.
</p>
<p>
Using the <a href="http://localhost:8983/solr/#/collection1/plugins/updatehandler?entry=updateHandler">statistics page</a>
for the <span class="codefrag">updateHandler</span> you can observe this delete
propogate to disk by watching the <span class="codefrag">deletesById</span>
value drop to 0 as the <span class="codefrag">cumulative_deletesById</span>
and <span class="codefrag">autocommit</span> values increase.
</p>
<p>
Here is an example of using delete-by-query to delete anything with
<a href="http://localhost:8983/solr/collection1/select?q=name:DDR&amp;fl=name">DDR</a> in the name:
</p>
<pre class="code">java -Dcommit=false -Ddata=args -jar post.jar "&lt;delete&gt;&lt;query&gt;name:DDR&lt;/query&gt;&lt;/delete&gt;"</pre>
<p>
You can force a new searcher to be opened to reflect these changes by sending an explicit commit command to Solr:
</p>
<pre class="code">java -jar post.jar -</pre>
<p>
Now re-execute <a href="http://localhost:8983/solr/#/collection1/query?q=id:SP2514N">the previous search</a>
and verify that no matching documents are found. You can also revisit the
statistics page and observe the changes to both the number of commits in the <a href="http://localhost:8983/solr/#/collection1/plugins/updatehandler?entry=updateHandler">updateHandler</a> and the numDocs in the <a href="http://localhost:8983/solr/#/collection1/plugins/core?entry=searcher">searcher</a>.
</p>
<p>
Commits that open a new searcher can be expensive operations so it's best to
make many changes to an index in a batch and then send the
<span class="codefrag">commit</span> command at the end.
There is also an <span class="codefrag">optimize</span> command that does the
same things as <span class="codefrag">commit</span>, but also forces all index
segments to be merged into a single segment -- this can be very resource
intensive, but may be worthwhile for improving search speed if your index
changes very infrequently.
</p>
<p>
All of the update commands can be specified using either <a href="http://wiki.apache.org/solr/UpdateXmlMessages">XML</a> or <a href="http://wiki.apache.org/solr/UpdateJSON">JSON</a>.
</p>
<p>To continue with the tutorial, re-add any documents you may have deleted by going to the <span class="codefrag">exampledocs</span> directory and executing</p>
<pre class="code">java -jar post.jar *.xml</pre>
</div>
<a name="N1017C"></a><a name="Querying+Data"></a>
<h2 class="boxed">Querying Data</h2>
<div class="section">
<p>
Searches are done via HTTP GET on the <span class="codefrag">select</span> URL with the query string in the <span class="codefrag">q</span> parameter.
You can pass a number of optional <a href="http://wiki.apache.org/solr/SearchHandler">request parameters</a>
to the request handler to control what information is returned. For example, you can use the "<span class="codefrag">fl</span>" parameter
to control what stored fields are returned, and if the relevancy score is returned:
</p>
<ul>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;fl=name,id">q=video&amp;fl=name,id</a> (return only name and id fields) </li>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;fl=name,id,score">q=video&amp;fl=name,id,score</a> (return relevancy score as well) </li>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;fl=*,score">q=video&amp;fl=*,score</a> (return all stored fields, as well as relevancy score) </li>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;sort=price desc&amp;fl=name,id,price">q=video&amp;sort=price desc&amp;fl=name,id,price</a> (add sort specification: sort by price descending) </li>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;wt=json">q=video&amp;wt=json</a> (return response in JSON format) </li>
</ul>
<p>
The <a href="http://localhost:8983/solr/#/collection1/query">query form</a>
provided in the web admin interface allows setting various request parameters
and is useful when testing or debugging queries.
</p>
<a name="N101BA"></a><a name="Sorting"></a>
<h3 class="boxed">Sorting</h3>
<p>
Solr provides a simple method to sort on one or more indexed fields.
Use the "<span class="codefrag">sort</span>' parameter to specify "field direction" pairs, separated by commas if there's more than one sort field:
</p>
<ul>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;sort=price+desc">q=video&amp;sort=price desc</a>
</li>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;sort=price+asc">q=video&amp;sort=price asc</a>
</li>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;sort=inStock+asc,price+desc">q=video&amp;sort=inStock asc, price desc</a>
</li>
</ul>
<p>
"<span class="codefrag">score</span>" can also be used as a field name when specifying a sort:
</p>
<ul>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;sort=score+desc">q=video&amp;sort=score desc</a>
</li>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;sort=inStock+asc,score+desc">q=video&amp;sort=inStock asc, score desc</a>
</li>
</ul>
<p>
Complex functions may also be used to sort results:
</p>
<ul>
<li>
<a href="http://localhost:8983/solr/collection1/select/?indent=on&amp;q=video&amp;sort=div(popularity,add(price,1))+desc">q=video&amp;sort=div(popularity,add(price,1)) desc</a>
</li>
</ul>
<p>
If no sort is specified, the default is <span class="codefrag">score desc</span> to return the matches having the highest relevancy.
</p>
</div>
<a name="N101FE"></a><a name="Highlighting"></a>
<h2 class="boxed">Highlighting</h2>
<div class="section">
<p>
Hit highlighting returns relevant snippets of each returned document, and highlights
terms from the query within those context snippets.
</p>
<p>
The following example searches for <span class="codefrag">video card</span> and requests
highlighting on the fields <span class="codefrag">name,features</span>. This causes a
<span class="codefrag">highlighting</span> section to be added to the response with the
words to highlight surrounded with <span class="codefrag">&lt;em&gt;</span> (for emphasis)
tags.
</p>
<p>
<a href="http://localhost:8983/solr/collection1/select/?wt=json&amp;indent=on&amp;q=video+card&amp;fl=name,id&amp;hl=true&amp;hl.fl=name,features">...&amp;q=video card&amp;fl=name,id&amp;hl=true&amp;hl.fl=name,features</a>
</p>
<p>
More request parameters related to controlling highlighting may be found
<a href="http://wiki.apache.org/solr/HighlightingParameters">here</a>.
</p>
</div> <!-- highlighting -->
<a name="N10227"></a><a name="Faceted+Search"></a>
<h2 class="boxed">Faceted Search</h2>
<div class="section">
<p>
Faceted search takes the documents matched by a query and generates counts for various
properties or categories. Links are usually provided that allows users to "drill down" or
refine their search results based on the returned categories.
</p>
<p>
The following example searches for all documents (<span class="codefrag">*:*</span>) and
requests counts by the category field <span class="codefrag">cat</span>.
</p>
<p>
<a href="http://localhost:8983/solr/collection1/select/?wt=json&amp;indent=on&amp;q=*:*&amp;fl=name&amp;facet=true&amp;facet.field=cat">...&amp;q=*:*&amp;facet=true&amp;facet.field=cat</a>
</p>
<p>
Notice that although only the first 10 documents are returned in the results list,
the facet counts generated are for the complete set of documents that match the query.
</p>
<p>
We can facet multiple ways at the same time. The following example adds a facet on the
boolean <span class="codefrag">inStock</span> field:
</p>
<p>
<a href="http://localhost:8983/solr/collection1/select/?wt=json&amp;indent=on&amp;q=*:*&amp;fl=name&amp;facet=true&amp;facet.field=cat&amp;facet.field=inStock">...&amp;q=*:*&amp;facet=true&amp;facet.field=cat&amp;facet.field=inStock</a>
</p>
<p>
Solr can also generate counts for arbitrary queries. The following example
queries for <span class="codefrag">ipod</span> and shows prices below and above 100 by using
range queries on the price field.
</p>
<p>
<a href="http://localhost:8983/solr/collection1/select/?wt=json&amp;indent=on&amp;q=ipod&amp;fl=name&amp;facet=true&amp;facet.query=price:[0+TO+100]&amp;facet.query=price:[100+TO+*]">...&amp;q=ipod&amp;facet=true&amp;facet.query=price:[0 TO 100]&amp;facet.query=price:[100 TO *]</a>
</p>
<p>
Solr can even facet by numeric ranges (including dates). This example requests counts for the manufacture date (<span class="codefrag">manufacturedate_dt</span> field) for each year between 2004 and 2010.
</p>
<p>
<a href="http://localhost:8983/solr/collection1/select/?wt=json&amp;indent=on&amp;q=*:*&amp;fl=name,manufacturedate_dt&amp;facet=true&amp;facet.range=manufacturedate_dt&amp;facet.range.start=2004-01-01T00:00:00Z&amp;facet.range.end=2010-01-01T00:00:00Z&amp;facet.range.gap=%2b1YEAR">...&amp;q=*:*&amp;facet=true&amp;facet.range=manufacturedate_dt&amp;facet.range.start=2004-01-01T00:00:00Z&amp;facet.range.end=2010-01-01T00:00:00Z&amp;facet.range.gap=+1YEAR</a>
</p>
<p>
More information on faceted search may be found on the
<a href="http://wiki.apache.org/solr/SolrFacetingOverview">faceting overview</a>
and
<a href="http://wiki.apache.org/solr/SimpleFacetParameters">faceting parameters</a>
pages.
</p>
</div> <!-- faceted search -->
<a name="N10278"></a><a name="Search+UI"></a>
<h2 class="boxed">Search UI</h2>
<div class="section">
<p>
Solr includes an example search interface built with <a href="https://wiki.apache.org/solr/VelocityResponseWriter">velocity templating</a>
that demonstrates many features, including searching, faceting, highlighting,
autocomplete, and geospatial searching.
</p>
<p>
Try it out at
<a href="http://localhost:8983/solr/collection1/browse">http://localhost:8983/solr/collection1/browse</a>
</p>
</div> <!-- Search UI -->
<a name="N1028B"></a><a name="Text+Analysis"></a>
<h2 class="boxed">Text Analysis</h2>
<div class="section">
<p>
Text fields are typically indexed by breaking the text into words and applying various transformations such as
lowercasing, removing plurals, or stemming to increase relevancy. The same text transformations are normally
applied to any queries in order to match what is indexed.
</p>
<p>
The <a href="http://wiki.apache.org/solr/SchemaXml">schema</a> defines
the fields in the index and what type of analysis is applied to them. The current schema your collection is using
may be viewed directly via the <a href="http://localhost:8983/solr/#/collection1/schema">Schema tab</a> in the Admin UI, or explored dynamically using the <a href="http://localhost:8983/solr/#/collection1/schema-browser">Schema Browser tab</a>.
</p>
<p>
The best analysis components (tokenization and filtering) for your textual
content depends heavily on language.
As you can see in the <a href="http://localhost:8983/solr/#/collection1/schema-browser?type=text_general">Schema Browser</a>,
many of the fields in the example schema are using a
<span class="codefrag">fieldType</span> named
<span class="codefrag">text_general</span>, which has defaults appropriate for
most languages.
</p>
<p>
If you know your textual content is English, as is the case for the example
documents in this tutorial, and you'd like to apply English-specific stemming
and stop word removal, as well as split compound words, you can use the
<a href="http://localhost:8983/solr/#/collection1/schema-browser?type=text_en_splitting"><span class="codefrag">text_en_splitting</span></a> fieldType instead.
Go ahead and edit the <span class="codefrag">schema.xml</span> in the
<span class="codefrag">solr/example/solr/collection1/conf</span> directory,
to use the <span class="codefrag">text_en_splitting</span> fieldType for
the <span class="codefrag">text</span> and
<span class="codefrag">features</span> fields like so:
</p>
<pre class="code">
&lt;field name="features" <b>type="text_en_splitting"</b> indexed="true" stored="true" multiValued="true"/&gt;
...
&lt;field name="text" <b>type="text_en_splitting"</b> indexed="true" stored="false" multiValued="true"/&gt;
</pre>
<p>
Stop and restart Solr after making these changes and then re-post all of
the example documents using
<span class="codefrag">java -jar post.jar *.xml</span>.
Now queries like the ones listed below will demonstrate English-specific
transformations:
</p>
<ul>
<li>A search for
<a href="http://localhost:8983/solr/collection1/select?q=power-shot&amp;fl=name">power-shot</a>
can match <span class="codefrag">PowerShot</span>, and
<a href="http://localhost:8983/solr/collection1/select?q=adata&amp;fl=name">adata</a>
can match <span class="codefrag">A-DATA</span> by using the
<span class="codefrag">WordDelimiterFilter</span> and <span class="codefrag">LowerCaseFilter</span>.
</li>
<li>A search for
<a href="http://localhost:8983/solr/collection1/select?q=features:recharging&amp;fl=name,features">features:recharging</a>
can match <span class="codefrag">Rechargeable</span> using the stemming
features of <span class="codefrag">PorterStemFilter</span>.
</li>
<li>A search for
<a href="http://localhost:8983/solr/collection1/select?q=%221 gigabyte%22&amp;fl=name">"1 gigabyte"</a>
can match <span class="codefrag">1GB</span>, and the commonly misspelled
<a href="http://localhost:8983/solr/collection1/select?q=pixima&amp;fl=name">pixima</a> can matches <span class="codefrag">Pixma</span> using the
<span class="codefrag">SynonymFilter</span>.
</li>
</ul>
<p>A full description of the analysis components, Analyzers, Tokenizers, and TokenFilters
available for use is <a href="http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters">here</a>.
</p>
<a name="N1030B"></a><a name="Analysis+Debugging"></a>
<h3 class="boxed">Analysis Debugging</h3>
<p>
There is a handy <a href="http://localhost:8983/solr/#/collection1/analysis">Analysis tab</a>
where you can see how a text value is broken down into words by both Index time nad Query time analysis chains for a field or field type. This page shows the resulting tokens after they pass through each filter in the chains.
</p>
<p>
<a href="http://localhost:8983/solr/#/collection1/analysis?analysis.fieldvalue=Canon+Power-Shot+SD500&amp;analysis.query=&amp;analysis.fieldtype=text_en_splitting&amp;verbose_output=0">This url</a>
shows the tokens created from
"<span class="codefrag">Canon Power-Shot SD500</span>"
using the
<span class="codefrag">text_en_splitting</span> type. Each section of
the table shows the resulting tokens after having passed through the next
<span class="codefrag">TokenFilter</span> in the (Index) analyzer.
Notice how both <span class="codefrag">powershot</span> and
<span class="codefrag">power</span>, <span class="codefrag">shot</span>
are indexed, using tokens that have the same "position".
(Compare the previous output with
<a href="http://localhost:8983/solr/#/collection1/analysis?analysis.fieldvalue=Canon+Power-Shot+SD500&amp;analysis.query=&amp;analysis.fieldtype=text_general&amp;verbose_output=0">The tokens produced using the text_general field type</a>.)
</p>
<p>
Mousing over the section label to the left of the section will display the full name of the analyzer component at that stage of the chain. Toggling the "Verbose Output" checkbox will <a href="http://localhost:8983/solr/#/collection1/analysis?analysis.fieldvalue=Canon+Power-Shot+SD500&amp;analysis.query=&amp;analysis.fieldtype=text_en_splitting&amp;verbose_output=1">show/hide the detailed token attributes</a>.
</p>
<p>
When both <a href="http://localhost:8983/solr/#/collection1/analysis?analysis.fieldvalue=Canon+Power-Shot+SD500&amp;analysis.query=power+shot+sd-500&amp;analysis.fieldtype=text_en_splitting&amp;verbose_output=0">Index and Query</a>
values are provided, two tables will be displayed side by side showing the
results of each chain. Terms in the Index chain results that are equivalent
to the final terms produced by the Query chain will be highlighted.
</p>
<p>
Other interesting examples:
</p>
<ul>
<li><a href="http://localhost:8983/solr/#/collection1/analysis?analysis.fieldvalue=A+new+nation%2C+conceived+in+liberty+and+dedicated+to+the+proposition+that+all+men+are+created+equal.%0A&amp;analysis.query=liberties+and+equality&amp;analysis.fieldtype=text_en&amp;verbose_output=0">English stemming and stop-words</a>
using the <span class="codefrag">text_en</span> field type
</li>
<li><a href="http://localhost:8983/solr/#/collection1/analysis?analysis.fieldtype=text_cjk&amp;analysis.fieldvalue=%EF%BD%B6%EF%BE%80%EF%BD%B6%EF%BE%85&amp;analysis.query=%E3%82%AB%E3%82%BF%E3%82%AB%E3%83%8A&amp;verbose_output=1">Half-width katakana normalization with bi-graming</a>
using the <span class="codefrag">text_cjk</span> field type
</li>
<li><a href="http://localhost:8983/solr/#/collection1/analysis?analysis.fieldtype=text_ja&amp;analysis.fieldvalue=%E7%A7%81%E3%81%AF%E5%88%B6%E9%99%90%E3%82%B9%E3%83%94%E3%83%BC%E3%83%89%E3%82%92%E8%B6%85%E3%81%88%E3%82%8B%E3%80%82&amp;verbose_output=1">Japanese morphological decomposition with part-of-speech filtering</a>
using the <span class="codefrag">text_ja</span> field type
</li>
<li><a href="http://localhost:8983/solr/#/collection1/analysis?analysis.fieldtype=text_ar&amp;analysis.fieldvalue=%D9%84%D8%A7+%D8%A3%D8%AA%D9%83%D9%84%D9%85+%D8%A7%D9%84%D8%B9%D8%B1%D8%A8%D9%8A%D8%A9&amp;verbose_output=1">Arabic stop-words, normalization, and stemming</a>
using the <span class="codefrag">text_ar</span> field type
</li>
</ul>
</div>
<a name="N1034D"></a><a name="Conclusion"></a>
<h2 class="boxed">Conclusion</h2>
<div class="section">
<p>
Congratulations! You successfully ran a small Solr instance, added some
documents, and made changes to the index and schema. You learned about queries, text
analysis, and the Solr admin interface. You're ready to start using Solr on
your own project! Continue on with the following steps:
</p>
<ul>
<li>Subscribe to the Solr <a href="http://lucene.apache.org/solr/discussion.html">mailing lists</a>!</li>
<li>Make a copy of the Solr <span class="codefrag">example</span> directory as a template for your project.</li>
<li>Customize the schema and other config in <span class="codefrag">solr/collection1/conf/</span> to meet your needs.</li>
</ul>
<p>
Solr has a ton of other features that we haven't touched on here, including
<a href="http://wiki.apache.org/solr/DistributedSearch">distributed search</a>
to handle huge document collections,
<a href="http://wiki.apache.org/solr/FunctionQuery">function queries</a>,
<a href="http://wiki.apache.org/solr/StatsComponent">numeric field statistics</a>,
and
<a href="http://wiki.apache.org/solr/ClusteringComponent">search results clustering</a>.
Explore the <a href="http://wiki.apache.org/solr/FrontPage">Solr Wiki</a> to find
more details about Solr's many <a href="http://lucene.apache.org/solr/features.html">features</a>.
</p>
<p>
Have Fun, and we'll see you on the Solr mailing lists!
</p>
</div>
</div>
<div class="clearboth">&nbsp;</div>
<div id="footer">
<div class="copyright">
Copyright &copy;
2012 <a href="http://www.apache.org/licenses/">The Apache Software Foundation.</a>
</div>
</div>
</body>
</html>

View File

@ -74,7 +74,7 @@
<li><a href="http://wiki.apache.org/solr">Wiki</a>: Additional documentation, especially focused on using Solr.</li>
<li><a href="changes/Changes.html">Changes</a>: List of changes in this release.</li>
<li><a href="SYSTEM_REQUIREMENTS.html">System Requirements</a>: Minimum and supported Java versions.</li>
<li><a href="tutorial.html">Solr Tutorial</a>: This document covers the basics of running Solr using an example schema, and some sample data.</li>
<li><a href="quickstart.html">Solr Quick Start</a>: This document covers the basics of running Solr using an example schema, and some sample data.</li>
<li><a href="{$luceneJavadocUrl}index.html">Lucene Documentation</a></li>
</ul>
<h2>API Javadocs</h2>

596
solr/site/quickstart.mdtext Normal file
View File

@ -0,0 +1,596 @@
# Solr Quick Start
<!-- Should these breadcrumbs be here? If not, how to make them on the site but not here? -->
<ul class="breadcrumbs">
<li><a href="/solr">Home</a></li>
<li><a href="/solr/resources.html">Resources</a></li>
</ul>
## Overview
This document covers getting Solr up and running, ingesting a variety of data sources into multiple collections,
and getting a feel for the Solr administrative and search interfaces.
## Requirements
To follow along with this tutorial, you will need...
1. To meet the [system requirements](SYSTEM_REQUIREMENTS.html)
2. An Apache Solr release. This tutorial was written using Apache Solr 5.0.0.
## Getting Started
Please run the browser showing this tutorial and the Solr server on the same machine so tutorial links will correctly
point to your Solr server.
Begin by unzipping the Solr release and changing your working directory to the subdirectory where Solr was installed.
Note that the base directory name may vary with the version of Solr downloaded. For example, with a shell in UNIX,
Cygwin, or MacOS:
/:$ ls solr*
solr-5.0.0.zip
/:$ unzip -q solr-5.0.0.zip -d solr5
/:$ cd solr5/
To launch Solr, run: `bin/solr start -e cloud -noprompt`
/solr5:$ bin/solr start -e cloud -noprompt
Welcome to the SolrCloud example!
Starting up 2 Solr nodes for your example SolrCloud cluster.
...
Started Solr server on port 8983 (pid=8404). Happy searching!
...
Started Solr server on port 7574 (pid=8549). Happy searching!
...
SolrCloud example running, please visit http://localhost:8983/solr
/solr5:$
You can see that the Solr is running by loading the Solr Admin UI in your web browser: <http://localhost:8983/solr/>.
This is the main starting point for administering Solr.
Solr will now be running two "nodes", one on port 7574 and one on port 8983. There is one collection created
automatically, `gettingstarted`, a two shard collection, each with two replicas.
The [Cloud tab](http://localhost:8983/solr/#/~cloud) in the Admin UI diagrams the collection nicely:
<img alt="Solr Quick Start: SolrCloud diagram" class="float-right" src="images/quickstart-solrcloud.png" />
## Indexing Data
Your Solr server is up and running, but it doesn't contain any data. The Solr install includes the `bin/post` tool in
order to facilitate getting various types of documents easily into Solr from the start. We'll be
using this tool for the indexing examples below.
You'll need a command shell to run these examples, rooted in the Solr install directory; the shell from where you
launched Solr works just fine.
### Indexing a directory of "rich" files
Let's first index local "rich" files including HTML, PDF, Microsoft Office formats (such as MS Word), plain text and
many other formats. `SimplePostTool` features the ability to crawl a directory of files, optionally recursively even,
sending the raw content of each file into Solr for extraction and indexing. A Solr install includes a `docs/`
subdirectory, so that makes a convenient set of (mostly) HTML files built-in to start with.
bin/post gettingstarted docs/
Here's what it'll look like:
/solr5:$ bin/post gettingstarted docs/
SimplePostTool version 1.5
Posting files to base url http://localhost:8983/solr/update..
Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
Entering recursive mode, max depth=999, delay=0s
Indexing directory docs (3 files, depth=0)
POSTing file index.html (text/html)
POSTing file SYSTEM_REQUIREMENTS.html (text/html)
POSTing file tutorial.html (text/html)
Indexing directory docs/changes (1 files, depth=1)
POSTing file Changes.html (text/html)
Indexing directory docs/solr-analysis-extras (8 files, depth=1)
...
2945 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/update..
Time spent: 0:00:37.537
The command-line breaks down as follows:
* `gettingstarted`: name of the collection to index into
* `docs/`: a relative path of the Solr install `docs/` directory
You have now indexed thousands of documents into the `gettingstarted` collection in Solr and committed these changes.
You can search for "solr" by loading the Admin UI [Query tab](http://localhost:8983/solr/#/gettingstarted_shard1_replica1/query),
and enter "solr" in the `q` param (replacing `*:*`, which matches all documents). See the [Searching](#searching)
section below for more information.
To index your own data, re-run the directory indexing command pointed to your own directory of documents. For example,
on a Mac instead of `docs/` try `~/Documents/` or `~/Desktop/` ! You may want to start from a clean, empty system
again, rather than have your content in addition to the Solr `docs/` directory; see the Cleanup section [below](#cleanup)
for how to get back to a clean starting point.
### Indexing Solr XML
Solr supports indexing structured content in a variety of incoming formats. The historically predominant format for
getting structured content into Solr has been [Solr XML](https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-XMLFormattedIndexUpdates).
Many Solr indexers have been coded to process domain content into Solr XML output, generally HTTP POSTed directly to
Solr's `/update` endpoint.
Solr's install includes a handful of Solr XML formatted files with example data (mostly mocked tech product data).
Using `bin/post`, index the example Solr XML files in `example/exampledocs/`:
bin/post -c gettingstarted example/exampledocs/*.xml (TODO: depends on SOLR-6900)
Here's what you'll see:
/solr5:$ bin/post -c gettingstarted example/exampledocs/*.xml
SimplePostTool version 1.5
Posting files to base url http://localhost:8983/solr/update using content-type application/xml..
POSTing file gb18030-example.xml
POSTing file hd.xml
POSTing file ipod_other.xml
POSTing file ipod_video.xml
POSTing file manufacturers.xml
POSTing file mem.xml
POSTing file money.xml
POSTing file monitor.xml
POSTing file monitor2.xml
POSTing file mp500.xml
POSTing file sd500.xml
POSTing file solr.xml
POSTing file utf8-example.xml
POSTing file vidcard.xml
14 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/update..
Time spent: 0:00:00.453
...and now you can search for all sorts of things using the default [Solr Query Syntax](https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser#TheStandardQueryParser-SpecifyingTermsfortheStandardQueryParser)
(a superset of the Lucene query syntax)...
NOTE:
You can browse the documents indexed at <http://localhost:8983/solr/gettingstarted/browse>. The `/browse` UI allows getting
a feel for how Solr's technical capabilities can be worked with in a familiar, though a bit rough and prototypical,
interactive HTML view. (The `/browse` view defaults to assuming the `gettingstarted` schema and data are a catch-all mix
of structured XML, JSON, CSV example data, and unstructured rich documents. Your own data may not look ideal at first,
though the `/browse` templates are customizable.)
### Indexing JSON
Solr supports indexing JSON, either arbitrary structured JSON or "Solr JSON" (which is similiar to Solr XML).
Solr includes a small sample Solr JSON file to illustrate this capability. Again using `bin/post`, index the
sample JSON file:
bin/post -c gettingstarted example/exampledocs/books.json (TODO: depends on SOLR-6900)
You'll see:
/solr5:$ bin/post -c gettingstarted example/exampledocs/books.json (TODO: depends on SOLR-6900)
SimplePostTool version 1.5
Posting files to base url http://localhost:8983/solr/update..
Entering auto mode. File endings considered are xml,json,csv,...
POSTing file books.json (application/json)
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/update..
Time spent: 0:00:00.084
To flatten (and/or split) and index arbitrary structured JSON, a topic beyond this quick start guide, check out
[Transforming and Indexing Custom JSON data](https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-TransformingandIndexingcustomJSONdata).
### Indexing CSV (Comma/Column Separated Values)
A great conduit of data into Solr is via CSV, especially when the documents are homogeneous by all having the
same set of fields. CSV can be conveniently exported from a spreadsheet such as Excel, or exported from databases such
as MySQL. When getting started with Solr, it can often be easiest to get your structured data into CSV format and then
index that into Solr rather than a more sophisticated single step operation.
Using SimplePostTool and the included example CSV data file, index it:
bin/post -c gettingstarted example/exampledocs/books.csv (TODO: depends on SOLR-6900)
In your terminal you'll see:
/solr5:$ bin/post -c gettingstarted example/exampledocs/books.csv (TODO: depends on SOLR-6900)
SimplePostTool version 1.5
Posting files to base url http://localhost:8983/solr/update..
Entering auto mode. File endings considered are xml,json,csv,...
POSTing file books.csv (text/csv)
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/update..
Time spent: 0:00:00.084
### Other indexing techniques
* Import records from a database using the [Data Import Handler (DIH)](https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler).
* Use [SolrJ](https://cwiki.apache.org/confluence/display/solr/Using+SolrJ) for Java or other Solr clients to
programatically create documents to send to Solr.
* Use the Admin UI [Documents tab](http://localhost:8983/solr/#/gettingstarted_shard1_replica1/documents) to paste in a document to be
indexed, or select `Document Builder` from the `Document Type` dropdown to build a document one field at a time.
Click on the `Submit Document` button below the form to index your document.
***
## Updating Data
You may notice that even if you index content in this guide more than once, it does not duplicate the results found.
This is because the example `schema.xml` specifies a "`uniqueKey`" field called "`id`". Whenever you POST commands to
Solr to add a document with the same value for the `uniqueKey` as an existing document, it automatically replaces it
for you. You can see that that has happened by looking at the values for `numDocs` and `maxDoc` in the "CORE"/searcher
section of the statistics page...
<http://localhost:8983/solr/#/gettingstarted_shard1_replica1/plugins/core?entry=searcher>
`numDocs` represents the number of searchable documents in the index (and will be larger than the number of XML, JSON,
or CSV files since some files contained more than one document). The maxDoc value may be larger as the maxDoc count
includes logically deleted documents that have not yet been removed from the index. You can re-post the sample files
over and over again as much as you want and `numDocs` will never increase, because the new documents will constantly be
replacing the old.
Go ahead and edit any of the existing example data files, change some of the data, and re-run the SimplePostTool command.
You'll see your changes reflected in subsequent searches.
## Deleting Data
You can delete data by POSTing a delete command to the update URL and specifying the value of the document's unique key
field, or a query that matches multiple documents (be careful with that one!). Since these commands are smaller, we
specify them right on the command line rather than reference a JSON or XML file.
Execute the following command to delete a specific document:
TODO: depends on SOLR-6900 to implement within bin/post:
java -Ddata=args org.apache.solr.util.SimplePostTool "<delete><id>SP2514N</id></delete>"
## Searching
Solr can be queried via REST clients cURL, wget, Chrome POSTMAN, etc., as well as via the native clients available for
many programming languages.
The Solr Admin UI includes a query builder interface - see the `gettingstarted` query tab at <http://localhost:8983/solr/#/gettingstarted_shard1_replica1/query>.
If you click the `Execute Query` button without changing anything in the form, you'll get 10 random documents in JSON
format (`*:*` in the `q` param matches all documents):
<img style="border:1px solid #ccc" src="images/quickstart-query-screen.png" alt="Solr Quick Start: gettingstarted Query tab" class="float-right"/>
The URL sent by the Admin UI to Solr is shown in light grey near the top right of the above screenshot - if you click on
it, your browser will show you the raw response. To use cURL, just give the same URL in quotes on the `curl` command line:
curl "http://localhost:8983/solr/gettingstarted/select?q=*%3A*&wt=json&indent=true"
In the above URL, the "`:`" in "`q=*:*`" has been URL-encoded as "`%3A`", but since "`:`" has no reserved purpose in the
query component of the URL (after the "`?`"), you don't need to URL encode it. So the following also works:
curl "http://localhost:8983/solr/gettingstarted/select?q=*:*&wt=json&indent=true"
### Basics
#### Search for a single term
To search for a term, give it as the `q` param value - in the Admin UI [Query tab](http://localhost:8983/solr/#/gettingstarted_shard1_replica1/query),
replace `*:*` with the term you want to find. To search for "foundation":
curl "http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=foundation"
You'll see:
/solr5$ curl "http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=foundation"
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"indent":"true",
"q":"foundation",
"wt":"json"}},
"response":{"numFound":2812,"start":0,"docs":[
{
"id":"0553293354",
"cat":["book"],
"name":"Foundation",
...
The response indicates that there are 2,812 hits (`"numFound":2812`), of which the first 10 were returned, since by
default `start`=`0` and `rows`=`10`. You can specify these params to page through results, where `start` is the position
of the first result to return, and `rows` is the page size.
To restrict fields returned in the response, use the `fl` param, which takes a comma-separated list of field names.
E.g. to only return the `id` field:
curl "http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=foundation&fl=id"
`q=foundation` matches nearly all of the docs we've indexed, since most of the files under `docs/` contain
"The Apache Software Foundation". To restrict search to a particular field, use the syntax "`q=field:value`",
e.g. to search for `foundation` only in the `name` field:
curl "http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=name:foundation"
The above request returns only one document (`"numFound":1`) - from the response:
...
"response":{"numFound":1,"start":0,"docs":[
{
"id":"0553293354",
"cat":["book"],
"name":"Foundation",
...
#### Phrase search
To search for a multi-term phrase, enclose it in double quotes: `q="multiple terms here"`. E.g. to search for
"CAS latency" - note that the space between terms must be converted to "`+`" in a URL (the Admin UI will handle URL
encoding for you automatically):
curl "http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=\"CAS+latency\""
You'll get back:
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"indent":"true",
"q":"\"CAS latency\"",
"wt":"json"}},
"response":{"numFound":2,"start":0,"docs":[
{
"id":"VDBDB1A16",
"name":"A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - OEM",
"manu":"A-DATA Technology Inc.",
"manu_id_s":"corsair",
"cat":["electronics", "memory"],
"features":["CAS latency 3,\t 2.7v"],
...
#### Combining searches
By default, when you search for multiple terms and/or phrases in a single query, Solr will only require that one of them
is present in order for a document to match. Documents containing more terms will be sorted higher in the results list.
You can require that a term or phrase is present by prefixing it with a "`+`"; conversely, to disallow the presence of a
term or phrase, prefix it with a "`-`".
To find documents that contain both terms "`one`" and "`three`", enter `+one +three` in the `q` param in the Admin UI
[Query tab](http://localhost:8983/solr/#/gettingstarted_shard1_replica1/query). Because the "`+`" character has a reserved purpose in URLs
(encoding the space character), you must URL encode it for `curl` as "`%2B`":
curl "http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=%2Bone+%2Bthree"
To search for documents that contain the term "`two`" but **don't** contain the term "`one`", enter `+two -one` in the
`q` param in the Admin UI. Again, URL encode "`+`" as "`%2B`":
curl "http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=%2Btwo+-one"
#### In depth
For more Solr search options, see the Solr Reference Guide's [Searching](https://cwiki.apache.org/confluence/display/solr/Searching)
section.
### Faceting
One of Solr's most popular features is faceting. Faceting allows the search results to be arranged into subsets (or
buckets or categories), providing a count for each subset. There are several types of faceting: field values, numeric
and date ranges, pivots (decision tree), and arbitrary query faceting.
#### Field facets
In addition to providing search results, a Solr query can return the number of documents that contain each unique value
in the whole result set.
From the Admin UI [Query tab](http://localhost:8983/solr/#/gettingstarted_shard1_replica1/query), if you check the "`facet`"
checkbox, you'll see a few facet-related options appear:
<img style="border:1px solid #ccc" src="images/quickstart-admin-ui-facet-options.png" alt="Solr Quick Start: Query tab facet options"/>
To see facet counts from all documents (`q=*:*`): turn on faceting (`facet=true`), and specify the field to facet on via
the `facet.field` param. If you only want facets, and no document contents, specify `rows=0`. The `curl` command below
will return facet counts for the `manu_id_s` field:
curl http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=*:*&rows=0 \
&facet=true&facet.field=manu_id_s
In your terminal, you'll see:
{
"responseHeader":{
"status":0,
"QTime":3,
"params":{
"facet":"true",
"indent":"true",
"q":"*:*",
"facet.field":"manu_id_s",
"wt":"json",
"rows":"0"}},
"response":{"numFound":2990,"start":0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{
"manu_id_s":[
"corsair",3,
"belkin",2,
"canon",2,
"apple",1,
"asus",1,
"ati",1,
"boa",1,
"dell",1,
"eu",1,
"maxtor",1,
"nor",1,
"uk",1,
"viewsonic",1,
"samsung",0]},
"facet_dates":{},
"facet_ranges":{},
"facet_intervals":{}}}
#### Range facets
For numerics or dates, it's often desirable to partition the facet counts into ranges rather than discrete values.
A prime example of numeric range faceting, using the example product data, is `price`. In the `/browse` UI, it looks
like this:
<img style="border:1px solid #ccc" src="images/quickstart-range-facet.png" alt="Solr Quick Start: Range facets"/>
The data for these price range facets can be seen in JSON format with this command:
curl http://localhost:8983/solr/gettingstarted/select?q=*:*&wt=json&indent=on&rows=0&facet=true \
&facet.range=price \
&f.price.facet.range.start=0 \
&f.price.facet.range.end=600 \
&f.price.facet.range.gap=50 \
&facet.range.other=after
In your terminal you will see:
{
"responseHeader":{
"status":0,
"QTime":1,
"params":{
"facet.range.other":"after",
"facet":"true",
"indent":"on",
"q":"*:*",
"f.price.facet.range.gap":"50",
"facet.range":"price",
"f.price.facet.range.end":"600",
"wt":"json",
"f.price.facet.range.start":"0",
"rows":"0"}},
"response":{"numFound":2990,"start":0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{},
"facet_dates":{},
"facet_ranges":{
"price":{
"counts":[
"0.0",19,
"50.0",1,
"100.0",0,
"150.0",2,
"200.0",0,
"250.0",1,
"300.0",1,
"350.0",2,
"400.0",0,
"450.0",1,
"500.0",0,
"550.0",0],
"gap":50.0,
"start":0.0,
"end":600.0,
"after":2}},
"facet_intervals":{}}}
#### Pivot facets
Another faceting type is pivot facets, also known as "decison trees", allowing two or more fields to be nested for all
the various possible combinations. Using the example technical product data, pivot facets can be used to see how many
of the products in the "book" category (the `cat` field) are in stock or not in stock. Here's how to get at the raw
data for this scenario:
curl http://localhost:8983/solr/gettingstarted/select?q=*:*&rows=0&wt=json&indent=on \
&facet=on&facet.pivot=cat,inStock
This results in the following response (trimmed to just the book category output), which says out of 14 items in the
"book" category, 12 are in stock and 2 are not in stock:
...
"facet_pivot":{
"cat,inStock":[{
"field":"cat",
"value":"book",
"count":14,
"pivot":[{
"field":"inStock",
"value":true,
"count":12},
{
"field":"inStock",
"value":false,
"count":2}]},
...
#### More faceting options
For the full scoop on Solr faceting, visit the Solr Reference Guide's [Faceting](https://cwiki.apache.org/confluence/display/solr/Faceting)
section.
### Spatial
Solr has sophisticated geospatial support, including searching within a specified distance range of a given location
(or within a bounding box), sorting by distance, or even boosting results by the distance. Some of the example tech products
documents in `example/exampledocs/*.xml` have locations associated with them to illustrate the spatial capabilities.
Spatial queries can be combined with any other types of queries, such as in this example of querying for "ipod" within
10 kilometers from San Francisco:
<img style="border:1px solid #ccc" src="images/quickstart-spatial.png" alt="Solr Quick Start: spatial search" class="float-right"/>
The URL to this example is <http://localhost:8983/solr/gettingstarted/browse?q=ipod&pt=37.7752%2C-122.4232&d=10&sfield=store&fq=%7B%21bbox%7D&queryOpts=spatial&queryOpts=spatial>,
leveraging the `/browse` UI to show a map for each item and allow easy selection of the location to search near.
To learn more about Solr's spatial capabilities, see the Solr Reference Guide's [Spatial Search](https://cwiki.apache.org/confluence/display/solr/Spatial+Search)
section.
## Wrapping up
If you've run the full set of commands in this quick start guide you have done the following:
* Launched Solr into SolrCloud mode, two nodes, two collections including shards and replicas
* Indexed a directory of rich text files
* Indexed Solr XML files
* Indexed Solr JSON files
* Indexed CSV content
* Opened the admin console, used its query interface to get JSON formatted results
* Opened the /browse interface to explore Solr's features in a more friendly and familiar interface
Nice work! The script (see below) to run all of these items took under two minutes! (Your run time may vary, depending
on your computer's power and resources available.)
Here's a Unix script for convenient copying and pasting in order to run the key commands for this quick start guide:
# TODO: depends on SOLR-6900
date ;
bin/solr start -e cloud -noprompt ;
open http://localhost:8983/solr ;
bin/post -c gettingstarted docs/ ;
open http://localhost:8983/solr/gettingstarted/browse ;
bin/post -c gettingstarted example/exampledocs/*.xml ;
bin/post -c gettingstarted example/exampledocs/books.json ;
bin/post -c gettingstarted example/exampledocs/books.csv ;
open "http://localhost:8983/solr/#/gettingstarted_shard1_replica1/plugins/core?entry=searcher" ;
java -Ddata=args org.apache.solr.util.SimplePostTool "<delete><id>SP2514N</id></delete>" ; # TODO: adjust this as SOLR-6900 implements
bin/solr healthcheck -c gettingstarted ;
date ;
## Cleanup
As you work through this guide, you may want to stop Solr and reset the environment back to the starting point.
The following command line will stop Solr and remove the directories for each of the two nodes that the start script
created:
bin/solr stop -all ; rm -Rf example/cloud/node1/ example/cloud/node2/
## Where to next?
For more information on Solr, check out the following resources:
* [Solr Reference Guide](https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide) (ensure you
match the version of the reference guide with your version of Solr)
* See also additional [Resources](http://lucene.apache.org/solr/resources.html)