mirror of https://github.com/apache/lucene.git
534 lines
23 KiB
HTML
Executable File
534 lines
23 KiB
HTML
Executable File
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<head>
|
|
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
|
<meta content="Apache Forrest" name="Generator">
|
|
<meta name="Forrest-version" content="0.7">
|
|
<meta name="Forrest-skin-name" content="pelt">
|
|
<title>Solr tutorial</title>
|
|
<link type="text/css" href="skin/basic.css" rel="stylesheet">
|
|
<link media="screen" type="text/css" href="skin/screen.css" rel="stylesheet">
|
|
<link media="print" type="text/css" href="skin/print.css" rel="stylesheet">
|
|
<link type="text/css" href="skin/profile.css" rel="stylesheet">
|
|
<script src="skin/getBlank.js" language="javascript" type="text/javascript"></script><script src="skin/getMenu.js" language="javascript" type="text/javascript"></script><script src="skin/fontsize.js" language="javascript" type="text/javascript"></script>
|
|
<link rel="shortcut icon" href="images/favicon.ico">
|
|
</head>
|
|
<body onload="init()">
|
|
<script type="text/javascript">ndeSetTextSize();</script>
|
|
<div id="top">
|
|
<div class="breadtrail">
|
|
<a href="http://www.apache.org/">Apache</a> > <a href="http://lucene.apache.org/">Lucene</a> > <a href="http://incubator.apache.org/solr/">Solr</a><script src="skin/breadcrumbs.js" language="JavaScript" type="text/javascript"></script>
|
|
</div>
|
|
<div class="header">
|
|
<div class="grouplogo">
|
|
<a href="http://incubator.apache.org/"><img class="logoImage" alt="Apache Incubator" src="http://incubator.apache.org/images/apache-incubator-logo.png" title="Apache Incubator"></a>
|
|
</div>
|
|
<div class="projectlogo">
|
|
<a href="http://incubator.apache.org/solr/"><img class="logoImage" alt="Solr" src="images/solr.png" title="Solr Description"></a>
|
|
</div>
|
|
<div class="searchbox">
|
|
<form action="http://www.google.com/search" method="get" class="roundtopsmall">
|
|
<input value="incubator.apache.org" name="sitesearch" type="hidden"><input onFocus="getBlank (this, 'Search the site with google');" size="25" name="q" id="query" type="text" value="Search the site with google">
|
|
<input attr="value" name="Search" value="Search" type="submit">
|
|
</form>
|
|
</div>
|
|
<ul id="tabs">
|
|
<li class="current">
|
|
<a class="base-selected" href="index.html">Main</a>
|
|
</li>
|
|
<li>
|
|
<a class="base-not-selected" href="http://wiki.apache.org/solr">Wiki</a>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
</div>
|
|
<div id="main">
|
|
<div id="publishedStrip">
|
|
<div id="level2tabs"></div>
|
|
<script type="text/javascript"><!--
|
|
document.write("<text>Last Published:</text> " + document.lastModified);
|
|
// --></script>
|
|
</div>
|
|
<div class="breadtrail">
|
|
|
|
|
|
</div>
|
|
<div id="menu">
|
|
<div onclick="SwitchMenu('menu_1.1', 'skin/')" id="menu_1.1Title" class="menutitle">About</div>
|
|
<div id="menu_1.1" class="menuitemgroup">
|
|
<div class="menuitem">
|
|
<a href="index.html" title="Welcome to Solr">Welcome</a>
|
|
</div>
|
|
<div class="menuitem">
|
|
<a href="who.html" title="Solr Committers">Who We Are</a>
|
|
</div>
|
|
</div>
|
|
<div onclick="SwitchMenu('menu_selected_1.2', 'skin/')" id="menu_selected_1.2Title" class="menutitle" style="background-image: url('skin/images/chapter_open.gif');">Documentation</div>
|
|
<div id="menu_selected_1.2" class="selectedmenuitemgroup" style="display: block;">
|
|
<div class="menuitem">
|
|
<a href="features.html">Features</a>
|
|
</div>
|
|
<div class="menuitem">
|
|
<a href="http://wiki.apache.org/solr/FAQ">FAQ</a>
|
|
</div>
|
|
<div class="menuitem">
|
|
<a href="http://wiki.apache.org/solr/">Wiki</a>
|
|
</div>
|
|
<div class="menupage">
|
|
<div class="menupagetitle">Tutorial</div>
|
|
</div>
|
|
<div class="menuitem">
|
|
<a href="docs/api/">API Docs</a>
|
|
</div>
|
|
</div>
|
|
<div onclick="SwitchMenu('menu_1.3', 'skin/')" id="menu_1.3Title" class="menutitle">Resources</div>
|
|
<div id="menu_1.3" class="menuitemgroup">
|
|
<div class="menuitem">
|
|
<a href="http://cvs.apache.org/dist/lucene/solr/nightly/">Download</a>
|
|
</div>
|
|
<div class="menuitem">
|
|
<a href="mailing_lists.html">Mailing Lists</a>
|
|
</div>
|
|
<div class="menuitem">
|
|
<a href="issue_tracking.html">Issue Tracking</a>
|
|
</div>
|
|
<div class="menuitem">
|
|
<a href="version_control.html">Version Control</a>
|
|
</div>
|
|
</div>
|
|
<div onclick="SwitchMenu('menu_1.4', 'skin/')" id="menu_1.4Title" class="menutitle">Related Projects</div>
|
|
<div id="menu_1.4" class="menuitemgroup">
|
|
<div class="menuitem">
|
|
<a href="http://lucene.apache.org/java/">Lucene Java</a>
|
|
</div>
|
|
<div class="menuitem">
|
|
<a href="http://lucene.apache.org/nutch/">Nutch</a>
|
|
</div>
|
|
</div>
|
|
<div id="credit"></div>
|
|
<div id="roundbottom">
|
|
<img style="display: none" class="corner" height="15" width="15" alt="" src="skin/images/rc-b-l-15-1body-2menu-3menu.png"></div>
|
|
<div id="credit2"></div>
|
|
</div>
|
|
<div id="content">
|
|
<div title="Portable Document Format" class="pdflink">
|
|
<a class="dida" href="tutorial.pdf"><img alt="PDF -icon" src="skin/images/pdfdoc.gif" class="skin"><br>
|
|
PDF</a>
|
|
</div>
|
|
<h1>Solr tutorial</h1>
|
|
<div id="minitoc-area">
|
|
<ul class="minitoc">
|
|
<li>
|
|
<a href="#Overview">Overview</a>
|
|
</li>
|
|
<li>
|
|
<a href="#Requirements">Requirements</a>
|
|
</li>
|
|
<li>
|
|
<a href="#Getting+Started">Getting Started</a>
|
|
</li>
|
|
<li>
|
|
<a href="#Indexing+Data">Indexing Data</a>
|
|
</li>
|
|
<li>
|
|
<a href="#Updating+Data">Updating Data</a>
|
|
<ul class="minitoc">
|
|
<li>
|
|
<a href="#Deleting+Data">Deleting Data</a>
|
|
</li>
|
|
</ul>
|
|
</li>
|
|
<li>
|
|
<a href="#Querying+Data">Querying Data</a>
|
|
<ul class="minitoc">
|
|
<li>
|
|
<a href="#Sorting">Sorting</a>
|
|
</li>
|
|
</ul>
|
|
</li>
|
|
<li>
|
|
<a href="#Text+Analysis">Text Analysis</a>
|
|
<ul class="minitoc">
|
|
<li>
|
|
<a href="#Analysis+Debugging">Analysis Debugging</a>
|
|
</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
|
|
|
|
<a name="N1000C"></a><a name="Overview"></a>
|
|
<h2 class="boxed">Overview</h2>
|
|
<div class="section">
|
|
<p>
|
|
This document covers the basics of running Solr using an example
|
|
schema, and some sample data.
|
|
</p>
|
|
</div>
|
|
|
|
|
|
<a name="N10016"></a><a name="Requirements"></a>
|
|
<h2 class="boxed">Requirements</h2>
|
|
<div class="section">
|
|
<p>
|
|
To follow along with this tutorial, you will need...
|
|
</p>
|
|
<ol>
|
|
|
|
<li>Java 1.5 or greater. Some places you can get it are from
|
|
<a href="http://java.sun.com/j2se/downloads.html">Sun</a>,
|
|
<a href="http://www-106.ibm.com/developerworks/java/jdk/">IBM</a>, or
|
|
<a href="http://www.bea.com/jrockit/">BEA</a>.
|
|
</li>
|
|
|
|
<li>A <a href="http://cvs.apache.org/dist/lucene/solr/nightly/">Solr release</a>.
|
|
</li>
|
|
|
|
<li>On Win32, <a href="http://www.cygwin.com/">cygwin</a>, for
|
|
shell support. (If you plan to use Subversion on Win32, be
|
|
sure to select the subversion package when you install, in the
|
|
"Devel" category.) This tutorial will assume that "<span class="codefrag">sh</span>"
|
|
is in your PATH, and that you have "curl" installed from the "Web" category.
|
|
</li>
|
|
|
|
<li>FireFox or Mozilla is the preferred browser to view the admin pages...
|
|
the current stylesheet doesn't currently look good on IE.
|
|
</li>
|
|
|
|
</ol>
|
|
</div>
|
|
|
|
|
|
<a name="N10046"></a><a name="Getting+Started"></a>
|
|
<h2 class="boxed">Getting Started</h2>
|
|
<div class="section">
|
|
<p>
|
|
Begin by unziping the Solr release and changing your working directory
|
|
to be the "<span class="codefrag">example</span>" directory
|
|
</p>
|
|
<pre class="code">
|
|
chrish@asimov:~/tmp/solr$ ls
|
|
solr-1.0.zip
|
|
chrish@asimov:~/tmp/solr$ unzip -q solr-1.0.zip
|
|
chrish@asimov:~/tmp/solr$ cd solr-1.0/example/
|
|
</pre>
|
|
<p>
|
|
Solr can run in any Java Servlet Container of your choice, but to simplify
|
|
this tutorial, the example index includes a small installation of Jetty.
|
|
</p>
|
|
<p>
|
|
To launch Jetty with the Solr WAR, and the example configs, just run the <span class="codefrag">start.jar</span> ...
|
|
</p>
|
|
<pre class="code">
|
|
chrish@asimov:~/tmp/solr/solr-1.0/example$ java -jar start.jar
|
|
1 [main] INFO org.mortbay.log - Logging to org.slf4j.impl.SimpleLogger@1f436f5 via org.mortbay.log.Slf4jLog
|
|
334 [main] INFO org.mortbay.log - Extract jar:file:/home/chrish/tmp/solr/solr-1.0/example/webapps/solr.war!/ to /tmp/Jetty__solr/webapp
|
|
Feb 24, 2006 5:54:52 PM org.apache.solr.servlet.SolrServlet init
|
|
INFO: user.dir=/home/chrish/tmp/solr/solr-1.0/example
|
|
Feb 24, 2006 5:54:52 PM org.apache.solr.core.SolrConfig <clinit>
|
|
INFO: Loaded Config solrconfig.xml
|
|
|
|
...
|
|
|
|
1656 [main] INFO org.mortbay.log - Started SelectChannelConnector @ 0.0.0.0:8983
|
|
</pre>
|
|
<p>
|
|
This will start up the Jetty application server on port 8983, and use your terminal to display the logging information from Solr.
|
|
</p>
|
|
<p>
|
|
You can see that the Solr is running by loading <a href="http://localhost:8983/solr/admin/">http://localhost:8983/solr/admin/</a> in your web browser. This is the main starting point for Administering Solr.
|
|
</p>
|
|
</div>
|
|
|
|
|
|
|
|
|
|
<a name="N1006E"></a><a name="Indexing+Data"></a>
|
|
<h2 class="boxed">Indexing Data</h2>
|
|
<div class="section">
|
|
<p>
|
|
Your Solr port is up and running, but it doesn't contain any data. You can modify a Solr index by POSTing XML Documents containing instructions to add (or update) documents, delete documents, commit pending adds and deletes, and optimize your index. The <span class="codefrag">exampledocs</span> directory contains samples of the types of instructions Solr expects, as well as a Shell script for posting them using the command line utility "<span class="codefrag">curl</span>".
|
|
</p>
|
|
<p>
|
|
Open a new Terminal window, enter the exampledocs directory, and run the "<span class="codefrag">post.sh</span>" script on some of the XML files in that directory...
|
|
</p>
|
|
<pre class="code">
|
|
chrish@asimov:~/tmp/solr/solr-1.0/example/exampledocs$ sh post.sh solr.xml
|
|
Posting file solr.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result>
|
|
<result status="0"></result>
|
|
</pre>
|
|
<p>
|
|
You have now indexed one document about Solr, and committed that change. You can now search for "solr" using the "Make a Query" interface on the Admin screen, and you should get one result. Clicking the "Search" button should take you to the following URL...
|
|
</p>
|
|
<p>
|
|
|
|
<a href="http://localhost:8983/solr/select/?stylesheet=&q=solr&version=2.1&start=0&rows=10&indent=on">http://localhost:8983/solr/select/?stylesheet=&q=solr&version=2.1&start=0&rows=10&indent=on</a>
|
|
|
|
</p>
|
|
<p>
|
|
You can index all of the sample data, using the following command...
|
|
</p>
|
|
<pre class="code">
|
|
chrish@asimov:~/tmp/solr/solr-1.0/example/exampledocs$ sh post.sh *.xml
|
|
Posting file hd.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result><result status="0"></result>
|
|
Posting file ipod_other.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result><result status="0"></result>
|
|
Posting file ipod_video.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result>
|
|
Posting file mem.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result><result status="0"></result><result status="0"></result>
|
|
Posting file monitor.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result>
|
|
Posting file monitor2.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result>
|
|
Posting file mp500.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result>
|
|
Posting file sd500.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result>
|
|
Posting file solr.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result>
|
|
Posting file vidcard.xml to http://localhost:8983/solr/update
|
|
<result status="0"></result><result status="0"></result>
|
|
<result status="0"></result>
|
|
</pre>
|
|
<p>
|
|
...and now you can search for all sorts of things using the default <a href="http://lucene.apache.org/java/docs/queryparsersyntax.html">Lucene QueryParser syntax</a>...
|
|
</p>
|
|
<ul>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?version=2.1&indent=on&q=video">video</a>
|
|
</li>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?version=2.1&indent=on&q=name:video">name:video</a>
|
|
</li>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?version=2.1&indent=on&q=%2Bvideo+%2Bprice%3A[*+TO+400]">+video +price:[* TO 400]</a>
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
</div>
|
|
|
|
|
|
|
|
|
|
<a name="N100B2"></a><a name="Updating+Data"></a>
|
|
<h2 class="boxed">Updating Data</h2>
|
|
<div class="section">
|
|
<p>
|
|
You may have noticed that even though the file <span class="codefrag">solr.xml</span> has now
|
|
been POSTed to the server twice, you still only get 1 result when searching for
|
|
"solr". This is because the example schema.xml specifies a "uniqueKey" field
|
|
called "<span class="codefrag">id</span>". Whenever you POST instructions to Solr to add a
|
|
document with the same value for the uniqueKey as an existing document, it
|
|
automaticaly replaces it for you. You can see that that has happened by
|
|
looking at the values for <span class="codefrag">numDocs</span> and <span class="codefrag">maxDoc</span> in the
|
|
"CORE" section of the statistics page... </p>
|
|
<p>
|
|
|
|
<a href="http://localhost:8983/solr/admin/stats.jsp">http://localhost:8983/solr/admin/stats.jsp</a>
|
|
|
|
</p>
|
|
<p>
|
|
numDoc should be 15, but maxDoc may be larger (the maxDoc count includes logically deleted documents that have not yet been removed from the index). You can re-post the sample XML
|
|
files over and over again as much as you want and numDocs will never increase,
|
|
because the new documents will constantly be replacing the old.
|
|
</p>
|
|
<p>
|
|
Go ahead and edit the existing XML files to change some of the data, and re-run the post.sh command, you'll see your changes reflected in subsequent searches.
|
|
</p>
|
|
<a name="N100D4"></a><a name="Deleting+Data"></a>
|
|
<h3 class="boxed">Deleting Data</h3>
|
|
<p>You can delete data by POSTing a delete command to the update URL and specifying the value
|
|
of the document's unique key field, or a query that matches multiple documents. Since these commands
|
|
are smaller, we will specify them right on the command line rather than reference an XML file.
|
|
</p>
|
|
<p>Execute the following command to delete a document</p>
|
|
<pre class="code">curl http://localhost:8983/solr/update --data-binary '<delete><id>SP2514N</id></delete>'</pre>
|
|
<p>Now if you go to the <a href="http://localhost:8983/solr/admin/stats.jsp">statistics</a> page and scroll down
|
|
to the UPDATE_HANDLERS section and verify that "<span class="codefrag">deletesPending : 1</span>"</p>
|
|
<p>If you search for <a href="http://localhost:8983/solr/select?q=id:SP2514N">id:SP2514N</a> it will still be found,
|
|
because index changes are not visible until changes are flushed to disk, and a new searcher is opened. To cause
|
|
this to happen, send the following commit command to Solr:</p>
|
|
<pre class="code">curl http://localhost:8983/solr/update --data-binary '<commit/>'</pre>
|
|
<p>Now re-execute the previous search and verify that no matching documents are found. Also revisit the
|
|
statistics page and observe the changes in both the UPDATE_HANDLERS section and the CORE section.</p>
|
|
<p>Here is an example of using delete-by-query to delete anything with
|
|
<a href="http://localhost:8983/solr/select?q=name:DDR&fl=name">DDR</a> in the name:</p>
|
|
<pre class="code">curl http://localhost:8983/solr/update --data-binary '<delete><query>name:DDR</query></delete>'
|
|
curl http://localhost:8983/solr/update --data-binary '<commit/>'
|
|
</pre>
|
|
<p>Commit can be a very expensive operation so it's best to make many changes to an index in a batch and
|
|
then send the commit command at the end. There is also an optimize command that does the same thing as commit,
|
|
in addition to merging all index segments into a single segment, making it faster to search and causing any
|
|
deleted documents to be removed. All of the update commands are documented <a href="http://wiki.apache.org/solr/UpdateXmlMessages">here</a>.
|
|
</p>
|
|
<p>To continue with the tutorial, re-add any documents you may have deleted by going to the <span class="codefrag">exampledocs</span> directory and executing</p>
|
|
<pre class="code">sh post.sh *.xml</pre>
|
|
</div>
|
|
|
|
|
|
<a name="N1011A"></a><a name="Querying+Data"></a>
|
|
<h2 class="boxed">Querying Data</h2>
|
|
<div class="section">
|
|
<p>
|
|
Searches are done via HTTP GET on the select URL with the query string in the q parameter.
|
|
You can pass a number of optional <a href="http://wiki.apache.org/solr/StandardRequestHandler">request parameters</a>
|
|
to the request handler to control what information is returned. For example, you can use the "fl" parameter
|
|
to control what stored fields are returned, and if the relevancy score is returned...
|
|
</p>
|
|
<ul>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=video&fl=name,id">q=video&fl=name,id</a> (return only name and id fields) </li>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=video&fl=name,id,score">q=video&fl=name,id,score</a> (return relevancy score as well) </li>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=video&fl=*,score">q=video&fl=*,score</a> (return all stored fields, as well as relevancy score) </li>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=video;price desc&fl=name,id">q=video;price desc&fl=name,id</a> (add sort specification: sort by price descending) </li>
|
|
|
|
</ul>
|
|
<p>
|
|
Solr provides a <a href="http://localhost:8983/solr/admin/form.jsp">query form</a> within the web admin interface
|
|
that allows setting the various request parameters and is useful when trying out or debugging queries.
|
|
</p>
|
|
<a name="N10149"></a><a name="Sorting"></a>
|
|
<h3 class="boxed">Sorting</h3>
|
|
<p>
|
|
Solr provides a simple extension to the Lucene QueryParser syntax for specifying sort options. After your search, add a semi-colon followed by a list of "field direction" pairs...
|
|
</p>
|
|
<ul>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=video;price+desc">video; price desc</a>
|
|
</li>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=video;price+asc">video; price asc</a>
|
|
</li>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=video;inStock+asc+price+desc">video; inStock asc, price desc</a>
|
|
</li>
|
|
|
|
</ul>
|
|
<p>
|
|
"score" can also be used as a field name when specifying a sort...
|
|
</p>
|
|
<ul>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=video;score+desc">video; score desc</a>
|
|
</li>
|
|
|
|
<li>
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=video;inStock+asc,score+desc">video; inStock asc, score desc</a>
|
|
</li>
|
|
|
|
</ul>
|
|
<p>
|
|
If no sort is specified, the default is <span class="codefrag">score desc</span>, the same as in the Lucene search APIs.
|
|
</p>
|
|
</div>
|
|
|
|
|
|
<a name="N1017C"></a><a name="Text+Analysis"></a>
|
|
<h2 class="boxed">Text Analysis</h2>
|
|
<div class="section">
|
|
<p>
|
|
Text fields are typically indexed by breaking the field into words and applying various transformations such as
|
|
lowercasing, removing plurals, or stemming to increase relevancy. The same text transformations are normally
|
|
applied to any queries in order to match what is indexed.
|
|
</p>
|
|
<p>Example queries demonstrating relevancy improving transformations:</p>
|
|
<ul>
|
|
|
|
<li>A search for
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=power-shot&fl=name">power-shot</a>
|
|
matches <span class="codefrag">PowerShot</span>, and
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=adata&fl=name">adata</a>
|
|
matches <span class="codefrag">A-DATA</span> due to the use of WordDelimiterFilter and LowerCaseFilter.
|
|
</li>
|
|
|
|
|
|
<li>A search for
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=name:printers&fl=name">name:printers</a>
|
|
matches <span class="codefrag">Printer</span>, and
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=features:recharging&fl=name,features">features:recharging</a>
|
|
matches <span class="codefrag">Rechargeable</span> due to stemming with the EnglishPorterFilter.
|
|
</li>
|
|
|
|
|
|
<li>A search for
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=%221+gigabyte%22&fl=name">"1 gigabyte"</a>
|
|
matches things with <span class="codefrag">GB</span>, and
|
|
<a href="http://localhost:8983/solr/select/?indent=on&q=pixima&fl=name">pixima</a>
|
|
matches <span class="codefrag">Pixma</span> due to use of a SynonymFilter.
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
<p>
|
|
The <a href="http://wiki.apache.org/solr/SchemaXml">schema</a> defines
|
|
the fields in the index and what type of analysis is applied to them. The current schema your server is using
|
|
may be accessed via the <span class="codefrag">[SCHEMA]</span> link on the <a href="http://localhost:8983/solr/admin/">admin</a> page.
|
|
</p>
|
|
<p>A full description of the analysis components, Analyzers, Tokenizers, and TokenFilters
|
|
available for use is <a href="http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters">here</a>.
|
|
</p>
|
|
<a name="N101D3"></a><a name="Analysis+Debugging"></a>
|
|
<h3 class="boxed">Analysis Debugging</h3>
|
|
<p>There is a handy <a href="http://localhost:8983/solr/admin/analysis.jsp">analysis</a>
|
|
debugging page where you can see how a text value is broken down into words,
|
|
and shows the resulting tokens after they pass through each filter in the chain.
|
|
</p>
|
|
<p>
|
|
|
|
<a href="http://localhost:8983/solr/admin/analysis.jsp?name=name&val=Canon+PowerShot+SD500">This</a>
|
|
shows how "<span class="codefrag">Canon PowerShot SD500</span>" would be indexed as a value in the name field. Each row of
|
|
the table shows the resulting tokens after having passed through the next TokenFilter in the Analyzer for the <span class="codefrag">name</span> field.
|
|
Notice how both <span class="codefrag">powershot</span> and <span class="codefrag">power</span>, <span class="codefrag">shot</span> are indexed. Tokens generated at the same position
|
|
are shown in the same column, in this case <span class="codefrag">shot</span> and <span class="codefrag">powershot</span>.
|
|
</p>
|
|
<p>Selecting <a href="http://localhost:8983/solr/admin/analysis.jsp?name=name&verbose=on&val=Canon+PowerShot+SD500">verbose output</a>
|
|
will show more details, such as the name of each analyzer component in the chain, token positions, and the start and end positions
|
|
of the token in the original text.
|
|
</p>
|
|
<p>Selecting <a href="http://localhost:8983/solr/admin/analysis.jsp?name=name&highlight=on&val=Canon+PowerShot+SD500&qval=power-shot">highlight matches</a>
|
|
when both index and query values are provided will take the resulting terms from the query value and highlight
|
|
all matches in the index value analysis.
|
|
</p>
|
|
<p>
|
|
<a href="http://localhost:8983/solr/admin/analysis.jsp?name=text&highlight=on&val=Four+score+and+seven+years+ago+our+fathers+brought+forth+on+this+continent+a+new+nation%2C+conceived+in+liberty+and+dedicated+to+the+proposition+that+all+men+are+created+equal.+&qval=liberties+and+equality">Here</a>
|
|
is an example of stemming and stop-words at work.
|
|
</p>
|
|
</div>
|
|
|
|
|
|
</div>
|
|
<div class="clearboth"> </div>
|
|
</div>
|
|
<div id="footer">
|
|
<div class="lastmodified">
|
|
<script type="text/javascript"><!--
|
|
document.write("<text>Last Published:</text> " + document.lastModified);
|
|
// --></script>
|
|
</div>
|
|
<div class="copyright">
|
|
Copyright ©
|
|
2006 <a href="http://www.apache.org/licenses/">The Apache Software Foundation.</a>
|
|
</div>
|
|
</div>
|
|
</body>
|
|
</html>
|