LUCENE-4594: PrefixTreeStrategy should not index center points

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1419630 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
David Wayne Smiley 2012-12-10 18:25:42 +00:00
parent ba9560fd74
commit 739c9ef807
2 changed files with 18 additions and 13 deletions

View File

@ -80,6 +80,11 @@ Changes in backwards compatibility policy
can use OrdinalPolicy.NO_PARENTS to never write any parent category ordinal
to the fulltree posting payload (but note that you need a special
FacetsAccumulator - see javadocs). (Shai Erera)
* LUCENE-4594: Spatial PrefixTreeStrategy no longer indexes center points of
non-point shapes. If you want to call makeDistanceValueSource() based on
shape centers, you need to do this yourself in another spatial field.
(David Smiley)
New Features
@ -246,6 +251,13 @@ Bug Fixes
* LUCENE-4596: fix a concurrency bug in DirectoryTaxonomyWriter.
(Shai Erera)
* LUCENE-4594: Spatial PrefixTreeStrategy would index center-points in addition
to the shape to index if it was non-point, in the same field. But sometimes
the center-point isn't actually in the shape (consider a LineString), and for
highly precise shapes it could cause makeDistanceValueSource's cache to load
parts of the shape's boundary erroneously too. So center points aren't
indexed any more; you should use another spatial field. (David Smiley)
Changes in Runtime Behavior
* LUCENE-4586: Change default ResultMode of FacetRequest to PER_NODE_IN_TREE.

View File

@ -45,7 +45,7 @@ import java.util.concurrent.ConcurrentHashMap;
* <h4>Characteristics:</h4>
* <ul>
* <li>Can index any shape; however only {@link RecursivePrefixTreeStrategy}
* can effectively search non-point shapes. <em>Not tested.</em></li>
* can effectively search non-point shapes.</li>
* <li>Can index a variable number of shapes per field value. This strategy
* can do it via multiple calls to {@link #createIndexableFields(com.spatial4j.core.shape.Shape)}
* for a document or by giving it some sort of Shape aggregate (e.g. JTS
@ -57,8 +57,9 @@ import java.util.concurrent.ConcurrentHashMap;
* is supported. If only points are indexed then this is effectively equivalent
* to IsWithin.</li>
* <li>The strategy supports {@link #makeDistanceValueSource(com.spatial4j.core.shape.Point)}
* even for multi-valued data. However, <em>it will likely be removed in the
* future</em> in lieu of using another strategy with a more scalable
* even for multi-valued data, so long as the indexed data is all points; the
* behavior is undefined otherwise. However, <em>it will likely be removed in
* the future</em> in lieu of using another strategy with a more scalable
* implementation. Use of this call is the only
* circumstance in which a cache is used. The cache is simple but as such
* it doesn't scale to large numbers of points nor is it real-time-search
@ -123,20 +124,12 @@ public abstract class PrefixTreeStrategy extends SpatialStrategy {
public Field[] createIndexableFields(Shape shape, double distErr) {
int detailLevel = grid.getLevelForDistance(distErr);
List<Node> cells = grid.getNodes(shape, detailLevel, true);//true=intermediates cells
//If shape isn't a point, add a full-resolution center-point so that
// PointPrefixTreeFieldCacheProvider has the center-points.
//TODO index each point of a multi-point or other aggregate.
//TODO remove this once support for a distance ValueSource is removed.
if (!(shape instanceof Point)) {
Point ctr = shape.getCenter();
//TODO should be smarter; don't index 2 tokens for this in CellTokenStream. Harmless though.
cells.add(grid.getNodes(ctr,grid.getMaxLevels(),false).get(0));
}
//TODO is CellTokenStream supposed to be re-used somehow? see Uwe's comments:
// http://code.google.com/p/lucene-spatial-playground/issues/detail?id=4
Field field = new Field(getFieldName(), new CellTokenStream(cells.iterator()), FIELD_TYPE);
Field field = new Field(getFieldName(),
new CellTokenStream(cells.iterator()), FIELD_TYPE);
return new Field[]{field};
}