mirror of https://github.com/apache/lucene.git
192 lines
7.5 KiB
Plaintext
192 lines
7.5 KiB
Plaintext
Apache Solr - DataImportHandler Version 1.3-dev
|
|
Release Notes
|
|
|
|
Introduction
|
|
------------
|
|
DataImportHandler is a data import tool for Solr which makes importing data from Databases, XML files and
|
|
HTTP data sources quick and easy.
|
|
|
|
|
|
$Id$
|
|
|
|
================== Release 1.4-dev ==================
|
|
Upgrading from Solr 1.3
|
|
-----------------------
|
|
|
|
Evaluator API has been changed in a non back-compatible way. Users who have developed custom Evaluators will need
|
|
to change their code according to the new API for it to work. See SOLR-996 for details.
|
|
|
|
Detailed Change List
|
|
----------------------
|
|
|
|
New Features
|
|
----------------------
|
|
1. SOLR-768: Set last_index_time variable in full-import command.
|
|
(Wojtek Piaseczny, Noble Paul via shalin)
|
|
|
|
2. SOLR-811: Allow a "deltaImportQuery" attribute in SqlEntityProcessor which is used for delta imports
|
|
instead of DataImportHandler manipulating the SQL itself.
|
|
(Noble Paul via shalin)
|
|
|
|
3. SOLR-842: Better error handling in DataImportHandler with options to abort, skip and continue imports.
|
|
(Noble Paul, shalin)
|
|
|
|
4. SOLR-833: A DataSource to read data from a field as a reader. This can be used, for example, to read XMLs
|
|
residing as CLOBs or BLOBs in databases.
|
|
(Noble Paul via shalin)
|
|
|
|
5. SOLR-887: A Transformer to strip HTML tags.
|
|
(Ahmed Hammad via shalin)
|
|
|
|
6. SOLR-886: DataImportHandler should rollback when an import fails or it is aborted
|
|
(shalin)
|
|
|
|
7. SOLR-891: A Transformer to read strings from Clob type.
|
|
(Noble Paul via shalin)
|
|
|
|
8. SOLR-812: Configurable JDBC settings in JdbcDataSource including optimized defaults for read only mode.
|
|
(David Smiley, Glen Newton, shalin)
|
|
|
|
9. SOLR-910: Add a few utility commands to the DIH admin page such as full import, delta import, status, reload config.
|
|
(Ahmed Hammad via shalin)
|
|
|
|
10.SOLR-938: Add event listener API for import start and end.
|
|
(Kay Kay, Noble Paul via shalin)
|
|
|
|
11.SOLR-801: Add support for configurable pre-import and post-import delete query per root-entity.
|
|
(Noble Paul via shalin)
|
|
|
|
12.SOLR-988: Add a new scope for session data stored in Context to store objects across imports.
|
|
(Noble Paul via shalin)
|
|
|
|
13.SOLR-980: A PlainTextEntityProcessor which can read from any DataSource<Reader> and output a String.
|
|
(Nathan Adams, Noble Paul via shalin)
|
|
|
|
14.SOLR-1003: XPathEntityprocessor must allow slurping all text from a given xml node and its children.
|
|
(Noble Paul via shalin)
|
|
|
|
15.SOLR-1001: Allow variables in various attributes of RegexTransformer, HTMLStripTransformer
|
|
and NumberFormatTransformer.
|
|
(Fergus McMenemie, Noble Paul, shalin)
|
|
|
|
16.SOLR-989: Expose running statistics from the Context API.
|
|
(Noble Paul, shalin)
|
|
|
|
17.SOLR-996: Expose Context to Evaluators.
|
|
(Noble Paul, shalin)
|
|
|
|
18.SOLR-783: Enhance delta-imports by maintaining separate last_index_time for each entity.
|
|
(Jon Baer, Noble Paul via shalin)
|
|
|
|
Optimizations
|
|
----------------------
|
|
1. SOLR-846: Reduce memory consumption during delta import by removing keys when used
|
|
(Ricky Leung, Noble Paul via shalin)
|
|
|
|
2. SOLR-974: DataImportHandler skips commit if no data has been updated.
|
|
(Wojtek Piaseczny, shalin)
|
|
|
|
3. SOLR-1004: Check for abort more frequently during delta-imports.
|
|
(Marc Sturlese, shalin)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
1. SOLR-800: Deep copy collections to avoid ConcurrentModificationException in XPathEntityprocessor while streaming
|
|
(Kyle Morrison, Noble Paul via shalin)
|
|
|
|
2. SOLR-823: Request parameter variables ${dataimporter.request.xxx} are not resolved
|
|
(Mck SembWever, Noble Paul, shalin)
|
|
|
|
3. SOLR-728: Add synchronization to avoid race condition of multiple imports working concurrently
|
|
(Walter Ferrara, shalin)
|
|
|
|
4. SOLR-742: Add ability to create dynamic fields with custom DataImportHandler transformers
|
|
(Wojtek Piaseczny, Noble Paul, shalin)
|
|
|
|
5. SOLR-832: Rows parameter is not honored in non-debug mode and can abort a running import in debug mode.
|
|
(Akshay Ukey, shalin)
|
|
|
|
6. SOLR-838: The VariableResolver obtained from a DataSource's context does not have current data.
|
|
(Noble Paul via shalin)
|
|
|
|
7. SOLR-864: DataImportHandler does not catch and log Errors (shalin)
|
|
|
|
8. SOLR-873: Fix case-sensitive field names and columns (Jon Baer, shalin)
|
|
|
|
9. SOLR-893: Unable to delete documents via SQL and deletedPkQuery with deltaimport
|
|
(Dan Rosher via shalin)
|
|
|
|
10. SOLR-888: DateFormatTransformer cannot convert non-string type
|
|
(Amit Nithian via shalin)
|
|
|
|
11. SOLR-841: DataImportHandler should throw exception if a field does not have column attribute
|
|
(Michael Henson, shalin)
|
|
|
|
12. SOLR-884: CachedSqlEntityProcessor should check if the cache key is present in the query results
|
|
(Noble Paul via shalin)
|
|
|
|
13. SOLR-985: Fix thread-safety issue with TemplateString for concurrent imports with multiple cores.
|
|
(Ryuuichi Kumai via shalin)
|
|
|
|
14. SOLR-999: XPathRecordReader fails on XMLs with nodes mixed with CDATA content.
|
|
(Fergus McMenemie, Noble Paul via shalin)
|
|
|
|
15.SOLR-1000: FileListEntityProcessor should not apply fileName filter to directory names.
|
|
(Fergus McMenemie via shalin)
|
|
|
|
16.SOLR-1009: Repeated column names result in duplicate values.
|
|
(Fergus McMenemie, Noble Paul via shalin)
|
|
|
|
17.SOLR-1017: Fix thread-safety issue with last_index_time for concurrent imports in multiple cores due to unsafe usage
|
|
of SimpleDateFormat by multiple threads.
|
|
(Ryuuichi Kumai via shalin)
|
|
|
|
18.SOLR-1024: Calling abort on DataImportHandler import commits data instead of calling rollback.
|
|
(shalin)
|
|
|
|
Documentation
|
|
----------------------
|
|
|
|
Other
|
|
----------------------
|
|
1. SOLR-782: Refactored SolrWriter to make it a concrete class and removed wrappers over SolrInputDocument.
|
|
Refactored to load Evaluators lazily. Removed multiple document nodes in the configuration xml.
|
|
Removed support for 'default' variables, they are automatically available as request parameters.
|
|
(Noble Paul via shalin)
|
|
|
|
2. SOLR-964: XPathEntityProcessor now ignores DTD validations
|
|
(Fergus McMenemie, Noble Paul via shalin)
|
|
|
|
================== Release 1.3.0 20080915 ==================
|
|
|
|
Status
|
|
------
|
|
This is the first release since DataImportHandler was added to the contrib solr distribution.
|
|
The following changes list changes since the code was introduced, not since
|
|
the first official release.
|
|
|
|
|
|
Detailed Change List
|
|
--------------------
|
|
|
|
New Features
|
|
1. SOLR-700: Allow configurable locales through a locale attribute in fields for NumberFormatTransformer.
|
|
(Stefan Oestreicher, shalin)
|
|
|
|
Changes in runtime behavior
|
|
|
|
Bug Fixes
|
|
1. SOLR-704: NumberFormatTransformer can silently ignore part of the string while parsing. Now it tries to
|
|
use the complete string for parsing. Failure to do so will result in an exception.
|
|
(Stefan Oestreicher via shalin)
|
|
|
|
2. SOLR-729: Context.getDataSource(String) gives current entity's DataSource instance regardless of argument.
|
|
(Noble Paul, shalin)
|
|
|
|
3. SOLR-726: Jdbc Drivers and DataSources fail to load if placed in multicore sharedLib or core's lib directory.
|
|
(Walter Ferrara, Noble Paul, shalin)
|
|
|
|
Other Changes
|
|
|
|
|