Apache Solr Release Notes Introduction ------------ Apache Solr is an open source enterprise search server based on the Apache Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. See http://lucene.apache.org/solr for more information. Getting Started --------------- You need a Java 1.8 VM or later installed. In this release, there is an example Solr server including a bundled servlet container in the directory named "example". See the Quick Start guide at http://lucene.apache.org/solr/quickstart.html ================== 7.0.0 ================== (No Changes) ================== 6.0.0 ================== Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release Versions of Major Components --------------------- Apache Tika 1.7 Carrot2 3.12.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Jetty 9.3.6.v20151106 System Requirements ---------------------- * LUCENE-5950: Move to Java 8 as minimum Java version. (Ryan Ernst, Uwe Schindler) Upgrading from Solr 5.x ---------------------- * The deprecated SolrServer and subclasses have been removed, use SolrClient instead. * The deprecated configuration in solrconfig.xml has been removed. Please remove it from solrconfig.xml. * SolrClient.shutdown() has been removed, use SolrClient.close() instead. * The deprecated zkCredientialsProvider element in solrcloud section of solr.xml is now removed. Use the correct spelling (zkCredentialsProvider) instead. * SOLR-7957: internal/expert - ResultContext was significantly changed and expanded to allow for multiple full query results (DocLists) per Solr request. TransformContext was rendered redundant and was removed. (yonik) * Several changes have been made regarding the "Similarity" used in Solr, in order to provide better default behavior for new users. There are 3 key impacts of these changes on existing users who upgrade: * DefaultSimilarityFactory has been removed. If you currently have DefaultSimilarityFactory explicitly referenced in your schema.xml, edit your config to use the functionally identical ClassicSimilarityFactory. See SOLR-8239 for more details. * The implicit default Similarity used when no is configured in schema.xml has been changed to SchemaSimilarityFactory. Users who wish to preserve back-compatible behavior should either explicitly configure ClassicSimilarityFactory, or ensure that the luceneMatchVersion for the collection is less then 6.0. See SOLR-8270 + SOLR-8271 for details. * SchemaSimilarityFactory has been modified to use BM25Similarity as the default for fieldTypes that do not explicitly declare a Similarity. The legacy behavior of using ClassicSimilarity as the default will occur if the luceneMatchVersion for the collection is less then 6.0, or the 'defaultSimFromFieldType' configuration option may be used to specify any default of your choosing. See SOLR-8261 + SOLR-8329 for more details. * If your solrconfig.xml file doesn't explicitly mention the schemaFactory to use then Solr will choose the ManagedIndexSchemaFactory by default. Previously it would have chosen ClassicIndexSchemaFactory. This means that the Schema APIs ( //schema ) are enabled and the schema is mutable. When Solr starts your schema.xml file will be renamed to managed-schema. If you want to retain the old behaviour then please ensure that the solrconfig.xml explicitly uses the ClassicIndexSchemaFactory : or your luceneMatchVersion in the solrconfig.xml is less than 6.0 * SolrIndexSearcher.QueryCommand and QueryResult were moved to their own classes. If you reference them in your code, you should import them under o.a.s.search (or use your IDE's "Organize Imports"). * SOLR-8698: 'useParams' attribute specified in request handler cannot be overridden from request params * When requesting stats in date fields, "sum" is now a double value instead of a date. See SOLR-8671 Detailed Change List ---------------------- New Features ---------------------- * SOLR-3085: New edismax param mm.autoRelax which helps in certain cases of the stopwords/zero-hits issue (janhoy) * SOLR-7560: Parallel SQL Support (Joel Bernstein) * SOLR-7707: Add StreamExpression Support to RollupStream (Dennis Gove, Joel Bernstein) * SOLR-7903: Add the FacetStream to the Streaming API and wire it into the SQLHandler (Joel Bernstein) * SOLR-7986: JDBC Driver for SQL Interface (Uwe Schindler, Joel Bernstein) * SOLR-8038: Add the StatsStream to the Streaming API and wire it into the SQLHandler (Joel Bernstein) * SOLR-8086: Add support for SELECT DISTINCT queries to the SQL interface (Joel Bernstein) * SOLR-7543: Basic graph traversal query Example: {!graph from="node_id" to="edge_id"}id:doc_1 (Kevin Watters, yonik) * SOLR-6273: Cross Data Center Replication. Active/passive replication for separate SolrClouds hosted on separate data centers. (Renaud Delbru, Yonik Seeley via Erick Erickson) * SOLR-7938: MergeStream now supports merging more than 2 streams together (Dennis Gove) * SOLR-8198: Change ReducerStream to use StreamEqualitor instead of StreamComparator (Dennis Gove) * SOLR-8268: StatsStream now implements the Expressible interface (Dennis Gove) * SOLR-7584: Adds Inner and LeftOuter Joins to the Streaming API and Streaming Expressions (Dennis Gove, Corey Wu) * SOLR-8188: Adds Hash and OuterHash Joins to the Streaming API and Streaming Expressions (Dennis Gove) * SOLR-7669: Add SelectStream and Tuple Operations to the Streaming API and Streaming Expressions (Dennis Gove) * SOLR-8337: Add ReduceOperation and wire it into the ReducerStream (Joel Bernstein) * SOLR-7904: Add StreamExpression Support to FacetStream (Dennis Gove) * SOLR-6398: Add IterativeMergeStrategy to support running Parallel Iterative Algorithms inside of Solr (Joel Bernstein) * SOLR-8436: Real-time get now supports filters. (yonik) * SOLR-7535: Add UpdateStream to Streaming API and Streaming Expression (Jason Gerlowski, Joel Bernstein) * SOLR-8479: Add JDBCStream to Streaming API and Streaming Expressions for integration with external data sources (Dennis Gove) * SOLR-8002: Add column alias support to the Parallel SQL Interface (Joel Bernstein) * SOLR-7525: Add ComplementStream and IntersectStream to the Streaming API and Streaming Expressions (Dennis Gove, Jason Gerlowski, Joel Bernstein) * SOLR-8415: Provide command to switch between non/secure mode in ZK (Mike Drob, Gregory Chanan) * SOLR-8556: Add ConcatOperation to be used with the SelectStream (Joel Bernstein, Dennis Gove) * SOLR-8550: Add asynchronous DaemonStreams to the Streaming API (Joel Bernstein) * SOLR-8285: Ensure the /export handler works with NULL field values (Joel Bernstein) * SOLR-8502: Improve Solr JDBC Driver to support SQL Clients like DBVisualizer (Kevin Risden, Joel Bernstein) * SOLR-8588: Add TopicStream to the streaming API to support publish/subscribe messaging (Joel Bernstein, Kevin Risden) * SOLR-8666: Adds header 'zkConnected' to response of SearchHandler and PingRequestHandler to notify the client when a connection to zookeeper has been lost and there is a possibility of stale data on the node the request is coming from. (Keith Laban, Dennis Gove) * SOLR-8522: Make it possible to use ip fragments in replica placement rules , such as ip_1, ip_2 etc (Arcadius Ahouansou, noble) * SOLR-8698: params.json can now specify 'appends' and 'invariants' (noble) Bug Fixes ---------------------- * SOLR-8386: Add field option in the new admin UI schema page loads up even when no schemaFactory has been explicitly specified since the default is ManagedIndexSchemaFactory. (Erick Erickson, Upayavira, Varun Thacker) * SOLR-8191: Guard against CloudSolrStream close method NullPointerException (Kevin Risden, Joel Bernstein) * SOLR-8485: SelectStream now properly handles non-lowercase and/or quoted select field names (Dennis Gove) * SOLR-8525: Fix a few places that were failing to pass dimensional values settings when copying a FieldInfo (Ishan Chattopadhyaya via Mike McCandless) * SOLR-8409: Ensures that quotes in solr params (eg. q param) are properly handled (Dennis Gove) * SOLR-8640: CloudSolrClient does not send credentials for update request (noble, hoss) * SOLR-8461: CloudSolrStream and ParallelStream can choose replicas that are not active (Cao Manh Dat, Varun Thacker, Joel Bernstein) * SOLR-8527: Improve JdbcTest to cleanup properly on failures (Kevin Risden, Joel Bernstein) * SOLR-8578: Successful or not, requests are not always fully consumed by Solrj clients and we count on HttpClient or the JVM. (Mark Miller) * SOLR-8683: Always consume the full request on the server, not just in the case of an error. (Mark Miller) * SOLR-8416: The collections create API should return after all replicas are active. (Michael Sun, Mark Miller, Alexey Serba) * SOLR-8701: CloudSolrClient decides that there are no healthy nodes to handle a request too early. (Mark Miller) * SOLR-8694: DistributedMap/Queue can create too many Watchers and some code simplification. (Scott Blum via Mark Miller) * SOLR-8695: Ensure ZK watchers are not triggering our watch logic on connection events and make this handling more consistent. (Scott Blum via Mark Miller) * SOLR-8633: DistributedUpdateProcess processCommit/deleteByQuery call finish on DUP and SolrCmdDistributor, which violates the lifecycle and can cause bugs. (hossman via Mark Miller) * SOLR-8656: PeerSync should use same nUpdates everywhere. (Ramsey Haddad via Mark Miller) * SOLR-8697: Scope ZK election nodes by session to prevent elections from interfering with each other and other small LeaderElector improvements. (Scott Blum via Mark Miller) * SOLR-8599: After a failed connection during construction of SolrZkClient attempt to retry until a connection can be made. (Keith Laban, Dennis Gove) * SOLR-8497: Merge index does not mark the Directory objects it creates as 'done' and they are retained in the Directory cache. (Sivlio Sanchez, Mark Miller) * SOLR-8696: Start the Overseer before actions that need the overseer on init and when reconnecting after zk expiration and improve init logic. (Scott Blum, Mark Miller) * SOLR-8420: Fix long overflow in sumOfSquares for Date statistics. (Tom Hill, Christine Poerschke, Tomás Fernández Löbbe) * SOLR-8748: OverseerTaskProcessor limits number of concurrent tasks to just 10 even though the thread pool size is 100. The limit has now been increased to 100. (Scott Blum, shalin) * SOLR-8375: ReplicaAssigner rejects valid nodes (Kelvin Tan, noble) * SOLR-8738: Fixed false success response when invalid deleteByQuery requests intially hit non-leader cloud nodes (hossman) * SOLR-8771: Multi-threaded core shutdown creates executor per core. (Mike Drob via Mark Miller) * SOLR-8145: Fix position of OOM killer script when starting Solr in the background (Jurian Broertjes via Timothy Potter) Optimizations ---------------------- * SOLR-7876: Speed up queries and operations that use many terms when timeAllowed has not been specified. Speedups of up to 8% were observed. (yonik) * SOLR-8037: Speed up creation of filters from term range queries (i.e. non-numeric range queries) and use the filter cache for term range queries that are part of larger queries. Some observed speedups were up to 2.5x for production of filters, and up to 10x for query evaluation with embedded term range queres that resulted in filter cache hits. (yonik) * SOLR-8559: FCS facet performance optimization which significantly speeds up processing when terms are high cardinality and the matching docset is small. When facet minCount > 0 and the number of matching documents is small (or 0) this enhancement prevents considering terms which have a 0 count. Also includes change to move to the next non-zero term value when selecting a segment position. (Keith Laban, Steve Bower, Dennis Gove) * SOLR-8532: Optimize GraphQuery when maxDepth is set by not collecting edges at the maxDepth level. (Kevin Watters via yonik) * SOLR-8669: Non binary responses use chunked encoding because we flush the outputstream early. (Mark Miller) * SOLR-8720: ZkController#publishAndWaitForDownStates should use #publishNodeAsDown. (Mark Miller) Other Changes ---------------------- * SOLR-6127: Improve example docs, using films data (Varun Thacker via ehatcher) * SOLR-6895: Deprecated SolrServer classes have been removed (Alan Woodward, Erik Hatcher) * SOLR-6954: Deprecated SolrClient.shutdown() method removed (Alan Woodward) * SOLR-7355: Switch from Google's ConcurrentLinkedHashMap to Caffeine. Only affects HDFS support. (Ben Manes via Shawn Heisey) * SOLR-7624: Remove deprecated zkCredientialsProvider element in solrcloud section of solr.xml. (Xu Zhang, Per Steffensen, Ramkumar Aiyengar, Mark Miller) * SOLR-7513: Add Equalitors to Streaming Expressions (Dennis Gove, Joel Bernstein) * SOLR-7528: Simplify Interfaces used in Streaming Expressions (Dennis Gove, Joel Bernstein) * SOLR-7554: Add checks in Streams for incoming stream order (Dennis Gove, Joel Bernstein) * SOLR-7441: Improve overall robustness of the Streaming stack: Streaming API, Streaming Expressions, Parallel SQL (Joel Bernstein) * SOLR-8153: Support upper case and mixed case column identifiers in the SQL interface (Joel Bernstein) * SOLR-8132: HDFSDirectoryFactory now defaults to using the global block cache. (Mark Miller) * SOLR-8261: Change SchemaSimilarityFactory default to BM25Similarity (hossman) * SOLR-8259: Remove deprecated JettySolrRunner.getDispatchFilter() * SOLR-8258: Change default hdfs tlog replication factor from 1 to 3. (Mark Miller) * SOLR-8270: Change implicit default Similarity to use BM25 when luceneMatchVersion >= 6 (hossman) * SOLR-8271: Change implicit default Similarity to use SchemaSimilarityFactory when luceneMatchVersion >= 6 (hossman) * SOLR-8179: SQL JDBC - DriverImpl loadParams doesn't support keys with no values in the connection string (Kevin Risden, Joel Bernstein) * SOLR-8131: Make ManagedIndexSchemaFactory the default schemaFactory when luceneMatchVersion >= 6 (Uwe Schindler, shalin, Varun Thacker) * SOLR-8266: Remove Java Serialization from the Streaming API. The /stream handler now only accepts Streaming Expressions. (Jason Gerlowski, Joel Bernstein) * SOLR-8426: Enable /export, /stream and /sql handlers by default and remove them from example configs. (shalin) * SOLR-8443: Change /stream handler http param from "stream" to "expr" (Joel Bernstein, Dennis Gove) * SOLR-5209: Unloading or deleting the last replica of a shard now no longer cascades to remove the shard from the clusterstate. (Christine Poerschke) * SOLR-8190: Implement Closeable on TupleStream (Kevin Risden, Joel Bernstein) * SOLR-8529: Improve JdbcTest to not use plain assert statements (Kevin Risden, Joel Bernstein) * SOLR-7339: Upgrade Jetty to v9.3.6.v20151106. (Gregg Donovan, shalin, Mark Miller) * SOLR-5730: Make Lucene's SortingMergePolicy and EarlyTerminatingSortingCollector configurable in Solr. (Christine Poerschke, hossmann, Tomás Fernández Löbbe, Shai Erera) * SOLR-8677: Prevent shards containing invalid characters from being created. Checks added server-side and in SolrJ. (Shai Erera, Jason Gerlowski, Anshum Gupta) * SOLR-8693: Improve ZkStateReader logging. (Scott Blum via Mark Miller) * SOLR-8710: Upgrade morfologik-stemming to version 2.1.0. (Dawid Weiss) * SOLR-8711: Upgrade Carrot2 clustering dependency to 3.12.0. (Dawid Weiss) * SOLR-8690: Make peersync fingerprinting optional with solr.disableFingerprint system property. (yonik) * SOLR-8691: Cache index fingerprints per searcher. (yonik) * SOLR-8746: Renamed Overseer.getInQueue to getStateUpdateQueue, getInternalQueue to getInternalWorkQueue and added javadocs. (Scott Blum, shalin) * SOLR-8752: Add a test for SizeLimitedDistributedMap and improve javadocs. (shalin) * SOLR-8671: Date statistics: make "sum" a double instead of a long/date (Tom Hill, Christine Poerschke, Tomás Fernández Löbbe) * SOLR-8713: new UI and example solrconfig files point to Reference Guide for Solr Query Syntax instead of the wiki. (Marius Grama via Tomás Fernández Löbbe) * SOLR-8758: Add a new SolrCloudTestCase class, using MiniSolrCloudCluster (Alan Woodward) * SOLR-8764: Remove all deprecated methods and classes from master prior to the 6.0 release. (Steve Rowe) ================== 5.5.1 ================== Bug Fixes ---------------------- * SOLR-8737: Managed synonym lists do not include the original term in the expand (janhoy) * SOLR-8734: fix (maxMergeDocs|mergeFactor) deprecation warnings: in solrconfig.xml may not be combined with and on their own or combined with is a warning. (Christine Poerschke, Shai Erera) * SOLR-8712: Variable solr.core.instanceDir was not being resolved (Kristine Jetzke, Shawn Heisey, Alan Woodward) ======================= 5.5.0 ======================= Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release Versions of Major Components --------------------- Apache Tika 1.7 Carrot2 3.10.4 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Jetty 9.2.13.v20150730 Upgrading from Solr 5.4 ----------------------- * The Solr schema version has been increased to 1.6. Since schema version 1.6, all non-stored docValues fields will be returned along with other stored fields when all fields (or pattern matching globs) are specified to be returned (e.g. fl=*) for search queries. This behavior can be turned on and off by setting 'useDocValuesAsStored' parameter for a field or a field type to true (default since schema version 1.6) or false (default till schema version 1.5). Note that enabling this property has performance implications because DocValues are column-oriented and may therefore incur additional cost to retrieve for each returned document. All example schema are upgraded to version 1.6 but any older schemas will default to useDocValuesAsStored=false and continue to work as in older versions of Solr. If this new behavior is desirable, then you should set version attribute in your schema file to '1.6'. Re-indexing is not necessary to upgrade the schema version. Also note that while returning non-stored fields from docValues (default in schema versions 1.6+, unless useDocValuesAsStored is false), the values of a multi-valued field are returned in sorted order. If you require the multi-valued fields to be returned in the original insertion order, then make your multi-valued field as stored. This requires re-indexing. See SOLR-8220 for more details. * All protected methods from CoreAdminHandler other than handleCustomAction() is removed by SOLR-8476 and can no more be overridden. If you still wish to override those methods, override the handleRequestBody() * The PERSIST CoreAdmin action which was a NOOP and returned a deprecated message has been removed. See SOLR-8476 for more details. The corresponding SolrJ action has also been removed. * bin/post now defaults application/json files to the /update/json/docs end-point. Use `-format solr` to force files to the /update end-point. See SOLR-7042 for more details. * In solrconfig.xml the element is deprecated in favor of a similar element, the and elements are also deprecated, please see SOLR-8621 for full details. To migrate your existing solrconfig.xml, you can replace elements as follows: ?? ??? ... ??? ?? ?? ... ???? ?? ??? ... ??? ???? ?? ... * Clearing up stored async collection api responses via REQUESTSTATUS call is now deprecated and would be removed in 6.0. See SOLR-8648 for more details. * SOLR-6594: Deprecated the old schema API which will be removed in a later major release Detailed Change List ---------------------- New Features ---------------------- * SOLR-7928: Improve CheckIndex to work against HdfsDirectory (Mike Drob, Gregory Chanan) * SOLR-8378: Add upconfig and downconfig commands to the bin/solr script (Erick Erickson) * SOLR-8434: Add wildcard support to role, to match any role in RuleBasedAuthorizationPlugin (noble) * SOLR-4280: Allow specifying "spellcheck.maxResultsForSuggest" as a percentage of filter query results (Markus Jelsma via James Dyer) * SOLR-8429: Add a flag 'blockUnknown' to BasicAuthPlugin to block unauthenticated requests (noble) * SOLR-8230: JSON Facet API: add "facet-info" into debug section of response when debugQuery=true (Michael Sun, yonik) * SOLR-8428: RuleBasedAuthorizationPlugin adds an 'all' permission (noble) * SOLR-5743: BlockJoinFacetComponent and BlockJoinDocSetFacetComponent for calculating facets by child.facet.field parameter with {!parent ..}.. query. They count facets on children documents aggregating (deduplicating) counts by parent documents (Dr. Oleg Savrasov via Mikhail Khludnev) * SOLR-8220: Read field from DocValues for non stored fields. (Keith Laban, yonik, Erick Erickson, Ishan Chattopadhyaya, shalin) * SOLR-8470: Make TTL of PKIAuthenticationPlugin's tokens configurable through a system property (pkiauth.ttl) (noble) * SOLR-8477: Let users choose compression mode in SchemaCodecFactory (Tomás Fernández Löbbe) * SOLR-839: XML QueryParser support (defType=xmlparser) Lucene includes a queryparser that supports the creation of Lucene queries from XML. The queries supported by lucene.queryparser.xml.CoreParser are now supported by the newly created solr.search.SolrCoreParser and in future SolrCoreParser could support additional queries also. Example: shirt plain cotton S M L (Erik Hatcher, Karl Wettin, Daniel Collins, Nathan Visagan, Ahmet Arslan, Christine Poerschke) * SOLR-8312: Add domain size and numBuckets to facet telemetry info (facet debug info for the new Facet Module). (Michael Sun, yonik) * SOLR-8534: Add generic support for collection APIs to be async. Thus more actions benefit from having async support. The commands that additionally get async support are: delete/reload collection, create/delete alias, create/delete shard, delete replica, add/delete replica property, add/remove role, overseer status, balance shard unique, rebalance leaders, modify collection, migrate state format (Varun Thacker) * SOLR-4619: Improve PreAnalyzedField query analysis. (Andrzej Bialecki, Steve Rowe) * SOLR-8560: Added RequestStatusState enum which can be used when comparing states of asynchronous requests. (Shai Erera) * SOLR-8586: added index fingerprint, a hash over all versions currently in the index. PeerSync now uses this to check if replicas are in sync. (yonik) * SOLR-8500: Allow the number of threads ConcurrentUpdateSolrClient StreamingSolrClients configurable by a system property. NOTE: this is an expert option and can result in more often needing to do full index replication for recovery, the sweet spot for using this is very high volume, leader-only indexing. (Tim Potter, Erick Erickson) * SOLR-8642: SOLR allows creation of collections with invalid names (Jason Gerlowski via Erick Erickson) * SOLR-8621: Deprecate in favor of . It allows to configure both the "simple" merge policies, but also more advanced ones, e.g. UpgradeIndexMergePolicy. (Christine Poerschke, Shai Erera) * SOLR-8648: DELETESTATUS API for selective deletion and flushing of stored async collection API responses. (Anshum Gupta) * SOLR-8466: adding facet.method=uif to bring back UnInvertedField faceting which is used to work on facet.method=fc. It's more performant for rarely changing indexes. Note: it ignores prefix and contains yet. (Jamie Johnson via Mikhail Khludnev) Bug Fixes ---------------------- * SOLR-8175: Word Break Spellchecker would throw AIOOBE with certain queries containing "should" clauses. (Ryan Josal via James Dyer) * SOLR-2556: The default spellcheck query converter was ignoring terms consisting entirely of digits. (James Dyer) * SOLR-8366: ConcurrentUpdateSolrClient attempts to use response's content type as charset encoding for parsing exception. (shalin) * SOLR-6271: Fix ConjunctionSolrSpellChecker to not compare StringDistance by instance. (Igor Kostromin via James Dyer) * SOLR-7304: Fix Spellcheck Collate to not invalidate range queries. (James Dyer) * SOLR-8373: KerberosPlugin: Using multiple nodes on same machine leads clients to fetch TGT for every request (Ishan Chattopadhyaya via noble) * SOLR-8367: Fix the LeaderInitiatedRecovery 'all replicas participate' fail-safe. (Mark Miller, Mike Drob) * SOLR-8401: Windows start script fails when executed from a different drive. (Nicolas Gavalda via Erick Erickson) * SOLR-6992: Fix "Files" UI to show the managed-schema file as well. (Shawn Heisey, Varun Thacker) * SOLR-2649: MM ignored in edismax queries with operators. (Greg Pendlebury, Jan Høydahl et. al. via Erick Erickson) * SOLR-8372: Canceled recovery can rarely lead to inconsistent shards: If a replica is recovering via index replication, and that recovery fails (for example if the leader goes down), and then some more updates are received (there could be a few left to be processed from the leader that just went down), and then that replica is brought down, it will think it is up-to-date when restarted. (shalin, Mark Miller, yonik) * SOLR-8419: TermVectorComponent for distributed search when distrib.singlePass could include term vectors for documents that matched the query yet weren't in the returned documents. (David Smiley) * SOLR-8015: HdfsLock may fail to close a FileSystem instance if it cannot immediately obtain an index lock. (Mark Miller) * SOLR-8422: When authentication enabled, requests fail if sent to a node that doesn't host the collection (noble) * SOLR-8059: &debug=results for distributed search when distrib.singlePass (sometimes activated automatically) could result in an NPE. (David Smiley, Markus Jelsma) * SOLR-8460: /analysis/field could throw exceptions for custom attributes. (David Smiley, Uwe Schindler) * SOLR-8276: Atomic updates and realtime-get do not work with non-stored docvalues. (Ishan Chattopadhyaya, yonik via shalin) * SOLR-7462: AIOOBE in RecordingJSONParser (Scott Dawson, noble) * SOLR-8494: SimplePostTool and therefore the bin/post script cannot upload files larger than 2.1GB. (shalin) * SOLR-8451: We should not call method.abort in HttpSolrClient or HttpSolrCall#remoteQuery and HttpSolrCall#remoteQuery should not close streams. (Mark Miller) * SOLR-8450: Our HttpClient retry policy is too permissive. (Mark Miller, shalin) * SOLR-8533: Raise default maxUpdateConnections and maxUpdateConnectionsPerHost to 100k each. (Mark Miller) * SOLR-8453: Solr should attempt to consume the request inputstream on errors as we cannot count on the container to do it. (Mark Miller, Greg Wilkins, yonik, Joakim Erdfelt) * SOLR-6279: cores?action=UNLOAD now waits for the core to close before unregistering it from ZK. (Christine Poerschke) * SOLR-2798: Fixed local params to work correctly with multivalued params (Demian Katz via hossman) * SOLR-8541: Highlighting a geo RPT field would throw an NPE instead of doing nothing. (Pawel Rog via David Smiley) * SOLR-8548: Core discovery was not following symlinks (Aaron LaBella via Alan Woodward) * SOLR-8564: Fix Embedded ZooKeeper to use /zoo_data for it's data directory * SOLR-8371: Try and prevent too many recovery requests from stacking up and clean up some faulty cancel recovery logic. (Mark Miller) * SOLR-8582 : memory leak in JsonRecordReader affecting /update/json/docs. Large payloads cause OOM (noble, shalin) * SOLR-8605: Regular expression queries starting with escaped forward slash caused an exception. (Scott Blum, yonik) * SOLR-8607: The Schema API refuses to add new fields that match existing dynamic fields. (Jan Høydahl, Steve Rowe) * SOLR-8575: Fix HDFSLogReader replay status numbers, a performance bug where we can reopen FSDataInputStream much too often, and an hdfs tlog data integrity bug. (Mark Miller, Patrick Dvorack, yonik) * SOLR-8651: The commitWithin parameter is not passed on for deleteById in UpdateRequest in distributed queries (Jessica Cheng Mallet via Erick Erickson) * SOLR-8551: Make collection deletion more robust. (Mark Miller) Optimizations ---------------------- * SOLR-8501: Specify the entity request size when known in HttpSolrClient. (Mark Miller) * SOLR-8615: Just like creating cores, we should use multiple threads when closing cores. (Mark Miller) * SOLR-7281: Add an overseer action to publish an entire node as 'down'. (Mark Miller, shalin) * SOLR-8669: Non binary responses use chunked encoding because we flush the outputstream early. (Mark Miller) Other Changes ---------------------- * LUCENE-6900: Added test for score ordered grouping, and refactored TopGroupsResultTransformer. (David Smiley) * SOLR-8336: CoreDescriptor now takes a Path for its instance directory, rather than a String (Alan Woodward) * SOLR-8351: Improve HdfsDirectory toString representation (Mike Drob via Gregory Chanan) * SOLR-8321: add a (SolrQueryRequest free) SortSpecParsing.parseSortSpec variant (Christine Poerschke) * SOLR-8338: in OverseerTest replace strings such as "collection1" and "state" with variable or enum equivalent (Christine Poerschke) * SOLR-8333: Several API tweaks so that public APIs were no longer refering to private classes (ehatcher, Shawn Heisey, hossman) * SOLR-8357: UpdateLog.RecentUpdates now implements Closeable (Alan Woodward) * SOLR-8339: Refactor SolrDocument and SolrInputDocument to have a common base abstract class called SolrDocumentBase. Deprecated methods toSolrInputDocument and toSolrDocument in ClientUtils. (Ishan Chattopadhyaya via shalin) * SOLR-8353: Support regex for skipping license checksums (Gregory Chanan) * SOLR-8313: SimpleQueryParser doesn't use MultiTermAnalysis for Fuzzy Queries (Tom Hill via Erick Erickson) * SOLR-8359: Restrict child classes from using parent logger's state (Jason Gerlowski, Mike Drob, Anshum Gupta) * SOLR-8131: All example config sets now explicitly use the ManagedIndexSchemaFactory instead of ClassicIndexSchemaFactory. This means that the Schema APIs ( //schema ) are enabled by default and the schema is mutable. The schema file will be called managed-schema (Uwe Schindler, shalin, Varun Thacker) * SOLR-8381: Cleanup data_driven managed-schema and solrconfig.xml files. Commented out copyFields are removed and solrconfig.xml doesn't refer to field which are not defined. (Varun Thacker) * SOLR-7774: revise BasicDistributedZkTest.test logic w.r.t. 'commitWithin did not work on some nodes' (Christine Poerschke) * SOLR-8360: simplify ExternalFileField.getValueSource implementation (Christine Poerschke) * SOLR-8387: All example configs shipped with Solr explicitly use ManagedIndexSchemaFactory, the schema file will be called managed-schema instead of schema.xml . It is not advised to use hand edit the managed-schema. You should use the schema APIs instead ( //schema ) . If you do not want this behaviour in the example configs, before you start solr rename managed-schema to schema.xml and change the schemaFactory in solrconfig.xml file to explicitly use ClassicIndexSchemaFactory instead : (Varun Thacker) * SOLR-8305: replace LatLonType.getValueSource's QParser use (Christine Poerschke) * SOLR-8388: factor out response/TestSolrQueryResponse.java from servlet/ResponseHeaderTest.java more TestSolrQueryResponse.java tests; add SolrReturnFields.toString method, ReturnFieldsTest.testToString test; (Christine Poerschke) * SOLR-8383: SolrCore.java + QParserPlugin.java container initialCapacity tweaks (Christine Poerschke, Mike Drob) * LUCENE-6925: add RandomForceMergePolicy class in test-framework (Christine Poerschke) * SOLR-8404: tweak SolrQueryResponse.getToLogAsString, add TestSolrQueryResponse.testToLog (Christine Poerschke) * SOLR-8352: randomise unload order in UnloadDistributedZkTest.testUnloadShardAndCollection (Christine Poerschke) * SOLR-8414: AbstractDistribZkTestBase.verifyReplicaStatus could throw NPE (Christine Poerschke) * SOLR-8410: Add all read paths to 'read' permission in RuleBasedAuthorizationPlugin (noble) * SOLR-8279: Add a new test fault injection approach and a new SolrCloud test that stops and starts the cluster while indexing data and with random faults. (Mark Miller) * SOLR-8419: TermVectorComponent for distributed search now requires a uniqueKey in the schema. Also, it no longer returns "uniqueKeyField" in the response. (David Smiley) * SOLR-8317: add & use responseHeader and response accessors to SolrQueryResponse. (Christine Poerschke) * SOLR-8452: replace "partialResults" occurrences with SolrQueryResponse.RESPONSE_HEADER_PARTIAL_RESULTS_KEY (Christine Poerschke) * SOLR-8454: ZkStateReader logging improvements and cleanup of dead code (Shai Erera, Anshum Gupta) * SOLR-8455: RecovertStrategy logging improvements and sleep-between-recovery-attempts bug fix. (Shai Erera) * SOLR-8481: TestSearchPerf no longer needs to duplicate SolrIndexSearcher.(NO_CHECK_QCACHE|NO_CHECK_FILTERCACHE) (Christine Poerschke) * SOLR-8486: No longer require jar/unzip for bin/solr (Steven E. Harris, janhoy) * SOLR-8483: relocate 'IMPORTANT NOTE' in open-exchange-rates.json test-file to avoid OpenExchangeRatesOrgProvider.java warnings (Christine Poerschke) * SOLR-8489: TestMiniSolrCloudCluster.createCollection to support extra & alternative collectionProperties (Christine Poerschke) * SOLR-8482: add & use QueryCommand.[gs]etTerminateEarly accessors. (Christine Poerschke) * SOLR-8498: Improve error message when a large value is stored in an indexed string field. (shalin) * SOLR-8484: refactor update/SolrIndexConfig.LOCK_TYPE_* into core/DirectoryFactory.LOCK_TYPE_* (Christine Poerschke) * SOLR-8504: (IndexSchema|SolrIndexConfig)Test: private static finals for solrconfig.xml and schema.xml String literals. (Christine Poerschke) * SOLR-8505: core/DirectoryFactory.LOCK_TYPE_HDFS - add & use it instead of String literals (Christine Poerschke) * SOLR-7042: bin/post now uses /update/json/docs for application/json content types, including support for .jsonl (JSON Lines) files. (Erik Hatcher and shalin) * SOLR-8476: Refactor and cleanup CoreAdminHandler (noble, Varun Thacker) * SOLR-8535: Support forcing define-lucene-javadoc-url to be local (Gregory Chanan) * SOLR-8549: Solr start script checks for cores which have failed to load as well before attempting to create a core with the same name (Varun Thacker) * SOLR-8555: SearchGroupShardResponseProcessor (initialCapacity) tweaks (Christine Poerschke) * LUCENE-6978: Refactor several code places that lookup locales by string name to use BCP47 locale tag instead. LuceneTestCase now also prints locales on failing tests this way. In addition, several places in Solr now additionally support BCP47 in config files. (Uwe Schindler, Robert Muir) * SOLR-7907: Remove CLUSTERSTATUS related exclusivity checks while running commands in the Overseer because the CLUSTERSTATUS request is served by the individual nodes itself and not via the Overseer node (Varun Thacker) * SOLR-8566: various initialCapacity tweaks (Fix Versions: trunk 5.5) (Christine Poerschke) * SOLR-8565: add & use CommonParams.(ROWS|START)_DEFAULT constants (Christine Poerschke) * SOLR-8595: Use BinaryRequestWriter by default in HttpSolrClient and ConcurrentUpdateSolrClient. (shalin) * SOLR-8597: add default, no-op QParserPlugin.init(NamedList) method (Christine Poerschke) * SOLR-7968: Make QueryComponent more extensible. (Markus Jelsma via David Smiley) * SOLR-8600: add & use ReRankQParserPlugin parameter [default] constants, changed ReRankQuery.toString to use StringBuilder. (Christine Poerschke) * SOLR-8308: Core gets inaccessible after RENAME operation with special characters (Erik Hatcher, Erick Erickson) * SOLR-3141: Warn in logs when expensive optimize calls are made (yonik, janhoy) ================== 5.4.1 ================== Bug Fixes ---------------------- * SOLR-8460: /analysis/field could throw exceptions for custom attributes. (David Smiley, Uwe Schindler) * SOLR-8373: KerberosPlugin: Using multiple nodes on same machine leads clients to fetch TGT for every request (Ishan Chattopadhyaya via noble) * SOLR-8059: &debug=results for distributed search when distrib.singlePass (sometimes activated automatically) could result in an NPE. (David Smiley, Markus Jelsma) * SOLR-8422: When authentication enabled, requests fail if sent to a node that doesn't host the collection (noble) * SOLR-7462: AIOOBE in RecordingJSONParser (Scott Dawson, noble) * SOLR-8496: SolrIndexSearcher.getDocSet(List) incorrectly included deleted documents when all of the queries were uncached (or there was no filter cache). This caused multi-select faceting (including the JSON Facet API) to include deleted doc counts when the remaining non-excluded filters were all uncached. This bug was first introduced in 5.3.0 (Andreas Müller, Vasiliy Bout, Erick Erickson, Shawn Heisey, Hossman, yonik) * SOLR-8418: Adapt to changes in LUCENE-6590 for use of boosts with MLTHandler and Simple/CloudMLTQParser (Jens Wille, Ramkumar Aiyengar) New Features ---------------------- * SOLR-8470: Make TTL of PKIAuthenticationPlugin's tokens configurable through a system property (pkiauth.ttl) (noble) ================== 5.4.0 ================== Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release Versions of Major Components --------------------- Apache Tika 1.7 Carrot2 3.10.4 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Jetty 9.2.13.v20150730 Upgrading from Solr 5.3 ----------------------- * DefaultSimilarityFactory has been renamed to ClassicSimilarityFactory to match the underlying rename of DefaultSimilarity to ClassicSimilarity and the (eventual) move away from using it as a default. If you currently have DefaultSimilarityFactory explicitly referenced in your schema.xml, you will now get a warning urging you to edit your config to use the functionally identical ClassicSimilarityFactory. DefaultSimilarityFactory will be removed completely in Solr 6. See SOLR-8239 for more details. * SOLR-7859: The following APIs are now deprecated: - SolrCore.getStartTime: Use SolrCore.getStartTimeStamp instead. - SolrIndexSearcher.getOpenTime: Use SolrIndexSearcher.getOpenTimeStamp instead. * SOLR-8307: EmptyEntityResolver was moved from core to solrj, and moved from the org.apache.solr.util package to org.apache.solr.common. If you are using this class, you will need to adjust the import package. * Logger declarations in most source files have changed to code that no longer needs to explicitly state the class name. This fixes situations where a logger for a different class was incorrectly used. See SOLR-8324 and its sub-issues for details. Detailed Change List ---------------------- New Features ---------------------- * SOLR-5756: A utility Collection API to move a collection from shared clusterstate.json (stateFormat=1, default until 4.x) to the per-collection state.json stored in ZooKeeper (stateFormat=2, default since 5.0) seamlessly without any application down-time. Example: http://localhost:8983/solr/admin/collections?action=MIGRATESTATEFORMAT&collection= (Noble Paul, Scott Blum, shalin) * SOLR-7219: filterCache access added to the solr query syntax. Example: description:HDTV OR filter(+promotion:tv +promotion_date:[NOW/DAY TO NOW/DAY+7DAY]) (yonik) * SOLR-7775: Allow fromIndex parameter to ScoreJoinQParserPlugin {!join score=.. fromIndex=..}.. to refer to a single-sharded collection that has a replica on all nodes where there is a replica in the to index (Andrei Beliakov via Mikhail Khludnev) * SOLR-7961: Print Solr's version with command bin/solr version (janhoy) * SOLR-7789: Introduce a ConfigSet management API (Gregory Chanan) * SOLR-4316: Add a collections dropdown to angular admin UI (Upayavira, Shalin Shekhar Mangar) * SOLR-7915: Provide pluggable context tool support for VelocityResponseWriter (Erik Hatcher) * LUCENE-6795: SystemInfoHandler was improved to also show detailed operating system statistics on IBM J9 virtual machines. It also no longer fails on Java 9 with Jigsaw module system. (Uwe Schindler) * SOLR-8053: Basic auth support in SolrJ (noble) * SOLR-7995: Add a LIST command to ConfigSets API (Gregory Chanan) * SOLR-4388: In Angular UI, add a Collections UI when in cloud mode (Upayavira) * SOLR-7858, SOLR-8199: Add links between original and new Admin UIs (Upayavira) * SOLR-7888: Analyzing suggesters can now filter suggestions by a context field (Arcadius Ahouansou, janhoy) * SOLR-8217: JSON Facet API: add "method" param to terms/field facets to give an execution hint for what method should be used to facet. (yonik) * SOLR-8113: CloneFieldUpdateProcessorFactory now supports choosing a "dest" field name based on a regex pattern and replacement init options. (Gus Heck, hossman) * SOLR-8139: Create/delete fields/dynamic fields/copy fields via schema tab on Angular UI * SOLR-8166: Introduce possibility to configure ParseContext in ExtractingRequestHandler/ExtractingDocumentLoader (Andriy Binetsky via Uwe Schindler) * SOLR-7569: A collection API to force elect a leader, called FORCELEADER, when all replicas in a shard are down (Ishan Chattopadhyaya, Mark Miller, shalin, noble) * SOLR-6168: Add a 'sort' local param to the collapse QParser to support using complex sort options to select the representitive doc for each collapsed group. (Umesh Prasad, hossman) * SOLR-8329: SchemaSimilarityFactory now supports a 'defaultSimFromFieldType' init option for using a fieldType name to identify which Similarity to use as a default. (hossman) * SOLR-7912: Add boost support, and also exclude the queried document in MoreLikeThis QParser (Jens Wille via Anshum Gupta) Bug Fixes ---------------------- * SOLR-7859: Fix usage of currentTimeMillis instead of nanoTime in multiple places, whitelist valid uses of currentTimeMillis (Ramkumar Aiyengar) * SOLR-7836: Possible deadlock when closing refcounted index writers. (Jessica Cheng Mallet, Erick Erickson, Mark Miller, yonik) * SOLR-7869: Overseer does not handle BadVersionException correctly and, in some cases, can go into an infinite loop if cluster state in ZooKeeper is modified externally. (Scott Blum, shalin) * SOLR-7920: Resolve XSS issue in Admin UI Schema Browser (David Chiu via Upayavira) * SOLR-7935: Fix very rare race condition that can cause an update to fail via NullPointerException during a core reload. (yonik) * SOLR-7941: multivalued params are concatenated when using config API (noble) * SOLR-7956: There are interrupts on shutdown in places that can cause ChannelAlreadyClosed exceptions which prevents proper closing of transaction logs, interfere with the IndexWriter, the hdfs client and other things. (Mark Miller, Scott Blum) * SOLR-7954: Fixed an integer overflow bug in the HyperLogLog code used by the 'cardinality' option of stats.field to prevent ArrayIndexOutOfBoundsException in a distributed search when a large precision is selected and a large number of values exist in each shard (hossman) * SOLR-7844: Zookeeper session expiry during shard leader election can cause multiple leaders. (Mike Roberts, Mark Miller, Jessica Cheng) * SOLR-7984: wrong and misleading error message 'no default request handler is registered' (noble, hossman) * SOLR-8001: Fixed bugs in field(foo,min) and field(foo,max) when some docs have no values (David Smiley, hossman) * SOLR-7819: ZK connection loss or session timeout do not stall indexing threads anymore. All activity related to leader initiated recovery is performed by a dedicated LIR thread in the background. (Ramkumar Aiyengar, shalin) * SOLR-7746: Ping requests stopped working with distrib=true in Solr 5.2.1. (Alexey Serba, Michael Sun via Gregory Chanan) * SOLR-6547: ClassCastException in SolrResponseBase.getQTime on update response from CloudSolrClient when parallelUpdates is enabled (default) and multiple docs are sent as a single update. (kevin, hossman, shalin) * SOLR-8058: Fix the exclusion filter so that collections that start with js, css, img, tpl can be accessed. (Upayavira, Steve Rowe, Anshum Gupta) * SOLR-8069: Ensure that only the valid ZooKeeper registered leader can put a replica into Leader Initiated Recovery. (Mark Miller, Jessica Cheng, Anshum Gupta) * SOLR-8077: Replication can still cause index corruption. (Mark Miller) * SOLR-8104: Config API does not work for spellchecker (noble) * SOLR-8095: Allow disabling HDFS Locality Metrics and disable by default as it may have performance implications on rapidly changing indexes. (Mike Drob via Mark Miller) * SOLR-8085: Fix a variety of issues that can result in replicas getting out of sync. (yonik, Mark Miller) * SOLR-8094: HdfsUpdateLog should not replay buffered documents as a replacement to dropping them. (Mark Miller) * SOLR-8075: Leader Initiated Recovery should not stop a leader that participated in an election with all of it's replicas from becoming a valid leader. (Mark Miller) * SOLR-8072: Rebalance leaders feature does not set CloudDescriptor#isLeader to false when bumping leaders. (Mark Miller) * SOLR-7666: Many small fixes to Angular UI (Upayavira, Alexandre Rafalovitch) * SOLR-7967: AddSchemaFieldsUpdateProcessorFactory does not check if the ConfigSet is immutable (Gregory Chanan) * SOLR-6188: Skip the automatic loading of resources in the "lib" subdirectory by SolrResourceLoader, but only if we are loading resources from the solr home directory. Fixes the inability to use ICU analysis components with a "solr." prefix on the classname. (Shawn Heisey) * SOLR-8130: Solr's hdfs safe mode detection does not catch all cases of being in safe mode. (Mark Miller, Mike Drob) * SOLR-8128: Set v.locale specified locale for all LocaleConfig extending VelocityResponseWriter tools. (Erik Hatcher) * SOLR-8152: Overseer Task Processor/Queue can miss responses, leading to timeouts. (Gregory Chanan) * SOLR-8107: bin/solr -f should use exec to start the JVM (Martijn Koster via Timothy Potter) * SOLR-8050: Partial update on document with multivalued date field fails to parse date and can also fail to remove dates in some cases. (Burkhard Buelte, Luc Vanlerberghe, shalin) * SOLR-8167: Authorization framework does not work with POST params (noble) * SOLR-8162: JmxMonitoredMap#clear triggers a query on all the MBeans thus generating lots of warnings. (Marius Dumitru Florea, shalin) * SOLR-7843: DataImportHandler's delta imports leak memory because the delta keys are kept in memory and not cleared after the process is finished. (Pablo Lozano via shalin) * SOLR-8189: eTag calculation during HTTP Cache Validation uses unsynchronized WeakHashMap causing threads to be stuck in runnable state. (shalin) * SOLR-7993: Raw json output for fields stopped working in 5.3.0 when requested fields do not include the unique key field name. (Bill Bell, Ryan McKinley via shalin) * SOLR-8192: JSON Facet API allBuckets:true did not work correctly when faceting on a multi-valued field with sub-facets / facet functions. (yonik) * SOLR-8206: JSON Facet API limit:0 did not always work correctly. (yonik) * SOLR-8126: update- does not work if the component is only present in solrconfig.xml (noble) * SOLR-8203: Stop processing updates more quickly on node shutdown. When a node is shut down, streaming updates would continue, but new update requests would be aborted. This can cause big update reorders that can cause replicas to get out of sync. (Mark Miller, yonik) * SOLR-6406: ConcurrentUpdateSolrClient hang in blockUntilFinished. If updates are still flowing and shutdown is called on the executor service used by ConcurrentUpdateSolrClient, a race condition can cause that client to hang in blockUntilFinished. (Mark Miller, yonik) * SOLR-8215: Only active replicas should handle incoming requests against a collection (Varun Thacker) * SOLR-8223: Avoid accidentally swallowing OutOfMemoryError (in LeaderInitiatedRecoveryThread.java or CoreContainer.java) (Mike Drob via Christine Poerschke) * SOLR-8255: MiniSolrCloudCluster needs to use a thread-safe list to keep track of its child nodes (Alan Woodward) * SOLR-8254: HttpSolrCore.getCoreByCollection() can throw NPE (Alan Woodward, Mark Miller) * SOLR-8262: Comment out /stream handler from sample solrconfig.xml's for security reasons (Joel Bernstein) * SOLR-7989: After a new leader is elected it should change it's state to ACTIVE even if the last published state is something else if it has already registered with ZK. (Ishan Chattopadhyaya, Mark Miller via noble) * SOLR-8287: TrieDoubleField and TrieLongField now override toNativeType (Ishan Chattopadhyaya via Christine Poerschke) * SOLR-8284: JSON Facet API - fix NPEs when short form "sort:index" or "sort:count" are used. (Michael Sun via yonik) * SOLR-8295: Fix NPE in collapse QParser when collapse field is missing from all docs in a segment (hossman) * SOLR-8280: Fixed bug in SimilarityFactory initialization that prevented SolrCoreAware factories -- such as SchemaSimilarityFactory -- from functioning properly with managed schema features. (hossman) * SOLR-5971: Fix error 'Illegal character in query' when proxying request. (Uwe Schindler, Ishan Chattopadhyaya, Eric Bus) * SOLR-8307: Fix XXE vulnerability in MBeansHandler "diff" feature (Erik Hatcher) * SOLR-8073: Solr fails to start on Windows with obscure errors when using relative path. (Alexandre Rafalovitch, Ishan Chattopadhyaya via shalin) * SOLR-7169: bin/solr status should return exit code 3, not 0 if Solr is not running (Dominik Siebel via Timothy Potter) * SOLR-8341: Fix JSON Facet API excludeTags when specified in the form of domain:{excludeTags:mytag} (yonik) * SOLR-8326: If BasicAuth enabled, inter node requests fail after node restart (noble, Anshum Gupta) * SOLR-8340: Fixed NullPointerException in HighlightComponent. (zengjie via Christine Poerschke) * SOLR-8355: update permissions were failing node recovery (noble , Anshum Gupta) Optimizations ---------------------- * SOLR-7918: Filter (DocSet) production from term queries has been optimized and is anywhere from 20% to over 100% faster and produces less garbage on average. (yonik) * SOLR-6760: New optimized DistributedQueue implementation for overseer increases message processing performance by ~470%. (Noble Paul, Scott Blum, shalin) * SOLR-6629: Watch /collections zk node on all nodes so that cluster state updates are more efficient especially when cluster has a mix of collections in stateFormat=1 and stateFormat=2. (Scott Blum, shalin) * SOLR-7971: Reduce memory allocated by JavaBinCodec to encode small strings by an amount equal to the string.length(). JavaBinCodec now uses a double pass approach to write strings larger than 64KB to avoid allocating buffer memory equal to string's UTF8 size. (yonik, Steve Rowe, Mikhail Khludnev, Noble Paul, shalin) * SOLR-7983: Utils.toUTF8 uses larger buffer than necessary for holding UTF8 data. (shalin) * SOLR-8222: JSON Facet API optimization to faceting by count on docvalue fields (or indexed fields with method=dv) when there are multiple hits expected for enoug buckets. For example, this more than doubled the performance of faceting 5M documents over a field with 1M unique values. (yonik) * SOLR-8288: DistributedUpdateProcessor#doFinish should explicitly check and ensure it does not try to put itself into LIR. (Mark Miller) Other Changes ---------------------- * SOLR-8294: Cleanup solrconfig.xmls under solr/example/example-DIH/solr (removed obsolete clustering handler sections). (Dawid Weiss) * SOLR-7969: Unavailable clustering engines should not fail the core. (Dawid Weiss) * SOLR-7790, SOLR-7791: Update Carrot2 clustering component to version 3.10.4. Upgrade HPPC library to version 0.7.1. (Dawid Weiss) * SOLR-7831: Start Scripts: Allow a configurable stack size [-Xss] (Steve Davids via Mark Miller) * SOLR-7870: Write a test which asserts that requests to stateFormat=2 collection succeed on a node even after all local replicas of that collection have been removed. (Scott Blum via shalin) * SOLR-7902: Split out use of child timers from RTimer to a sub-class (Ramkumar Aiyengar) * SOLR-7943: Upgrade Jetty to 9.2.13.v20150730. (Bill Bell, shalin) * SOLR-7007: DistributedUpdateProcessor now logs replay flag as boolean instead of int (Mike Drob via Christine Poerschke) * SOLR-7960: Start scripts now gives generic help for bin/solr -h and bin/solr --help (janhoy) * SOLR-7970: Factor out a SearchGroupsFieldCommandResult class. (Christine Poerschke) * SOLR-7942: Previously removed unlockOnStartup option (LUCENE-6508) now logs warning if configured, will be an error in 6.0. Also improved error msg if an index is locked on startup (hossman) * SOLR-7979: Fix two typos (in a CoreAdminHandler log message and a TestCloudPivotFacet comment). (Mike Drob via Christine Poerschke) * SOLR-7966: Solr Admin UI Solr now sets the HTTP header X-Frame-Options to DENY to avoid clickjacking. (yonik) * SOLR-7999: SolrRequestParser tests no longer depend on external URLs that may fail to work. (Uwe Schindler) * SOLR-8034: Leader no longer puts replicas in recovery in case of a failed update, when minRF isn't achieved. (Jessica Cheng, Timothy Potter, Anshum Gupta) * SOLR-8066: SolrCore.checkStale method doesn't restore interrupt status. (shalin) * SOLR-8068: Throw a SolrException if the core container has initialization errors or is shutting down (Ishan Chattopadhyaya, Noble Paul, Anshum Gupta) * SOLR-8083: Convert the ZookeeperInfoServlet to a handler at /admin/zookeeper (noble) * SOLR-8025: remove unnecessary ResponseBuilder.getQueryCommand() calls (Christine Poerschke) * SOLR-8150: Fix build failure due to too much output from QueryResponseTest (janhoy) * SOLR-8151: OverseerCollectionMessageHandler was logging info data as WARN (Alan Woodward) * SOLR-8116: SearchGroupsResultTransformer tweaks (String literals, list/map initialCapacity) (Christine Poerschke) * SOLR-8114: in Grouping.java rename groupSort and sort to withinGroupSort and groupSort (Christine Poerschke) * SOLR-8074: LoadAdminUIServlet directly references admin.html (Mark Miller, Upayavira) * SOLR-8195: IndexFetcher download trace now includes bytes-downloaded[-per-second] (Christine Poerschke) * SOLR-4854: Add a test to assert that [elevated] DocTransfer works correctly with javabin response format. (Ray, shalin) * SOLR-8196: TestMiniSolrCloudCluster.testStopAllStartAll case plus necessary MiniSolrCloudCluster tweak (Christine Poerschke) * SOLR-8221: MiniSolrCloudCluster should create subdirectories for its nodes (Alan Woodward) * SOLR-8218: DistributedUpdateProcessor (initialCapacity) tweaks (Christine Poerschke) * SOLR-8147: contrib/analytics FieldFacetAccumulator now throws IOException instead of SolrException (Scott Stults via Christine Poerschke) * SOLR-8239: Added ClassicSimilarityFactory, marked DefaultSimilarityFactory as deprecated. (hossman) * SOLR-8253: AbstractDistribZkTestBase can sometimes fail to shut down its ZKServer (Alan Woodward) * SOLR-8260: Use NIO2 APIs in core discovery (Alan Woodward) * SOLR-8259: Deprecate JettySolrRunner.getDispatchFilter(), add .getSolrDispatchFilter() and .getCoreContainer() (Alan Woodward) * SOLR-8278: Use NIO2 APIs in ConfigSetService (Alan Woodward) * SOLR-8286: Remove instances of solr.hdfs.blockcache.write.enabled from tests and docs (Gregory Chanan) * SOLR-8269: Upgrade commons-collections to 3.2.2. This fixes a known serialization vulnerability (janhoy) * SOLR-8246: Fix SolrCLI to clean the config directory in case creating a core failed. (Jason Gerlowski via Shai Erera) * SOLR-8290: remove SchemaField.checkFieldCacheSource's unused QParser argument (Christine Poerschke) * SOLR-8300: Use constants for the /overseer_elect znode (Varun Thacker) * SOLR-8283: factor out StrParser from QueryParsing.StrParser and SortSpecParsing[Test] from QueryParsing[Test] (Christine Poerschke) * SOLR-8298: small preferLocalShards implementation refactor (Christine Poerschke) * SOLR-8315: Removed default core checks in the dispatch filter since we don't have a default core anymore (Varun Thacker) * SOLR-8302: SolrResourceLoader now takes a Path as its instance directory (Alan Woodward, Shawn Heisey) * SOLR-8303: CustomBufferedIndexInput now includes resource description when throwing EOFException. (Mike Drob via Uwe Schindler) * SOLR-8194: Improve error reporting for null documents in UpdateRequest (Markus Jelsma, Alan Woodward) * SOLR-8277: (Search|Top)GroupsFieldCommand tweaks (Christine Poerschke) * SOLR-8299: ConfigSet DELETE operation no longer allows deletion of config sets that are currently in use by other collections (Anshum Gupta) * SOLR-8101: Improve Linux service installation script (Sergey Urushkin via Timothy Potter) * SOLR-8180: jcl-over-slf4j should have officially been a SolrJ dependency; it now is. (David Smiley, Kevin Risden) * SOLR-8330: Standardize and fix logger creation and usage so that they aren't shared across source files.(Jason Gerlowski, Uwe Schindler, Anshum Gupta) * SOLR-8363: Fix check-example-lucene-match-version Ant task and addVersion.py script to check and update luceneMatchVersion under solr/example/ configs as well logic. (Varun Thacker) ================== 5.3.2 ================== Bug Fixes ---------------------- * SOLR-8460: /analysis/field could throw exceptions for custom attributes. (David Smiley, Uwe Schindler) * SOLR-8373: KerberosPlugin: Using multiple nodes on same machine leads clients to fetch TGT for every request (Ishan Chattopadhyaya via noble) * SOLR-8340: Fixed NullPointerException in HighlightComponent. (zengjie via Christine Poerschke) * SOLR-8059: &debug=results for distributed search when distrib.singlePass (sometimes activated automatically) could result in an NPE. (David Smiley, Markus Jelsma) * SOLR-8167: Authorization framework does not work with POST params (noble) * SOLR-8355: update permissions were failing node recovery (noble , Anshum Gupta) * SOLR-8326: If BasicAuth enabled, inter node requests fail after node restart (noble, Anshum Gupta) * SOLR-8269: Upgrade commons-collections to 3.2.2. This fixes a known serialization vulnerability (janhoy) * SOLR-8422: When authentication enabled, requests fail if sent to a node that doesn't host the collection (noble) * SOLR-8496: SolrIndexSearcher.getDocSet(List) incorrectly included deleted documents when all of the queries were uncached (or there was no filter cache). This caused multi-select faceting (including the JSON Facet API) to include deleted doc counts when the remaining non-excluded filters were all uncached. This bug was first introduced in 5.3.0 (Andreas Müller, Vasiliy Bout, Erick Erickson, Shawn Heisey, Hossman, yonik) ================== 5.3.1 ================== Bug Fixes ---------------------- * SOLR-7949: Resolve XSS issue in Admin UI stats page (David Chiu via janhoy) * SOLR-8000: security.json is not loaded on server start (noble) * SOLR-8004: RuleBasedAuthorization plugin does not work for the collection-admin-edit permission (noble) * SOLR-7972: Fix VelocityResponseWriter template encoding issue. Templates must be UTF-8 encoded. (Erik Hatcher) * SOLR-7929: SimplePostTool (also bin/post) -filetypes "*" now works properly in 'web' mode (Erik Hatcher) * SOLR-7978: Fixed example/files update-script.js to be Java 7 and 8 compatible. (Erik Hatcher) * SOLR-7988: SolrJ could not make requests to handlers with '/admin/' prefix (noble , ludovic Boutros) * SOLR-7990: Use of timeAllowed can cause incomplete filters to be cached and incorrect results to be returned on subsequent requests. (Erick Erickson, yonik) * SOLR-8041: Fix VelocityResponseWriter's $resource.get(key,baseName,locale) to use specified locale. (Erik Hatcher) ================== 5.3.0 ================== Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release Versions of Major Components --------------------- Apache Tika 1.7 Carrot2 3.9.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Jetty 9.2.11.v20150529 Upgrading from Solr 5.2 ----------------------- * SolrJ's CollectionAdminRequest class is now marked as abstract. Use one of its concrete sub-classes instead. * Solr no longer supports forcefully unlocking an index. This is no longer supported by the underlying Lucene locking framework. The setting in solrconfig.xml has no effect anymore. Background: If you use native lock factory, unlocking should not be needed, because the locks are cleared after process shutdown automatically by the operating system. If you are using simple lock factory (not recommended) or hdfs lock factory, you may need to manually unlock by deleting the lock file from filesystem / HDFS. * The zkCredientialsProvider element in solrcloud section of solr.xml is now deprecated. Use the correct spelling (zkCredentialsProvider) instead. * class TransformerWithContext is deprecated . Use DocTransformer directly * The "name" parameter in ADDREPLICA Collections API call has be deprecated. One cannot specify the core name for a replica. See SOLR-7499 for more info. * The ShardHandler interface has changed. The interface used to provide a `checkDistributed` function which doubled up in purpose to determine if the request is distributed, and to prepare for distributed requests. This unfortunately meant that the object had to be instantiated even when the request is not distributed. The task of initially determining if the request is distributed is now done by SearchHandler using the distrib/shards parameters, and a ShardHandler object is created only if the request is distributed. The interface now has a `prepDistributed` function instead of the `checkDistributed` function, which can then be used to prepare for the distributed request. Users with custom ShardHandler implementations would need to modify their code to this effect. * The system property "solr.solrxml.location" is not supported any more. Now, solr.xml is first looked up in zookeeper, and if not found, fallback to SOLR_HOME. See SOLR-7735 for more info. Detailed Change List ---------------------- New Features ---------------------- * SOLR-7724: SolrJ now supports parsing the output of the clustering component. (Alessandro Benedetti via Dawid Weiss) * SOLR-7389: Expose znodeVersion property for each of the collections returned for the clusterstatus operation in the collections API (Marius Grama via shalin) * SOLR-7622: A DocTransformer can now request fields from the SolrIndexSearcher that are not necessarily returned in the file SolrDocument by returning a list of fields from DocTransformer#getExtraRequestFields (ryan) * SOLR-7458: Expose HDFS Block Locality Metrics via JMX (Mike Drob via Mark Miller) * SOLR-7676: Faceting on nested objects / Block-join faceting with the new JSON Facet API. Example: Assuming books with nested pages and an input domain of pages, the following will switch the domain to books before faceting on the author field: authors:{ type:terms, field:author, domain:{toParent:"type:book"} } (yonik) * SOLR-7668: Add 'port' tag support in replica placement rules (Adam McElwee, Noble Paul) * SOLR-5886: Response for an async call is now stored in zk so that it can be returned by the REQUESTSTATUS API. Also, the number of stored (failed and successful) responses are now restricted to 10,000 each as a safety net. (Anshum Gupta) * SOLR-7639: MoreLikeThis QParser now supports all options provided by the MLT Handler i.e. mintf, mindf, minwl, maxwl, maxqt, and maxntp. * SOLR-7182: Make the Schema-API a first class citizen of SolrJ. The new SchemaRequest and its inner classes can be used to make requests to the Schema API. (Sven Windisch, Marius Grama via shalin) * SOLR-7651: New response format added wt=smile (noble) * SOLR-4212: SOLR-6353: Let facet queries and facet ranges hang off of pivots. Example: facet.range={!tag=r1}price&facet.query={!tag=q1}somequery&facet.pivot={!range=r1 query=q1}category,manufacturer (Steve Molloy, hossman, shalin) * SOLR-7742: Support for Immutable ConfigSets (Gregory Chanan) * SOLR-2522: new two argument option for the existing field() function; picks the min/max value of a docValues field to use as a ValueSource: "field(field_name,min)" and "field(field_name,max)" (hossman) * SOLR-6234: Scoring for query time join (Mikhail Khludnev) * SOLR-5882: score local parameter for block join query parser {!parent} (Andrey Kudryavtsev, Mikhail Khludnev) * SOLR-7799: Added includeIndexFieldFlags (backwards compatible default is true) to /admin/luke. When there are many fields in the index, setting this flag to false can dramatically speed up requests. (ehatcher) * SOLR-7769: Add bin/post -p alias for -port parameter. (ehatcher) * SOLR-7766: support creation of a coreless collection via createNodeSet=EMPTY (Christine Poerschke) * SOLR-7849: Solr-managed inter-node authentication when authentication enabled (Noble Paul) * SOLR-7220: Nested C-style comments in queries. (yonik) * SOLR-7757: Improved security framework where security components can be edited/reloaded, Solr now watches /security.json. Components can choose to make their config editable (Noble Paul, Anshum Gupta, Ishan Chattopadhyaya) * SOLR-7838: An authorizationPlugin interface where the access control rules are stored/managed in ZooKeeper (Noble Paul, Anshum Gupta, Ishan Chattopadhyaya) * SOLR-7837: An AuthenticationPlugin which implements the HTTP BasicAuth protocol and stores credentials securely in ZooKeeper (Noble Paul, Anshum Gupta,Ishan Chattopadhyaya) Bug Fixes ---------------------- * SOLR-7361: Slow loading SolrCores should not hold up all other SolrCores that have finished loading from serving requests. (Mark Miller, Timothy Potter, Ramkumar Aiyengar) * SOLR-4506: Clean-up old (unused) index directories in the background after initializing a new index; previously, Solr would leave old index.yyyyMMddHHmmssSSS directories left behind after failed recoveries in the data directory, which unnecessarily consumes disk space. (Mark Miller, Timothy Potter) * SOLR-7108: Change default query used by /admin/ping to not rely on other parameters such as query parser or default field. (ehatcher) * SOLR-6835: ReRankQueryParserPlugin checks now whether the reRankQuery parameter is present and not empty. (帅广应, Marius Grama via shalin) * SOLR-7566: Search requests should return the shard name that is down. (Marius Grama, shalin) * SOLR-7675: Add missing _root_ field to managed-schema template so that the default data driven config set can index nested documents by default. (yonik) * SOLR-7635: Limit lsof port check in bin/solr to just listening ports (Upayavira, Ramkumar Aiyengar) * SOLR-7091: Nested documents with unknown fields don't work in schemaless mode. (Steve Rowe) * SOLR-7682: Schema API: add-copy-field should accept the maxChars parameter. (Steve Rowe) * SOLR-7693: Fix the bin/solr -e cloud example to work if lsof is not installed on the local machine by waiting for 10 seconds before starting the second node. (hossman, Timothy Potter) * SOLR-7689: ReRankQuery rewrite method can change the QueryResultKey causing cache misses. (Emad Nashed, Yonik Seeley, Joel Bernstein) * SOLR-7697: Schema API doesn't take class or luceneMatchVersion attributes into account for the analyzer when adding a new field type. (Marius Grama, Steve Rowe) * SOLR-7679: Schema API doesn't take similarity attribute into account when adding field types. (Marius Grama, Steve Rowe) * SOLR-7664: Throw correct exception (RemoteSolrException) on receiving a HTTP 413. (Ramkumar Aiyengar, Eirik Lygre) * SOLR-6686: facet.threads can return wrong results when using facet.prefix multiple times on same field. (Michael Ryan, Tim Underwood via shalin) * SOLR-7673: Race condition in shard splitting can cause operation to hang indefinitely or sub-shards to never become active. (shalin) * SOLR-7741: Add missing fields to SolrIndexerConfig.toMap (Mike Drob, Christine Poerschke via Ramkumar Aiyengar) * SOLR-7748: Fix bin/solr to start on IBM J9. (Shai Erera) * SOLR-7143: MoreLikeThis Query parser should handle multiple field names (Jens Wille, Anshum Gupta) * SOLR-7132: The Collections API ADDREPLICA command property.name is not reflected in the clusterstate until after Solr restarts (Erick Erickson) * SOLR-7172: addreplica API fails with incorrect error msg "cannot create collection" (Erick Erickson) * SOLR-7705: CoreAdminHandler Unload no longer handles null core name and throws NPE instead of a bad request error. (John Call, Edward Ribeiro via shalin) * SOLR-7529: CoreAdminHandler Reload throws NPE on null core name instead of a bad request error. (Jellyfrog, Edward Ribeiro via shalin) * SOLR-7781: JSON Facet API: Terms facet on string/text fields with sub-facets caused a bug that resulted in filter cache lookup misses as well as the filter cache exceeding it's configured size. (yonik) * SOLR-7810: map-reduce contrib script to set classpath for convenience refers to example rather than server. (Mark Miller) * SOLR-7765: Hardened the behavior of TokenizerChain when null arguments are used in constructor. This prevents NPEs in some code paths. (Konstantin Gribov, hossman) * SOLR-7829: Fixed a bug in distributed pivot faceting that could result in a facet.missing=true count which was lower then the correct count if facet.sort=index and facet.pivot.mincount > 1 (hossman) * SOLR-7842: ZK connection loss or session expiry events should not fire config directory listeners. (noble, shalin) * SOLR-6357: Allow delete documents by doing a score join query. (Mikhail Khludnev, Timothy Potter) * SOLR-7756: Fixed ExactStatsCache and LRUStatsCache to not throw an NPE when a term is not present on a shard. (Varun Thacker, Anshum Gupta) * SOLR-7818: Fixed distributed stats to be calculated for all the query terms. Earlier the stats were calculated with the terms that are present in the last shard of a distributed request. (Varun Thacker, Anshum Gupta) * SOLR-7866: VersionInfo caused an unhandled NPE when trying to determine the max value for the version field. (Timothy Potter) * SOLR-7666 (and linked tickets): Many fixes to AngularJS Admin UI bringing it close to feature parity with existing UI. (Upayavira) * SOLR-7908: SegmentsInfoRequestHandler gets a ref counted IndexWriter and does not properly release it. (Mark Miller, shalin) * SOLR-7921: The techproducts example fails when running in a directory that contains spaces. (Ishan Chattopadhyaya via Timothy Potter) * SOLR-7934: SolrCLI masks underlying cause of create collection failure. (Timothy Potter) Optimizations ---------------------- * SOLR-7660: Avoid redundant 'exists' calls made to ZK while fetching cluster state updates. (shalin) * SOLR-7714: Reduce SearchHandler's use of ShardHandler objects across shards in a search, from one for each shard and the federator, to just one for the federator. (Christine Poerschke via Ramkumar Aiyengar) * SOLR-7751: Minor optimizations to QueryComponent.process (reduce eager instantiations, cache method calls) (Christine Poerschke via Ramkumar Aiyengar) * SOLR-7455: Terms facets with the JSON Facet API now defer calculating non-sorting stats until a second phase, after the top N facets are found. This improves performance proportional to the number of non-sorting statistics being calculated in addition to the number of buckets and domain documents. For Example: The facet request {type:terms, field:field1, facet:{x:"unique(field2)"}} saw a 7x improvement when field1 and 1M unique terms and field2 had 1000 unique terms. (yonik) * SOLR-7840: ZkStateReader.updateClusterState fetches watched collections twice from ZK. (shalin) * SOLR-7875: Speedup SolrQueryTimeoutImpl. Avoid setting a timeout time when timeAllowed parameter is not set. (Tomás Fernández Löbbe) Other Changes ---------------------- * SOLR-7787: Removed fastutil and java-hll dependency, integrated HyperLogLog from java-hll into Solr core. (Dawid Weiss) * SOLR-7595: Allow method chaining for all CollectionAdminRequests in Solrj. (shalin) * SOLR-7146: MiniSolrCloudCluster based tests can fail with ZooKeeperException NoNode for /live_nodes. (Vamsee Yarlagadda via shalin) * SOLR-7590: Finish and improve MDC context logging support. (Mark Miller) * SOLR-7599: Remove cruft from SolrCloud tests. (shalin) * SOLR-7636: CLUSTERSTATUS API is executed at CollectionsHandler (noble) * LUCENE-6508: Remove ability to forcefully unlock an index. This is no longer supported by the underlying Lucene locking framework. (Uwe Schindler, Mike McCandless, Robert Muir) * SOLR-3719: Add as-you-type "instant search" to example/files /browse. (Esther Quansah, ehatcher) * SOLR-7645: Remove explicitly defined request handlers from example and test solrconfig's that are already defined implicitly, such as /admin/ping, /admin/system, and several others. (ehatcher) * SOLR-7603: Fix test only bug in UpdateRequestProcessorFactoryTest (hossman) * SOLR-7634: Upgrade Jetty to 9.2.11.v20150529 (Bill Bell, shalin) * SOLR-7659: Rename releaseCommitPointAndExtendReserve in DirectoryFileStream to extendReserveAndReleaseCommitPoint, and reverse the code to match. (shalin, Shawn Heisey) * SOLR-7624: Add correct spelling (zkCredentialsProvider) as an alternative to zkCredientialsProvider element in solrcloud section of solr.xml. (Xu Zhang, Per Steffensen, Ramkumar Aiyengar, Mark Miller) * SOLR-7619: Fix SegmentsInfoRequestHandlerTest when more than one segment is created. (Ramkumar Aiyengar, Steve Rowe) * SOLR-7678: Switch RTimer to use nanoTime (improves accuracy of QTime, and other times returned by Solr handlers) (Ramkumar Aiyengar) * SOLR-7680: Use POST instead of GET when finding versions for mismatches with CloudInspectUtil for tests (Ramkumar Aiyengar) * SOLR-7665: deprecate the class TransformerWithContext (noble) * SOLR-7629: Have RulesTest consider disk space limitations of where the test is being run (Christine Poerschke via Ramkumar Aiyengar) * SOLR-7499: The "name" parameter in ADDREPLICA Collections API call has be deprecated. One cannot specify the core name for a replica (Varun Thacker, noble, Erick Erickson) * SOLR-7711: Correct initial capacity for the list that holds the default components for the SearchHandler (Christine Poerschke via Varun Thacker) * SOLR-7485: Replace shards.info occurrences with ShardParams.SHARDS_INFO (Christine Poerschke via Ramkumar Aiyengar) * SOLR-7710: Replace async occurrences with CommonAdminParams.ASYNC (Christine Poerschke, Ramkumar Aiyengar) * SOLR-7712: fixed test to account for aggregate floating point precision loss (hossman) * SOLR-7740: Fix typo bug with TestConfigOverlay (Christine Poerschke via Ramkumar Aiyengar) * SOLR-7750: Change TestConfig.testDefaults to cover all SolrIndexConfig fields (Christine Poerschke via Ramkumar Aiyengar) * SOLR-7703: Authentication plugin is now loaded using the ResourceLoader. (Avi Digmi via Anshum Gupta) * SOLR-7800: JSON Facet API: the avg() facet function now skips missing values rather than treating them as a 0 value. The def() function can be used to treat missing values as 0 if that is desired. Example: facet:{ mean:"avg(def(myfield,0))" } * SOLR-7805: Update Kite Morphlines to 1.1.0 (Mark Miller) * SOLR-7803: Prevent class loading deadlock in TrieDateField; refactor date formatting and parsing out of TrieDateField and move to static utility class DateFormatUtil. (Markus Heiden, Uwe Schindler) * SOLR-7825: Forbid all usages of log4j and java.util.logging classes in Solr except classes which are specific to logging implementations. Remove accidental usage of log4j logger from a few places. The default log level for org.apache.zookeeper is changed from ERROR to WARN for zkcli.{sh,cmd} only. (Oliver Schrenk, Tim Potter, Uwe Schindler, shalin) * SOLR-7735: Look for solr.xml in Zookeeper by default in SolrCloud mode. If not found, it will be loaded from $SOLR_HOME/solr.xml as before. Sysprop solr.solrxml.location is now gone. (janhoy) * SOLR-7227: Ship Solr with the Web application directory exploded into server/solr-webapp, solr.war is no longer included in the distribution bundles. (Timothy Potter, Uwe Schindler) * SOLR-6625: Enable registering interceptors for the calls made using HttpClient and make the request object available at the interceptor context ( Ishan Chattopadhyay, Gregory Chanan, noble, Anshum Gupta) * SOLR-5022: On Java 7 raise permgen for running tests. (Uwe Schindler) * SOLR-7823: TestMiniSolrCloudCluster.testCollectionCreateSearchDelete async collection-creation (sometimes) (Christine Poerschke) * SOLR-7854: Remove unused ZkStateReader.updateClusterState(false) method. (Scott Blum via shalin) * SOLR-7863: Lowercase the CLUSTERPROP command in ZkCLI for consistency, print error for unknown cmd (janhoy) * SOLR-7832: bin/post now allows either -url or -c, rather than requiring both. (ehatcher) * SOLR-7847: Implement run example logic in Java instead of OS-specific scripts in bin/solr and bin\solr.cmd (Timothy Potter) * SOLR-7877: TestAuthenticationFramework.testBasics to preserve/restore the original request(Username|Password) (Christine Poerschke) * SOLR-7900: example/files improvements - added language detection and faceting, added title field, relocated .js files. (Esther Quansah and Erik Hatcher) ================== 5.2.1 ================== Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release Versions of Major Components --------------------- Apache Tika 1.7 Carrot2 3.9.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Jetty 9.2.10.v20150310 Detailed Change List ---------------------- Bug Fixes ---------------------- * SOLR-7588: Fix javascript bug introduced by SOLR-7409 that breaks the dataimport screen in the admin UI. (Bill Bell via Shawn Heisey) * SOLR-7616: Faceting on a numeric field with a unique() subfacet function on another numeric field can result in incorrect results or an exception. (yonik) * SOLR-7518: New Facet Module should respect shards.tolerant and process all non-failing shards instead of throwing an exception. (yonik) * SOLR-7574: A request with a json content type but no body caused a null pointer exception (yonik) * SOLR-7512: SolrOutputFormat creates an invalid solr.xml in the solr home zip for MapReduceIndexerTool. (Mark Miller, Adam McElwee) * SOLR-7652: Fix example/files update-script.js to work with Java 7 (ehatcher) * SOLR-7638: Fix new (Angular-based) admin UI Cloud pane (Upayavira via ehatcher) * SOLR-7655: The DefaultSolrHighlighter since 5.0 was determining if payloads were present in a way that was slow, especially when lots of fields were highlighted. It's now fast. (David Smiley) * SOLR-7493: Requests aren't distributed evenly if the collection isn't present locally. (Jeff Wartes, shalin) Other Changes ---------------------- * SOLR-7623: Fix regression from SOLR-7484 that made it impossible to override SolrDispatchFilter#execute() and SolrDispatchFilter#sendError(). You can now override these functions in HttpSolrCall. (ryan) * SOLR-7648: Expose remote IP and Host via the AuthorizationContext to be used by the authorization plugin. (Ishan Chattopadhyaya via Anshum Gupta) ================== 5.2.0 ================== Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release Versions of Major Components --------------------- Apache Tika 1.7 Carrot2 3.9.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Jetty 9.2.10.v20150310 Upgrading from Solr 5.1 ----------------------- * A bug was introduced in Solr 4.10 that caused index time document boosts to trigger excessive field boosts in multivalued fields -- the result being that some field norms might be excessively large. This bug has now been fixed, but users of document boosts are strongly encouraged to re-index. See SOLR-7335 for more details. * The Slice and Replica classes have been changed to use State enums instead of string constants to track the respective stats. Advanced users with client code manipulating these objects will need to update their code accordingly. See SOLR-7325 and SOLR-7336 for more info. * Solr has internally been upgraded to use Jetty 9. See SOLR-4839 for full details, but there are a few key details all Solr users should know when upgrading: - It is no longer possible to run "java -jar start.jar" from inside the server directory. The bin/solr script is the only supported way to run Solr. This is necessary to support HTTP and HTTPS modules in Jetty which can be selectively enabled by the bin/solr scripts. In case you have a pressing need to run solr the old way, you can run "java -jar start.jar --module=http" to get the same behavior as before. - The way SSL support is configured has been changed. Before this release, the SOLR_SSL_OPTS property configured in solr.in.sh (linux/mac) or solr.in.cmd (windows) was used to enable/disable SSL but starting in 5.2.0, new properties named as SOLR_SSL_KEY_STORE, SOLR_SSL_KEY_STORE_PASSWORD, SOLR_SSL_TRUST_STORE, SOLR_SSL_TRUST_STORE_PASSWORD, SOLR_SSL_NEED_CLIENT_AUTH and SOLR_SSL_WANT_CLIENT_AUTH have been introduced. The bin/solr scripts configure the SOLR_SSL_OPTS property automatically based on the above new properties. You should *not* configure the SOLR_SSL_OPTS property directly inside solr.in.{sh,cmd}. - Support for SOLR_SSL_PORT property has been removed. Instead use the regular SOLR_PORT property or specify the port while invoking the bin/solr script using the "-p" switch. - Furthermore, it is now possible to configure the HTTP client with different SSL properties than the ones used for Jetty using the same files. - Please refer to the "Enabling SSL" section in the Solr Reference Guide for complete details. * Support for pathPrefix has been completely removed from Solr. Since 5.0, Solr no longer officially supports being run as a webapp but allowed users to play around with the web.xml to have a path prefix. That would no longer be true. See SOLR-7500 for more info. * The package structure under org.apache.solr.client.solrj.io has been changed to support the Streaming Expression Language (SOLR-7377). Any code written with the 5.1 Streaming API will have to be updated to reflect these changes. * Merge Policy's "noCFSRatio" is no longer set based on element in the indexConfig section of solrconfig.xml. This means that Solr will start using Lucene's default for MP "noCFSRatio", with this new default Solr will decide if a segment should use cfs or not based on the size of the segment in relation the size of the complete index. For TieredMergePolicy for example (current default), segments will use cfs if they are less than 10% of the index, otherwise cfs is disabled. Old values for this setting (1.0 for useCompoundFile=true and 0.0 for useCompoundFile=false) as well as any other value can be set inside the element in solrconfig.xml. will only apply to newly created segments. See SOLR-7463. Detailed Change List ---------------------- New Features ---------------------- * SOLR-6637: Restore a Solr core from a backed up index. Restore API Example - http://localhost:8983/solr/techproducts/replication?command=restore&name=backup_name Restore Status API Example - http://localhost:8983/solr/techproducts/replication?command=restorestatus (Varun Thacker, noble, shalin) * SOLR-7241, SOLR-7263, SOLR-7279, SOLR-7300, SOLR-7396, SOLR-7397, SOLR-7492: Admin UI - Refactoring using AngularJS. More functionality moving the Admin UI to Angular JS (Upayavira via Erick) * SOLR-7372: Limit memory consumed by LRUCache with a new 'maxRamMB' config parameter. (yonik, shalin) * SOLR-7376: Return raw XML or JSON (in the appropriate writer) using DocumentTransformers. ?fl=id,name,json_s:[json],xml_s:[xml] (ryan) * SOLR-7422: Optional flatter form for the JSON Facet API via a "type" parameter: top_authors : { type:terms, field:author, limit:5 } is equivalent to top_authors : { terms : { field:author, limit:5 } } (yonik) * SOLR-7176: zkcli script can perfrom the CLUSTERPROP command without a running Solr cluster (Hrishikesh Gadre, Per Steffensen, Noble Paul) * SOLR-7417: JSON Facet API - unique() is now implemented for numeric and date fields. (yonik) * SOLR-7406: Add a new "facet.range.method" parameter to let users choose how to do range faceting between an implementation based on filters (previous algorithm, using "facet.range.method=filter") or DocValues ("facet.range.method=dv"). Input parameters and output of both methods are the same. (Tomás Fernández Löbbe) * SOLR-7473: Facet Module (Json Facet API) range faceting now supports the "mincount" parameter in range facets to supress buckets less than that count. The default for "mincount" remains 0 for range faceting. Example: prices:{ type:range, field:price, mincount:1, start:0, end:100, gap:10 } (yonik) * SOLR-7437: Make HDFS transaction log replication factor configurable. (Mark Miller) * SOLR-7477: Multi-select faceting support for the Facet Module via the "excludeTags" parameter which disregards any matching tagged filters for that facet. Example: & q=shoes & fq={!tag=COLOR}color:blue & json.facet={ colors:{type:terms, field:color, excludeTags=COLOR} } (yonik) * SOLR-7231: DIH-TikaEntityprocessor, create lat-lon field from Metadata (Tim Allison via Noble Paul) * SOLR-6220: Rule Based Replica Assignment during collection, shard creation and replica creation (Noble Paul) * SOLR-6968: New 'cardinality' option for stats.field, uses HyperLogLog to efficiently estimate the cardinality of a field w/bounded RAM. (hossman) * SOLR-4392: Make it possible to specify AES encrypted password in dataconfig.xml (Noble Paul) * SOLR-7461: stats.field now supports individual local params for 'countDistinct' and 'distinctValues'. 'calcdistinct' is still supported as an alias for both options (hossman) * SOLR-7522: Facet Module - Implement field/terms faceting over single-valued numeric fields. (yonik) * SOLR-7275: Authorization framework for Solr. It defines an interface and a mechanism to create, load, and use an Authorization plugin. (Noble Paul, Ishan Chattopadhyaya, Anshum Gupta) * SOLR-7377: Solr Streaming Expressions (Dennis Gove, Joel Bernstein, Steven Bower) * SOLR-7553: Facet Analytics Module: new "hll" function that uses HyperLogLog to calculate distributed cardinality. The original "unique" function is still available. Example: json.facet={ numProducts : "hll(product_id)" } (yonik) * SOLR-7546: bin/post (and SimplePostTool in -Dauto=yes mode) now sends rather than skips files without a known content type, as "application/octet-stream", provided it still is in the allowed filetypes setting. (ehatcher) * SOLR-7274: Pluggable authentication module in Solr. This defines an interface and a mechanism to create, load, and use an Authentication plugin. (Noble Paul, Ishan Chattopadhyaya, Gregory Chanan, Anshum Gupta) * SOLR-7379: (experimental) New spatial RptWithGeometrySpatialField, based on CompositeSpatialStrategy, which blends RPT indexes for speed with serialized geometry for accuracy. Includes a Lucene segment based in-memory shape cache. (David Smiley) * SOLR-7465, SOLR-7610: New file indexing example, under example/files. (Esther Quansah, Erik Hatcher) * SOLR-7468: Kerberos authenticaion plugin for Solr. This would allow running a Kerberized Solr. (Noble Paul, Ishan Chattopadhyaya, Gregory Chanan, Anshum Gupta) Bug Fixes ---------------------- * SOLR-6709: Fix QueryResponse to deal with the "expanded" section when using the XMLResponseParser (Varun Thacker, Joel Bernstein) * SOLR-7066: autoAddReplicas feature has bug when selecting replacement nodes. (Mark Miller) * SOLR-7370: FSHDFSUtils#recoverFileLease tries to recover the lease every one second after the first four second wait. (Mark Miller) * SOLR-7369: AngularJS UI insufficient URLDecoding in cloud/tree view (janhoy) * SOLR-7380: SearchHandler should not try to load runtime components in inform() (Noble Paul) * SOLR-7385: The clusterstatus API now returns the config set used to create a collection inside a 'configName' key. (Shai Erera, shalin) * SOLR-7401: Fixed a NullPointerException when concurrently creating and deleting collections, while accessing other collections. (Shai Erera) * SOLR-7412: Fixed range.facet.other parameter for distributed requests. (Will Miller, Tomás Fernándes Löbbe) * SOLR-6087: SolrIndexSearcher makes no DelegatingCollector.finish() call when IndexSearcher throws an expected exception. (Christine Poerschke via shalin) * SOLR-7420: Overseer stats are not reset on loss of ZK connection. (Jessica Cheng, shalin) * SOLR-7392: Fix SOLR_JAVA_MEM and SOLR_OPTS customizations in solr.in.sh being ignored (Ramkumar Aiyengar, Ere Maijala) * SOLR-7426: SolrConfig#getConfigOverlay does not clean up it's resources. (Mark Miller) * SOLR-6665: ZkController.publishAndWaitForDownStates can return before all local cores are marked as 'down' if multiple replicas with the same core name exist in the cluster. (shalin) * SOLR-7418: Check and raise a SolrException instead of an NPE when an invalid doc id is sent to the MLTQParser. (Anshum Gupta) * SOLR-7443: Implemented range faceting over date fields in the new facet module (JSON Facet API). (yonik) * SOLR-7440: DebugComponent does not return the right requestPurpose for pivot facet refinements. (shalin) * SOLR-7408: Listeners set by SolrCores on config directories in ZK could be removed if collections are created/deleted in paralle against the same config set. (Shai Erera, Anshum Gupta) * SOLR-7450: Fix edge case which could cause `bin/solr stop` to hang forever (Ramkumar Aiyengar) * SOLR-7157: initParams must support tags other than appends, defaults and, invariants (Noble Paul) * SOLR-7387: Facet Module - distributed search didn't work when sorting terms facet by min, max, avg, or unique functions. (yonik) * SOLR-7469: Fix check-licenses to correctly detect if start.jar.sha1 is incorrect (hossman) * SOLR-7449: solr/server/etc/jetty-https-ssl.xml hard codes the key store file and password rather than pulling them from the sysprops defined in solr/bin/solr.in.{sh,cmd} * SOLR-7470: Fix sample data to eliminate file order dependency for successful indexing, also fixed SolrCloudExampleTest to help catch this in the future. (hossman) * SOLR-7478: UpdateLog#close shuts down it's executor with interrupts before running it's close logic, possibly preventing a clean close. (Mark Miller) * SOLR-7494: Facet Module - unique() facet function was wildly inaccurate for high cardinality fields. (Andy Crossen, yonik) * SOLR-7502: start script should not try to create configset for .system collection (Noble Paul) * SOLR-7514: SolrClient.getByIds fails with ClassCastException (Tom Farnworth, Ramkumar Aiyengar) * SOLR-7531: config API shows a few keys merged together (Noble Paul) * SOLR-7542: Schema API: Can't remove single dynamic copy field directive (Steve Rowe) * SOLR-7472: SortingResponseWriter does not log fl parameters that don't exist. (Joel Bernstein) * SOLR-7545: Honour SOLR_HOST parameter with bin/solr{,.cmd} (Ishan Chattopadhyaya via Ramkumar Aiyengar) * SOLR-7503: Recovery after ZK session expiration should happen in parallel for all cores using the thread-pool managed by ZkContainer instead of a single thread. (Jessica Cheng Mallet, Timothy Potter, shalin, Mark Miller) * SOLR-7335: Fix doc boosts to no longer be multiplied in each field value in multivalued fields that are not used in copyFields (Shingo Sasaki via hossman) * SOLR-7585: Fix NoSuchElementException in LFUCache resulting from heavy writes making concurrent put() calls. (Maciej Zasada via Shawn Heisey) * SOLR-7587: Seeding bucket versions from index when the firstSearcher event fires has a race condition that leads to an infinite wait on VersionInfo's ReentrantReadWriteLock because the read-lock acquired during a commit cannot be upgraded to a write-lock needed to block updates; solution is to move the call out of the firstSearcher event path and into the SolrCore constructor. (Timothy Potter) * SOLR-7625: Ensure that the max value for seeding version buckets is updated after recovery even if the UpdateLog is not replayed. (Timothy Potter) * SOLR-7610: Fix VelocityResponseWriter's $resource.locale to accurately report locale in use. (ehatcher) * SOLR-7614: Distributed pivot facet refinement was broken due to a single correlation counter used across multiple requests as if it was private to each request. (yonik) Optimizations ---------------------- * SOLR-7324: IndexFetcher does not need to call isIndexStale if full copy is already needed (Stephan Lagraulet via Varun Thacker) * SOLR-7547: Short circuit SolrDisptachFilter for static content request. Right now it creates a new HttpSolrCall object and tries to process it. (Anshum Gupta) * SOLR-7333: Make the poll queue time a leader uses when distributing updates to replicas configurable and use knowledge that a batch is being processed to poll efficiently. (Timothy Potter) * SOLR-7332: Initialize the highest value for all version buckets with the max value from the index or recent updates to avoid unnecessary lookups to the index to check for reordered updates when processing new documents. (Timothy Potter, yonik) * SOLR-5855: DefaultSolrHighlighter now re-uses the document's term vectors instance when highlighting more than one field. Applies to the standard and FVH highlighters. (David Smiley, Daniel Debray) Other Changes ---------------------- * SOLR-6865: Upgrade HttpClient to 4.4.1 (Shawn Heisey) * SOLR-7358: TestRestoreCore fails in Windows (Ishan Chattopadhyaya via Varun Thacker) * SOLR-7371: Make DocSet implement Accountable to estimate memory usage. (yonik, shalin) * SOLR-7381: Improve logging by adding node name in MDC in SolrCloud mode and adding MDC to all thread pools. A new MDCAwareThreadPoolExecutor is introduced and usages of Executors#newFixedThreadPool, #newSingleThreadExecutor, #newCachedThreadPool as well as ThreadPoolExecutor directly is now forbidden in Solr. MDC keys are now exposed in thread names automatically so that a thread dump can give hints on what the thread was doing. Uncaught exceptions thrown by tasks in the pool are logged along with submitter's stack trace. (shalin) * SOLR-7384: Fix spurious failures in FullSolrCloudDistribCmdsTest. (shalin) * SOLR-6692: Default highlighter changes: - hl.maxAnalyzedChars now applies cumulatively on a multi-valied field. - fragment ranking on a multi-valued field should be more relevant. - hl.usePhraseHighlighter is now toggleable on a per-field basis. - Much more extensible (get values from another source; return snippet scores and offsets). - When using hl.maxMultiValuedToMatch with hl.preserveMulti, only count matched snippets. (David Smiley) * SOLR-6886: Removed redundant size check and added missing calls to DelegatingCollection.finish inside Grouping code. (Christine Poerschke via shalin) * SOLR-7421: RecoveryAfterSoftCommitTest fails frequently on Jenkins due to full index replication taking longer than 30 seconds. (Timothy Potter, shalin) * SOLR-7081: Add new test case to test if create/delete/re-create collections work. (Christine Poerschke via Ramkumar Aiyengar) * SOLR-7467: Upgrade t-digest to 3.1 (hossman) * SOLR-7471: Stop requiring docValues for interval faceting (Tomás Fernández Löbbe) * SOLR-7391: Use a time based expiration cache for one off HDFS FileSystem instances. (Mark Miller) * SOLR-5213: Log when shard splitting unexpectedly leads to documents going to no or multiple shards (Christine Poerschke, Ramkumar Aiyengar) * SOLR-7425: Improve MDC based logging format. (Mark Miller) * SOLR-4839: Upgrade Jetty to 9.2.10.v20150310 and restlet-jee to 2.3.0 (Bill Bell, Timothy Potter, Uwe Schindler, Mark Miller, Steve Rowe, Steve Davids, shalin) * SOLR-7457: Make DirectoryFactory publishing MBeanInfo extensible. (Mike Drob via Mark Miller) * SOLR-7325: Slice.getState() now returns a State enum instead of a String. This helps clarify the states a Slice can be in, as well comparing the state of a Slice. (Shai Erera) * SOLR-7336: Added Replica.getState() and removed ZkStateReader state-related constants. You should use Replica.State to compare a replica's state. (Shai Erera) * SOLR-7487: Fix check-example-lucene-match-version Ant task to check luceneMatchVersion in solr/server/solr/configsets instead of example and harden error checking / validation logic. (hossman, Timothy Potter) * SOLR-7409: When there are multiple dataimport handlers defined, the admin UI was listing them in a random order. Now they are sorted in a natural order that handles numbers properly. (Jellyfrog via Shawn Heisey) * SOLR-7484: Refactor SolrDispatchFilter to extract all Solr specific implementation detail to HttpSolrCall and also extract methods from within the current SDF.doFilter(..) logic making things easier to manage. HttpSolrCall converts the processing to a 3-step process i.e. Construct, Init, and Call so the context of the request would be available after Init and before the actual call operation. (Anshum Gupta, Noble Paul) * SOLR-6878: Allow symmetric lists of synonyms to be added using the managed synonym REST API to support legacy expand=true type mappings; previously the API only allowed adding explicit mappings, with this feature you can now add a list and have the mappings expanded when the update is applied (Timothy Potter, Vitaliy Zhovtyuk, hossman) * SOLR-7102: bin/solr should activate cloud mode if ZK_HOST is set (Timothy Potter) * SOLR-7500: Remove pathPrefix from SolrDispatchFilter as Solr no longer runs as a part of a bigger webapp. (Anshum Gupta) * SOLR-7243: CloudSolrClient was always returning SERVER_ERROR for exceptions, even when a more relevant ErrorCode was available, via SolrException. Now the actual ErrorCode is used when available. (Hrishikesh Gadre via Shawn Heisey) * SOLR-7544: CollectionsHandler refactored to be more modular (Noble Paul) * SOLR-7532: Removed occurrences of the unused 'commitIntervalLowerBound' property for updateHandler elements from Solr configuration. (Marius Grama via shalin) * SOLR-7541: Removed CollectionsHandler#createNodeIfNotExists. All calls made to this method now call ZkCmdExecutor#ensureExists as they were doing the same thing. Also ZkCmdExecutor#ensureExists now respects the CreateMode passed to it. (Varun Thacker) * SOLR-6820: Make the number of version buckets used by the UpdateLog configurable as increasing beyond the default 256 has been shown to help with high volume indexing performance in SolrCloud; helps overcome a limitation where Lucene uses the request thread to perform expensive index housekeeping work. (Mark Miller, yonik, Timothy Potter) * SOLR-7463: Stop forcing MergePolicy's "NoCFSRatio" based on the IWC "useCompoundFile" configuration (Tomás Fernández Löbbe) * SOLR-7582: Allow auto-commit to be set with system properties in data_driven_schema_configs and enable auto soft-commits for the bin/solr -e cloud example using the Config API. (Timothy Potter) * SOLR-7183: Fix Locale blacklisting for Minikdc based tests. (Ishan Chattopadhyaya, hossman via Anshum Gupta) * SOLR-7662: Refactored response writing to consolidate the logic in one place (Noble Paul) * SOLR-7110: Added option to optimize JavaBinCodec to minimize string Object creation (Noble Paul) ================== 5.1.0 ================== Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release Versions of Major Components --------------------- Apache Tika 1.7 Carrot2 3.9.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Jetty 8.1.10.v20130312 Upgrading from Solr 5.0 ----------------------- * SolrClient query functions now declare themselves as throwing IOException in addition to SolrServerException, to bring them in line with the update functions. * SolrRequest.process() is now final. Subclasses should instead be parameterized by their corresponding SolrResponse type, and implement createResponse() * The signature of SolrDispatchFilter.createCoreContainer() has changed to take (String,Properties) arguments * Deprecated the 'lib' option added to create-requesthandler as part of SOLR-6801 in 5.0 release. Please use the add-runtimelib command * Tika's runtime dependency of 'jhighlight' was removed as the latter was found to contain some LGPL-only code. Until that's resolved by Tika, you can download the .jar yourself and place it under contrib/extraction/lib. * The _text catch-all field in data_driven_schema_configs has been renamed to _text_. Detailed Change List ---------------------- New Features ---------------------- * SOLR-6909: Extract atomic update handling logic into AtomicUpdateDocumentMerger class and enable subclassing. (Steve Davids, yonik) * SOLR-6845: Add a “buildOnStartup” option for suggesters. (Tomás Fernández Löbbe) * SOLR-6449: Add first class support for Real Time Get in Solrj. (Anurag Sharma, Steve Davids via shalin) * SOLR-6954: SolrClient now implements Closeable, and shutdown() has been deprecated in favour of close(). (Mark Miller, Tomás Fernández Löbbe, Alan Woodward) * SOLR-4905: Allow fromIndex parameter to JoinQParserPlugin to refer to a single-sharded collection that has a replica on all nodes where there is a replica in the to index (Jack Lo, Timothy Potter) * SOLR-6648: Add support in AnalyzingInfixLookupFactory and BlendedInfixLookupFactory for setting 'highlight' and 'allTermsRequired' in the suggester configuration. (Boon Low, Varun Thacker via Tomás Fernández Löbbe) * SOLR-7083: Support managing all named components in solrconfig such as requestHandler, queryParser, queryResponseWriter, valueSourceParser, transformer, queryConverter (Noble Paul) * SOLR-7005: Spatial 2D heatmap faceting on RPT fields via new facet.heatmap with PNG and 2D int array formats. (David Smiley) * SOLR-7019: Support changing field key when using interval faceting. (Tomás Fernández Löbbe) * SOLR-6832: Queries be served locally rather than being forwarded to another replica. (Sachin Goyal, Timothy Potter) * SOLR-1945: Add support for child docs in DocumentObjectBinder (Noble Paul, Mark Miller) * SOLR-7125, SOLR-7158: You can upload and download configurations via CloudSolrClient (Alan Woodward, Ishan Chattopadhyaya) * SOLR-5507: Admin UI - Refactoring using AngularJS, first part (Upayavira via Erick Erickson) * SOLR-7164: BBoxField defaults sub fields to not-stored (ryan) * SOLR-7155,SOLR-7201: All SolrClient methods now take an optional 'collection' argument (Alan Woodward, Shawn Heisey) * SOLR-6359: Allow number of logs and records kept by UpdateLog to be configured (Ramkumar Aiyengar) * SOLR-7189: Allow DIH to extract content from embedded documents via Tika. (Tim Allison via shalin) * SOLR-6841: Visualize lucene segment information in Admin UI. (Alexey Kozhemiakin, Michal Bienkowski, hossman, Shawn Heisey, Varun Thacker via shalin) * SOLR-5846: EnumField supports DocValues functionality. (Elran Dvir, shalin) * SOLR-4044: CloudSolrClient.connect() throws a more useful exception if the cluster is not ready, and can now take an optional timeout argument to wait for the cluster. (Alan Woodward, shalin, yonik, Mark Miller, Vitaliy Zhovtyuk) * SOLR-7073: Support adding a jar to a collections classpath (Noble Paul) * SOLR-7126: Secure loading of runtime external jars (Noble Paul) * SOLR-6349: Added support for stats.field localparams to enable/disable individual stats to limit the amount of computation done and the amount of data returned. eg: stats.field={!min=true max=true}field_name (Tomas Fernandez-Lobbe, Xu Zhang, hossman) * SOLR-7218: lucene/solr query syntax to give any query clause a constant score. General Form: ^= Example: (color:blue color:green)^=2.0 text:shoes (yonik) * SOLR-7214: New Facet module with a JSON API, facet functions, aggregations, and analytics. Any facet type can have sub facets, and facets can be sorted by arbitrary aggregation functions. Examples: json.facet={x:'avg(price)', y:'unique(color)'} json.facet={count1:{query:"price:[10 TO 20]"}, count2:{query:"color:blue AND popularity:[0 TO 50]"} } json.facet={categories:{terms:{field:cat, sort:"x desc", facet:{x:"avg(price)", y:"sum(price)"}}}} (yonik) * SOLR-6141: Schema API: Remove fields, dynamic fields, field types and copy fields; and replace fields, dynamic fields and field types. (Steve Rowe) * SOLR-7217: HTTP POST body is auto-detected when the client is curl and the content type is form data (curl's default), allowing users to use curl to send JSON or XML without having to specify the content type. (yonik) * SOLR-6892: Update processors can now be top-level components and they can be specified in request to create a new custom update chain (Noble Paul) * SOLR-7216: Solr JSON Request API: - HTTP search requests can have a JSON body. - JSON request can also be passed via the "json" parameter. - Smart merging of multiple JSON parameters: ruery parameters starting with "json." will be merged into the JSON request. - Legacy query parameters can also be passed in the "params" block of the JSON request. (yonik) * SOLR-7245: Temporary ZK election or connection loss should not stall indexing due to leader initiated recovery (Ramkumar Aiyengar) * SOLR-6350: StatsComponent now supports Percentiles (Xu Zhang, hossman) * SOLR-7306: Percentiles support for the new facet module. Percentiles can be calculated for all facet buckets and field faceting can sort by percentile values. Examples: json.facet={ median_age : "percentile(age,50)" } json.facet={ salary_percentiles : "percentile(salary,25,50,75)" } (yonik) * SOLR-7307: EmbeddedSolrServer can now be started up by passing a path to a solr home directory, or a NodeConfig object (Alan Woodward, Mike Drob) * SOLR-1387: Add facet.contains and facet.contains.ignoreCase options (Tom Winch via Alan Woodward) * SOLR-7082: Streaming Aggregation for SolrCloud (Joel bernstein, Yonik Seeley) * SOLR-7212: Parameter substitution / macro expansion across entire request. Substitution can contain further expansions and default values are supported. Example: q=price:[ ${low:0} TO ${high} ]&low=100&high=200 (yonik) * SOLR-7226: Make /query/* jmx/* , requestDispatcher/*, properties in solrconfig.xml editable (Noble Paul) * SOLR-7240: '/' redirects to '/solr/' for convenience (Martijn Koster, hossman) * SOLR-5911: Added payload support for term vectors. New "termPayloads" option for fields / types in the schema, and "tv.payloads" param for the term vector component. (Mike McCandless, David Smiley) * SOLR-5132: Added a new collection action MODIFYCOLLECTION (Noble Paul) Bug Fixes ---------------------- * SOLR-7046: NullPointerException when group.function uses query() function. (Jim Musil via Erick Erickson) * SOLR-7072: Multiple mlt.fl does not work. (Constantin Mitocaru, shalin) * SOLR-6775: Creating backup snapshot results in null pointer exception. (Ryan Hesson, Varun Thacker via shalin) * SOLR-5890: Delete silently fails if not sent to shard where document was added (Ishan Chattopadhyaya, Noble Paul) * SOLR-7101: JmxMonitoredMap can throw an exception in clear when queryNames fails. (Mark Miller, Wolfgang Hoschek) * SOLR-6214: Snapshots numberToKeep param only keeps n-1 backups. (Mathias H., Ramana, Varun Thacker via shalin) * SOLR-7084: FreeTextSuggester: Better error message when doing a lookup during dictionary build. Used to be nullpointer (janhoy) * SOLR-6956: OverseerCollectionProcessor and replicas on the overseer node can sometimes operate on stale cluster state due to overseer holding the state update lock for a long time. (Mark Miller, shalin) * SOLR-7104: Propagate property prefix parameters for ADDREPLICA Collections API call. (Varun Thacker via Anshum Gupta) * SOLR-7113: Multiple calls to UpdateLog#init is not thread safe with respect to the HDFS FileSystem client object usage. (Mark Miller, Vamsee Yarlagadda) * SOLR-7128: Two phase distributed search is fetching extra fields in GET_TOP_IDS phase. (Pablo Queixalos, shalin) * SOLR-7139: Fix SolrContentHandler for TIKA to ignore multiple startDocument events. (Chris A. Mattmann, Uwe Schindler) * SOLR-7178: OverseerAutoReplicaFailoverThread compares Integer objects using == (shalin) * SOLR-7171: BaseDistributedSearchTestCase now clones getSolrHome() for each subclass, and consistently uses getSolrXml(). (hossman) * SOLR-6657: DocumentDictionaryFactory requires weightField to be mandatory, but it shouldn't (Erick Erickson) * SOLR-7206: MiniSolrCloudCluster wasn't dealing with SSL mode correctly (Alan Woodward) * SOLR-4464: DIH Processed documents counter resets to zero after first entity is processed. (Dave Cook, Shawn Heisey, Aaron Greenspan, Thomas Champagne via shalin) * SOLR-7209: /update/json/docs carry forward fields from previous records (Noble Paul) * SOLR-7195: Fixed a bug where the bin/solr shell script would incorrectly detect another Solr process listening on the same port number. If the requested listen port was 8983, it would match on another Solr using port 18983 for any purpose. Also escapes the dot character in all grep commands looking for start.jar. (Xu Zhang via Shawn Heisey) * SOLR-6682: Fix response when using EnumField with StatsComponent (Xu Zhang via hossman) * SOLR-7109: Indexing threads stuck during network partition can put leader into down state. (Mark Miller, Anshum Gupta, Ramkumar Aiyengar, yonik, shalin) * SOLR-7092: Stop the HDFS lease recovery retries in HdfsTransactionLog on close and try to avoid lease recovery on closed files. (Mark Miller) * SOLR-7285: ActionThrottle will not pause if getNanoTime first returns 0. (Mark Miller, Gregory Chanan) * SOLR-7141: RecoveryStrategy: Raise time that we wait for any updates from the leader before they saw the recovery state to have finished. (Mark Miller) * SOLR-7248: In legacyCloud=false mode we should check if the core was hosted on the same node before registering it (Varun Thacker, Noble Paul, Mark Miller) * SOLR-7294: Migrate API fails with 'Invalid status request: notfoundretried 6times' message. (Jessica Cheng Mallet, shalin) * SOLR-7254: Make an invalid negative start/rows throw a HTTP 400 error (Bad Request) instead of causing a 500 error. (Ramkumar Aiyengar, Hrishikesh Gadre, yonik) * SOLR-7305: BlendedInfixLookupFactory swallows root IOException when it occurs. (Stephan Lagraulet via shalin) * SOLR-7293: Fix bug that Solr server does not listen on IPv6 interfaces by default. (Uwe Schindler, Sebastian Pesman) * SOLR-7298: Fix Collections API calls (SolrJ) to not add name parameter when not needed. (Shai Erera, Anshum Gupta) * SOLR-7134: Replication can still cause index corruption. (Mark Miller, shalin, Mike Drob) * SOLR-7309: Make bin/solr, bin/post work when Solr installation directory contains spaces (Ramkumar Aiyengar, Martijn Koster) * SOLR-6924: The config API forcefully refreshes all replicas in the collection to ensure all are updated (Noble Paul) * SOLR-7266: The IgnoreCommitOptimizeUpdateProcessor blocks commit requests from replicas needing to recover. (Jessica Cheng Mallet, Timothy Potter) * SOLR-7299: bin\solr.cmd doesn't use jetty SSL configuration. (Steve Rowe) * SOLR-7334: Admin UI does not show "Num Docs" and "Deleted Docs". (Erick Erickson, Timothy Potter) * SOLR-7338, SOLR-6583: A reloaded core will never register itself as active after a ZK session expiration (Mark Miller, Timothy Potter) * SOLR-7366: Can't index example XML docs into the cloud example using bin/post due to regression in ManagedIndexSchema's handling of ResourceLoaderAware objects used by field types (Steve Rowe, Timothy Potter) * SOLR-7284: HdfsUpdateLog is using hdfs FileSystem.get without turning off the cache. (Mark Miller) * SOLR-7286: Using HDFS's FileSystem.newInstance does not guarantee a new instance. (Mark Miller) * SOLR-7508: SolrParams.toMultiMap() does not handle arrays (Thomas Scheffler , Noble Paul) Optimizations ---------------------- * SOLR-7049: Move work done by the LIST Collections API call to the Collections Handler (Varun Thacker via Anshum Gupta). * SOLR-7116: Distributed facet refinement requests would needlessly compute other types of faceting that have already been computed. (David Smiley, Hossman) * SOLR-7239: improved performance of min & max in StatsComponent, as well as situations where local params disable all stats (hossman) * SOLR-7050: realtime get should internally load only fields specified in fl. (yonik, Noble Paul) Other Changes ---------------------- * SOLR-7014: Collapse identical catch branches in try-catch statements. (shalin) * SOLR-6500: Refactor FileFetcher in SnapPuller, add debug logging. (Ramkumar Aiyengar via Mark Miller) * SOLR-7076: In DIH, TikaEntityProcessor should have support for onError=skip (Noble Paul) * SOLR-7094: Better error reporting of JSON parse issues when indexing docs (Ishan Chattopadhyaya via Timothy Potter) * SOLR-7103: Remove unused method params in faceting code. (shalin) * SOLR-6311: When performing distributed queries, SearchHandler should use path when no qt or shard.qt parameter is specified; fix also resolves SOLR-4479. (Steve Molloy, Timothy Potter) * SOLR-7112: Fix DeleteInactiveReplicaTest.deleteLiveReplicaTest test failures. (shalin) * SOLR-6902: Use JUnit rules instead of inheritance with distributed Solr tests to allow for multiple tests without the same class. (Ramkumar Aiyengar, Erick Erickson, Mike McCandless) * SOLR-7032: Clean up test remnants of old-style solr.xml (Erick Erickson) * SOLR-7145: SolrRequest is now parametrized by its response type (Alan Woodward) * SOLR-7142: Fix TestFaceting.testFacets. (Michal Kroliczek via shalin) * SOLR-7156: Fix test failures due to resource leaks on windows. (Ishan Chattopadhyaya via shalin) * SOLR-7147: Introduce new TrackingShardHandlerFactory for monitoring what requests are sent to shards during tests. (hossman, shalin) * SOLR-7160: Rename ConfigSolr to NodeConfig, and decouple it from xml representation (Alan Woodward) * SOLR-7166: Encapsulate JettySolrRunner configuration (Alan Woodward) * SOLR-7130: Make stale state notification work without failing the requests (Noble Paul, shalin) * SOLR-7151: SolrClient query methods throw IOException (Alan Woodward) * SOLR-7179: JettySolrRunner no longer passes configuration to SolrDispatchFilter via system properties, but instead uses a Properties object in the servlet context (Alan Woodward) * SOLR-6275: Improve accuracy of QTime reporting (Ramkumar Aiyengar) * SOLR-7174: DIH should reset TikaEntityProcessor so that it is capable of re-use (Alexandre Rafalovitch , Gary Taylor via Noble Paul) * SOLR-6804: Untangle SnapPuller and ReplicationHandler (Ramkumar Aiyengar) * SOLR-7180: MiniSolrCloudCluster will startup and shutdown its jetties in parallel (Alan Woodward, Tomás Fernández Löbbe, Vamsee Yarlagadda) * SOLR-7173: Fix ReplicationFactorTest on Windows by adding better retry support after seeing no response exceptions. (Ishan Chattopadhyaya via Timothy Potter) * SOLR-7246: Speed up BasicZkTest, TestManagedResourceStorage (Ramkumar Aiyengar) * SOLR-7258: Forbid MessageFormat.format and MessageFormat single-arg constructor. (shalin) * SOLR-7162: Remove unused SolrSortField interface. (yonik, Connor Warrington via shalin) * SOLR-6414: Update to Hadoop 2.6.0. (Mark Miller) * SOLR-6673: MDC based logging of collection, shard, replica, core (Ishan Chattopadhyaya , Noble Paul) * SOLR-7291: Test indexing on ZK disconnect with ChaosMonkey tests (Ramkumar Aiyengar) * SOLR-7203: Remove buggy no-op retry code in HttpSolrClient (Alan Woodward, Mark Miller, Greg Solovyev) * SOLR-7202: Remove deprecated string action types in Overseer and OverseerCollectionProcessor - "deletecollection", "createcollection", "reloadcollection", "removecollection", "removeshard". (Varun Thacker, shalin) * SOLR-7290: Rename catchall _text field in data_driven_schema_configs to _text_ (Steve Rowe) * SOLR-7346: Stored XSS in Admin UI Schema-Browser page and Analysis page (Mei Wang via Timothy Potter) ================== 5.0.0 ================== Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release. NOTE: Solr 5.0 only supports creating and removing SolrCloud collections through the collections API, unlike previous versions. While not using the collections API may still work in 5.0, it is unsupported, not recommended, and the behavior will change in a 5.x release. Versions of Major Components --------------------- Apache Tika 1.7 Carrot2 3.9.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Jetty 8.1.10.v20130312 Upgrading from Solr 4.x ---------------------- * Apache Solr has no support for Lucene/Solr 3.x and earlier indexes anymore. Be sure to run Lucene's IndexUpgrader on the previous 4.10 version if you might still have old segments in your index. Alternatively fully optimize your index with Solr 4.10 to make sure it consists only of one up-to-date index segment. * The "file" attribute of infoStream in solrconfig.xml is removed. Control this via your logging configuration (org.apache.solr.update.LoggingInfoStream) instead. * UniqFieldsUpdateProcessorFactory no longer supports the init param style that was deprecated in Solr 4.5. If you are still using this syntax, update your configs to use instead. See SOLR-4249 for more details. * The following legacy numeric and date field types, deprecated in Solr 4.8, are no longer supported: BCDIntField, BCDLongField, BCDStrField, IntField, LongField, FloatField, DoubleField, SortableIntField, SortableLongField, SortableFloatField, SortableDoubleField, and DateField. Convert these types in your schema to the corresponding Trie-based field type and then re-index. See SOLR-5936 for more information. * getAnalyzer() in IndexSchema and FieldType that was deprecated in Solr 4.9 has been removed. Use getIndexAnalyzer() instead. See SOLR-6022 for more information. * The spellcheck response format has changed, affecting xml and json clients. In particular, the "correctlySpelled" and "collations" subsections have been moved outside the "suggestions" subsection, and now are directly under "spellcheck". See SOLR-3029 for more information. * The CollectionsAPI SolrJ calls createCollection(), reloadCollection(), deleteCollection(), requestStatus(), createShard(), splitShard(), deleteShard(), createAlias() and deleteAlias() which were deprecated in 4.11 have been removed. The new usage involves a builder style construction of the call. * The OVERSEERSTATUS API returns new key names for operations such as "create" for "createcollection", "delete" for "removecollection" and "deleteshard" for "removeshard". * If you have been using the /update/json/docs to index documents, SOLR-6617 introduces backward incompatible change. the key names created are fully qualified paths of keys . If you need the old functionality back , please add an extra parameter f=/** example: /update/json/docs?f=/** * Bugs fixed in several ValueSource functions may result in different behavior in situations where some documents do not have values for fields wrapped in other value sources. Users who want to preserve the previous behavior may need to wrap fields in the "def()" function. Example: changing "fl=sum(fieldA,fieldB)" to "fl=sum(def(fieldA,0.0),def(fieldB,0.0))". See LUCENE-5961 for more details. * AdminHandlers is deprecated, /admin/* are implicitly defined, /get, /replication and handlers are also implicitly registered (refer to SOLR-6792) * SolrCore.reload(ConfigSet coreConfig, SolrCore prev) was deprecated in 4.10.3 and removed in 5.0. use SolrCore.reload(ConfigSet coreConfig). See SOLR-5864. * The "termIndexInterval" option in solrconfig.xml has been a No-Op in the default codec since Solr 4.0, and has been removed completely in 5.0. If you get an "Illegal parameter 'termIndexInterval'" error when upgrading, you can safely remove this option from your configs. If you have a strong need to configure this, you must explicitly configure your schema with a custom codec. See SOLR-6560 and for more details. * The "checkIntegrityAtMerge" option in solrconfig.xml is now a No-Op and should be removed from any solrconfig.xml files -- these integrity checks are now done automatically at a very low level during the segment merging process. See SOLR-6834 for more details. * SimplePostTool (post.jar) no longer defaults to collection1, making either of core/collection name or update URL mandatory. An existing call without an explicit update URL needs to now have the core/collection name passed as "-Dc=" e.g.: java -jar -Dc= post.jar *.xml (new call with collection name) See SOLR-6852 for more details. * Relative paths specified in the solr.xml coreRootDirectory parameter for core discovery are now resolved relative to SOLR_HOME, rather than cwd. See SOLR-6718. * SolrServer and associated classes have been deprecated. Applications using SolrJ should use the equivalent SolrClient classes instead. * Spatial fields originating from Solr 4 (e.g. SpatialRecursivePrefixTreeFieldType, BBoxField) have the 'units' attribute deprecated, now replaced with 'distanceUnits'. If you change it to a unit other than 'degrees' (or if you don't specify it, which will default to kilometers if geo=true), then be sure to update maxDistErr as it's in those units. If you keep units=degrees then it should be backwards compatible but you'll get a deprecation warning on startup. See SOLR-6797. * The configuration in solrconfig.xml has been discontinued and should be removed from solrconfig.xml. Solr defaults to using NRT searchers regardless of the value in configuration and a warning is logged on startup if the solrconfig.xml has specified. * There was an old spatial syntax to specify a circle using Circle(x,y d=...) which should be replaced with simply using {!geofilt} (if you can) or BUFFER(POINT(x y),d). Likewise a rect syntax comprised of minX minY maxX maxY that should now be replaced with ENVELOPE(minX, maxX, maxY, minY). * Due to changes in the underlying commons-codec package, users of the BeiderMorseFilterFactory will need to rebuild their indexes after upgrading. See LUCENE-6058 for more details. * CachedSqlEntityProcessor has been removed, use SqlEntityProcessor with the cacheImpl parameter. * HttpDataSource has been removed, use URLDataSource instead. * LegacyHTMLStripCharFilter has been removed * CoreAdminRequest.persist() call has been removed. All changes made via CoreAdmin are persistent. * SpellCheckResponse.getSuggestions() and getSuggestionFrequencies() have been removed, use getAlternatives() and getAlternativeFrequencies() instead. * SolrQuery deprecated methods have been removed: - setMissing() is now setFacetMissing() - getFacetSort() is now getFacetSortString() - setFacetSort(boolean) should instead use setFacetSort(String) with FacetParams.FACET_SORT_COUNT or FacetParams.FACET_SORT_INDEX - setSortField(String, ORDER) should use setSort(SortClause) - addSortField(String, ORDER) should use addSort(SortClause) - removeSortField(String, ORDER) should use removeSort(SortClause) - getSortFields() should use getSorts() - set/getQueryType() should use set/getRequestHandler() * ClientUtil deprecated date methods have been removed, use DateUtil instead * FacetParams.FacetDateOther has been removed, use FacetRangeOther * ShardParams.SHARD_KEYS has been removed, use ShardParams._ROUTE_ * The 'old-style' solr.xml format is no longer supported, and cores must be defined using core.properties files. See https://cwiki.apache.org/confluence/display/solr/Format+of+solr.xml Detailed Change List ---------------------- New Features ---------------------- * SOLR-6103: Added DateRangeField for indexing date ranges, especially multi-valued ones. Supports facet.range, DateMath, and is mostly interoperable with TrieDateField. Based on LUCENE-5648. (David Smiley) * SOLR-6403: TransactionLog replay status logging. (Mark Miller) * SOLR-4580: Support for protecting content in ZooKeeper. (Per Steffensen, Mark Miller) * SOLR-6365: specify appends, defaults, invariants outside of the request handler. (Noble Paul, Erik Hatcher, shalin) * SOLR-5097: Schema API: Add REST support for adding dynamic fields to the schema. (Steve Rowe) * SOLR-5098: Schema API: Add REST support for adding field types to the schema. (Timothy Potter) * SOLR-5473 : Split clusterstate.json per collection and watch states selectively (Noble Paul, Mark Miller, shalin, Jessica Cheng Mallet, Timothy Potter, Anshum Gupta) * SOLR-5474 : Support for SOLR-5473 in SolrJ (Timothy Potter, Noble Paul, Mark Miller) * SOLR-5810 : Support for SOLR-5473 in solr admin UI (Timothy Potter, Noble Paul) * SOLR-6482: Add an onlyIfDown flag for DELETEREPLICA collections API command (Erick Erickson) * SOLR-6354: stats.field can now be used to generate stats over the numeric results of arbitrary functions, ie: stats.field={!func}product(price,popularity) (hossman) * SOLR-6485: ReplicationHandler should have an option to throttle the speed of replication (Varun Thacker, Noble Paul) * SOLR-6543: Give HttpSolrClient the ability to send PUT requests (Gregory Chanan) * SOLR-5986: Don't allow runaway queries from harming Solr cluster health or search performance (Anshum Gupta, Steve Rowe, Robert Muir) * SOLR-6565: SolrRequest support for query params (Gregory Chanan) * SOLR-6476: Create a bulk mode for schema API (Noble Paul, Steve Rowe) * SOLR-6512: Add a collections API call to add/delete arbitrary properties to a specific replica. Optionally adding sliceUnique=true will remove this property from all other replicas within a particular slice. (Erick Erickson) * SOLR-6513: Add a collectionsAPI call BALANCESLICEUNIQUE. Allows the even distribution of custom replica properties across nodes making up a collection, at most one node per slice will have the property. * SOLR-6605: Make ShardHandlerFactory maxConnections configurable. (Christine Poerschke via shalin) * SOLR-6585: RequestHandlers can optionally handle sub paths as well (Noble Paul) * SOLR-6617: /update/json/docs path will use fully qualified node names by default (Noble Paul) * SOLR-4715: Add CloudSolrClient constructors which accept a HttpClient instance. (Hardik Upadhyay, Shawn Heisey, shalin) * SOLR-5992: add "removeregex" as an atomic update operation (Vitaliy Zhovtyuk via Erick Erickson) * SOLR-6633: /update/json/docs path can now save the underlying json doc asa string field and better support added to the default example (Noble Paul) * SOLR-6650: Add optional slow request logging at WARN level (Jessica Cheng Mallet via Timothy Potter) * SOLR-6655: SimplePostTool now features -Dhost, -Dport, and -Dc (for core/collection) properties to allow easier overriding of just the right piece of the Solr URL. (ehatcher) * SOLR-6248: MoreLikeThis QParser that accepts a document id and returns documents that have similar content. It works in standalone/cloud mode and shares logic with the Lucene MoreLikeThis class (Anshum Gupta). * SOLR-6670: change BALANCESLICEUNIQUE to BALANCESHARDUNIQUE. Also, the parameter for ADDREPLICAPROP that used to be sliceUnique is now shardUnique. (Erick Erickson) * SOLR-6351: Stats can now be nested under pivot values by adding a 'stats' local param to facet.pivot which refers to a 'tag' local param in one or more stats.field params. (hossman, Vitaliy Zhovtyuk, Steve Molloy) * SOLR-6533: Support editing common solrconfig.xml values (Noble Paul) * SOLR-6607: Managing requesthandlers through API (Noble Paul) * SOLR-4799: faster join using join="zipper" aka merge join for nested DIH EntityProcessors (Mikhail Khludnev via Noble Paul) * SOLR-6787: API to manage blobs in Solr (Noble Paul) * SOLR-6801: Load RequestHandler from blob store (Noble Paul) * SOLR-1632: Support Distributed IDF (Andrzej Bialecki, Mark Miller, Yonik Seeley, Robert Muir, Markus Jelsma, Vitaliy Zhovtyuk, Anshum Gupta) * SOLR-6729: createNodeSet.shuffle=(true|false) support for /admin/collections?action=CREATE. (Christine Poerschke, Ramkumar Aiyengar via Mark Miller) * SOLR-6851: Scripts to support installing and running Solr as a service on Linux (Timothy Potter, Hossman, Steve Rowe) * SOLR-6770: Add/edit param sets and use them in Requests (Noble Paul) * SOLR-6879: Have an option to disable autoAddReplicas temporarily for all collections. (Varun Thacker via Steve Rowe) * SOLR-6435: Add bin/post script to simplify posting content to Solr (Erik Hatcher) * SOLR-6761: Ability to ignore commit and/or optimize requests from clients when running in SolrCloud mode using the IgnoreCommitOptimizeUpdateProcessorFactory. (Timothy Potter) * SOLR-6797: Spatial fields that used to require units=degrees like SpatialRecursivePrefixTreeFieldType (RPT) now take distanceUnits=degrees|kilometers|miles instead. It is applied to nearly all distance measurements involving the field: maxDistErr, distErr, d, geodist, score=distance|area|area2d. score now accepts these units as well. It does NOT affect distances embedded in WKT strings like BUFFER(POINT(200 10),0.2)). (Ishan Chattopadhyaya, David Smiley) * SOLR-6766: Expose HdfsDirectoryFactory Block Cache statistics via JMX. (Mike Drob, Mark Miller) * SOLR-2035: Add a VelocityResponseWriter $resource tool for locale-specific string lookups. (Erik Hatcher) * SOLR-6916: Toggle payload support for the default highlighter via hl.payloads. It's auto enabled when the index has payloads. (David Smiley) * SOLR-6581: Efficient DocValues support and numeric collapse field implementations for Collapse and Expand (Joel Bernstein) * SOLR-6937: In schemaless mode ,replace spaces and special characters with underscore (Noble Paul) * SOLR-5147: Support child documents in DIH (Vadim Kirilchuk, Shawn Heisey, Thomas Champagne, Mikhail Khludnev via Noble Paul) Bug Fixes ---------------------- * SOLR-4895: An error should be returned when a rollback is attempted in SolrCloud mode. (Vamsee Yarlagadda via Mark Miller) * SOLR-6424: The hdfs block cache BLOCKCACHE_WRITE_ENABLED is not defaulting to false like it should. (Mark Miller) * SOLR-6426: SolrZkClient clean can fail due to a race with children nodes. (Mark Miller) * SOLR-5966: Admin UI Menu is fixed and doesn't respect smaller viewports. (Aman Tandon, steffkes via shalin) * SOLR-4406: Fix RawResponseWriter to respect 'base' writer (Steve Davids, hossman) * SOLR-6297: Fix WordBreakSolrSpellChecker to not lose suggestions in shard/cloud environments (James Dyer) * SOLR-6467: bin/solr script should direct stdout/stderr when starting in the background to the solr-PORT-console.log in the logs directory instead of bin. (Timothy Potter) * SOLR-6187: SOLR-6154: facet.mincount ignored in range faceting using distributed search NOTE: This does NOT fixed for the (deprecated) facet.date idiom, use facet.range instead. (Erick Erickson, Zaccheo Bagnati, Ronald Matamoros, Vamsee Yalargadda) * SOLR-6457: LBHttpSolrClient: ArrayIndexOutOfBoundsException risk if counter overflows (longkey via Noble Paul) * SOLR-6499: Log warning about multiple update request handlers (Noble Paul, Andreas Hubold, hossman) * SOLR-6507: Fixed several bugs involving stats.field used with local params (hossman) * SOLR-6481: CLUSTERSTATUS should check if the node hosting a replica is live when reporting replica status (Timothy Potter) * SOLR-6484: SolrCLI's healthcheck action needs to check live nodes as part of reporting the status of a replica (Timothy Potter) * SOLR-6540 Fix NPE from strdist() func when doc value source does not exist in a doc (hossman) * SOLR-6624 Spelling mistakes in the Java source (Hrishikesh Gadre) * SOLR-6307: Atomic update remove does not work for int array or date array (Anurag Sharma , noble) * SOLR-6224: Post soft-commit callbacks are called before soft commit actually happens. (shalin) * SOLR-6591: Overseer can use stale cluster state and lose updates for collections with stateFormat > 1. (shalin) * SOLR-6631: DistributedQueue spinning on calling zookeeper getChildren() (Jessica Cheng Mallet, Mark Miller, Timothy Potter) * SOLR-6579: SnapPuller Replication blocks clean shutdown of tomcat (Philip Black-Knight via Noble Paul) * SOLR-6721: ZkController.ensureReplicaInLeaderInitiatedRecovery puts replica in local map before writing to ZK. (shalin) * SOLR-6679: Disabled suggester component from techproduct solrconfig.xml since it caused long startup times on large indexes even when it wasn't used. (yonik, hossman) * SOLR-6738: Admin UI - Escape Data on Plugins-View (steffkes) * SOLR-3774: Solr adds RequestHandler SolrInfoMBeans twice to the JMX server. (Tomás Fernández Löbbe, hossman, Mark Miller) * SOLR-6763: Shard leader elections should not persist across session expiry (Alan Woodward, Mark Miller) * SOLR-3881: Avoid OOMs in LanguageIdentifierUpdateProcessor: - Added langid.maxFieldValueChars and langid.maxTotalChars params to limit input, by default 10k and 20k chars, respectively. - Moved input concatenation to Tika implementation; the langdetect implementation instead appends each input piece via the langdetect API. (Vitaliy Zhovtyuk, Tomás Fernández Löbbe, Rob Tulloh, Steve Rowe) * SOLR-6626: NPE in FieldMutatingUpdateProcessor when indexing a doc with null field value (Noble Paul) * SOLR-6604: SOLR-6812: Fix NPE with distrib.singlePass=true and expand component. Increased test coverage of expand component with docValues. (Christine Poerschke, Per Steffensen, shalin) * SOLR-6718: Core discovery was walking paths relative to the Jetty working directory, rather than SOLR_HOME. (Andreas Hubold, Alan Woodward) * SOLR-6864: Support registering searcher listeners in SolrCoreAware.inform(SolrCore) method. Existing components rely on this. (Tomás Fernández Löbbe) * SOLR-6850: AutoAddReplicas makes a call to wait to see live replicas that times out after 30 milliseconds instead of 30 seconds. (Varun Thacker via Mark Miller) * SOLR-6397: zkcli script put/putfile should allow overwriting an existing znode's data (Timothy Potter) * SOLR-6873: Lib relative path is incorrect for techproduct configset (Alexandre Rafalovitch via Erick Erickson) * SOLR-6899: Change public setter for CollectionAdminRequest.action to protected. (Anshum Gupta) * SOLR-6779: fix /browse for schemaless example (ehatcher) * SOLR-6874: There is a race around SocketProxy binding to it's port the way we setup JettySolrRunner and SocketProxy. (Mark Miller, Timothy Potter) * SOLR-6735: Make CloneFieldUpdateProcessorFactory null safe (Steve Davids via ehatcher) * SOLR-6907: URLEncode documents directory in MorphlineMapperTest to handle spaces etc. in file name. (Ramkumar Aiyengar via Erick Erickson) * SOLR-6880: Harden ZkStateReader to expect that getCollectionLive may return null as it's contract states. (Mark Miller, shalin) * SOLR-6643: Fix error reporting & logging of low level JVM Errors that occur when loading/reloading a SolrCore (hossman) * SOLR-6839: Direct routing with CloudSolrServer will ignore the Overwrite document option. (Mark Miller) * SOLR-6793: ReplicationHandler does not destroy all of it's created SnapPullers. (Mark Miller) * SOLR-6946: Document -p port option for the create_core and create_collection actions in bin/solr (Timothy Potter) * SOLR-6923: AutoAddReplicas also consults live_nodes to see if a state change has happened. (Varun Thacker via Anshum Gupta) * SOLR-6941: DistributedQueue#containsTaskWithRequestId can fail with NPE. (Mark Miller) * SOLR-6764: Field types need to be re-informed after reloading a managed schema from ZK (Timothy Potter) * SOLR-6931: We should do a limited retry when using HttpClient. (Mark Miller, Hrishikesh Gadre, Gregory Chanan) * SOLR-7004: Add a missing constructor for CollectionAdminRequest.BalanceShardUnique that sets the collection action. (Anshum Gupta) * SOLR-6993: install_solr_service.sh won't install on RHEL / CentOS (David Anderson via Timothy Potter) * SOLR-6928: solr.cmd stop works only in english (john.work, Jan Høydahl, Timothy Potter) * SOLR-7011: Delete collection returns before collection is actually removed. (Christine Poerschke via shalin) * SOLR-6640: Close searchers before rollback and recovery to avoid index corruption. (Robert Muir, Varun Thacker, shalin) * SOLR-6847: LeaderInitiatedRecoveryThread compares wrong replica's state with lirState. (shalin) * SOLR-6856: Restore ExtractingRequestHandler's ability to capture all HTML tags when parsing (X)HTML. (hossman, Uwe Schindler, ehatcher, Steve Rowe) * SOLR-7024: Improved error messages when java is not found by the bin/solr shell script, particularly when JAVA_HOME has an invalid location. (Shawn Heisey) * SOLR-7038: Validate the presence of configset before trying to create a collection. (Anshum Gupta, Mark Miller) * SOLR-7037: bin/solr start -e techproducts -c fails to start Solr in cloud mode (Timothy Potter) * SOLR-7016: Fix bin\solr.cmd to work in a directory with spaces in the name. (Timothy Potter, Uwe Schindler) * SOLR-6969: When opening an HDFSTransactionLog for append we must first attempt to recover it's lease to prevent data loss. (Mark Miller, Praneeth Varma, Colin McCabe) * SOLR-7067: bin/solr won't run under bash 4.2+. (Steve Rowe) * SOLR-7068: Collapse on numeric field breaks when min/max values are negative. (Joel Bernstein) * SOLR-6780: Fixed a bug in how default/appends/invariants params were affecting the set of all "keys" found in the request parameters, resulting in some key=value param pairs being duplicated. This was noticeably affecting some areas of the code where iteration was done over the set of all params: - literal.* in ExtractingRequestHandler - facet.* in FacetComponent - spellcheck.[dictionary name].* and spellcheck.collateParam.* in SpellCheckComponent - olap.* in AnalyticsComponent (Alexandre Rafalovitch & hossman) * SOLR-6920: A replicated index can end up corrupted when small files end up with the same file name and size. (Varun Thacker, Mark Miller) * SOLR-7033, SOLR-5961: RecoveryStrategy should not publish any state when closed / cancelled and there should always be a pause between recoveries even when recoveries are rapidly stopped and started as well as when a node attempts to become the leader for a shard. (Mark Miller, Maxim Novikov) * SOLR-6693: bin\solr.cmd doesn't support 32-bit JRE/JDK running on Windows due to parenthesis in JAVA_HOME. (Timothy Potter, Christopher Hewitt, Jan Høydahl) Optimizations ---------------------- * SOLR-6603: LBHttpSolrClient - lazily allocate skipped-zombie-servers list. (Christine Poerschke via shalin) * SOLR-6554: Speed up overseer operations avoiding cluster state reads from zookeeper at the start of each loop and instead relying on local state and compare-and-set writes. This change also adds batching for consecutive messages belonging to the same collection with stateFormat=2. (shalin) * SOLR-6680: DefaultSolrHighlighter can sometimes avoid CachingTokenFilter with hl.usePhraseHighlighter, and can be more efficient handling data from term vectors. (David Smiley) * SOLR-6666: Dynamic copy fields are considering all dynamic fields, causing a significant performance impact on indexing documents. (Liram Vardi via Erick Erickson, Steve Rowe) Other Changes ---------------------- * SOLR-4622: Hardcoded SolrCloud defaults for hostContext and hostPort that were deprecated in 4.3 have been removed completely. (hossman) * SOLR-5936: Removed deprecated non-Trie-based numeric & date field types. (Steve Rowe) * SOLR-6169: Finish removal of CoreAdminHandler handleAlias action begun in 4.9 (Alan Woodward) * SOLR-6215: TrieDateField should directly extend TrieField instead of forwarding to a wrapped TrieField. (Steve Rowe) * SOLR-3029: Changes to spellcheck response format (Nalini Kartha via James Dyer) * SOLR-3957: Removed RequestHandlerUtils#addExperimentalFormatWarning(), which removes "experimental" warning from two places: replication handler details command and DataImportHandler responses. (ehatcher) * SOLR-6073: Remove helper methods from CollectionsRequest (SolrJ) for CollectionsAPI calls and move to a builder design for the same. (Varun Thacker, Anshum Gupta) * SOLR-6519: Make DirectoryFactory#create() take LockFactory. (Uwe Schindler) * SOLR-6400: SolrCloud tests are not properly testing session expiration. (Mark Miller) * LUCENE-5650: Tests can no longer write to CWD. Update log dir is now made relative to the instance dir if it is not an absolute path. (Ryan Ernst, Dawid Weiss) * SOLR-6390: Remove unnecessary checked exception for CloudSolrClient constructors, improve javadocs for CloudSolrClient constructors. (Steve Davids via Shawn Heisey) * LUCENE-5901: Replaced all occurrences of LUCENE_CURRENT with LATEST for luceneMatchVersion. (Ryan Ernst) * SOLR-6445: Upgrade Noggit to version 0.6 to support more flexible JSON input (Noble Paul, Yonik Seeley) * SOLR-6073: Remove helper methods from CollectionsRequest (SolrJ) for CollectionsAPI calls and move to a builder design for the same. (Varun Thacker, Anshum Gupta) * SOLR-5322: core discovery can fail w/NPE and no explanation if a non-readable directory exists (Said Chavkin, Erick Erickson) * SOLR-6488, SOLR-6991: Update to Apache Tika 1.7. This adds support for parsing Outlook PST and Matlab MAT files. Parsing for NetCDF files was removed because of license issues; if you need support for this format, download the parser JAR yourself and add it to contrib/extraction/lib folder: http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/ (Uwe Schindler) * SOLR-6115: Cleanup enum/string action types in Overseer, OverseerCollectionProcessor and CollectionHandler. (Erick Erickson, shalin) * SOLR-6453: Stop throwing an error message from Overseer when node exits (Ramkumar Aiyengar, Noble Paul) * SOLR-6550: Provide simple mechanism for passing additional metadata / context about a server-side SolrException back to the client-side (Timothy Potter) * SOLR-6249: Schema API changes return success before all cores are updated; client application can provide the optional updateTimeoutSecs parameter to cause the server handling the managed schema update to block until all replicas of the same collection have processed the update or until the specified timeout is reached (Timothy Potter) * SOLR-6597: SolrIndexConfig parameter in one of the SolrIndexSearcher constructor has been removed. It was just passed and never used via that constructor. (Anshum Gupta) * SOLR-5852: Add CloudSolrClient helper method to connect to a ZK ensemble. (Varun Thacker, Furkan KAMACI, Shawn Heisey, Mark Miller, Erick Erickson via shalin) * SOLR-6592: Avoid waiting for the leader to see the down state if that leader is not live. (Timothy Potter) * SOLR-6641: SystemInfoHandler should include the zkHost the node is using (when running in solrcloud mode) (Timothy Potter) * SOLR-6295: Fix child filter query creation to never match parent docs in SolrExampleTests. (Varun Thacker, Mikhail Khludnev via shalin) * SOLR-6578: Update commons-io dependency to the latest 2.4 version (Steve Rowe, Shawn Heisey) * SOLR-6651: Fix wrong timeout logged in waitForReplicasToComeUp. (shalin) * SOLR-6698: Solr is not consistent wrt ZkCredentialsProvider / ZkCredentialProvider. References to zkCredentialProvider in System properties or configurations should be changed to zkCredentialsProvider. (Gregory Chanan) * SOLR-6715: ZkSolrResourceLoader constructors accept a parameter called 'collection' but it should be 'configName'. (shalin) * SOLR-6697: bin/solr start scripts allow setting SOLR_OPTS in solr.in.* (janhoy) * SOLR-6739: Admin UI - Sort list of command line args (steffkes) * SOLR-6740: Admin UI - improve Files View (steffkes) * SOLR-6570: Run SolrZkClient session watch asynchronously. (Ramkumar Aiyengar via Mark Miller) * SOLR-6747: Add an optional caching option as a workaround for SOLR-6586. (Mark Miller, Gregory Chanan) * SOLR-6459: Normalize logging of operations in Overseer and log current queue size. (Ramkumar Aiyengar, shalin via Mark Miller) * SOLR-6754: ZkController.publish doesn't use the updateLastState parameter. (shalin) * SOLR-6751: Exceptions thrown in the analysis chain in DirectUpdateHandler2 should return a BAD_REQUEST status (Alan Woodward) * SOLR-6792: deprecate AdminHandlers, Clean up solrconfig.xml of unnecessary plugin definitions, implicit registration of /replication, /get and /admin/* handlers (Noble Paul) * SOLR-5864: Remove previous SolrCore as parameter on reload. (Tomás Fernández Löbbe) * SOLR-4792: Stop shipping a .war. (Robert Muir, Ramkumar Aiyengar, Mark Miller) * SOLR-6799: Update Saxon-HE to 9.6.0-2. (Mark Miller) * SOLR-6454: Suppress EOFExceptions in SolrDispatchFilter. (Ramkumar Aiyengar via Mark Miller) * SOLR-6370: Allow tests to report/fail on many ZK watches being parallelly requested on the same data (Ramkumar Aiyengar via Timothy Potter) * SOLR-6752: Buffer Cache allocate/lost metrics should be exposed. (Mike Drob via Mark Miller) * SOLR-6560: Purge termIndexInterval from example/test configs (Tom Burton-West, hossman) * SOLR-6773: Remove the multicore example as the DIH and cloud examples illustrate multicore behavior (hossman, Timothy Potter) * SOLR-6834: Warn if checkIntegrityAtMerge is configured. This option is no longer meaningful since the checks are done automatically at a very low level in the segment merging. This warning will become an error in Solr 6.0. (hossman) * SOLR-6833: Examples started with bin/solr -e should use a solr.solr.home directory under the example directory instead of server/solr. (Alexandre Rafalovitch, Anshum Gupta, hossman, Timothy Potter) * SOLR-6826: fieldType capitalization is not consistent with the rest of case-sensitive field names. (Alexandre Rafalovitch via Erick Erickson) * SOLR-6849: HttpSolrClient.RemoteSolrException reports the URL of the remote host where the exception occurred. (Alan Woodward) * SOLR-6852: SimplePostTool no longer defaults to collection1 making core/collection/update URL mandatory. (Anshum Gupta) * SOLR-6861: post.sh from exampledocs directory has been removed as there no longer is a default update URL. (Anshum Gupta) * SOLR-5922: Add support for adding core properties to SolrJ Collection Admin Request calls. (Varun Thacker via Anshum Gupta). * SOLR-6523: Provide SolrJ support for specifying stateFormat while creating Collections. (Anshum Gupta) * SOLR-6881: Add split.key support for SPLITSHARD via SolrJ (Anshum Gupta) * SOLR-6883: CLUSTERPROP API switch case does not call break. (Varun Thacker via shalin) * SOLR-6882: Misspelled collection API actions in ReplicaMutator exception messages. (Steve Rowe via shalin) * SOLR-6867: SolrCLI should check for existence before creating a new core/collection, more user-friendly error reporting (no stack trace), and the ability to pass a directory when using bin/solr to create a core or collection (Timothy Potter) * SOLR-6885: Add core name to RecoveryThread name. (Christine Poerschke via shalin) * SOLR-6855: bin/solr -e dih launches, but has some path cruft issues preventing some of the imports don't work (Hossman, Timothy Potter) * SOLR-3711: Truncate long strings in /browse field facets (ehatcher) * SOLR-6876: Remove unused legacy scripts.conf (Alexandre Rafalovitch via Erick Erickson) * SOLR-6896: Speed up tests by dropping SolrJettyRunner thread max idle time (Alan Woodward) * SOLR-6448: Add SolrJ support for all current Collection API calls. (Anshum Gupta) * Fixed a typo in various solrconfig.xml files. (sdumitriu - pull request #120) * SOLR-6895: SolrServer classes are renamed to *SolrClient. The existing classes still exist, but are deprecated. (Alan Woodward, Erik Hatcher) * SOLR-6483: Refactor some methods in MiniSolrCloudCluster tests (Steve Davids via Erick Erickson) * SOLR-6906: Fix typo bug in DistributedDebugComponentTest.testCompareWithNonDistributedRequest (Ramkumar Aiyengar via Erick Erickson) * SOLR-6905: Test pseudo-field retrieval in distributed search. (Ramkumar Aiyengar via shalin) * SOLR-6897: Nuke non-NRT mode from code and configuration. (Hossman, shalin) * SOLR-6830: Update Woodstox to 4.4.1 and StAX to 3.1.4. (ab) * SOLR-6918: No need to log exceptions (as warn) generated when creating MBean stats if the core is shutting down (Timothy Potter) * SOLR-6932: All HttpClient ConnectionManagers and SolrJ clients should always be shutdown in tests and regular code. (Mark Miller) * SOLR-1723: VelocityResponseWriter improvements (Erik Hatcher) * SOLR-6324: Set finite default timeouts for select and update. (Ramkumar Aiyengar via Mark Miller) * SOLR-6952: bin/solr create action should copy configset directory instead of reusing an existing configset in ZooKeeper by default (Timothy Potter) * SOLR-6933: bin/solr should provide a single "create" action that creates a core or collection depending on whether Solr is running in standalone or cloud mode (Timothy Potter) * SOLR-6496: LBHttpSolrClient stops server retries after the timeAllowed threshold is met. (Steve Davids, Anshum Gupta) * SOLR-6904: Removed deprecated Circle & rect syntax. See upgrading notes. (David Smiley) * SOLR-6943: HdfsDirectoryFactory should fall back to system props for most of it's config if it is not found in solrconfig.xml. (Mark Miller, Mike Drob) * SOLR-6926: "ant example" makes no sense anymore - should be "ant server" (Ramkumar Aiyengar, Timothy Potter) * SOLR-6982: bin/solr and SolrCLI should support SSL-related Java System Properties (Timothy Potter) * SOLR-6981: Add a delete action to the bin/solr script to allow deleting of cores / collections (with delete collection config directory from ZK) (Timothy Potter) * SOLR-6840: Remove support for old-style solr.xml (Erick Erickson, Alan Woodward) * SOLR-6976: Remove classes and methods deprecated in 4.x (Alan Woodward, Noble Paul, Chris Hostetter) * SOLR-6521: CloudSolrClient should synchronize cache cluster state loading ( Noble Paul, Jessica Cheng Mallet) * SOLR-7018: bin/solr stop should stop if there is only one node running or generate an error message prompting the user to be explicit about which of multiple nodes to stop using the -p or -all options (Timothy Potter) * SOLR-5918: ant clean does not remove ZooKeeper data (Varun Thacker, Steve Rowe) * SOLR-7020: 'bin/solr start' should automatically use an SSL-enabled alternate jetty configuration file when in SSL mode, eliminating the need for manual jetty.xml edits. (Steve Rowe) * SOLR-6227: Avoid spurious failures of ChaosMonkeySafeLeaderTest by ensuring there's at least one jetty to kill. (shalin) ================== 4.10.4 ================== Bug Fixes ---------------------- * SOLR-6931: We should do a limited retry when using HttpClient. (Mark Miller, Hrishikesh Gadre, Gregory Chanan) * SOLR-6780: Fixed a bug in how default/appends/invariants params were affecting the set of all "keys" found in the request parameters, resulting in some key=value param pairs being duplicated. This was noticeably affecting some areas of the code where iteration was done over the set of all params: - literal.* in ExtractingRequestHandler - facet.* in FacetComponent - spellcheck.[dictionary name].* and spellcheck.collateParam.* in SpellCheckComponent - olap.* in AnalyticsComponent (Alexandre Rafalovitch & hossman) * SOLR-6426: SolrZkClient clean can fail due to a race with children nodes. (Mark Miller) * SOLR-6457: LBHttpSolrClient: ArrayIndexOutOfBoundsException risk if counter overflows (longkey via Noble Paul) * SOLR-6481: CLUSTERSTATUS should check if the node hosting a replica is live when reporting replica status (Timothy Potter) * SOLR-6631: DistributedQueue spinning on calling zookeeper getChildren() (Jessica Cheng Mallet, Mark Miller, Timothy Potter) * SOLR-6579: SnapPuller Replication blocks clean shutdown of tomcat (Philip Black-Knight via Noble Paul) * SOLR-6763: Shard leader elections should not persist across session expiry (Alan Woodward, Mark Miller) * SOLR-3881: Avoid OOMs in LanguageIdentifierUpdateProcessor: - Added langid.maxFieldValueChars and langid.maxTotalChars params to limit input, by default 10k and 20k chars, respectively. - Moved input concatenation to Tika implementation; the langdetect implementation instead appends each input piece via the langdetect API. (Vitaliy Zhovtyuk, Tomás Fernández Löbbe, Rob Tulloh, Steve Rowe) * SOLR-6850: AutoAddReplicas makes a call to wait to see live replicas that times out after 30 milliseconds instead of 30 seconds. (Varun Thacker via Mark Miller) * SOLR-6839: Direct routing with CloudSolrServer will ignore the Overwrite document option. (Mark Miller) * SOLR-7139: Fix SolrContentHandler for TIKA to ignore multiple startDocument events. (Chris A. Mattmann, Uwe Schindler) * SOLR-6941: DistributedQueue#containsTaskWithRequestId can fail with NPE. (Mark Miller) * SOLR-7011: Delete collection returns before collection is actually removed. (Christine Poerschke via shalin) * SOLR-6856: Restore ExtractingRequestHandler's ability to capture all HTML tags when parsing (X)HTML. (hossman, Uwe Schindler, ehatcher, Steve Rowe) * SOLR-6928: solr.cmd stop works only in english (john.work, Jan Høydahl, Timothy Potter) * SOLR-7038: Validate the presence of configset before trying to create a collection. (Anshum Gupta, Mark Miller) * SOLR-7016: Fix bin\solr.cmd to work in a directory with spaces in the name. (Timothy Potter, Uwe Schindler) * SOLR-6693: bin\solr.cmd doesn't support 32-bit JRE/JDK running on Windows due to parenthesis in JAVA_HOME. (Timothy Potter, Christopher Hewitt, Jan Høydahl) * SOLR-7067: bin/solr won't run under bash 4.2+. (Steve Rowe) * SOLR-7033, SOLR-5961: RecoveryStrategy should not publish any state when closed / cancelled and there should always be a pause between recoveries even when recoveries are rapidly stopped and started as well as when a node attempts to become the leader for a shard. (Mark Miller, Maxim Novikov) * SOLR-6847: LeaderInitiatedRecoveryThread compares wrong replica's state with lirState. (shalin) * SOLR-7128: Two phase distributed search is fetching extra fields in GET_TOP_IDS phase. (Pablo Queixalos, shalin) Other Changes ---------------------- * SOLR-7147: Introduce new TrackingShardHandlerFactory for monitoring what requests are sent to shards during tests. (hossman, shalin) ================== 4.10.3 ================== Bug Fixes ---------------------- * SOLR-6696: bin/solr start script should not enable autoSoftCommit by default (janhoy) * SOLR-6704: TrieDateField type drops schema properties in branch 4.10 (Tomás Fernández Löbbe) * SOLR-6085: Suggester crashes when prefixToken is longer than surface form (janhoy) * SOLR-6323: ReRankingQParserPlugin cleaner paging and fix bug with fuzzy, range and other queries that need to be re-written. (Adair Kovac, Joel Bernstein) * SOLR-6684: Fix-up /export JSON. (Joel Bernstein) * SOLR-6781: BBoxField didn't support dynamic fields. (David Smiley) * SOLR-6784: BBoxField's 'score' mode should have been optional. (David Smiley) * SOLR-6510: The collapse QParser would throw a NPE when used on a DocValues field on an empty segment/index. (Christine Poerschke, David Smiley) * SOLR-2927: Solr does not unregister all mbeans upon exception in constructor causing memory leaks. (tom liu, Sharath Babu, Cyrille Roy, shalin) * SOLR-6685: ConcurrentModificationException in Overseer Status API. (shalin) * SOLR-6706: /update/json/docs throws RuntimeException if a nested structure contains a non-leaf float field (Noble Paul, shalin) * SOLR-6610: Slow startup of new clusters because ZkController.publishAndWaitForDownStates always times out. (Jessica Cheng Mallet, shalin, Noble Paul) * SOLR-6662: better validation when parsing command-line options that expect a value (Timothy Potter) * SOLR-6732: Fix handling of leader-initiated recovery state was String in older versions and is now a JSON map, caused backwards compatibility issues when doing rolling upgrades of a live cluster while indexing (Timothy Potter) * SOLR-6705: Better strategy for dealing with JVM specific options in the start scripts; remove -XX:+AggressiveOpts and only set -XX:-UseSuperWord for Java 1.7u40 to u51. (Uwe Schindler, janhoy, hossman, Timothy Potter) * SOLR-6726: better strategy for selecting the JMX RMI port based on SOLR_PORT in bin/solr script (Timothy Potter) * SOLR-6795: distrib.singlePass returns score even though not asked for. (Per Steffensen via shalin) * SOLR-6796: distrib.singlePass does not return correct set of fields for multi-fl-parameter requests. (Per Steffensen via shalin) * SOLR-6776: Transaction log was not flushed at the end of update requests with softCommit specified, which could lead to data loss if the server were killed immediately after the update finished. (Jeffery Yuan via yonik) Other Changes ---------------------- * SOLR-6661: Adjust all example configurations to allow overriding error-prone relative paths for solrconfig.xml references with solr.install.dir system property; bin/solr scripts will set it appropriately. (ehatcher) * SOLR-6694: Auto-detect JAVA_HOME using the Windows registry if it is not set (janhoy, Timothy Potter) * SOLR-6653: bin/solr script should return error code >0 when something fails (janhoy, Timothy Potter) * SOLR-6829: Added getter/setter for lastException in DIH's ContextImpl (ehatcher) ================== 4.10.2 ================== Bug Fixes ---------------------- * SOLR-6509: Solr start scripts interactive mode doesn't honor -z argument (Timothy Potter) * SOLR-6511: Fencepost error in LeaderInitiatedRecoveryThread (Timothy Potter) * SOLR-6530: Commits under network partitions can put any node in down state. (Ramkumar Aiyengar, Alan Woodward, Mark Miller, shalin) * SOLR-6573: QueryElevationComponent now works with localParams in the query (janhoy) * SOLR-6524: Collections left in recovery state after node restart because recovery sleep time increases exponentially between retries. (Mark Miller, shalin) * SOLR-6587: Misleading exception when creating collections in SolrCloud with bad configuration. (Tomás Fernández Löbbe) * SOLR-6452: StatsComponent's stat 'missing' will work on fields with docValues=true and indexed=false (Xu Zhang via Tomás Fernández Löbbe) * SOLR-6646: bin/solr start script fails to detect solr on non-default port and then after 30s tails wrong log file (janhoy) * SOLR-6647: Bad error message when missing resource from ZK when parsing Schema (janhoy) * SOLR-6545: Query field list with wild card on dynamic field fails. (Burke Webster, Xu Zhang, shalin) Other Changes ---------------------- * SOLR-6550: Provide simple mechanism for passing additional metadata / context about a server-side SolrException back to the client-side (Timothy Potter) * SOLR-6486: solr start script can have a debug flag option; use -a to set arbitrary options (Noble Paul, Timothy Potter) * SOLR-6549: bin/solr script should support a -s option to set the -Dsolr.solr.home property. (Timothy Potter) * SOLR-6529: Stop command in the start scripts should only stop the instance that it had started. (Varun Thacker, Timothy Potter) ================== 4.10.1 ================== Bug Fixes ---------------------- * SOLR-6425: If using the new global hdfs block cache option, you can end up reading corrupt files on file name reuse. (Mark Miller, Gregory Chanan) * SOLR-5814: CoreContainer reports incorrect & misleading path for solrconfig.xml when there are loading problems (Pradeep via hossman) * SOLR-6024: Fix StatsComponent when using docValues="true" multiValued="true" (Vitaliy Zhovtyuk & Tomas Fernandez-Lobbe via hossman) * SOLR-6493: Fix fq exclusion via "ex" local param in multivalued stats.field (hossman) * SOLR-6447: bin/solr script needs to pass -DnumShards=1 for boostrapping collection1 when starting Solr in cloud mode. (Timothy Potter) * SOLR-6501: Binary Response Writer does not return wildcard fields. (Mike Hugo, Constantin Mitocaru, sarowe, shalin) Other Changes --------------------- * SOLR-6503: Removed support for parsing netcdf files in Solr Cell because of license issues. If you need support for this format, download the parser JAR yourself (version 4.2) and add it to contrib/extraction/lib folder: http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/ (Uwe Schindler) ================== 4.10.0 ================= Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release Versions of Major Components --------------------- Apache Tika 1.5 (with upgraded Apache POI 3.10.1) Carrot2 3.9.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Upgrading from Solr 4.9 ---------------------- * In Solr 3.6, all primitive field types were changed to omit norms by default when the schema version is 1.5 or greater (SOLR-3140), but TrieDateField's default was mistakenly not changed. As of Solr 4.10, TrieDateField omits norms by default (see SOLR-6211). * Creating a SolrCore via CoreContainer.create() no longer requires an additional call to CoreContainer.register() to make it available to clients (see SOLR-6170). * CoreContainer.remove() has been removed. You should now use CoreContainer.unload() to delete a SolrCore (see SOLR-6232). * solr.xml parsing has been improved to better account for the expected data types of various options. As part of this fix, additional error checking has also been added to provide errors in the event of duplicated options, or unknown option names that may indicate a typo. Users who have modified their solr.xml in the past and now upgrade may get errors on startup if they have typos or unexpected options specified in their solr.xml file. (See SOLR-5746 for more information.) Detailed Change List ---------------------- New Features ---------------------- * SOLR-6196: The overseerstatus collection API instruments amILeader and ZK state update calls. (shalin) * SOLR-6069: The 'clusterstatus' API should return 'roles' information. (shalin) * SOLR-6044: The 'clusterstatus' API should return live_nodes as well. (shalin) * SOLR-5768: Add a distrib.singlePass parameter to make EXECUTE_QUERY phase fetch all fields and skip GET_FIELDS. (Gregg Donovan, shalin) * SOLR-6183: New spatial BBoxField for indexing rectangles with search support for most predicates. It includes extra score relevancy modes in addition to distance: score=overlapRatio|area|area2D. (David Smiley, Ryan McKinley) * SOLR-6232: You can now unload/delete cores that have failed to initialize (Alan Woodward) * SOLR-2245: Improvements to the MailEntityProcessor: - Support for server-side date filtering if using GMail; requires new dependency on the Sun Gmail Java mail extensions - Support for using the last_index_time from the previous run as the value for the fetchMailsSince filter. (Peter Sturge, Timothy Potter) * SOLR-6258: Added onRollback event handler hook to Data Import Handler (DIH). (ehatcher) * SOLR-6263: Add DIH handler name to variable resolver as ${dih.handlerName}. (ehatcher) * SOLR-6216: Better faceting for multiple intervals on DV fields (Tomas Fernandez-Lobbe via Erick Erickson) * SOLR-6267: Let user override Interval Faceting key with LocalParams (Tomas Fernandez_Lobbe via Erick Erickson) * SOLR-6020: Auto-generate a unique key in schema-less example if data does not have an id field. The UUIDUpdateProcessor was improved to not require a field name in configuration and generate a UUID into the unique Key field. (Vitaliy Zhovtyuk, hossman, Steve Rowe, Erik Hatcher, shalin) * SOLR-6294: SOLR-6437: Remove the restriction of adding json by only wrapping it in an array in a new path /update/json/docs (Noble Paul , hossman, Yonik Seeley, Steve Rowe) * SOLR-6302: UpdateRequestHandlers are registered implicitly /update , /update/json, /update/csv , /update/json/docs (Noble Paul) * SOLR-6318: New "terms" QParser for efficiently filtering documents by a list of values. For many values, it's more appropriate than a boolean query. (David Smiley) * SOLR-6283: Add support for Interval Faceting in SolrJ. (Tomás Fernández Löbbe) * SOLR-6304 : JsonLoader should be able to flatten an input JSON to multiple docs (Noble Paul) * SOLR-2894: Distributed query support for facet.pivot (Dan Cooper, Erik Hatcher, Chris Russell, Andrew Muldowney, Brett Lucey, Mark Miller, hossman) * SOLR-5656: Add autoAddReplicas feature for shared file systems. (Mark Miller, Gregory Chanan) * SOLR-5244: Exporting Full Sorted Result Sets (Erik Hatcher, Joel Bernstein) * SOLR-3617: bin/solr and bin/solr.cmd scripts for starting, stopping, and running Solr examples (Timothy Potter) * SOLR-6233: Provide basic command line tools for checking Solr status and health. (Timothy Potter) Bug Fixes ---------------------- * SOLR-6095 : SolrCloud cluster can end up without an overseer with overseer roles (Noble Paul, Shalin Mangar) * SOLR-6165: DataImportHandler should write BigInteger and BigDecimal values as strings. (Anand Sengamalai via shalin) * SOLR-6189: Avoid publishing the state as down if the node is not live when determining if a replica should be in leader-initiated recovery. (Timothy Potter) * SOLR-6197: The MIGRATE collection API doesn't work when legacyCloud=false is set in cluster properties. (shalin) * SOLR-6206: The migrate collection API fails on retry if temp collection already exists. (shalin) * SOLR-6072: The 'deletereplica' API should remove the data and instance directory by default. (shalin) * SOLR-6211: TrieDateField doesn't default to omitNorms=true. (Michael Ryan, Steve Rowe) * SOLR-6159: A ZooKeeper session expiry during setup can keep LeaderElector from joining elections. (Steven Bower, shalin) * SOLR-6223: SearchComponents may throw NPE when using shards.tolerant and there is a failure in the 'GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG' phase. (Tomás Fernández Löbbe via shalin) * SOLR-6180: Callers of ManagedIndexSchema mutators should hold the schemaUpdateLock. (Gregory Chanan via Steve Rowe) * SOLR-6229: Make SuggestComponent return 400 instead of 500 for bad dictionary selected in request. (Tomás Fernández Löbbe via shalin) * SOLR-6235: Leader initiated recovery should use coreNodeName instead of coreName to avoid marking all replicas having common core name as down. (shalin) * SOLR-6208: JettySolrRunner QueuedThreadPool's configuration code is never executed. (dweiss via shalin) * SOLR-6245: Socket and Connection configuration are ignored in HttpSolrServer when passing in HttpClient. (Patanachai Tangchaisin, shalin) * SOLR-6137: Schemaless concurrency improvements: - Fixed an NPE when reloading a managed schema with no dynamic copy fields - Moved parsing and schema fields addition to after the distributed phase - AddSchemaFieldsUpdateProcessor now uses a fixed schema rather than always retrieving the latest, and holds the schema update lock through the entire schema swap-out process (Gregory Chanan via Steve Rowe) * SOLR-6136: ConcurrentUpdateSolrServer includes a Spin Lock (Brandon Chapman, Timothy Potter) * SOLR-6257: More than two "!"-s in a doc ID throws an ArrayIndexOutOfBoundsException when using the composite id router. (Steve Rowe) * SOLR-5746: Bugs in solr.xml parsing have been fixed to more correctly deal with the various datatypes of options people can specify, additional error handling of duplicated/unidentified options has also been added. (Maciej Zasada, hossman) * SOLR-5847: Fixed data import abort button in admin UI. (ehatcher) * SOLR-6264: Distributed commit and optimize are executed serially across all replicas. (Mark Miller, Timothy Potter) * SOLR-6163: Correctly decode special characters in managed stopwords and synonym endpoints. (Vitaliy Zhovtyuk, Timo Schmidt via Timothy Potter) * SOLR-6336: DistributedQueue can easily create too many ZooKeeper Watches. (Ramkumar Aiyengar via Mark Miller) * SOLR-6347: DELETEREPLICA throws a NPE while removing the last Replica in a Custom sharded collection. (Anshum Gupta) * SOLR-6062: Fix undesirable edismax query parser effect (introduced in SOLR-2058) in how phrase queries generated from pf, pf2, and pf3 are merged into the main query. (Michael Dodsworth via ehatcher) * SOLR-6372: HdfsDirectoryFactory should use supplied Configuration for communicating with secure kerberos. (Gregory Chanan via Mark Miller) * SOLR-6284: Fix NPE in OCP when non-existent sliceId is used for a deleteShard request (Ramkumar Aiyengar via Anshum Gupta) * SOLR-6380: Added missing context info to log message if IOException occurs in processing tlog (Steven Bower via hossman) * SOLR-6383: RegexTransformer returns no results after replaceAll if regex does not match a value. (Alexander Kingson, shalin) * SOLR-6387: Add better error messages throughout Solr and supply a work around for Java bug #8047340 to SystemInfoHandler: On Turkish default locale, some JVMs fail to fork on MacOSX, BSD, AIX, and Solaris platforms. (hossman, Uwe Schindler) * SOLR-6338: coreRootDirectory requires trailing slash, or SolrCloud cores are created in wrong location. (Primož Skale via Erick Erickson) * SOLR-6314: Facet counts duplicated in the response if specified more than once on the request. (Vamsee Yarlagadda, Erick Erickson) * SOLR-6378: Fixed example/example-DIH/ issues with "tika" and "solr" configurations, and tidied up README.txt (Daniel Shchyokin via ehatcher) * SOLR-6393: TransactionLog replay performance on HDFS is very poor. (Mark Miller) * SOLR-6268: HdfsUpdateLog has a race condition that can expose a closed HDFS FileSystem instance and should close its FileSystem instance if either inherited close method is called. (Mark Miller) * SOLR-6089: When using the HDFS block cache, when a file is deleted, its underlying data entries in the block cache are not removed, which is a problem with the global block cache option. (Mark Miller, Patrick Hunt) * SOLR-6402: OverseerCollectionProcessor should not exit for ZooKeeper ConnectionLoss. (Jessica Cheng via Mark Miller) * SOLR-6405: ZooKeeper calls can easily not be retried enough on ConnectionLoss. (Jessica Cheng, Mark Miller) * SOLR-6410: Ensure all Lookup instances are closed via CloseHook (hossman, Areek Zillur, Ryan Ernst, Dawid Weiss) Optimizations --------------------- * LUCENE-5803: Solr's schema now uses DelegatingAnalyzerWrapper. This uses less heap for cached TokenStreamComponents because it caches per FieldType not per Field, so indexes with many fields of same type just use one TokenStream per thread. (Shay Banon, Uwe Schindler, Robert Muir) * SOLR-6259: Reduce CPU usage by avoiding repeated costly calls to Document.getField inside DocumentBuilder.toDocument for use-cases with large number of fields and copyFields. (Steven Bower via shalin) * SOLR-5968: BinaryResponseWriter fetches unnecessary stored fields when only pseudo-fields are requested. (Gregg Donovan via shalin) * SOLR-6261: Run ZooKeeper watch event callbacks in parallel to the ZooKeeper event thread. (Ramkumar Aiyengar via Mark Miller) Other Changes --------------------- * SOLR-6173: Fixed wrong failure message in TestDistributedSearch. (shalin) * SOLR-5902: Corecontainer level mbeans are not exposed (noble) * SOLR-6194: Allow access to DataImporter and DIHConfiguration from DataImportHandler. (Aaron LaBella via shalin) * SOLR-6170: CoreContainer.preRegisterInZk() and CoreContainer.register() commands are merged into CoreContainer.create(). (Alan Woodward) * SOLR-6171: Remove unused SolrCores coreNameToOrig map (Alan Woodward) * SOLR-5596: Set system property zookeeper.forceSync=no for Solr test cases. (shalin) * SOLR-2853: Add a unit test for the case when "spellcheck.maxCollationTries=0" (James Dyer) * SOLR-6240: Removed unused coreName parameter in ZkStateReader.getReplicaProps. (shalin) * SOLR-6241: Harden the HttpPartitionTest. (shalin) * SOLR-6228: Fixed bug in TestReplicationHandler.doTestIndexAndConfigReplication. (shalin) * SOLR-6120: On Windows, when the war is not extracted, the zkcli.bat script will print a helpful message indicating that the war must be unzipped instead of a java error about a missing class. (shalin, Shawn Heisey) * SOLR-6179: Better strategy for handling empty managed data to avoid spurious warning messages in the logs. (Timothy Potter) * SOLR-6232: CoreContainer.remove() replaced with CoreContainer.unload(). A call to unload will also close the core. * SOLR-3893: DIH should not depend on mail.jar,activation.jar (Timothy Potter, Steve Rowe) * SOLR-6252: A couple of small improvements to UnInvertedField class. (Vamsee Yarlagadda, Gregory Chanan, Mark Miller) * SOLR-3345: BaseDistributedSearchTestCase should always ignore QTime. (Vamsee Yarlagadda, Benson Margulies via Mark Miller) * SOLR-6270: Increased timeouts for MultiThreadedOCPTest. (shalin) * SOLR-6274: UpdateShardHandler should log the params used to configure its HttpClient. (Ramkumar Aiyengar via Mark Miller) * SOLR-6194: Opened up "public" access to DataSource, DocBuilder, and EntityProcessorWrapper in DIH. (Aaron LaBella via ehatcher) * SOLR-6269: Renamed "rollback" to "error" in DIH internals, including renaming onRollback to onError introduced in SOLR-6258. (ehatcher) * SOLR-3622: When using DIH in SolrCloud-mode, rollback will no longer be called when an error occurs. (ehatcher) * SOLR-6231: Increased timeouts and hardened the RollingRestartTest. (Noble Paul, shalin) * SOLR-6290: Harden and speed up CollectionsAPIAsyncDistributedZkTest. (Mark Miller, shalin) * SOLR-6281: Made PostingsSolrHighlighter more configurable via subclass extension. (David Smiley) * SOLR-6309: Increase timeouts for AsyncMigrateRouteKeyTest. (shalin) * SOLR-2168: Added support for facet.missing in /browse field and pivot faceting. (ehatcher) * SOLR-4702: Added support for multiple spellcheck collations to /browse UI. (ehatcher) * SOLR-5664: Added support for multi-valued field highlighting in /browse UI. (ehatcher) * SOLR-6313: Improve SolrCloud cloud-dev scripts. (Mark Miller, Vamsee Yarlagadda) * SOLR-6360: Remove bogus "Content-Charset" header in HttpSolrServer. (Michael Ryan, Uwe Schindler) * SOLR-6362: Fix bug in TestSqlEntityProcessorDelta. (James Dyer) * SOLR-6388: Force upgrade of Apache POI dependency in Solr Cell to version 3.10.1 to fix CVE-2014-3529 and CVE-2014-3574. (Uwe Schindler) * SOLR-6391: Improve message for CREATECOLLECTION failure due to missing numShards (Anshum Gupta) ================== 4.9.0 ================== Versions of Major Components --------------------- Apache Tika 1.5 Carrot2 3.9.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Upgrading from Solr 4.8 ---------------------- * Support for DiskDocValuesFormat (ie: fieldTypes configured with docValuesFormat="Disk") has been removed due to poor performance. If you have an existing fieldTypes using DiskDocValuesFormat please modify your schema.xml to remove the 'docValuesFormat' attribute, and optimize your index to rewrite it into the default codec, prior to upgrading to 4.9. See LUCENE-5761 for more details. Detailed Change List ---------------------- New Features ---------------------- * SOLR-5999: Add checkIntegrityAtMerge support to solrconfig.xml. (Varun Thacker via Ryan Ernst) * SOLR-6043: Add ability to set http headers in solr response (Tomás Fernández Löbbe via Ryan Ernst) * SOLR-5973: Pluggable Ranking Collectors and Merge Strategies (Joel Bernstein) * SOLR-6108: Add support for 'addreplica' Collection API in SolrJ. (shalin) * SOLR-5468: Allow a client application to request the minium achieved replication factor for an update request (single or batch) by sending an optional parameter "min_rf". (Timothy Potter) * SOLR-6088: Add query re-ranking with the ReRankingQParserPlugin (Joel Bernstein) * SOLR-5285: Added a new [child ...] DocTransformer for optionally including Block-Join descendant documents inline in the results of a search. This works independent of whether the search itself is a block-join related query and is supported by he xml, json, and javabin response formats. (Varun Thacker via hossman) * SOLR-6150: Add new AnalyticsQuery to support pluggable analytics (Joel Bernstein) * SOLR-6125: Allow SolrIndexWriter to close without waiting for merges (Christine Poerschke via Alan Woodward) * SOLR-6064: DebugComponent track output should be returned as a JSON object rather than a list (Christine Poerschke, Alan Woodward) Bug Fixes ---------------------- * SOLR-5956: Use coreDescriptor.getInstanceDir() instead of getRawInstanceDir() in the SnapShooter to avoid problems when solr.solr.home is a symbolic link. (Timothy Potter) * SOLR-6002: Fix a couple of ugly issues around SolrIndexWriter close and rollback as well as how SolrIndexWriter manages its ref counted directory instance. (Mark Miller, Gregory Chanan) * SOLR-6015: Better way to handle managed synonyms when ignoreCase=true (Timothy Potter) * SOLR-6104: The 'addreplica' Collection API does not support 'async' parameter. (shalin) * SOLR-6101: Shard splitting doesn't work when legacyCloud=false is set in cluster properties. (shalin) * SOLR-6111: The 'deleteshard' collection API should be able to delete a shard in 'construction' state. (shalin) * SOLR-6118: 'expand.sort' didn't support function queries. (David Smiley) * SOLR-6120: zkcli.sh should expand solr.war automatically instead of throwing ClassNotFoundException. (sebastian badea, shalin) * SOLR-6149: Specifying the query value without any index value does not work in Analysis browser. (Aman Tandon, shalin) * SOLR-6145: Fix Schema API optimistic concurrency by moving it out of ManagedIndexSchema.add(Copy)Fields() into the consumers of those methods: CopyFieldCollectionResource, FieldCollectionResource, FieldResource, and AddSchemaFieldsUpdateProcessorFactory. (Gregory Chanan, Alexey Serba, Steve Rowe) * SOLR-6146: Incorrect configuration such as wrong chroot in zk server address can cause CloudSolrServer to leak resources. (Jessica Cheng, Varun Thacker, shalin) * SOLR-6158: Relative configSetBase directories were resolved relative to the container CWD, rather than solr.home. (Simon Endele, Alan Woodward) * SOLR-5426: Fixed a bug in ReverseWildCardFilter that could cause InvalidTokenOffsetsException when highlighting. (Uwe Schindler, Arun Kumar, via hossman) * SOLR-6175: DebugComponent throws NPE on shard exceptions when using shards.tolerant. (Tomás Fernández Löbbe via shalin) * SOLR-6129: DateFormatTransformer doesn't resolve dateTimeFormat. (Aaron LaBella via shalin) * SOLR-6164: Copy Fields Schema additions are not distributed to other nodes. (Gregory Chanan via Steve Rowe) * SOLR-6160: An error was sometimes possible if a distributed search included grouping with group.facet, faceting on facet.field and either facet.range or facet.query. (David Smiley) * SOLR-6182: Data stored by the RestManager could not be reloaded after core restart, causing the core to fail to load; cast the data loaded from storage to the correct data type. (Timothy Potter) Other Changes --------------------- * SOLR-5980: AbstractFullDistribZkTestBase#compareResults always returns false for shouldFail. (Mark Miller, Gregory Chanan) * SOLR-5987: Add "collection" to UpdateParams. (Mark Miller, Greg Solovyev) * SOLR-3862: Add remove" as update option for atomically removing a value from a multivalued field (Jim Musli, Steven Bower, Alaknantha via Erick Erickson) * SOLR-5974: Remove ShardDoc.score and use parent's ScoreDoc.score. (Tomás Fernández Löbbe via Ryan Ernst) * SOLR-6025: Replace mentions of CommonsHttpSolrServer with HttpSolrServer and StreamingUpdateSolrServer with ConcurrentUpdateSolrServer. (Ahmet Arslan via shalin) * SOLR-6013: Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility. (Aaron LaBella via shalin) * SOLR-6022: Deprecate getAnalyzer() in IndexField and FieldType, and add getIndexAnalyzer(). (Ryan Ernst) * SOLR-3671: Fix DIHWriter interface usage so users may implement writers that output documents to a location external to Solr (ex. a NoSql db). (Roman Chyla via James Dyer) * SOLR-5340: Add support for named snapshots (Varun Thacker via Noble Paul) * SOLR-5495: Recovery strategy for leader partitioned from replica case. Hardening recovery scenarios after the leader receives an error trying to forward an update request to a replica. (Timothy Potter) * SOLR-6116: Refactor DocRouter.getDocRouter to accept routerName as a String. (shalin) * SOLR-6026: REQUESTSTATUS Collection API now also checks for submitted tasks which are yet to begin execution. * SOLR-6067: Refactor duplicate Collector code in SolrIndexSearcher (Christine Poerschke via hossman) * SOLR-5940: post.jar reports back detailed error in case of error responses. (Sameer Maggon, shalin, Uwe Schindler) * SOLR-6161: SolrDispatchFilter should throw java.lang.Error back even if wrapped in another exception. (Miklos Christine via shalin) * SOLR-6153: ReplicationHandler backup response format should contain backup name. (Varun Thacker via shalin) * SOLR-6169: Remove broken handleAlias action in CoreAdminHandler (Alan Woodward) * SOLR-6128: Removed deprecated analysis factories and fieldTypes from the example schema.xml (hossman) * SOLR-5868: HttpClient should be configured to use ALLOW_ALL_HOSTNAME hostname verifier to simplify SSL setup. (Steve Davids via Mark Miller) Optimizations ---------------------- * SOLR-5681: Make the processing of Collection API calls multi-threaded. (Anshum Gupta, shalin, Noble Paul) Build --------------------- * SOLR-6006: Separate test and compile scope dependencies in the Solrj and Solr contrib ivy.xml files, so that the derived Maven dependencies get filled out properly in the corresponding POMs. (Steven Scott, Steve Rowe) * SOLR-6130: Added com.uwyn:jhighlight dependency to, and removed asm:asm dependency from the extraction contrib - dependencies weren't fully upgraded with the Tika 1.4->1.5 upgrade (SOLR-5763). (Steve Rowe) ================== 4.8.1 ================== Bug Fixes ---------------------- * SOLR-5904: ElectionContext can cancel an election when it should not if there was an exception while trying to register as the leader. (Mark Miller, Alan Woodward) * SOLR-5993: ZkController can warn about shard leader conflict even after the conflict is resolved. (Gregory Chanan via shalin) * SOLR-6017: Fix SimpleQParser to use query analyzer (Ryan Ernst) * SOLR-6029: CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment. (Greg Harris, Joel Bernstein) * SOLR-6030: Use System.nanoTime() instead of currentTimeInMills() in LRUCache.warm. (Tomás Fernández Löbbe via shalin) * SOLR-6037: Fixed incorrect max/sum/stddev for Date fields in StatsComponent (Brett Lucey, hossman) * SOLR-6023: FieldAnalysisRequestHandler throws NPE if no parameters are supplied. (shalin) * SOLR-5090: SpellCheckComponent sometimes throws NPE if "spellcheck.alternativeTermCount" is set to zero (James Dyer). * SOLR-6039: fixed debug output when no results in response (Tomás Fernández Löbbe, hossman) * SOLR-6035: CloudSolrServer directUpdate routing should use getCoreUrl. (Marvin Justice, Joel Bernstein) ================== 4.8.0 ================== Versions of Major Components --------------------- Apache Tika 1.5 Carrot2 3.9.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.6 Upgrading from Solr 4.7 ---------------------- * In previous versions of Solr, Terms that exceeded Lucene's MAX_TERM_LENGTH were silently ignored when indexing documents. Beginning with Solr 4.8, a document an error will be generated when attempting to index a document with a term that is too large. If you wish to continue to have large terms ignored, use "solr.LengthFilterFactory" in all of your Analyzers. See LUCENE-5472 for more details. * Solr 4.8 requires Java 7 or greater, Java 8 is verified to be compatible and may bring some performance improvements. When using Oracle Java 7 or OpenJDK 7, be sure to not use the GA build 147 or update versions u40, u45 and u51! We recommend using u55 or later. An overview of known JVM bugs can be found on http://wiki.apache.org/lucene-java/JavaBugs * ZooKeeper is upgraded from 3.4.5 to 3.4.6. * and tags have been deprecated. There is no longer any reason to keep them in the schema file, they may be safely removed. This allows intermixing of , and definitions if desired. Currently, these tags are supported so either style may be implemented. TBD is whether they'll be deprecated formally for 5.0 Detailed Change List ---------------------- System Requirements ---------------------- * LUCENE-4747, LUCENE-5514: Move to Java 7 as minimum Java version. (Robert Muir, Uwe Schindler) New Features ---------------------- * SOLR-5130: Implement addReplica Collections API (Noble Paul) * SOLR-5183: JSON updates now support nested child documents using a "_childDocument_" object key. (Varun Thacker, hossman) * SOLR-5714: You can now use one pool of memory for for the HDFS block cache that all collections share. (Mark Miller, Gregory Chanan) * SOLR-5720: Add ExpandComponent to expand results collapsed by the CollapsingQParserPlugin. (Joel Bernstein) * SOLR-3177: Enable tagging and excluding filters in StatsComponent via the localParams syntax. (Mathias H., Nikolai Luthman, Vitaliy Zhovtyuk, shalin) * SOLR-1604: Wildcards, ORs etc inside Phrase Queries. (Ahmet Arslan via Erick Erickson) * SOLR-5477: Async execution of OverseerCollectionProcessor(CollectionsAPI) tasks. (Anshum Gupta) * SOLR-5865: Provide a MiniSolrCloudCluster to enable easier testing. (Greg Chanan via Mark Miller) * SOLR-5860: Use leaderConflictResolveWait in WaitForState during recovery/startup, improve logging and force refresh cluster state every 15 seconds. (Timothy Potter via shalin) * SOLR-5749: A new Overseer status collection API exposes overseer queue sizes, timing statistics, success and error counts and last N failures per operation. (shalin) * SOLR-5858: Add a hl.qparser parameter to allow you to define a queryparser for hl.q highlight queries. If no queryparser is defined, Solr will use the overall query's defType. (Alan Woodward) * SOLR-4478: Allow cores to use configuration from a configsets directory outside their instance directory. (Alan Woodward, Erick Erickson) * SOLR-5466: A new List collections and cluster status API which clients can use to read collection and shard information instead of reading data directly from ZooKeeper. (Dave Seltzer, Varun Thacker, Vitaliy Zhovtyuk, Erick Erickson, shalin) * SOLR-5795: New DocExpirationUpdateProcessorFactory supports computing an expiration date for documents from the "TTL" expression, as well as automatically deleting expired documents on a periodic basis. (hossman) * SOLR-5829: Allow ExpandComponent to accept query and filter query parameters (Joel Bernstein) * SOLR-5653: Create a RestManager to provide REST API endpoints for reconfigurable plugins. (Tim Potter, Steve Rowe) * SOLR-5655: Create a stopword filter factory that is (re)configurable, and capable of reporting its configuration, via REST API. (Tim Potter via Steve Rowe) * SOLR-5654: Create a synonym filter factory that is (re)configurable, and capable of reporting its configuration, via REST API. (Tim Potter via Steve Rowe) * SOLR-5960: Add support for basic authentication in post.jar tool, e.g.: java -Durl="http://username:password@hostname:8983/solr/update" -jar post.jar sample.xml (Sameer Maggon via Uwe Schindler) * SOLR-4864: RegexReplaceProcessorFactory should support pattern capture group substitution in replacement string. (Sunil Srinivasan, Jack Krupansky via Steve Rowe) Bug Fixes ---------------------- * SOLR-5858, SOLR-4812: edismax and dismax query parsers can be used for parsing highlight queries. (Alan Woodward, Tien Nguyen Manh) * SOLR-5893: On restarting overseer designate , move itself to front of the queue (Noble Paul) * SOLR-5915: Attempts to specify the parserImpl for solr.PreAnalyzedField fieldtype failed. (Mike McCandless) * SOLR-5943: SolrCmdDistributor does not distribute the openSearcher parameter. (ludovic Boutros via shalin) * SOLR-5954: Slower DataImportHandler process caused by not reusing jdbc connections. (Mark Miller, Paco Garcia, Raja Nagendra Kumar) * SOLR-5897: Upgraded to jQuery 1.7.2, Solr was previously using 1.4.3, the file was mistakenly named 1.7.2 (steffkes) Optimizations ---------------------- * SOLR-1880: Distributed Search skips GET_FIELDS stage if EXECUTE_QUERY stage gets all fields. Requests with fl=id or fl=id,score are now single-pass. (Shawn Smith, Vitaliy Zhovtyuk, shalin) * SOLR-5783: Requests to open a new searcher will now reuse the current registered searcher (w/o additional warming) if possible in situations where the underlying index has not changed. This reduces overhead in situations such as deletes that do not modify the index, and/or redundant commits. (hossman) * SOLR-5884: When recovery is cancelled, any call to the leader to wait to see the replica in the right state for recovery should be aborted. (Mark Miller) Other Changes --------------------- * SOLR-5909: Upgrade Carrot2 clustering dependency to 3.9.0. (Dawid Weiss) * SOLR-5764: Fix recently added tests to not use absolute paths to load test-files, use SolrTestCaseJ4.getFile() and getResource() instead; fix morphlines/map-reduce to not duplicate test resources and fix dependencies among them. (Uwe Schindler) * SOLR-5765: Update to SLF4J 1.7.6. (Mark Miller) * SOLR-5609: If legacy mode is disabled don't let cores create slices/replicas/collections . All operations should be performed through collection API (Noble Paul) * SOLR-5613: Upgrade to commons-codec 1.9 for better BeiderMorseFilter performance. (Thomas Champagne, Shawn Heisey via shalin) * SOLR-5771: Add SolrTestCaseJ4.SuppressSSL annotation to disable SSL (instead of static boolean). (Robert Muir) * SOLR-5799: When registering as the leader, if an existing ephemeral registration exists, wait a short time to see if it goes away. (Mark Miller) * LUCENE-5472: IndexWriter.addDocument will now throw an IllegalArgumentException if a Term to be indexed exceeds IndexWriter.MAX_TERM_LENGTH. To recreate previous behavior of silently ignoring these terms, use LengthFilter in your Analyzer. (hossman, Mike McCandless, Varun Thacker) * SOLR-5825: Separate http request creating and execution in SolrJ (Steven Bower via Erick Erickson) * SOLR-5837: Add hashCode/equals to SolrDocument, SolrInputDocument and SolrInputField for testing purposes. (Varun Thacker, Noble Paul, Mark Miller) * SOLR-5853: The createCollection methods in the test framework now reports result of operation in the returned CollectionAdminResponse (janhoy) * SOLR-5838: Relative SolrHome Path Bug At AbstractFullDistribZkTestBase. (Furkan KAMACI via shalin) * SOLR-5763: Upgrade to Tika 1.5 (Vitaliy Zhovtyuk via Steve Rowe) * SOLR-5881: Upgrade ZooKeeper to 3.4.6 (Shawn Heisey) * SOLR-5883: Many tests do not shutdown SolrServer. (Tomás Fernández Löbbe via Mark Miller) * SOLR-5898: Update to latest Kite Morphlines release: Version 0.12.1. (Mark Miller) * SOLR-5228: Don't require or be inside of -- or that be inside of . (Erick Erickson) * SOLR-5903: SolrCore implements Closeable, cut over to using try-with-resources where possible. (Alan Woodward) * SOLR-5914: Cleanup and fix Solr's test cleanup code. (Mark Miller, Uwe Schindler) * SOLR-5936: Deprecate non-Trie-based numeric & date field types. (Steve Rowe) * SOLR-5934: LBHttpSolrServer exception handling improvement and small test improvements. (Gregory Chanan via Mark Miller) * SOLR-5773: CollapsingQParserPlugin should make elevated documents the group head. (David Boychuck, Joel Bernstein) * SOLR-5937: Modernize the DIH example config sets. (Steve Rowe) ================== 4.7.2 ================== Versions of Major Components --------------------- Apache Tika 1.4 Carrot2 3.8.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Detailed Change List ---------------------- Bug Fixes ---------------------- * SOLR-5951: Fixed SolrDispatchFilter to throw useful exception on startup if SLF4j logging jars are missing. (Uwe Schindler, Hossman, Shawn Heisey) * SOLR-5950: Maven config: make the org.slf4j:slf4j-api dependency transitive (i.e., not optional) in all modules in which it's a dependency, including solrj, except for the WAR, where it will remain optional. (Uwe Schindler, Steve Rowe) ================== 4.7.1 ================== Versions of Major Components --------------------- Apache Tika 1.4 Carrot2 3.8.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Detailed Change List ---------------------- Bug Fixes ---------------------- * SOLR-5647: The lib paths in example-schemaless will now load correctly. (Paul Westin via Shawn Heisey) * SOLR-5770: All attempts to match a SolrCore with its state in clusterstate.json should be done with the CoreNodeName. (Steve Davids via Mark Miller) * SOLR-5875: QueryComponent.mergeIds() unmarshals all docs' sort field values once per doc instead of once per shard. (Alexey Serba, hoss, Martin de Vries via Steve Rowe) * SOLR-5800: Admin UI - Analysis form doesn't render results correctly when a CharFilter is used. (steffkes) * SOLR-5870: Admin UI - Reload on Core Admin doesn't show errors (steffkes) * SOLR-5867: OverseerCollectionProcessor isn't properly generating https urls in some cases. (Steve Davids via shalin) * SOLR-5866: UpdateShardHandler needs to use the system default scheme registry to properly handle https via javax.net.ssl.* properties. (Steve Davids via shalin) * SOLR-5782: The full MapReduceIndexer help text does not display when using --help. (Mark Miller, Wolfgang Hoschek) * SOLR-5824: Merge up Solr MapReduce contrib code to latest external changes. Includes a few minor bug fixes. (Mark Miller) * SOLR-5818: distrib search with custom comparator does not quite work correctly (Ryan Ernst) * SOLR-5895: JavaBinLoader hides IOExceptions. (Mike Sokolov via shalin) * SOLR-5861: Recovery should not set onlyIfLeaderActive=true for slice in 'recovery' state. (shalin) * SOLR-5423: CSV output doesn't include function field (Arun Kumar, hossman, Steve Rowe) * SOLR-5550: shards.info is not returned by a short circuited distributed query. (Timothy Potter, shalin) * SOLR-5777: Fix ordering of field values in JSON updates where field name key is repeated (hossman) * SOLR-5734: We should use System.nanoTime rather than System.currentTimeMillis when calculating elapsed time. (Mark Miller, Ramkumar Aiyengar) * SOLR-5760: ConcurrentUpdateSolrServer has a blockUntilFinished call when streamDeletes is true that should be tucked into the if statement below it. (Mark Miller, Gregory Chanan) * SOLR-5761: HttpSolrServer has a few fields that can be set via setters but are not volatile. (Mark Miller, Gregory Chanan) * SOLR-5907: The hdfs write cache can cause a reader to see a corrupted state. It now defaults to off, and if you were using solr.hdfs.blockcache.write.enabled explicitly, you should set it to false. (Mark Miller) * SOLR-5811: The Overseer will retry work items until success, which is a serious problem if you hit a bad work item. (Mark Miller) * SOLR-5796: Increase how long we are willing to wait for a core to see the ZK advertised leader in its local state. (Timothy Potter, Mark Miller) * SOLR-5834: Overseer threads are only being interrupted and not closed. (hossman, Mark Miller) * SOLR-5839: ZookeeperInfoServlet does not trim path properly. (Furkan KAMACI via Mark Miller) * SOLR-5874: Unsafe cast in CloudSolrServer's RouteException. Change RouteException to handle Throwable rather than Exception. (Mark Miller, David Arthur) * SOLR-5899: CloudSolrServer's RouteResponse and RouteException should be publicly accessible. (Mark Miller, shalin) * SOLR-5905: CollapsingQParserPlugin throws a NPE if required 'field' param is missing. (Spyros Kapnissis via shalin) * SOLR-5906: Collection create API ignores property.instanceDir parameter. (Varun Thacker, shalin) * SOLR-5920: Distributed sort on DateField, BoolField and BCD{Int,Long,Str}Field returns string cast exception (Eric Bus, AJ Lemke, hossman, Steve Rowe) Other Changes --------------------- * SOLR-5796: Make how long we are willing to wait for a core to see the ZK advertised leader in its local state configurable. (Timothy Potter via Mark Miller) ================== 4.7.0 ================== Versions of Major Components --------------------- Apache Tika 1.4 Carrot2 3.8.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Upgrading from Solr 4.6.0 ---------------------- * CloudSolrServer and LBHttpSolrServer no longer declare MalformedURLException as thrown from their constructors. * Due to a bug in previous versions the default value of the 'discountOverlap' property of DefaultSimilarity was not being set appropriately if you were using the implicit DefaultSimilarityFactory instead of explicitly configuring it. To preserve consistent behavior for people who upgrade, the implicit behavior is now contingent on the -- discountOverlap=false for 4.6 and below, discountOverlap=true for 4.7 and above. See SOLR-5561 for more information. Detailed Change List ---------------------- New Features ---------------------- * SOLR-5308: SOLR-5601: SOLR-5710: A new 'migrate' collection API to split all documents with a route key into another collection (shalin) * SOLR-5441: Expose number of transaction log files and their size via JMX. (Rafał Kuć via shalin) * SOLR-5320: Added support for tri-level compositeId routing. (Anshum Gupta via shalin) * SOLR-5458: Admin UI - Added a new "Files" conf directory browser/file viewer. (steffkes) * SOLR-5447, SOLR-5490: Add a QParserPlugin for Lucene's SimpleQueryParser. (Jack Conradson via shalin) * SOLR-5208: Support for the setting of core.properties key/values at create-time on Collections API (Erick Erickson) * SOLR-5428: SOLR-5690: New 'stats.calcdistinct' parameter in StatsComponent returns set of distinct values and their count. This can also be specified per field e.g. 'f.field.stats.calcdistinct'. (Elran Dvir via shalin) * SOLR-5378, SOLR-5528: A new SuggestComponent that fully utilizes the Lucene suggester module and adds pluggable dictionaries, payloads and better distributed support. This is intended to eventually replace the Suggester support through the SpellCheckComponent. (Areek Zillur, Varun Thacker via shalin) * SOLR-5492: Return the replica that actually served the query in shards.info response. (shalin) * SOLR-5506: Support docValues in CollationField and ICUCollationField. (Robert Muir) * SOLR-5023: Add support for deleteInstanceDir to be passed from SolrJ for Core Unload action. (Lyubov Romanchuk, shalin) * SOLR-1871: The 'map' function query accepts a ValueSource as target and default value. (Chris Harris, shalin) * SOLR-5556: Allow class of CollectionsHandler and InfoHandler to be specified in solr.xml. (Gregory Chanan, Alan Woodward) * SOLR-5581: Give ZkCLI the ability to get files. (Gregory Chanan via Mark Miller) * SOLR-5536: Add ValueSource collapse criteria to CollapsingQParsingPlugin (Joel Bernstein) * SOLR-5541: Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters (Joel Bernstein) * SOLR-5463: new 'cursorMark' request param for deep paging of sorted result sets (sarowe, hossman) * SOLR-5529: Add support for queries to use multiple suggesters. (Areek Zillur, Erick Erickson, via Robert Muir) * SOLR-1301: Add a Solr contrib that allows for building Solr indexes via Hadoop's MapReduce. (Matt Revelle, Alexander Kanarsky, Steve Rowe, Mark Miller, Greg Bowyer, Jason Rutherglen, Kris Jirapinyo, Jason Venner , Andrzej Bialecki, Patrick Hunt, Wolfgang Hoschek, Roman Shaposhnik, Eric Wong) * SOLR-5631: Add support for Lucene's FreeTextSuggester. (Areek Zillur via Robert Muir) * SOLR-5695: Add support for Lucene's BlendedInfixSuggester. (Areek Zillur) * SOLR-5476: Overseer Role for nodes (Noble Paul) * SOLR-5594: Allow FieldTypes to specify custom PrefixQuery behavior (Anshum Gupta via hossman) * LUCENE-5395: Upgrade to Spatial4j 0.4. Various new options are now exposed automatically for an RPT field type. See Spatial4j CHANGES & javadocs. https://github.com/spatial4j/spatial4j/blob/master/CHANGES.md (David Smiley) * SOLR-5670: allow _version_ to use DocValues. (Per Steffensen via yonik) * SOLR-5535: Set "partialResults" header for shards that error out if shards.tolerant is specified. (Steve Davids via shalin) * SOLR-5610: Support cluster-wide properties with an API called CLUSTERPROP (Noble Paul) * SOLR-5623: Better diagnosis of RuntimeExceptions in analysis (Benson Margulies) * SOLR-5530: Added a NoOpResponseParser for SolrJ which puts the entire raw response into an entry in the NamedList. (Upayavira, Vitaliy Zhovtyuk via shalin) * SOLR-5682: Make the admin InfoHandler more pluggable / derivable. (Greg Chanan via Mark Miller) * SOLR-5672: Add logParamsList parameter to support reduced logging. (Christine Poerschke via Mark Miller) * SOLR-3854: SSL support for SolrCloud. (Sami Siren, hossman, Steve Davids, Alexey Serba, Mark Miller) Bug Fixes ---------------------- * SOLR-5438: DebugComponent throws NPE when used with grouping. (Tomás Fernández Löbbe via shalin) * SOLR-4612: Admin UI - Analysis Screen contains empty table-columns (steffkes) * SOLR-5451: SyncStrategy closes its http connection manager before the executor that uses it in its close method. (Mark Miller) * SOLR-5460: SolrDispatchFilter#sendError can get a SolrCore that it does not close. (Mark Miller) * SOLR-5461: Request proxying should only set con.setDoOutput(true) if the request is a post. (Mark Miller) * SOLR-5481: SolrCmdDistributor should not let the http client do its own retries. (Mark Miller) * LUCENE-5347: Fixed Solr's Zookeeper Client to copy files to Zookeeper using binary transfer. Previously data was read with default encoding and stored in zookeeper as UTF-8. This bug was found after upgrading to forbidden-apis 1.4. (Uwe Schindler) * SOLR-4376: DataImportHandler uses wrong date format for last_index_time if a delta-import is run first before any full-imports. (Sebastien Lorber, Arcadius Ahouansou via shalin) * SOLR-5494: CoreContainer#remove throws NPE rather than returning null when a SolrCore does not exist in core discovery mode. (Mark Miller) * SOLR-5354: Distributed sort is broken with CUSTOM FieldType. (Steve Rowe, hossman, Robert Muir, Jessica Cheng) * SOLR-5515: NPE when getting stats on date field with empty result on SolrCloud. (Alexander Sagen, shalin) * SOLR-5204: StatsComponent and SpellCheckComponent do not support the shards.tolerant=true parameter. (Anca Kopetz, shalin) * SOLR-5527: DIH logs spurious warning for special commands. (shalin) * SOLR-5524: Exception when using Query Function inside Scale Function. (Trey Grainger, yonik) * SOLR-5562: ConcurrentUpdateSolrServer constructor ignores supplied httpclient. (Kyle Halliday via Mark Miller) * SOLR-5567: ZkController getHostAddress duplicates url prefix. (Kyle Halliday, Alexey Serba, shalin) * SOLR-4992: Solr eats OutOfMemoryError exceptions in many cases. (Mark Miller, Daniel Collins) * LUCENE-5399, SOLR-5354 sort wouldn't work correctly with distributed searching for some field types such as legacy numeric types (Rob Muir, Mike McCandless) * SOLR-5643: ConcurrentUpdateSolrServer will sometimes not spawn a new Runner thread even though there are updates in the queue. (Mark Miller) * SOLR-5650: When a replica becomes a leader, only peer sync with other replicas that last published an ACTIVE state. (Mark Miller) * SOLR-5657: When a SolrCore starts on HDFS, it should gracefully handle HDFS being in safe mode. (Mark Miller) * SOLR-5663: example-DIH uses non-existing column for mapping (case-sensitive) (steffkes) * SOLR-5666: Using the hdfs write cache can result in appearance of corrupted index. (Mark Miller) * SOLR-5230: Call DelegatingCollector.finish() during grouping. (Joel Bernstein, ehatcher) * SOLR-5679: Shard splitting fails with ClassCastException on collections upgraded from 4.5 and earlier versions. (Brett Hoerner, shalin) * SOLR-5673: HTTPSolrServer doesn't set own property correctly in setFollowRedirects. (Frank Wesemann via shalin) * SOLR-5676: SolrCloud updates rejected if talking to secure ZooKeeper. (Greg Chanan via Mark Miller) * SOLR-5634: SolrJ GroupCommand.getNGroups returns null if group.format=simple and group.ngroups=true. (Artem Lukanin via shalin) * SOLR-5667: Performance problem when not using hdfs block cache. (Mark Miller) * SOLR-5526: Fixed NPE that could arise when explicitly configuring some built in QParserPlugins (Nikolay Khitrin, Vitaliy Zhovtyuk, hossman) * SOLR-5598: LanguageIdentifierUpdateProcessor ignores all but the first value of multiValued string fields. (Andreas Hubold, Vitaliy Zhovtyuk via shalin) * SOLR-5593: Replicas should accept the last updates from a leader that has just lost its connection to ZooKeeper. (Christine Poerschke via Mark Miller) * SOLR-5678: SolrZkClient should throw a SolrException when connect times out rather than a RuntimeException. (Karl Wright, Anshum Gupta, Mark Miller) * SOLR-4072: Error message is incorrect for linkconfig in ZkCLI. (Vamsee Yarlagadda, Adam Hahn, via Mark Miller) * SOLR-5691: Sharing non thread safe WeakHashMap across thread can cause problems. (Bojan Smid, Mark Miller) * SOLR-5693: Running on HDFS does work correctly with NRT search. (Mark Miller) * SOLR-5644: SplitShard does not handle not finding a shard leader well. (Mark Miller, Anshum Gupta via shalin) * SOLR-5704: coreRootDirectory was not respected when creating new cores via CoreAdminHandler (Jesse Sipprell, Alan Woodward) * SOLR-5709: Highlighting grouped duplicate docs from different shards with group.limit > 1 throws ArrayIndexOutOfBoundsException. (Steve Rowe) * SOLR-5561: Fix implicit DefaultSimilarityFactory initialization in IndexSchema to properly specify discountOverlap option. (Isaac Hebsh, Ahmet Arslan, Vitaliy Zhovtyuk, hossman) * SOLR-5689: On reconnect, ZkController cancels election on first context rather than latest. (Gregory Chanan, Mark Miller via shalin) * SOLR-5649: Clean up some minor ConnectionManager issues. (Mark Miller, Gregory Chanan) * SOLR-5365: Fix bug with compressed files in ExtractingRequestHandler by upgrading commons-compress to 1.7 (Jan Høydahl, hossman) * SOLR-5675: cloud-scripts/zkcli.bat: quote option log4j (Günther Ruck via steffkes * SOLR-5721: ConnectionManager can become stuck in likeExpired. (Gregory Chanan via Mark Miller) * SOLR-5731: In ConnectionManager, we should catch and only log exceptions from BeforeReconnect. (Mark Miller) * SOLR-5718: Make LBHttpSolrServer zombie checks non-distrib and non-scoring. (Christine Poerschke via Mark Miller) * SOLR-5727: LBHttpSolrServer should only retry on Connection exceptions when sending updates. Affects CloudSolrServer. (Mark Miller) * SOLR-5739: Sub-shards created by shard splitting have their update log set to buffering mode on restarts. (Günther Ruck, shalin) * SOLR-5741: UpdateShardHandler was not correctly setting max total connections on the HttpClient. (Shawn Heisey) * SOLR-5620: ZKStateReader.aliases should be volatile to ensure all threads see the latest aliases. (Ramkumar Aiyengar via Mark Miller) * SOLR-5448: ShowFileRequestHandler treats everything as Directory, when in Cloud-Mode. (Erick Erickson, steffkes) Optimizations ---------------------- * SOLR-5436: Eliminate the 1500ms wait in overseer loop as well as polling the ZK distributed queue. (Noble Paul, Mark Miller) * SOLR-5189: Solr 4.x Web UI Log Viewer does not display 'date' column from logs (steffkes) * SOLR-5512: Optimize DocValuesFacets. (Robert Muir) * SOLR-2960: fix DIH XPathEntityProcessor to add the correct number of "null" placeholders for multi-valued fields (Michael Watts via James Dyer) * SOLR-5214: Reduce memory usage for shard splitting by merging segments one at a time. (Christine Poerschke via shalin) * SOLR-4227: Wrap XML RequestWriter's OutputStreamWriter in a BufferedWriter to avoid frequent converter invocations. (Conrad Herrmann, shalin) * SOLR-5624: Enable QueryResultCache for CollapsingQParserPlugin. (David Boychuck, Joel Bernstein) * LUCENE-5440: DocSet decoupled from OpenBitSet. DocSetBase moved to use FixedBitSet instead of OpenBitSet. As a result BitDocSet now only works with FixedBitSet. (Shai Erera) Other Changes --------------------- * SOLR-5399: Add distributed request tracking information to DebugComponent (Tomás Fernández Löbbe via Ryan Ernst) * SOLR-5421: Remove double set of distrib.from param in processAdd method of DistributedUpdateProcessor. (Anshum Gupta via shalin) * SOLR-5404: The example config references deprecated classes. (Uwe Schindler, Rafał Kuć via Mark Miller) * SOLR-5487: Replication factor error message doesn't match constraint. (Patrick Hunt via shalin) * SOLR-5499: Log a warning if /get is not registered when using SolrCloud. (Daniel Collins via shalin) * SOLR-5517: Return HTTP error on POST requests with no Content-Type. (Ryan Ernst, Uwe Schindler) * SOLR-5502: Added a test for tri-level compositeId routing with documents having a "/" in a document id. (Anshum Gupta via Mark Miller) * SOLR-5533: Improve out of the box support for running Solr on hdfs with SolrCloud. (Mark Miller) * SOLR-5548: Give DistributedSearchTestCase / JettySolrRunner the ability to specify extra filters. (Greg Chanan via Mark Miller) * SOLR-5555: LBHttpSolrServer and CloudSolrServer constructors don't need to declare MalformedURLExceptions (Sushil Bajracharya, Alan Woodward) * SOLR-5565: Raise default ZooKeeper session timeout to 30 seconds from 15 seconds. (Mark Miller) * SOLR-5574: CoreContainer shutdown publishes all nodes as down and waits to see that and then again publishes all nodes as down. (Mark Miller) * SOLR-5590: Upgrade HttpClient/HttpComponents to 4.3.x. (Karl Wright via Shawn Heisey) * SOLR-2794: change the default of hl.phraseLimit to 5000. (Michael Della Bitta via Robert Muir, Koji, zarni - pull request #11) * SOLR-5632: Improve response message for reloading a non-existent core. (Anshum Gupta via Mark Miller) * SOLR-5633: HttpShardHandlerFactory should make its http client available to subclasses. (Ryan Ernst) * SOLR-5684: Shutdown SolrServer clients created in BasicDistributedZk2Test and BasicDistributedZkTest. (Tomás Fernández Löbbe via shalin) * SOLR-5629: SolrIndexSearcher.name should include core name. (Shikhar Bhushan via shalin) * SOLR-5702: Log config name found for collection at info level. (Christine Poerschke via Mark Miller) * SOLR-5659: Add test for compositeId ending with an '!'. (Markus Jelsma, Anshum Gupta via shalin) * SOLR-5700: Improve error handling of remote queries (proxied requests). (Greg Chanan, Steve Davids via Mark Miller) * SOLR-5585: Raise Collections API timeout to 3 minutes from one minute. (Mark Miller) * SOLR-5257: Improved error/warn messages when Update XML contains unexpected XML nodes (Vitaliy Zhovtyuk, hossman) ================== 4.6.1 ================== Versions of Major Components --------------------- Apache Tika 1.4 Carrot2 3.8.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Detailed Change List ---------------------- Bug Fixes ---------------------- * SOLR-5408: CollapsingQParserPlugin scores incorrectly when multiple sort criteria are used (Brandon Chapman, Joel Bernstein) * SOLR-5416: CollapsingQParserPlugin breaks Tag/Exclude Faceting (David Boychuck, Joel Bernstein) * SOLR-5442: Python client cannot parse proxied response when served by Tomcat. (Patrick Hunt, Gregory Chanan, Vamsee Yarlagadda, Romain Rigaux, Mark Miller) * SOLR-5445: Proxied responses should propagate all headers rather than the first one for each key. (Patrick Hunt, Mark Miller) * SOLR-5479: SolrCmdDistributor retry logic stops if a leader for the request cannot be found in 1 second. (Mark Miller) * SOLR-5532: SolrJ Content-Type validation is too strict for some webcontainers / proxies. (Jakob Furrer, hossman, Shawn Heisey, Uwe Schindler, Mark Miller) * SOLR-5547: Creating a collection alias using SolrJ's CollectionAdminRequest sets the alias name and the collections to alias to the same value. (Aaron Schram, Mark Miller) * SOLR-5577: Likely ZooKeeper expiration should not slow down updates a given amount, but instead cut off updates after a given time. (Mark Miller, Christine Poerschke, Ramkumar Aiyengar) * SOLR-5580: NPE when creating a core with both explicit shard and coreNodeName. (YouPeng Yang, Mark Miller) * SOLR-5552: Leader recovery process can select the wrong leader if all replicas for a shard are down and trying to recover as well as lose updates that should have been recovered. (Timothy Potter, Mark Miller) * SOLR-5569 A replica should not try and recover from a leader until it has published that it is ACTIVE. (Mark Miller) * SOLR-5568 A SolrCore cannot decide to be the leader just because the cluster state says no other SolrCore's are active. (Mark Miller) * SOLR-5496: We should share an http connection manager across non search HttpClients and ensure all http connection managers get shutdown. (Mark Miller) * SOLR-5583: ConcurrentUpdateSolrServer#blockUntilFinished may wait forever if the executor service is shutdown. (Mark Miller) * SOLR-5586: All ZkCmdExecutor's should be initialized with the zk client timeout. (Mark Miller) * SOLR-5587: ElectionContext implementations should use ZkCmdExecutor#ensureExists to ensure their election paths are properly created. (Mark Miller) * SOLR-5540: HdfsLockFactory should explicitly create the lock parent directory if necessary. (Mark Miller) * SOLR-4709: The core reload after replication if config files have changed can fail due to a race condition. (Mark Miller, Hossman) * SOLR-5503: Retry 'forward to leader' requests less aggressively - rather than on IOException and status 500, ConnectException. (Mark Miller) * SOLR-5588: PeerSync doesn't count all connect failures as success. (Mark Miller) * SOLR-5564: hl.maxAlternateFieldLength should apply to original field when fallback is attempted (janhoy) * SOLR-5608: Don't allow a closed SolrCore to publish state to ZooKeeper. (Mark Miller, Shawn Heisey) * SOLR-5615: Deadlock while trying to recover after a ZK session expiration. (Ramkumar Aiyengar, Mark Miller) * SOLR-5543: Core swaps resulted in duplicate core entries in solr.xml when using solr.xml persistence. (Bill Bell, Alan Woodward) * SOLR-5618: Fix false cache hits in queryResultCache when hashCodes are equal and duplicate filter queries exist in one of the requests (hossman) * SOLR-4260: ConcurrentUpdateSolrServer#blockUntilFinished can return before all previously added updates have finished. This could cause distributed updates meant for replicas to be lost. (Markus Jelsma, Timothy Potter, Joel Bernstein, Mark Miller) * SOLR-5645: A SolrCore reload via the CoreContainer will try and register in zk again with the new SolrCore. (Mark Miller) * SOLR-5636: SolrRequestParsers does some xpath lookups on every request, which can cause concurrency issues. (Mark Miller) * SOLR-5658: commitWithin and overwrite are not being distributed to replicas now that SolrCloud uses javabin to distribute updates. (Mark Miller, Varun Thacker, Elodie Sannier, shalin) Optimizations ---------------------- * SOLR-5576: Improve concurrency when registering and waiting for all SolrCore's to register a DOWN state. (Christine Poerschke via Mark Miller) ================== 4.6.0 ================== Versions of Major Components --------------------- Apache Tika 1.4 Carrot2 3.8.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Upgrading from Solr 4.5.0 ---------------------- * If you are using methods from FieldMutatingUpdateProcessorFactory for getting configuration information (oneOrMany or getBooleanArg), those methods have been moved to NamedList and renamed to removeConfigArgs and removeBooleanArg, respectively. The original methods are deprecated, to be removed in 5.0. See SOLR-5264. Detailed Change List ---------------------- New Features ---------------------- * SOLR-5167: Add support for AnalyzingInfixSuggester (AnalyzingInfixLookupFactory). (Areek Zillur, Varun Thacker via Robert Muir) * SOLR-5246: Shard splitting now supports collections configured with router.field. (shalin) * SOLR-5274: Allow JettySolrRunner SSL config to be specified via a constructor. (Mark Miller) * SOLR-5300: Shards can be split by specifying arbitrary number of hash ranges within the shard's hash range. (shalin) * SOLR-5226: Add Lucene index heap usage to the Solr admin UI. (Areek Zillur via Robert Muir) * SOLR-5324: Make sub shard replica recovery and shard state switch asynchronous. (Yago Riveiro, shalin) * SOLR-5338: Split shards by a route key using split.key parameter. (shalin) * SOLR-5353: Enhance CoreAdmin api to split a route key's documents from an index and leave behind all other documents. (shalin) * SOLR-5027: CollapsingQParserPlugin for high performance field collapsing on high cardinality fields. (Joel Bernstein) * SOLR-5395: Added a RunAlways marker interface for UpdateRequestProcessorFactory implementations indicating that they should not be removed in later stages of distributed updates (usually signalled by the update.distrib parameter) (yonik) * SOLR-5310: Add a collection admin command to remove a replica (noble) * SOLR-5311: Avoid registering replicas which are removed (noble) * SOLR-5406: CloudSolrServer failed to propagate request parameters along with delete updates. (yonik) * SOLR-5374: Support user configured doc-centric versioning rules via the optional DocBasedVersionConstraintsProcessorFactory update processor (Hossman, yonik) * SOLR-5392: Extend solrj apis to cover collection management. (Roman Shaposhnik via Mark Miller) * SOLR-5084: new field type EnumField. (Elran Dvir via Erick Erickson) * SOLR-5464: Add option to ConcurrentSolrServer to stream pure delete requests. (Mark Miller) Bug Fixes ---------------------- * SOLR-5216: Document updates to SolrCloud can cause a distributed deadlock. (Mark Miller) * SOLR-5367: Unmarshalling delete by id commands with JavaBin can lead to class cast exception. (Mark Miller) * SOLR-5359: ZooKeeper client is not closed when it fails to connect to an ensemble. (Mark Miller, Klaus Herrmann) * SOLR-5042: MoreLikeThisComponent was using the rows/count value in place of flags, which caused a number of very strange issues, including NPEs and ignoring requests for the results to include the score. (Anshum Gupta, Mark Miller, Shawn Heisey) * SOLR-5371: Solr should consistently call SolrServer#shutdown (Mark Miller) * SOLR-5363: Solr doesn't start up properly with Log4J2 (Petar Tahchiev via Alan Woodward) * SOLR-5380: Using cloudSolrServer.setDefaultCollection(collectionId) does not work as intended for an alias spanning more than 1 collection. (Thomas Egense, Shawn Heisey, Mark Miller) * SOLR-5418: Background merge after field removed from solr.xml causes error. (Reported on user's list, Robert M's patch via Erick Erickson) * SOLR-5318: Creating a core via the admin API doesn't respect transient property (Olivier Soyez via Erick Erickson) * SOLR-5388: Creating a new core via the HTTP API that results in a transient being unloaded results in a " Too many close [count:-1]" error. (Olivier Soyez via Erick Erickson) * SOLR-5453: Raise recovery socket read timeouts. (Mark Miller) * SOLR-5397: Replication can fail silently in some cases. (Mark Miller) * SOLR-5465: SolrCmdDistributor retry logic has a concurrency race bug. (Mark Miller) * SOLR-5452: Do not attempt to proxy internal update requests. (Mark Miller) Optimizations ---------------------- * SOLR-5232: SolrCloud should distribute updates via streaming rather than buffering. (Mark Miller) * SOLR-5223: SolrCloud should use the JavaBin binary format for communication by default. (Mark Miller) * SOLR-5370: Requests to recover when an update fails should be done in background threads. (Mark Miller) * LUCENE-5300,LUCENE-5304: Specialized faceting for fields which are declared as multi-valued in the schema but are actually single-valued. (Adrien Grand) Security ---------------------- * SOLR-4882: SolrResourceLoader was restricted to only allow access to resource files below the instance dir. The reason for this is security related: Some Solr components allow to pass in resource paths via REST parameters (e.g. XSL stylesheets, velocity templates,...) and load them via resource loader. For backwards compatibility, this security feature can be disabled by a new system property: solr.allow.unsafe.resourceloading=true (Uwe Schindler) Other Changes ---------------------- * SOLR-5237: Add indexHeapUsageBytes to LukeRequestHandler, indicating how much heap memory is being used by the underlying Lucene index structures. (Areek Zillur via Robert Muir) * SOLR-5241: Fix SimplePostToolTest performance problem - implicit DNS lookups (hossman) * SOLR-5273: Update HttpComponents to 4.2.5 and 4.2.6. (Mark Miller) * SOLR-5264: Move methods for getting config information from FieldMutatingUpdateProcessorFactory to NamedList. (Shawn Heisey) * SOLR-5319: Remove unused and incorrect router name from Collection ZK nodes. (Jessica Cheng via shalin) * SOLR-5321: Remove unnecessary code in Overseer.updateState method which tries to use router name from message where none is ever sent. (shalin) * SOLR-5401: SolrResourceLoader logs a warning if a deprecated (factory) class is used in schema or config. (Uwe Schindler) * SOLR-3397: Warn if master or slave replication is enabled in SolrCloud mode. (Erick Erickson) ================== 4.5.1 ================== Versions of Major Components --------------------- Apache Tika 1.4 Carrot2 3.8.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Detailed Change List ---------------------- Bug Fixes ---------------------- * SOLR-4590: Collections API should return a nice error when not in SolrCloud mode. (Anshum Gupta, Mark Miller) * SOLR-5295: The CREATESHARD collection API creates maxShardsPerNode number of replicas if replicationFactor is not specified. (Brett Hoerner, shalin) * SOLR-5296: Creating a collection with implicit router adds shard ranges to each shard. (shalin) * SOLR-5263: Fix CloudSolrServer URL cache update race. (Jessica Cheng, Mark Miller) * SOLR-5297: Admin UI - Threads Screen missing Icon (steffkes) * SOLR-5301: DELETEALIAS command prints CREATEALIAS in logs (janhoy) * SOLR-5255: Remove unnecessary call to fetch and watch live nodes in ZkStateReader cluster watcher. (Jessica Cheng via shalin) * SOLR-5305: Admin UI - Reloading System-Information on Dashboard does not work anymore (steffkes) * SOLR-5314: Shard split action should use soft commits instead of hard commits to make sub shard data visible. (Kalle Aaltonen, shalin) * SOLR-5327: SOLR-4915, "The root cause should be returned to the user when a SolrCore create call fails", was reverted. (Mark Miller) * SOLR-5317: SolrCore persistence bugs if defining SolrCores in solr.xml. (Mark Miller, Yago Riveiro) * SOLR-5306: Extra collection creation parameters like collection.configName are not being respected. (Mark Miller, Liang Tianyu, Nathan Neulinger) * SOLR-5325: ZooKeeper connection loss can cause the Overseer to stop processing commands. (Christine Poerschke, Mark Miller, Jessica Cheng) * SOLR-4327: HttpSolrServer can leak connections on errors. (Karl Wright, Mark Miller) * SOLR-5349: CloudSolrServer - ZK timeout arguments passed to ZkStateReader are flipped. (Ricardo Merizalde via shalin) * SOLR-5330: facet.method=fcs on single values fields could sometimes result in incorrect facet labels. (Michael Froh, yonik) Other Changes ---------------------- * SOLR-5323: Disable ClusteringComponent by default in collection1 example. The solr.clustering.enabled system property needs to be set to 'true' to enable the clustering contrib (reverts SOLR-4708). (Dawid Weiss) ================== 4.5.0 ================== Versions of Major Components --------------------- Apache Tika 1.4 Carrot2 3.8.0 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Upgrading from Solr 4.4.0 ---------------------- * XML configuration parsing is now more strict about situations where a single setting is allowed but multiple values are found. In the past, one value would be chosen arbitrarily and silently. Starting with 4.5, configuration parsing will fail with an error in situations like this. If you see error messages such as "solrconfig.xml contains more than one value for config path: XXXXX" or "Found Z configuration sections when at most 1 is allowed matching expression: XXXXX" check your solrconfig.xml file for multiple occurrences of XXXXX and delete the ones that you do not wish to use. See SOLR-4953 & SOLR-5108 for more details. * In the past, schema.xml parsing would silently ignore "default" or "required" options specified on declarations. Beginning with 4.5, attempting to do configured these on a dynamic field will cause an init error. If you encounter one of these errors when upgrading an existing schema.xml, you can safely remove these attributes, regardless of their value, from your config and Solr will continue to behave exactly as it did in previous versions. See SOLR-5227 for more details. * The UniqFieldsUpdateProcessorFactory has been improved to support all of the FieldMutatingUpdateProcessorFactory selector options. The init param option is now deprecated and should be replaced with the more standard . See SOLR-4249 for more details. * UpdateRequestExt has been removed as part of SOLR-4816. You should use UpdateRequest instead. * CloudSolrServer can now use multiple threads to add documents by default. This is a small change in runtime semantics when using the bulk add method - you will still end up with the same exception on a failure, but some documents beyond the one that failed may have made it in. To get the old, single threaded behavior, set parallel updates to false on the CloudSolrServer instance. Detailed Change List ---------------------- New Features ---------------------- * SOLR-5219: Rewritten selection of the default search and document clustering algorithms. (Dawid Weiss) * SOLR-5202: Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. (Dawid Weiss) * SOLR-5126: Update Carrot2 clustering to version 3.8.0, update Morfologik to version 1.7.1 (Dawid Weiss) * SOLR-2345: Enhanced geodist() to work with an RPT field, provided that the field is referenced via 'sfield' and the query point is constant. (David Smiley) * SOLR-5082: The encoding of URL-encoded query parameters can be changed with the "ie" (input encoding) parameter, e.g. "select?q=m%FCller&ie=ISO-8859-1". The default is UTF-8. To change the encoding of POSTed content, use the "Content-Type" HTTP header. (Uwe Schindler, Shawn Heisey) * SOLR-4221: Custom sharding (Noble Paul) * SOLR-4808: Persist and use router,replicationFactor and maxShardsPerNode at Collection and Shard level (Noble Paul, Shalin Mangar) * SOLR-5006: CREATESHARD command for 'implicit' shards (Noble Paul) * SOLR-5017: Allow sharding based on the value of a field (Noble Paul) * SOLR-4222: create custom sharded collection via collections API (Noble Paul) * SOLR-4718: Allow solr.xml to be stored in ZooKeeper. (Mark Miller, Erick Erickson) * SOLR-5156: Enhance ZkCLI to allow uploading of arbitrary files to ZK. (Erick Erickson) * SOLR-5165: Single-valued docValues fields no longer require a default value. Additionally they work with sortMissingFirst, sortMissingLast, facet.missing, exists() in function queries, etc. (Robert Muir) * SOLR-5182: Add NoOpRegenerator, a regenerator for custom per-segment caches where items are preserved across commits. (Robert Muir) * SOLR-4249: UniqFieldsUpdateProcessorFactory now extends FieldMutatingUpdateProcessorFactory and supports all of its selector options. Use of the "fields" init param is now deprecated in favor of "fieldName" (hossman) * SOLR-2548: Allow multiple threads to be specified for faceting. When threading, one can specify facet.threads to parallelize loading the uninverted fields. In at least one extreme case this reduced warmup time from 20 seconds to 3 seconds. (Janne Majaranta, Gun Akkor via Erick Erickson, David Smiley) * SOLR-4816: CloudSolrServer can now route updates locally and no longer relies on inter-node update forwarding. (Joel Bernstein, Shikhar Bhushan, Stephen Riesenberg, Mark Miller) * SOLR-3249: Allow CloudSolrServer and SolrCmdDistributor to use JavaBin. (Mark Miller) Bug Fixes ---------------------- * SOLR-3633: web UI reports an error if CoreAdminHandler says there are no SolrCores (steffkes) * SOLR-4489: SpellCheckComponent can throw StringIndexOutOfBoundsException when generating collations involving multiple word-break corrections. (James Dyer) * SOLR-5087 - CoreAdminHandler.handleMergeAction generating NullPointerException (Patrick Hunt via Erick Erickson) * SOLR-5107: Fixed NPE when using numTerms=0 in LukeRequestHandler (Ahmet Arslan, hossman) * SOLR-4679, SOLR-4908, SOLR-5124: Text extracted from HTML or PDF files using Solr Cell was missing ignorable whitespace, which is inserted by TIKA for convenience to support plain text extraction without using the HTML elements. This bug resulted in glued words. (hossman, Uwe Schindler) * SOLR-5121: zkcli usage help for makepath doesn't match actual command. (Daniel Collins via Mark Miller) * SOLR-5119: Managed schema problems after adding fields via Schema Rest API. (Nils Kübler, Steve Rowe) * SOLR-5133: HdfsUpdateLog can fail to close a FileSystem instance if init is called more than once. (Mark Miller) * SOLR-5135: Harden Collection API deletion of /collections/$collection ZooKeeper node. (Mark Miller) * SOLR-4764: When using NRT, just init the first reader from IndexWriter. (Robert Muir, Mark Miller) * SOLR-5122: Fixed bug in spellcheck.collateMaxCollectDocs. Eliminates risk of divide by zero, and makes estimated hit counts meaningful in non-optimized indexes. (hossman) * SOLR-3936: Fixed QueryElevationComponent sorting when used with Grouping (Michael Garski via hossman) * SOLR-5171: SOLR Admin gui works in IE9, breaks in IE10. (Joseph L Howard via steffkes) * SOLR-5174: Admin UI - Query View doesn't highlight (json) Result if it contains HTML Tags (steffkes) * SOLR-4817 Solr should not fall back to the back compat built in solr.xml in SolrCloud mode (Erick Erickson) * SOLR-5112: Show full message in Admin UI Logging View (Matthew Keeney via steffkes) * SOLR-5190: SolrEntityProcessor substitutes variables only once in child entities (Harsh Chawla, shalin) * SOLR-3852: Fixed ZookeeperInfoServlet so that the SolrCloud Admin UI pages will work even if ZK contains nodes with data which are not utf8 text. (hossman) * SOLR-5206: Fixed OpenExchangeRatesOrgProvider to use refreshInterval correctly (Catalin, hossman) * SOLR-5215: Fix possibility of deadlock in ZooKeeper ConnectionManager. (Mark Miller, Ricardo Merizalde) * SOLR-4909: Use DirectoryReader.openIfChanged in non-NRT mode. (Michael Garski via Robert Muir) * SOLR-5227: Correctly fail schema initialization if a dynamicField is configured to be required, or have a default value. (hossman) * SOLR-5231: Fixed a bug with the behavior of BoolField that caused documents w/o a value for the field to act as if the value were true in functions if no other documents in the same index segment had a value of true. (Robert Muir, hossman, yonik) * SOLR-5233: The "deleteshard" collections API doesn't wait for cluster state to update, can fail if some nodes of the deleted shard were down and had incorrect logging. (Christine Poerschke, shalin) * SOLR-5150: HdfsIndexInput may not fully read requested bytes. (Mark Miller, Patrick Hunt) * SOLR-5240: All solr cores will now be loaded in parallel (as opposed to a fixed number) in zookeeper mode to avoid deadlocks due to replicas waiting for other replicas to come up. (yonik) * SOLR-5243: Killing a shard in one collection can result in leader election in a different collection if they share the same coreNodeName. (yonik, Mark Miller) * SOLR-5281: IndexSchema log message was printing '[null]' instead of '[]' (Jun Ohtani via Steve Rowe) * SOLR-5279: Implicit properties don't seem to exist on core RELOAD (elyograg, hossman, Steve Rowe) * SOLR-5291: Solrj does not propagate the root cause to the user for many errors. (Mark Miller) Optimizations ---------------------- * SOLR-5044: Admin UI - Note on Core-Admin about directories while creating core (steffkes) * SOLR-5134: Have HdfsIndexOutput extend BufferedIndexOutput. (Mark Miller, Uwe Schindler) * SOLR-5057: QueryResultCache should not related with the order of fq's list (Feihong Huang via Erick Erickson) * SOLR-4816: CloudSolrServer now uses multiple threads to send updates by default. (Joel Bernstein via Mark Miller) * SOLR-3530: Better error messages / Content-Type validation in SolrJ. (Mark Miller, hossman) Other Changes ---------------------- * SOLR-4708: Enable ClusteringComponent by default in collection1 example. The solr.clustering.enabled system property is set to 'true' by default. (ehatcher, Dawid Weiss) * SOLR-4914, SOLR-5162: Factor out core list persistence and discovery into a new CoresLocator interface. (Alan Woodward, Shawn Heisey) * SOLR-5056: Improve type safety of ConfigSolr class. (Alan Woodward) * SOLR-4951: Better randomization of MergePolicy in Solr tests (hossman) * SOLR-4953, SOLR-5108: Make XML Configuration parsing fail if an xpath matches multiple nodes when only a single value or plugin instance is expected. (hossman) * The routing parameter "shard.keys" is deprecated as part of SOLR-5017 .The new parameter name is '_route_' . The old parameter should continue to work for another release (Noble Paul) * SOLR-5173: Solr-core's Maven configuration includes test-only Hadoop dependencies as indirect compile-time dependencies. (Chris Collins, Steve Rowe) ================== 4.4.0 ================== Versions of Major Components --------------------- Apache Tika 1.4 Carrot2 3.6.2 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Upgrading from Solr 4.3.0 ---------------------- * TieredMergePolicy and the various subtypes of LogMergePolicy no longer have an explicit "setUseCompoundFile" method. Instead the behavior of new segments is determined by the IndexWriter configuration, and the MergePolicy is only consulted to determine if merge segments should use the compound file format (based on the value of "setNoCFSRatio"). If you have explicitly configured one of these classes using and include an init arg like this... true ...this will now be treated as if you specified... true ...directly on the (overriding any value already set using that syntax) and a warning will be logged to updated your configuration. Users with an explicitly declared are encouraged to review the current javadocs for their MergePolicy subclass and review their configured options carefully. See SOLR-4941, SOLR-4934 and LUCENE-5038 for more information. * SOLR-4778: The signature of LogWatcher.registerListener has changed, from (ListenerConfig, CoreContainer) to (ListenerConfig). Users implementing their own LogWatcher classes will need to change their code accordingly. * LUCENE-5063: ByteField and ShortField have been deprecated and will be removed in 5.0. If you are still using these field types, you should migrate your fields to TrieIntField. Detailed Change List ---------------------- New Features ---------------------- * SOLR-3251: Dynamically add fields to schema. (Steve Rowe, Robert Muir, yonik) * SOLR-4761, SOLR-4976: Add option to plugin a merged segment warmer into solrconfig.xml. Info about segments warmed in the background is available via infostream. (Mark Miller, Ryan Ernst, Mike McCandless, Robert Muir) * SOLR-3240: Add "spellcheck.collateMaxCollectDocs" option so that when testing potential Collations against the index, SpellCheckComponent will only collect n documents, thereby estimating the hit-count. This is a performance optimization in cases where exact hit-counts are unnecessary. Also, when "collateExtendedResults" is false, this optimization is always made (James Dyer). * SOLR-4785: New MaxScoreQParserPlugin returning max() instead of sum() of terms (janhoy) * SOLR-4234: Add support for binary files in ZooKeeper. (Eric Pugh via Mark Miller) * SOLR-4048: Add findRecursive method to NamedList. (Shawn Heisey) * SOLR-4228: SolrJ's SolrPing object has new methods for ping, enable, and disable. (Shawn Heisey, hossman, Steve Rowe) * SOLR-4893: Extend FieldMutatingUpdateProcessor.ConfigurableFieldNameSelector to enable checking whether a field matches any schema field. To select field names that don't match any fields or dynamic fields in the schema, add false to an update processor's configuration in solrconfig.xml. (Steve Rowe, hossman) * SOLR-4921: Admin UI now supports adding documents to Solr (gsingers, steffkes) * SOLR-4916: Add support to write and read Solr index files and transaction log files to and from HDFS. (phunt, Mark Miller, Gregory Chanan) * SOLR-4892: Add FieldMutatingUpdateProcessorFactory subclasses Parse{Date,Integer,Long,Float,Double,Boolean}UpdateProcessorFactory. These factories have a default selector that matches all fields that either don’t match any schema field, or are in the schema with the corresponding typeClass. If they see a value that is not a CharSequence, or can't parse the value, they leave it as is. For multi-valued fields, these processors will not convert any values unless all are first successfully parsed, or already are instances of the target class. Ordering the processors, e.g. [Boolean, Long, Double, Date] will allow e.g. values ["2", "5", "8.6"] to be left alone by the Boolean and Long processors, but then converted by the Double processor. (Steve Rowe, hossman) * SOLR-4972: Add PUT command to ZkCli tool. (Roman Shaposhnik via Mark Miller) * SOLR-4973: Adding getter method for defaultCollection on CloudSolrServer. (Furkan KAMACI via Mark Miller) * SOLR-4897: Add solr/example/example-schemaless/, an example config set for schemaless mode. (Steve Rowe) * SOLR-4655: Add option to have Overseer assign generic node names so that new addresses can host shards without naming confusion. (Mark Miller, Anshum Gupta) * SOLR-4977: Add option to send IndexWriter's infostream to the logging system. (Ryan Ernst via Robert Muir) * SOLR-4693: A "deleteshard" collections API that unloads all replicas of a given shard and then removes it from the cluster state. It will remove only those shards which are INACTIVE or have no range (created for custom sharding). (Anshum Gupta, shalin) * SOLR-5003: CSV Update Handler supports optionally adding the line number/row id to a document (gsingers) * SOLR-5010: Add support for creating copy fields to the Schema REST API (gsingers) * SOLR-4991: Register QParserPlugins as SolrInfoMBeans (ehatcher) * SOLR-4943: Add a new system wide info admin handler that exposes the system info that could previously only be retrieved using a SolrCore. (Mark Miller) * SOLR-3076: Block joins. Documents and their sub-documents must be indexed as a block. {!parent which=} takes in a query that matches child documents and results in matches on their parents. {!child of=} takes in a query that matches some parent documents and results in matches on their children. (Mikhail Khludnev, Vadim Kirilchuk, Alan Woodward, Tom Burton-West, Mike McCandless, hossman, yonik) Bug Fixes ---------------------- * SOLR-4333: edismax parser to not double-escape colons if already escaped by the client application (James Dyer, Robert J. van der Boon) * SOLR-4776: Solrj doesn't return "between" count in range facets (Philip K. Warren via shalin) * SOLR-4616: HitRatio on caches is now exposed over JMX MBeans as a float. (Greg Bowyer) * SOLR-4803: Fixed core discovery mode (ie: new style solr.xml) to treat 'collection1' as the default core name. (hossman) * SOLR-4790: Throw an error if a core has the same name as another core, both old and new style solr.xml * SOLR-4842: Fix facet.field local params from affecting other facet.field's. (ehatcher, hossman) * SOLR-4814: If a SolrCore cannot be created it should remove any information it published about itself from ZooKeeper. (Mark Miller) * SOLR-4863: Removed non-existent attribute sourceId from dynamic JMX stats to fix AttributeNotFoundException (suganuma, hossman via shalin) * SOLR-4891: JsonLoader should preserve field value types from the JSON content stream. (Steve Rowe) * SOLR-4805: SolrCore#reload should not call preRegister and publish a DOWN state to ZooKeeper. (Mark Miller, Jared Rodriguez) * SOLR-4899: When reconnecting after ZooKeeper expiration, we need to be willing to wait forever, not just for 30 seconds. (Mark Miller) * SOLR-4920: JdbcDataSource incorrectly suppresses exceptions when retrieving a connection from a JNDI context and falls back to trying to use DriverManager to obtain a connection. Additionally, if a SQLException is thrown while initializing a connection, such as in setAutoCommit(), the connection will not be closed. (Chris Eldredge via shalin) * SOLR-4915: The root cause should be returned to the user when a SolrCore create call fails. (Mark Miller) * SOLR-4925 : Collection create throws NPE when 'numShards' param is missing (Noble Paul) * SOLR-4910: persisting solr.xml is broken. More stringent testing of persistence fixed up a number of issues and several bugs with persistence. Among them are - don't persisting implicit properties - should persist zkHost in the tag (user's list) - reloading a core that has transient="true" returned an error. reload should load a transient core if it's not yet loaded. - No longer persisting loadOnStartup or transient core properties if they were not specified in the original solr.xml - Testing flushed out the fact that you couldn't swap a core marked transient=true loadOnStartup=false because it hadn't been loaded yet. - SOLR-4862, CREATE fails to persist schema, config, and dataDir - SOLR-4363, not persisting coreLoadThreads in tag - SOLR-3900, logWatcher properties not persisted - SOLR-4850, cores defined as loadOnStartup=true, transient=false can't be searched (Erick Erickson) * SOLR-4923: Commits to non leaders as part of a request that also contain updates can execute out of order. (hossman, Ricardo Merizalde, Mark Miller) * SOLR-4932: persisting solr.xml saves some parameters it shouldn't when they weren't defined in the original. Benign since the default values are saved, but still incorrect. (Erick Erickson, thanks Shawn Heisey for helping test!) * SOLR-4934, SOLR-4941: Fix handling of init arg "useCompoundFile" needed after changes in LUCENE-5038 (hossman) * SOLR-4456: Admin UI: Displays dashboard even if Solr is down (steffkes) * SOLR-4949: UI Analysis page dropping characters from input box (steffkes) * SOLR-4960: Fix race conditions in shutdown of CoreContainer and getCore that could cause a request to attempt to use a core that has shut down. (yonik) * SOLR-4926: Fixed rare replication bug that normally only manifested when using compound file format. (yonik, Mark Miller) * SOLR-4974: Outgrowth of SOLR-4960 that includes transient cores and pending cores (Erick Erickson) * SOLR-3369: shards.tolerant=true is broken for group queries (Russell Black, Martijn van Groningen, Jabouille jean Charles, Ryan McKinley via shalin) * SOLR-4452: Hunspell stemmer should not merge duplicate dictionary entries (janhoy) * SOLR-5000: ManagedIndexSchema doesn't persist uniqueKey tag after calling addFields method. (Jun Ohtani, Steve Rowe) * SOLR-4982: Creating a core while referencing system properties looks like it loses files Actually, instanceDir, config, dataDir and schema are not dereferenced properly when creating cores that reference sys vars (e.g. &dataDir=${dir}). In the dataDir case in particular this leads to the index being put in a directory literally named ${dir} but on restart the sysvar will be properly dereferenced. * SOLR-4788: Multiple Entities DIH delta import: dataimporter.[entityName].last_index_time is empty. (chakming wong, James Dyer via shalin) * SOLR-4978: Time is stripped from datetime column when imported into Solr date field if convertType=true. (Bill Au, shalin) * SOLR-5019: spurious ConcurrentModificationException when spell check component was in use with filters. (yonik) * SOLR-5018: The Overseer should avoid publishing the state for collections that do not exist under the /collections zk node. (Mark Miller) * SOLR-5028,SOLR-5029: ShardHandlerFactory was not being created properly when using new-style solr.xml, and was not being persisted properly when using old-style. (Tomás Fernández Löbbe, Ryan Ernst, Alan Woodward) * SOLR-4997: The splitshard api doesn't call commit on new sub shards before switching shard states. Multiple bugs related to sub shard recovery and replication are also fixed. (shalin) * SOLR-5034: A facet.query that parses or analyzes down to a null Query would throw a NPE. Fixed. (David Smiley) * SOLR-5039: Admin/Schema Browser displays -1 for term counts for multiValued fields. * SOLR-5037: The CSV loader now accepts field names that are not in the schema. (gsingers, ehatcher, Steve Rowe) * SOLR-4791: solr.xml sharedLib does not work in 4.3.0 (Ryan Ernst, Jan Høydahl via Erick Erickson) Optimizations ---------------------- * SOLR-4923: Commit to all nodes in a collection in parallel rather than locally and then to all other nodes. (hossman, Ricardo Merizalde, Mark Miller) * SOLR-3838: Admin UI - Multiple filter queries are not supported in Query UI (steffkes) * SOLR-4719 : Admin UI - Default to wt=json on Query-Screen (steffkes) * SOLR-4611: Admin UI - Analysis-Urls with empty parameters create empty result table (steffkes) * SOLR-4955: Admin UI - Show address bar on top for Schema + Config (steffkes) * SOLR-4412: New parameter langid.lcmap to map detected language code to be placed in "language" field (janhoy) * SOLR-4815: Admin-UI - DIH: Let "commit" be checked by default (steffkes) * SOLR-5002: optimize numDocs(Query,DocSet) when filterCache is null (Robert Muir) * SOLR-5012: optimize search with filter when filterCache is null (Robert Muir) Other Changes ---------------------- * SOLR-4737: Update Guava to 14.0.1 (Mark Miller) * SOLR-2079: Add option to pass HttpServletRequest in the SolrQueryRequest context map. (Tomás Fernández Löbbe via Robert Muir) * SOLR-4738: Update Jetty to 8.1.10.v20130312 (Mark Miller, Robert Muir) * SOLR-4749: Clean up and refactor CoreContainer code around solr.xml and SolrCore management. (Mark Miller) * SOLR-4547: Move logging of filenames on commit from INFO to DEBUG. (Shawn Heisey, hossman) * SOLR-4757: Change the example to use the new solr.xml format and core discovery by directory structure. (Mark Miller) * SOLR-4759: Velocity (/browse) template cosmetic cleanup. (Mark Bennett, ehatcher) * SOLR-4778: LogWatcher init code moved out of CoreContainer (Alan Woodward) * SOLR-4784: Make class LuceneQParser public (janhoy) * SOLR-4448: Allow the solr internal load balancer to be more easily pluggable. (Philip Hoy via Robert Muir) * SOLR-4224: Refactor JavaBinCodec input stream definition to enhance reuse. (phunt via Mark Miller) * SOLR-4931: SolrDeletionPolicy onInit and onCommit methods changed to override exact signatures (with generics) from IndexDeletionPolicy (shalin) * SOLR-4942: test improvements to randomize use of compound files (hossman) * SOLR-4966: CSS, JS and other files in webapp without license (uschindler, steffkes) * SOLR-4986: Upgrade to Tika 1.4 (Markus Jelsma via janhoy) * SOLR-4948, SOLR-5009: Tidied up CoreContainer construction logic. (Alan Woodward, Uwe Schindler, Steve Rowe) * LUCENE-5107: Properties files by Solr are now written in UTF-8 encoding, Unicode is no longer escaped. Reading of legacy properties files with \u escapes is still possible. (Uwe Schindler, Robert Muir) ================== 4.3.1 ================== Versions of Major Components --------------------- Apache Tika 1.3 Carrot2 3.6.2 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Detailed Change List ---------------------- Bug Fixes ---------------------- * SOLR-4795: Sub shard leader should not accept any updates from parent after it goes active (shalin) * SOLR-4798: shard splitting does not respect the router for the collection when executing the index split. One effect of this is that documents may be placed in the wrong shard when the default compositeId router is used in conjunction with IDs containing "!". (yonik) * SOLR-4797: Shard splitting creates sub shards which have the wrong hash range in cluster state. This happens when numShards is not a power of two and router is compositeId. (shalin) * SOLR-4806: Shard splitting does not abort if WaitForState times out (shalin) * SOLR-4807: The zkcli script now works with log4j. The zkcli.bat script was broken on Windows in 4.3.0, now it works. (Shawn Heisey) * SOLR-4813: Fix SynonymFilterFactory to allow init parameters for tokenizer factory used when parsing synonyms file. (Shingo Sasaki, hossman) * SOLR-4829: Fix transaction log leaks (a failure to clean up some old logs) on a shard leader, or when unexpected exceptions are thrown during log recovery. (Steven Bower, Mark Miller, yonik) * SOLR-4751: Fix replication problem of files in sub directory of conf directory. (Minoru Osuka via Koji) * SOLR-4741: Deleting a collection should set DELETE_DATA_DIR to true. (Mark Miller) * SOLR-4752: There are some minor bugs in the Collections API parameter validation. (Mark Miller) * SOLR-4563: RSS DIH-example not working (janhoy) * SOLR-4796: zkcli.sh should honor JAVA_HOME (Roman Shaposhnik via Mark Miller) * SOLR-4734: Leader election fails with an NPE if there is no UpdateLog. (Mark Miller, Alexander Eibner) * SOLR-4868: Setting the log level for the log4j root category results in adding a new category, the empty string. (Shawn Heisey) * SOLR-4855: DistributedUpdateProcessor doesn't check for peer sync requests (shalin) * SOLR-4867: Admin UI - setting loglevel on root throws RangeError (steffkes) * SOLR-4870: RecentUpdates.update() does not increment numUpdates loop counter (Alexey Kudinov via shalin) * SOLR-4877, LUCENE-5023: Removed SolrIndexSearcher#getDocSetNC()'s special case for handling TermQuery to prevent NullPointerException if reader does not have fields. (Bao Yang Yang, Uwe Schindler) * SOLR-4881: Fix DocumentAnalysisRequestHandler to correctly use EmptyEntityResolver to prevent loading of external entities like UpdateRequestHandler does. (Hossman, Uwe Schindler) * SOLR-4858: SolrCore reloading was broken when the UpdateLog was enabled. (Hossman, Anshum Gupta, Alexey Serba, Mark Miller, yonik) * SOLR-4853: Fixed SolrJettyTestBase so it may be reused by end users (hossman) * SOLR-4744: Update failure on sub shard is not propagated to clients by parent shard (Anshum Gupta, yonik, shalin) Other Changes ---------------------- * SOLR-4760: Include core name in logs when loading schema. (Shawn Heisey) ================== 4.3.0 ================== Versions of Major Components --------------------- Apache Tika 1.3 Carrot2 3.6.2 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Upgrading from Solr 4.2.0 ---------------------- * In the schema REST API, the output path for copyFields and dynamicFields has been changed from all lowercase "copyfields" and "dynamicfields" to camelCase "copyFields" and "dynamicFields", respectively, to align with all other schema REST API outputs, which use camelCase. The URL format remains the same: all resource names are lowercase. See SOLR-4623 for details. * Slf4j/logging jars are no longer included in the Solr webapp. All logging jars are now in example/lib/ext. Changing logging impls is now as easy as updating the jars in this folder with those necessary for the logging impl you would like. If you are using another webapp container, these jars will need to go in the corresponding location for that container. In conjunction, the dist-excl-slf4j and dist-war-excl-slf4 build targets have been removed since they are redundant. See the Slf4j documentation, SOLR-3706, and SOLR-4651 for more details. * The hardcoded SolrCloud defaults for 'hostContext="solr"' and 'hostPort="8983"' have been deprecated and will be removed in Solr 5.0. Existing solr.xml files that do not have these options explicitly specified should be updated accordingly. See SOLR-4622 for more details. Detailed Change List ---------------------- New Features ---------------------- * SOLR-4648 PreAnalyzedUpdateProcessorFactory allows using the functionality of PreAnalyzedField with other field types. See javadoc for details and examples. (Andrzej Bialecki) * SOLR-4623: Provide REST API read access to all elements of the live schema. Add a REST API request to return the entire live schema, in JSON, XML, and schema.xml formats. Move REST API methods from package org.apache.solr.rest to org.apache.solr.rest.schema, and rename base functionality REST API classes to remove the current schema focus, to prepare for other non-schema REST APIs. Change output path for copyFields and dynamicFields from "copyfields" and "dynamicfields" (all lowercase) to "copyFields" and "dynamicFields", respectively, to align with all other REST API outputs, which use camelCase. (Steve Rowe) * SOLR-4658: In preparation for REST API requests that can modify the schema, a "managed schema" is introduced. Add '' to solrconfig.xml in order to use it, and to enable schema modifications via REST API requests. (Steve Rowe, Robert Muir) * SOLR-4656: Added two new highlight parameters, hl.maxMultiValuedToMatch and hl.maxMultiValuedToExamine. maxMultiValuedToMatch stops looking for snippets after finding the specified number of matches, no matter how far into the multivalued field you've gone. maxMultiValuedToExamine stops looking for matches after the specified number of multiValued entries have been examined. If both are specified, the limit hit first stops the loop. Also this patch cuts down on the copying of the document entries during highlighting. These optimizations are probably unnoticeable unless there are a large number of entries in the multiValued field. Conspicuously, this will prevent the "best" match from being found if it appears later in the MV list than the cutoff specified by either of these params. (Erick Erickson) * SOLR-4675: Improve PostingsSolrHighlighter to support per-field/query-time overrides and add additional configuration parameters. See the javadocs for more details and examples. (Robert Muir) * SOLR-3755: A new collections api to add additional shards dynamically by splitting existing shards. (yonik, Anshum Gupta, shalin) * SOLR-4530: DIH: Provide configuration to use Tika's IdentityHtmlMapper (Alexandre Rafalovitch via shalin) * SOLR-4662: Discover SolrCores by directory structure rather than defining them in solr.xml. Also, change the format of solr.xml to be closer to that of solrconfig.xml. This version of Solr will ship the example in the old style, but you can manually try the new style. Solr 4.4 will ship with the new style, and Solr 5.0 will remove support for the old style. (Erick Erickson, Mark Miller) Additional Work: - SOLR-4347: Ensure that newly-created cores via Admin handler are persisted in solr.xml (Erick Erickson) - SOLR-1905: Cores created by the admin request handler should be persisted to solr.xml. Also fixed a problem whereby properties like solr.solr.datadir would be persisted to solr.xml. Also, cores that didn't happen to be loaded were not persisted. (Erick Erickson) * SOLR-4717/SOLR-1351: SimpleFacets now work with localParams allowing faceting on the same field multiple ways (ryan, Uri Boness) * SOLR-4671: CSVResponseWriter now supports pseudo fields. (ryan, nihed mbarek) * SOLR-4358: HttpSolrServer sends the stream name and exposes 'useMultiPartPost' (Karl Wright via ryan) Bug Fixes ---------------------- * SOLR-4543: setting shardHandlerFactory in solr.xml/solr.properties does not work. (Ryan Ernst, Robert Muir via Erick Erickson) * SOLR-4634: Fix scripting engine tests to work with Java 8's "Nashorn" Javascript implementation. (Uwe Schindler) * SOLR-4636: If opening a reader fails for some reason when opening a SolrIndexSearcher, a Directory can be left unreleased. (Mark Miller) * SOLR-4405: Admin UI - admin-extra files are not rendered into the core-menu (steffkes) * SOLR-3956: Fixed group.facet=true to work with negative facet.limit (Chris van der Merwe, hossman) * SOLR-4650: copyField doesn't work with source globs that don't match any explicit or dynamic fields. This regression was introduced in Solr 4.2. (Daniel Collins, Steve Rowe) * SOLR-4641: Schema now throws exception on illegal field parameters. (Robert Muir) * SOLR-3758: Fixed SpellCheckComponent to work consistently with distributed grouping (James Dyer) * SOLR-4652: Fix broken behavior with shared libraries in resource loader for solr.xml plugins. (Ryan Ernst, Robert Muir, Uwe Schindler) * SOLR-4664: ZkStateReader should update aliases on construction. (Mark Miller, Elodie Sannier) * SOLR-4682: CoreAdminRequest.mergeIndexes can not merge multiple cores or indexDirs. (Jason.D.Cao via shalin) * SOLR-4581: When faceting on numeric fields in Solr 4.2, negative values (constraints) were sorted incorrectly. (Alexander Buhr, shalin, yonik) * SOLR-4699: The System admin handler should not assume a file system based data directory location. (Mark Miller) * SOLR-4695: Fix core admin SPLIT action to be useful with non-cloud setups (shalin) * SOLR-4680: Correct example spellcheck configuration's queryAnalyzerFieldType and use "text" field instead of narrower "name" field (ehatcher, Mark Bennett) * SOLR-4702: Fix example /browse "Did you mean?" suggestion feature. (ehatcher, Mark Bennett) * SOLR-4710: You cannot delete a collection fully from ZooKeeper unless all nodes are up and functioning correctly. (Mark Miller) * SOLR-4487: SolrExceptions thrown by HttpSolrServer will now contain the proper HTTP status code returned by the remote server, even if that status code is not something Solr itself returned -- eg: from the Servlet Container, or an intermediate HTTP Proxy (hossman) * SOLR-4661: Admin UI Replication details now correctly displays the current replicable generation/version of the master. (hossman) * SOLR-4716,SOLR-4584: SolrCloud request proxying does not work on Tomcat and perhaps other non Jetty containers. (Po Rui, Yago Riveiro via Mark Miller) * SOLR-4746: Distributed grouping used a NamedList instead of a SimpleOrderedMap for the top level group commands, causing output formatting differences compared to non-distributed grouping. (yonik) * SOLR-4705: Fixed bug causing NPE when querying a single replica in SolrCloud using the shards param (Raintung Li, hossman) * SOLR-4729: LukeRequestHandler: Using a dynamic copyField source that is not also a dynamic field triggers error message 'undefined field: "(glob)"'. (Adam Hahn, hossman, Steve Rowe) Optimizations ---------------------- Other Changes ---------------------- * SOLR-4653: Solr configuration should log inaccessible/ non-existent relative paths in lib dir=... (Dawid Weiss) * SOLR-4317: SolrTestCaseJ4: Can't avoid "collection1" convention (Tricia Jenkins, via Erick Erickson) * SOLR-4571: SolrZkClient#setData should return Stat object. (Mark Miller) * SOLR-4603: CachingDirectoryFactory should use an IdentityHashMap for byDirectoryCache. (Mark Miller) * SOLR-4544: Refactor HttpShardHandlerFactory so load-balancing logic can be customized. (Ryan Ernst via Robert Muir) * SOLR-4607: Use noggit 0.5 release jar rather than a forked copy. (Yonik Seeley, Robert Muir) * SOLR-3706: Ship setup to log with log4j. (ryan, Mark Miller) * SOLR-4651: Remove dist-excl-slf4j build target. (Shawn Heisey) * SOLR-4622: The hardcoded SolrCloud defaults for 'hostContext="solr"' and 'hostPort="8983"' have been deprecated and will be removed in Solr 5.0. Existing solr.xml files that do not have these options explicitly specified should be updated accordingly. (hossman) * SOLR-4672: Requests attempting to use SolrCores which had init failures (that would be reported by CoreAdmin STATUS requests) now result in 500 error responses with the details about the init failure, instead of 404 error responses. (hossman) * SOLR-4730: Make the wiki link more prominent in the release documentation. (Uri Laserson via Robert Muir) ================== 4.2.1 ================== Versions of Major Components --------------------- Apache Tika 1.3 Carrot2 3.6.2 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Detailed Change List ---------------------- Bug Fixes ---------------------- * SOLR-4567: copyField source glob matching explicit field(s) stopped working in Solr 4.2. (Alexandre Rafalovitch, Steve Rowe) * SOLR-4475: Fix various places that still assume File based paths even when not using a file based DirectoryFactory. (Mark Miller) * SOLR-4551: CachingDirectoryFactory needs to create CacheEntry's with the fullpath not path. (Mark Miller) * SOLR-4555: When forceNew is used with CachingDirectoryFactory#get, the old CacheValue should give up its path as it will be used by a new Directory instance. (Mark Miller) * SOLR-4578: CoreAdminHandler#handleCreateAction gets a SolrCore and does not close it in SolrCloud mode when a core with the same name already exists. (Mark Miller) * SOLR-4574: The Collections API will silently return success on an unknown ACTION parameter. (Mark Miller) * SOLR-4576: Collections API validation errors should cause an exception on clients and otherwise act as validation errors with the Core Admin API. (Mark Miller) * SOLR-4577: The collections API should return responses (success or failure) for each node it attempts to work with. (Mark Miller) * SOLR-4568: The lastPublished state check before becoming a leader is not working correctly. (Mark Miller) * SOLR-4570: Even if an explicit shard id is used, ZkController#preRegister should still wait to see the shard id in its current ClusterState. (Mark Miller) * SOLR-4585: The Collections API validates numShards with < 0 but should use <= 0. (Mark Miller) * SOLR-4592: DefaultSolrCoreState#doRecovery needs to check the CoreContainer shutdown flag inside the recoveryLock sync block. (Mark Miller) * SOLR-4595: CachingDirectoryFactory#close can throw a concurrent modification exception. (Mark Miller) * SOLR-4573: Accessing Admin UI files in SolrCloud mode logs warnings. (Mark Miller, Phil John) * SOLR-4594: StandardDirectoryFactory#remove accesses byDirectoryCache without a lock. (Mark Miller) * SOLR-4597: CachingDirectoryFactory#remove should not attempt to empty/remove the index right away but flag for removal after close. (Mark Miller) * SOLR-4598: The Core Admin unload command's option 'deleteDataDir', should use the DirectoryFactory API to remove the data dir. (Mark Miller) * SOLR-4599: CachingDirectoryFactory calls close(Directory) on forceNew if the Directory has a refCnt of 0, but it should call closeDirectory(CacheValue). (Mark Miller) * SOLR-4602: ZkController#unregister should cancel its election participation before asking the Overseer to delete the SolrCore information. (Mark Miller) * SOLR-4601: A Collection that is only partially created and then deleted will leave pre allocated shard information in ZooKeeper. (Mark Miller) * SOLR-4604: UpdateLog#init is over called on SolrCore#reload. (Mark Miller) * SOLR-4605: Rollback does not work correctly. (Mark S, Mark Miller) * SOLR-4609: The Collections API should only send the reload command to ACTIVE cores. (Mark Miller) * SOLR-4297: Atomic update request containing null=true sets all subsequent fields to null (Ben Pennell, Rob, shalin) * SOLR-4371: Admin UI - Analysis Screen shows empty result (steffkes) * SOLR-4318: NPE encountered with querying with wildcards on a field that uses the DefaultAnalyzer (i.e. no analysis chain defined). (Erick Erickson) * SOLR-4361: DataImportHandler would throw UnsupportedOperationException if handler-level parameters were specified containing periods in the name (James Dyer) * SOLR-4538: Date Math expressions were being truncated to 32 characters when used in field:value queries in the lucene QParser. (hossman, yonik) * SOLR-4617: SolrCore#reload needs to pass the deletion policy to the next SolrCore through its constructor rather than setting a field after. (Mark Miller) * SOLR-4589: Fixed CPU spikes and poor performance in lazy field loading of multivalued fields. (hossman) * SOLR-4608: Update Log replay and PeerSync replay should use the default processor chain to update the index. (Ludovic Boutros, yonik) * SOLR-4625: The solr (lucene syntax) query parser lost top-level boost values and top-level phrase slops on queries produced by nested sub-parsers. (yonik) * SOLR-4624: CachingDirectoryFactory does not need to support forceNew any longer and it appears to be causing a missing close directory bug. forceNew is no longer respected and will be removed in 4.3. (Mark Miller) * SOLR-3819: Grouped faceting (group.facet=true) did not respect filter exclusions. (Petter Remen, yonik) * SOLR-4637: Replication can sometimes wait until shutdown or core unload until removing some tmp directories. (Mark Miller) * SOLR-4638: DefaultSolrCoreState#getIndexWriter(null) is a way to avoid creating the IndexWriter earlier than necessary, but it's not implemented quite right. (Mark Miller) * SOLR-4640: CachingDirectoryFactory can fail to close directories in some race conditions. (Mark Miller) * SOLR-4642: QueryResultKey is not calculating the correct hashCode for filters. (Joel Bernstein via Mark Miller) Optimizations ---------------------- * SOLR-4569: waitForReplicasToComeUp should bail right away if it doesn't see the expected slice in the clusterstate rather than waiting. (Mark Miller) * SOLR-4311: Admin UI - Optimize Caching Behaviour (steffkes) Other Changes ---------------------- * SOLR-4537: Clean up schema information REST API. (Steve Rowe) * SOLR-4596: DistributedQueue should ensure its full path exists in the constructor. (Mark Miller) ================== 4.2.0 ================== Versions of Major Components --------------------- Apache Tika 1.3 Carrot2 3.6.2 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Upgrading from Solr 4.1.0 ---------------------- (No upgrade instructions yet) Detailed Change List ---------------------- New Features ---------------------- * SOLR-4043: Add ability to get success/failure responses from Collections API. (Raintung Li, Mark Miller) * SOLR-2827: RegexpBoost Update Processor (janhoy) * SOLR-4370: Allow configuring commitWithin to do hard commits. (Mark Miller, Senthuran Sivananthan) * SOLR-4451: SolrJ, and SolrCloud internals, now use SystemDefaultHttpClient under the covers -- allowing many HTTP connection related properties to be controlled via 'standard' java system properties. (hossman) * SOLR-3855, SOLR-4490: Doc values support. (Adrien Grand, Robert Muir) * SOLR-4417: Reopen the IndexWriter on SolrCore reload. (Mark Miller) * SOLR-4477: Add support for queries (match-only) against docvalues fields. (Robert Muir) * SOLR-4488: Return slave replication details for a master if the master has also acted like a slave. (Mark Miller) * SOLR-4498: Add list command to ZkCLI that prints out the contents of ZooKeeper. (Roman Shaposhnik via Mark Miller) * SOLR-4481: SwitchQParserPlugin registered by default as 'switch' using syntax: {!switch case=XXX case.foo=YYY case.bar=ZZZ default=QQQ}foo (hossman) * SOLR-4078: Allow custom naming of SolrCloud nodes so that a new host:port combination can take over for a previous shard. (Mark Miller) * SOLR-4210: Requests to a Collection that does not exist on the receiving node should be proxied to a suitable node. (Mark Miller, Po Rui, yonik) * SOLR-1365: New SweetSpotSimilarityFactory allows customizable TF/IDF based Similarity when you know the optimal "Sweet Spot" of values for the field length and TF scoring factors. (hossman) * SOLR-4138: CurrencyField fields can now be used in a ValueSources to get the "raw" value (using the default number of fractional digits) in the default currency of the field type. There is also a new currency(field,[CODE]) function for generating a ValueSource of the "natural" value, converted to an optionally specified currency to override the default for the field type. (hossman) * SOLR-4503: Add REST API methods, via Restlet integration, for reading schema elements, at /schema/fields/, /schema/dynamicfields/, /schema/fieldtypes/, and /schema/copyfields/. (Steve Rowe) Bug Fixes ---------------------- * SOLR-2850: Do not refine facets when minCount == 1 (Matt Smith, lundgren via Adrien Grand) * SOLR-4309: /browse: Improve JQuery autosuggest behavior (janhoy) * SOLR-4330: group.sort is ignored when using group.truncate and ex/tag local params together (koji) * SOLR-4321: Collections API will sometimes use a node more than once, even when more unused nodes are available. (Eric Falcao, Brett Hoerner, Mark Miller) * SOLR-4345 : Solr Admin UI doesn't work in IE 10 (steffkes) * SOLR-4349 : Admin UI - Query Interface does not work in IE (steffkes) * SOLR-4359: The RecentUpdates#update method should treat a problem reading the next record the same as a problem parsing the record - log the exception and break. (Mark Miller) * SOLR-4225: Term info page under schema browser shows incorrect count of terms (steffkes) * SOLR-3926: Solr should support better way of finding active sorts (Eirik Lygre via Erick Erickson) * SOLR-4342: Fix DataImportHandler stats to be a proper Map (hossman) * SOLR-3967: langid.enforceSchema option checks source field instead of target field (janhoy) * SOLR-4380: Replicate after startup option would not replicate until the IndexWriter was lazily opened. (Mark Miller, Gregg Donovan) * SOLR-4400: Deadlock can occur in a rare race between committing and closing a SolrIndexWriter. (Erick Erickson, Mark Miller) * SOLR-3655: A restarted node can briefly appear live and active before it really is in some cases. (Mark Miller) * SOLR-4426: NRTCachingDirectoryFactory does not initialize maxCachedMB and maxMergeSizeMB if is not present in solrconfig.xml (Jack Krupansky via shalin) * SOLR-4463: Fix SolrCoreState reference counting. (Mark Miller) * SOLR-4459: The Replication 'index move' rather than copy optimization doesn't kick in when using NRTCachingDirectory or the rate limiting feature. (Mark Miller) * SOLR-4421,SOLR-4165: On CoreContainer shutdown, all SolrCores should publish their state as DOWN. (Mark Miller, Markus Jelsma) * SOLR-4467: Ephemeral directory implementations may not recover correctly because the code to clear the tlog files on startup is off. (Mark Miller) * SOLR-4413: Fix SolrCore#getIndexDir() to return the current index directory. (Gregg Donovan, Mark Miller) * SOLR-4469: A new IndexWriter must be opened on SolrCore reload when the index directory has changed and the previous SolrCore's state should not be propagated. (Mark Miller, Gregg Donovan) * SOLR-4471: Replication occurs even when a slave is already up to date. (Mark Miller, Andre Charton) * SOLR-4484: ReplicationHandler#loadReplicationProperties still uses Files rather than the Directory to try and read the replication properties files. (Mark Miller) * SOLR-4352: /browse pagination now supports and preserves sort context (Eric Spiegelberg, Erik Hatcher) * LUCENE-4796, SOLR-4373: Fix concurrency issue in NamedSPILoader and AnalysisSPILoader when doing concurrent core loads in multicore Solr configs. (Uwe Schindler, Hossman) * SOLR-4504: Fixed CurrencyField range queries to correctly exclude documents w/o values (hossman) * SOLR-4480: A trailing + or - caused the edismax parser to throw an exception. (Fiona Tay, Jan Høydahl, yonik) * SOLR-4507: The Cloud tab does not show up in the Admin UI if you set zkHost in solr.xml. (Alfonso Presa, Mark Miller) * SOLR-4505: Possible deadlock around SolrCoreState update lock. (Erick Erickson, Mark Miller) * SOLR-4511: When a new index is replicated into place, we need to update the most recent replicatable index point without doing a commit. This is important for repeater use cases, as well as when nodes may switch master/slave roles. (Mark Miller, Raúl Grande) * SOLR-4515: CurrencyField's OpenExchangeRatesOrgProvider now requires a ratesFileLocation init param, since the previous global default no longer works (hossman) * SOLR-4518: Improved CurrencyField error messages when attempting to use a Currency that is not supported by the current JVM. (hossman) * SOLR-3798: Fix copyField implementation in IndexSchema to handle dynamic field references that aren't string-equal to the name of the referenced dynamic field. (Steve Rowe) * SOLR-4497: Collection Aliasing. (Mark Miller) Optimizations ---------------------- * SOLR-4339: Admin UI - Display Field-Flags on Schema-Browser (steffkes) * SOLR-4340: Admin UI - Analysis's Button Spinner goes wild (steffkes) * SOLR-4341: Admin UI - Plugins/Stats Page contains loooong Values which result in horizontal Scrollbar (steffkes) * SOLR-3915: Color Legend for Cloud UI (steffkes) * SOLR-4306: Utilize indexInfo=false when gathering core names in UI (steffkes) * SOLR-4284: Admin UI - make core list scrollable separate from the rest of the UI (steffkes) * SOLR-4364: Admin UI - Locale based number formatting (steffkes) * SOLR-4521: Stop using the 'force' option for recovery replication. This will keep some less common unnecessary replications from happening. (Mark Miller, Simon Scofield) * SOLR-4529: Improve Admin UI Dashboard legibility (Felix Buenemann via steffkes) * SOLR-4526: Admin UI depends on optional system info (Felix Buenemann via steffkes) Other Changes ---------------------- * SOLR-4259: Carrot2 dependency should be declared on the mini version, not the core. (Dawid Weiss). * SOLR-4348: Make the lock type configurable by system property by default. (Mark Miller) * SOLR-4353: Renamed example jetty context file to reduce confusion (hossman) * SOLR-4384: Make post.jar report timing information (Upayavira via janhoy) * SOLR-4415: Add 'state' to shards (default to 'active') and read/write them to ZooKeeper (Anshum Gupta via shalin) * SOLR-4394: Tests and example configs demonstrating SSL with both server and client certs (hossman) * SOLR-3060: SurroundQParserPlugin highlighting tests (Ahmet Arslan via hossman) * SOLR-2470: Added more tests for VelocityResponseWriter * SOLR-4471: Improve and clean up TestReplicationHandler. (Amit Nithian via Mark Miller) * SOLR-3843: Include lucene codecs jar and enable per-field postings and docvalues support in the schema.xml (Robert Muir, Steve Rowe) * SOLR-4511: Add new test for 'repeater' replication node. (Mark Miller) * SOLR-4458: Sort directions (asc, desc) are now case insensitive (Shawn Heisey via hossman) * SOLR-2996: A bare * without a field specification is treated as *:* by the lucene and edismax query parsers. (hossman, Jan Høydahl, Alan Woodward, yonik) * SOLR-4416: Upgrade to Tika 1.3. (Markus Jelsma via Mark Miller) * SOLR-4200: Reduce INFO level logging from CachingDirectoryFactory (Shawn Heisey via hossman) ================== 4.1.0 ================== Versions of Major Components --------------------- Apache Tika 1.2 Carrot2 3.6.2 Velocity 1.7 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.4.5 Upgrading from Solr 4.0.0 ---------------------- Custom java parsing plugins need to migrate from throwing the internal ParseException to throwing SyntaxError. BaseDistributedSearchTestCase now randomizes the servlet context it uses when creating Jetty instances. Subclasses that assume a hard coded context of "/solr" should either be fixed to use the "String context" variable, or should take advantage of the new BaseDistributedSearchTestCase(String) constructor to explicitly specify a fixed servlet context path. See SOLR-4136 for details. Detailed Change List ---------------------- New Features ---------------------- * SOLR-2255: Enhanced pivot faceting to use local-params in the same way that regular field value faceting can. This means support for excluding a filter query, using a different output key, and specifying 'threads' to do facet.method=fcs concurrently. PivotFacetHelper now extends SimpleFacet and the getFacetImplementation() extension hook was removed. (dsmiley) * SOLR-3897: A highlighter parameter "hl.preserveMulti" to return all of the values of a multiValued field in their original order when highlighting. (Joel Bernstein via yonik) * SOLR-3929: Support configuring IndexWriter max thread count in solrconfig. (phunt via Mark Miller) * SOLR-3906: Add support for AnalyzingSuggester (LUCENE-3842), where the underlying analyzed form used for suggestions is separate from the returned text. (Robert Muir) * SOLR-3985: ExternalFileField caches can be reloaded on firstSearcher/ newSearcher events using the ExternalFileFieldReloader (Alan Woodward) * SOLR-3911: Make Directory and DirectoryFactory first class so that the majority of Solr's features work with any custom implementations. (Mark Miller) Additional Work: - SOLR-4032: Files larger than an internal buffer size fail to replicate. (Mark Miller, Markus Jelsma) - SOLR-4033: Consistently use the solrconfig.xml lockType everywhere. (Mark Miller, Markus Jelsma) - SOLR-4144: Replication using too much RAM. (yonik, Markus Jelsma) - SOLR-4187: NPE on Directory release (Mark Miller, Markus Jelsma) * SOLR-4051: Add element to DIH's data-config.xml file, allowing the user to specify the location, filename and Locale for the "data-config.properties" file. Alternatively, users can specify their own property writer implementation for greater control. This new configuration element is optional, and defaults mimic prior behavior. The one exception is that the "root" locale is default. Previously it was the machine's default locale. (James Dyer) * SOLR-4084: Add FuzzyLookupFactory, which is like AnalyzingSuggester except that it can tolerate typos in the input. (Areek Zillur via Robert Muir) * SOLR-4088: New and improved auto host detection strategy for SolrCloud. (Raintung Li via Mark Miller) * SOLR-3970: SystemInfoHandler now exposes more details about the JRE/VM/Java version in use. (hossman) * SOLR-4101: Add support for storing term offsets in the index via a 'storeOffsetsWithPositions' flag on field definitions in the schema. (Tom Winch, Alan Woodward) * SOLR-4093: Solr QParsers may now be directly invoked in the lucene query syntax without the _query_ magic field hack. Example: foo AND {!term f=myfield v=$qq} (yonik) * SOLR-4087: Add MAX_DOC_FREQ option to MoreLikeThis. (Andrew Janowczyk via Mark Miller) * SOLR-4114: Allow creating more than one shard per instance with the Collection API. (Per Steffensen, Mark Miller) * SOLR-3531: Allowing configuring maxMergeSizeMB and maxCachedMB when using NRTCachingDirectoryFactory. (Andy Laird via Mark Miller) * SOLR-4118: Fix replicationFactor to align with industry usage. replicationFactor now means the total number of copies of a document stored in the collection (or the total number of physical indexes for a single logical slice of the collection). For example if replicationFactor=3 then for a given shard there will be a total of 3 replicas (one of which will normally be designated as the leader.) (yonik) * SOLR-4124: You should be able to set the update log directory with the CoreAdmin API the same way as the data directory. (Mark Miller) * SOLR-4028: When using ZK chroot, it would be nice if Solr would create the initial path when it doesn't exist. (Tomás Fernández Löbbe via Mark Miller) * SOLR-3948: Calculate/display deleted documents in admin interface. (Shawn Heisey via Mark Miller) * SOLR-4030: Allow rate limiting Directory IO based on the IO context. (Mark Miller, Radim Kolar) * SOLR-4166: LBHttpSolrServer ignores ResponseParser passed in constructor. (Steve Molloy via Mark Miller) * SOLR-4140: Allow access to the collections API through CloudSolrServer without referencing an existing collection. (Per Steffensen via Mark Miller) * SOLR-788: Distributed search support for MLT. (Matthew Woytowitz, Mike Anderson, Jamie Johnson, Mark Miller) * SOLR-4120: Collection API: Support for specifying a list of Solr addresses to spread a new collection across. (Per Steffensen via Mark Miller) * SOLR-4110: Configurable Content-Type headers for PHPResponseWriters and PHPSerializedResponseWriter. (Dominik Siebel via Mark Miller) * SOLR-1028: The ability to specify "transient" and "loadOnStartup" as a new properties of tags in solr.xml. Can specify "transientCacheSize" in the tag. Together these allow cores to be loaded only when needed and only transientCacheSize transient cores will be loaded at a time, the rest aged out on an LRU basis. * SOLR-4246: When update.distrib is set to skip update processors before the distributed update processor, always include the log update processor so forwarded updates will still be logged. (yonik) * SOLR-4230: The new Solr 4 spatial fields now work with the {!geofilt} and {!bbox} query parsers. The score local-param works too. (David Smiley) * SOLR-1972: Add extra statistics to RequestHandlers - 5 & 15-minute reqs/sec rolling averages; median, 75th, 95th, 99th, 99.9th percentile request times (Alan Woodward, Shawn Heisey, Adrien Grand, Uwe Schindler) * SOLR-4271: Add support for PostingsHighlighter. (Robert Muir) * SOLR-4255: The new Solr 4 spatial fields now have a 'filter' boolean local-param that can be set to false to not filter. It's useful when there is already a spatial filter query but you also need to sort or boost by distance. (David Smiley) * SOLR-4265, SOLR-4283: Solr now parses request parameters (in URL or sent with POST using content-type application/x-www-form-urlencoded) in its dispatcher code. It no longer relies on special configuration settings in Tomcat or other web containers to enable UTF-8 encoding, which is mandatory for correct Solr behaviour. Query strings passed in via the URL need to be properly-%-escaped, UTF-8 encoded bytes, otherwise Solr refuses to handle the request. The maximum length of x-www-form-urlencoded POST parameters can now be configured through the requestDispatcher/requestParsers/@formdataUploadLimitInKB setting in solrconfig.xml (defaults to 2 MiB). Solr now works out of the box with e.g. Tomcat, JBoss,... (Uwe Schindler, Dawid Weiss, Alex Rocher) * SOLR-2201: DIH's "formatDate" function now supports a timezone as an optional fourth parameter (James Dyer, Mark Waddle) * SOLR-4302: New parameter 'indexInfo' (defaults to true) in CoreAdmin STATUS command can be used to omit index specific information (Shahar Davidson via shalin) * SOLR-2592: Collection specific document routing. The "compositeId" router is the default for collections with hash based routing (i.e. when numShards=N is specified on collection creation). Documents with ids sharing the same domain (prefix) will be routed to the same shard, allowing for efficient querying. Example: The following two documents will be indexed to the same shard since they share the same domain "customerB!". {"id" : "customerB!doc1" [...] } {"id" : "customerB!doc2" [...] } At query time, one can specify a "shard.keys" parameter that lists what shards the query should cover. http://.../query?q=my_query&shard.keys=customerB! Collections that do not specify numShards at collection creation time use custom sharding and default to the "implicit" router. Document updates received by a shard will be indexed to that shard, unless a "_shard_" parameter or document field names a different shard. (Michael Garski, Dan Rosher, yonik) Optimizations ---------------------- * SOLR-3788: Admin Cores UI should redirect to newly created core details (steffkes) * SOLR-3895: XML and XSLT UpdateRequestHandler should not try to resolve external entities. This improves speed of loading e.g. XSL-transformed XHTML documents. (Martin Herfurt, uschindler, hossman) * SOLR-3614: Fix XML parsing in XPathEntityProcessor to correctly expand named entities, but ignore external entities. (uschindler, hossman) * SOLR-3734: Improve Schema-Browser Handling for CopyField using dynamicField's (steffkes) * SOLR-3941: The "commitOnLeader" part of distributed recovery can use openSearcher=false. (Tomás Fernández Löbbe via Mark Miller) * SOLR-4063: Allow CoreContainer to load multiple SolrCores in parallel rather than just serially. (Mark Miller) * SOLR-4199: When doing zk retries due to connection loss, rather than just retrying for 2 minutes, retry in proportion to the session timeout. (Mark Miller) * SOLR-4262: Replication Icon on Dashboard does not reflect Master-/Slave- State (steffkes) * SOLR-4264: Missing Error-Screen on UI's Cloud-Page (steffkes) * SOLR-4261: Percentage Infos on Dashboard have a fixed width (steffkes) * SOLR-3851: create a new core/delete an existing core should also update the main/left list of cores on the admin UI (steffkes) * SOLR-3840: XML query response display is unreadable in Solr Admin Query UI (steffkes) * SOLR-3982: Admin UI: Various Dataimport Improvements (steffkes) * SOLR-4296: Admin UI: Improve Dataimport Auto-Refresh (steffkes) * SOLR-3458: Allow multiple Items to stay open on Plugins-Page (steffkes) Bug Fixes ---------------------- * SOLR-4288: Improve logging for FileDataSource (basePath, relative resources). (Dawid Weiss) * SOLR-4007: Morfologik dictionaries not available in Solr field type due to class loader lookup problems. (Lance Norskog, Dawid Weiss) * SOLR-3560: Handle different types of Exception Messages for Logging UI (steffkes) * SOLR-3637: Commit Status at Core-Admin UI is always false (steffkes) * SOLR-3917: Partial State on Schema-Browser UI is not defined for Dynamic Fields & Types (steffkes) * SOLR-3939: Consider a sync attempt from leader to replica that fails due to 404 a success. (Mark Miller, Joel Bernstein) * SOLR-3940: Rejoining the leader election incorrectly triggers the code path for a fresh cluster start rather than fail over. (Mark Miller) * SOLR-3961: Fixed error using LimitTokenCountFilterFactory (Jack Krupansky, hossman) * SOLR-3933: Distributed commits are not guaranteed to be ordered within a request. (Mark Miller) * SOLR-3939: An empty or just replicated index cannot become the leader of a shard after a leader goes down. (Joel Bernstein, yonik, Mark Miller) * SOLR-3971: A collection that is created with numShards=1 turns into a numShards=2 collection after starting up a second core and not specifying numShards. (Mark Miller) * SOLR-3988: Fixed SolrTestCaseJ4.adoc(SolrInputDocument) to respect field and document boosts (hossman) * SOLR-3981: Fixed bug that resulted in document boosts being compounded in destination fields. (hossman) * SOLR-3920: Fix server list caching in CloudSolrServer when using more than one collection list with the same instance. (Grzegorz Sobczyk, Mark Miller) * SOLR-3938: prepareCommit command omits commitData causing a failure to trigger replication to slaves. (yonik) * SOLR-3992: QuerySenderListener doesn't populate document cache. (Shotaro Kamio, yonik) * SOLR-3995: Recovery may never finish on SolrCore shutdown if the last reference to a SolrCore is closed by the recovery process. (Mark Miller) * SOLR-3998: Atomic update on uniqueKey field itself causes duplicate document. (Eric Spencer, yonik) * SOLR-4001: In CachingDirectoryFactory#close, if there are still refs for a Directory outstanding, we need to wait for them to be released before closing. (Mark Miller) * SOLR-4005: If CoreContainer fails to register a created core, it should close it. (Mark Miller) * SOLR-4009: OverseerCollectionProcessor is not resilient to many error conditions and can stop running on errors. (Raintung Li, milesli, Mark Miller) * SOLR-4019: Log stack traces for 503/Service Unavailable SolrException if not thrown by PingRequestHandler. Do not log exceptions if a user tries to view a hidden file using ShowFileRequestHandler. (Tomás Fernández Löbbe via James Dyer) * SOLR-3589: Edismax parser does not honor mm parameter if analyzer splits a token. (Tom Burton-West, Robert Muir) * SOLR-4031: Upgrade to Jetty 8.1.7 to fix a bug where in very rare occasions the content of two concurrent requests get mixed up. (Per Steffensen, yonik) * SOLR-4060: ReplicationHandler can try and do a snappull and open a new IndexWriter after shutdown has already occurred, leaving an IndexWriter that is not closed. (Mark Miller) * SOLR-4055: Fix a thread safety issue with the Collections API that could cause actions to be targeted at the wrong SolrCores. (Raintung Li, Per Steffensen via Mark Miller) * SOLR-3993: If multiple SolrCore's for a shard coexist on a node, on cluster restart, leader election would stall until timeout, waiting to see all of the replicas come up. (Mark Miller, Alexey Kudinov) * SOLR-2045: Databases that require a commit to be issued before closing the connection on a non-read-only database leak connections. Also expanded the SqlEntityProcessor test to sometimes use Derby as well as HSQLDB (Derby is one db affected by this bug). (Fenlor Sebastia, James Dyer) * SOLR-4064: When there is an unexpected exception while trying to run the new leader process, the SolrCore will not correctly rejoin the election. (Po Rui via Mark Miller) * SOLR-3989: SolrZkClient constructor dropped exception cause when throwing a new RuntimeException. (Colin Bartolome, yonik) * SOLR-4036: field aliases in fl should not cause properties of target field to be used. (Martin Koch, yonik) * SOLR-4003: The SolrZKClient clean method should not try and clear zk paths that start with /zookeeper, as this can fail and stop the removal of further nodes. (Mark Miller) * SOLR-4076: SolrQueryParser should run fuzzy terms through MultiTermAwareComponents to ensure that (for example) a fuzzy query of foobar~2 is equivalent to FooBar~2 on a field that includes lowercasing. (yonik) * SOLR-4081: QueryParsing.toString, used during debugQuery=true, did not correctly handle ExtendedQueries such as WrappedQuery (used when cache=false), spatial queries, and frange queries. (Eirik Lygre, yonik) * SOLR-3959: Ensure the internal comma separator of poly fields is escaped for CSVResponseWriter. (Areek Zillur via Robert Muir) * SOLR-4075: A logical shard that has had all of its SolrCores unloaded should be removed from the cluster state. (Mark Miller, Gilles Comeau) * SOLR-4034: Check if a collection already exists before trying to create a new one. (Po Rui, Mark Miller) * SOLR-4097: Race can cause NPE in logging line on first cluster state update. (Mark Miller) * SOLR-4099: Allow the collection api work queue to make forward progress even when its watcher is not fired for some reason. (Raintung Li via Mark Miller) * SOLR-3960: Fixed a bug where Distributed Grouping ignored PostFilters (Nathan Visagan, hossman) * SOLR-3842: DIH would not populate multivalued fields if the column name derives from a resolved variable (James Dyer) * SOLR-4117: Retrieving the size of the index may use the wrong index dir if you are replicating. (Mark Miller, Markus Jelsma) * SOLR-2890: Fixed a bug that prevented omitNorms and omitTermFreqAndPositions options from being respected in some declarations (hossman) * SOLR-4159: When we are starting a shard from rest, a potential leader should not consider its last published state when deciding if it can be the new leader. (Mark Miller) * SOLR-4158: When a core is registering in ZooKeeper it may not wait long enough to find the leader due to how long the potential leader waits to see replicas. (Mark Miller, Alain Rogister) * SOLR-4162: ZkCli usage examples are not correct because the zkhost parameter is not present and it is mandatory for all commands. (Tomás Fernández Löbbe via Mark Miller) * SOLR-4071: Validate that name is pass to Collections API create, and behave the same way as on startup when collection.configName is not explicitly passed. (Po Rui, Mark Miller) * SOLR-4127: Added explicit error message if users attempt Atomic document updates with either updateLog or DistribUpdateProcessor. (hossman) * SOLR-4136: Fix SolrCloud behavior when using "hostContext" containing "_" or"/" characters. This fix also makes SolrCloud more accepting of hostContext values with leading/trailing slashes. (hossman) * SOLR-4168: Ensure we are using the absolute latest index dir when getting list of files for replication. (Mark Miller) * SOLR-4171: CachingDirectoryFactory should not return any directories after it has been closed. (Mark Miller) * SOLR-4102: Fix UI javascript error if canonical hostname can not be resolved (steffkes via hossman) * SOLR-4178: ReplicationHandler should abort any current pulls and wait for its executor to stop during core close. (Mark Miller) * SOLR-3918: Fixed the 'dist-war-excl-slf4j' ant target to exclude all slf4j jars, so that the resulting war is usable as is provided the servlet container includes the correct slf4j api and impl jars. (Shawn Heisey, hossman) * SOLR-4198: OverseerCollectionProcessor should implement ClosableThread. (Mark Miller) * SOLR-4213: Directories that are not shutdown until DirectoryFactory#close do not have close listeners called on them. (Mark Miller) * SOLR-4134: Standard (XML) request writer cannot "set" multiple values into multivalued field with partial updates. (Luis Cappa Banda, Will Butler, shalin) * SOLR-3972: Fix ShowFileRequestHandler to not log a warning in the (expected) situation of a file not found. (hossman) * SOLR-4133: Cannot "set" field to null with partial updates when using the standard RequestWriter. (Will Butler, shalin) * SOLR-4223: "maxFormContentSize" in jetty.xml is not picked up by jetty 8 so set it via solr webapp context file. (shalin) * SOLR-4175:SearchComponent chain can't contain two components of the same class and use debugQuery. (Tomás Fernández Löbbe via ehatcher) * SOLR-4244: When coming back from session expiration we should not wait for the leader to see us in the down state if we are the node that must become the leader. (Mark Miller) * SOLR-4245: When a core is registering with ZooKeeper, the timeout to find the leader in the cluster state is 30 seconds rather than leaderVoteWait + extra time. (Mark Miller) * SOLR-4238: Fix jetty example requestLog config (jm via hossman) * SOLR-4251: Fix SynonymFilterFactory when an optional tokenizerFactory is supplied. (Chris Bleakley via rmuir) * SOLR-4253: Misleading resource loading warning from Carrot2 clustering component fixed (Stanisław Osiński) * SOLR-4257: PeerSync updates and Log Replay updates should not wait for a ZooKeeper connection in order to proceed. (yonik) * SOLR-4045: SOLR admin page returns HTTP 404 on core names containing a '.' (dot) (steffkes) * SOLR-4176: analysis ui: javascript not properly handling URL decoding of input (steffkes) * SOLR-4079: Long core names break web gui appearance and functionality (steffkes) * SOLR-4263: Incorrect Link from Schema-Browser to Query From for Top-Terms (steffkes) * SOLR-3829: Admin UI Logging events broken if schema.xml defines a catch-all dynamicField with type ignored (steffkes) * SOLR-4275: Fix TrieTokenizer to no longer throw StringIndexOutOfBoundsException in admin UI / AnalysisRequestHandler when you enter no number to tokenize. (Uwe Schindler) * SOLR-4279: Wrong exception message if _version_ field is multivalued (shalin) * SOLR-4170: The 'backup' ReplicationHandler command can sometimes use a stale index directory rather than the current one. (Mark Miller, Marcin Rzewucki) * SOLR-3876: Solr Admin UI is completely dysfunctional on IE 9 (steffkes) * SOLR-4112: Fixed DataImportHandler ZKAwarePropertiesWriter implementation so import works fine with SolrCloud clusters (Deniz Durmus, James Dyer, Erick Erickson, shalin) * SOLR-4291: Harden the Overseer work queue thread loop. (Mark Miller) * SOLR-3820: Solr Admin Query form is missing some edismax request parameters (steffkes) * SOLR-4217: post.jar no longer ignores -Dparams when -Durl is used. (Alexandre Rafalovitch, ehatcher) * SOLR-4303: On replication, if the generation of the master is lower than the slave we need to force a full copy of the index. (Mark Miller, Gregg Donovan) * SOLR-4266: HttpSolrServer does not release connection properly on exception when no response parser is used. (Steve Molloy via Mark Miller) * SOLR-2298: Updated JavaDoc for SolrDocument.addField and SolrInputDocument.addField to have more information on name and value parameters. (Siva Natarajan) Other Changes ---------------------- * SOLR-4106: Javac/ ivy path warnings with morfologik fixed by upgrading to Morfologik 1.5.5 (Robert Muir, Dawid Weiss) * SOLR-3899: SolrCore should not log at warning level when the index directory changes - it's an info event. (Tobias Bergman, Mark Miller) * SOLR-3861: Refactor SolrCoreState so that it's managed by SolrCore. (Mark Miller, hossman) * SOLR-3966: Eliminate superfluous warning from LanguageIdentifierUpdateProcessor (Markus Jelsma via hossman) * SOLR-3932: SolrCmdDistributorTest either takes 3 seconds or 3 minutes. (yonik, Mark Miller) * SOLR-3856: New tests for SqlEntityProcessor/CachedSqlEntityProcessor (James Dyer) * SOLR-4067: ZkStateReader#getLeaderProps should not return props for a leader that it does not think is live. (Mark Miller) * SOLR-4086: DIH refactor of VariableResolver and Evaluator. VariableResolver and each built-in Evaluator are separate concrete classes. DateFormatEvaluator now defaults with the ROOT Locale. However, users may specify a different Locale using an optional new third parameter. (James Dyer) * SOLR-3602: Update ZooKeeper to 3.4.5 (Mark Miller) * SOLR-4095: DIH NumberFormatTransformer & DateFormatTransformer default to the ROOT Locale if none is specified. These previously used the machine's default. (James Dyer) * SOLR-4096: DIH FileDataSource & FieldReaderDataSource default to UTF-8 encoding if none is specified. These previously used the machine's default. (James Dyer) * SOLR-1916: DIH to not use Lucene-forbidden Java APIs (default encoding, locale, etc.) (James Dyer, Robert Muir) * SOLR-4111: SpellCheckCollatorTest#testContextSensitiveCollate to test against both DirectSolrSpellChecker & IndexBasedSpellChecker (Tomás Fernández Löbbe via James Dyer) * SOLR-2141: Better test coverage for Evaluators (James Dyer) * SOLR-4119: Update Guava to 13.0.1 (Mark Miller) * SOLR-4074: Raise default ramBufferSizeMB to 100 from 32. (yonik, Mark Miller) * SOLR-4062: The update log location in solrconfig.xml should default to ${solr.ulog.dir} rather than ${solr.data.dir:} (Mark Miller) * SOLR-4155: Upgrade Jetty to 8.1.8. (Robert Muir) * SOLR-2986: Add MoreLikeThis to warning about features that require uniqueKey. Also, change the warning to warn log level. (Shawn Heisey via Mark Miller) * SOLR-4163: README improvements (Shawn Heisey via hossman) * SOLR-4248: "ant eclipse" should declare .svn directories as derived. (Shawn Heisey via Mark Miller) * SOLR-3279: Upgrade Carrot2 to 3.6.2 (Stanisław Osiński) * SOLR-4254: Harden the 'leader requests replica to recover' code path. (Mark Miller, yonik) * SOLR-4226: Extract fl parsing code out of ReturnFields constructor. (Ryan Ernst via Robert Muir) * SOLR-4208: ExtendedDismaxQParserPlugin has been refactored to make subclassing easier. (Tomás Fernández Löbbe, hossman) * SOLR-3735: Relocate the example mime-to-extension mapping, and upgrade Velocity Engine to 1.7 (ehatcher) * SOLR-4287: Removed "apache-" prefix from Solr distribution and artifact filenames. (Ryan Ernst, Robert Muir, Steve Rowe) * SOLR-4016: Deduplication does not work with atomic/partial updates so disallow atomic update requests which change signature generating fields. (Joel Nothman, yonik, shalin) * SOLR-4308: Remove the problematic and now unnecessary log4j-over-slf4j. (Mark Miller) ================== 4.0.0 ================== Versions of Major Components --------------------- Apache Tika 1.2 Carrot2 3.5.0 Velocity 1.6.4 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.3.6 Upgrading from Solr 4.0.0-BETA ---------------------- In order to better support distributed search mode, the TermVectorComponent's response format has been changed so that if the schema defines a uniqueKeyField, then that field value is used as the "key" for each document in its response section, instead of the internal lucene doc id. Users w/o a uniqueKeyField will continue to see the same response format. See SOLR-3229 for more details. If you are using SolrCloud's distributed update request capabilities and a non string type id field, you must re-index. Upgrading from Solr 4.0.0-ALPHA ---------------------- Solr is now much more strict about requiring that the uniqueKeyField feature (if used) must refer to a field which is not multiValued. If you upgrade from an earlier version of Solr and see an error that your uniqueKeyField "can not be configured to be multivalued" please add 'multiValued="false"' to the declaration for your uniqueKeyField. See SOLR-3682 for more details. In addition, please review the notes above about upgrading from 4.0.0-BETA Upgrading from Solr 3.6 ---------------------- * The Lucene index format has changed and as a result, once you upgrade, previous versions of Solr will no longer be able to read your indices. In a master/slave configuration, all searchers/slaves should be upgraded before the master. If the master were to be updated first, the older searchers would not be able to read the new index format. * Setting abortOnConfigurationError=false is no longer supported (since it has never worked properly). Solr will now warn you if you attempt to set this configuration option at all. (see SOLR-1846) * The default logic for the 'mm' param of the 'dismax' QParser has been changed. If no 'mm' param is specified (either in the query, or as a default in solrconfig.xml) then the effective value of the 'q.op' param (either in the query or as a default in solrconfig.xml or from the 'defaultOperator' option in schema.xml) is used to influence the behavior. If q.op is effectively "AND" then mm=100%. If q.op is effectively "OR" then mm=0%. Users who wish to force the legacy behavior should set a default value for the 'mm' param in their solrconfig.xml file. * The VelocityResponseWriter is no longer built into the core. Its JAR and dependencies now need to be added (via or solr/home lib inclusion), and it needs to be registered in solrconfig.xml like this: * The update request parameter to choose Update Request Processor Chain is renamed from "update.processor" to "update.chain". The old parameter was deprecated but still working since Solr3.2, but is now removed entirely. * The and sections of solrconfig.xml are discontinued and replaced with the section. There are also better defaults. When migrating, if you don't know what your old settings mean, simply delete both and sections. If you have customizations, put them in section - with same syntax as before. * Two of the SolrServer subclasses in SolrJ were renamed/replaced. CommonsHttpSolrServer is now HttpSolrServer, and StreamingUpdateSolrServer is now ConcurrentUpdateSolrServer. * The PingRequestHandler no longer looks for a option in the (legacy) section of solrconfig.xml. Users who wish to take advantage of this feature should configure a "healthcheckFile" init param directly on the PingRequestHandler. As part of this change, relative file paths have been fixed to be resolved against the data dir. See the example solrconfig.xml and SOLR-1258 for more details. * Due to low level changes to support SolrCloud, the uniqueKey field can no longer be populated via or in the schema.xml. Users wishing to have Solr automatically generate a uniqueKey value when adding documents should instead use an instance of solr.UUIDUpdateProcessorFactory in their update processor chain. See SOLR-2796 for more details. In addition, please review the notes above about upgrading from 4.0.0-BETA, and 4.0.0-ALPHA Detailed Change List ---------------------- New Features ---------------------- * SOLR-3670: New CountFieldValuesUpdateProcessorFactory makes it easy to index the number of values in another field for later use at query time. (hossman) * SOLR-2768: new "mod(x,y)" function for computing the modulus of two value sources. (hossman) * SOLR-3238: Numerous small improvements to the Admin UI (steffkes) * SOLR-3597: seems like a lot of wasted whitespace at the top of the admin screens (steffkes) * SOLR-3304: Added Solr adapters for Lucene 4's new spatial module. With SpatialRecursivePrefixTreeFieldType ("location_rpt" in example schema), it is possible to index a variable number of points per document (and sort on them), index not just points but any Spatial4j supported shape such as polygons, and to query on these shapes too. Polygons requires adding JTS to the classpath. (David Smiley) * SOLR-3825: Added optional capability to log what ids are in a response (Scott Stults via gsingers) * SOLR-3821: Added 'df' to the UI Query form (steffkes) * SOLR-3822: Added hover titles to the edismax params on the UI Query form (steffkes) Optimizations ---------------------- * SOLR-3715: improve concurrency of the transaction log by removing synchronization around log record serialization. (yonik) * SOLR-3807: Currently during recovery we pause for a number of seconds after waiting for the leader to see a recovering state so that any previous updates will have finished before our commit on the leader - we don't need this wait for peersync. (Mark Miller) * SOLR-3837: When a leader is elected and asks replicas to sync back to him and that fails, we should ask those nodes to recovery asynchronously rather than synchronously. (Mark Miller) * SOLR-3709: Cache the url list created from the ClusterState in CloudSolrServer on each request. (Mark Miller) Bug Fixes ---------------------- * SOLR-3685: Solr Cloud sometimes skipped peersync attempt and replicated instead due to tlog flags not being cleared when no updates were buffered during a previous replication. (Markus Jelsma, Mark Miller, yonik) * SOLR-3229: Fixed TermVectorComponent to work with distributed search (Hang Xie, hossman) * SOLR-3725: Fixed package-local-src-tgz target to not bring in unnecessary jars and binary contents. (Michael Dodsworth via rmuir) * SOLR-3649: Fixed bug in JavabinLoader that caused deleteById(List ids) to not work in SolrJ (siren) * SOLR-3730: Rollback is not implemented quite right and can cause corner case fails in SolrCloud tests. (rmuir, Mark Miller) * SOLR-2981: Fixed StatsComponent to no longer return duplicated information when requesting multiple stats.facet fields. (Roman Kliewer via hossman) * SOLR-3743: Fixed issues with atomic updates and optimistic concurrency in conjunction with stored copyField targets by making real-time get never return copyField targets. (yonik) * SOLR-3746: Proper error reporting if updateLog is configured w/o necessary "_version_" field in schema.xml (hossman) * SOLR-3745: Proper error reporting if SolrCloud mode is used w/o necessary "_version_" field in schema.xml (hossman) * SOLR-3770: Overseer may lose updates to cluster state (siren) * SOLR-3721: Fix bug that could theoretically allow multiple recoveries to run briefly at the same time if the recovery thread join call was interrupted. (Per Steffensen, Mark Miller) * SOLR-3782: A leader going down while updates are coming in can cause shard inconsistency. (Mark Miller) * SOLR-3611: We do not show ZooKeeper data in the UI for a node that has children. (Mark Miller) * SOLR-3789: Fix bug in SnapPuller that caused "internal" compression to fail. (siren) * SOLR-3790: ConcurrentModificationException could be thrown when using hl.fl=*. Fixed in r1231606. (yonik, koji) * SOLR-3668: DataImport : Specifying Custom Parameters (steffkes) * SOLR-3793: UnInvertedField faceting cached big terms in the filter cache that ignored deletions, leading to duplicate documents in search later when a filter of the same term was specified. (Günter Hipler, hossman, yonik) * SOLR-3679: Core Admin UI gives no feedback if "Add Core" fails (steffkes, hossman) * SOLR-3795: Fixed LukeRequestHandler response to correctly return field name strings in copyDests and copySources arrays (hossman) * SOLR-3699: Fixed some Directory leaks when there were errors during SolrCore or SolrIndexWriter initialization. (hossman) * SOLR-3518: Include final 'hits' in log information when aggregating a distributed request (Markus Jelsma via hossman) * SOLR-3628: SolrInputField and SolrInputDocument are now consistently backed by Collections passed in to setValue/setField, and defensively copy values from Collections passed to addValue/addField (Tom Switzer via hossman) * SOLR-3595: CurrencyField now generates an appropriate error on schema init if it is configured as multiValued - this has never been properly supported, but previously failed silently in odd ways. (hossman) * SOLR-3823: Fix 'bq' parsing in edismax. Please note that this required reverting the negative boost support added by SOLR-3278 (hossman) * SOLR-3827: Fix shareSchema=true in solr.xml (Tomás Fernández Löbbe via hossman) * SOLR-3809: Fixed config file replication when subdirectories are used (Emmanuel Espina via hossman) * SOLR-3828: Fixed QueryElevationComponent so that using 'markExcludes' does not modify the result set or ranking of 'excluded' documents relative to not using elevation at all. (Alexey Serba via hossman) * SOLR-3569: Fixed debug output on distributed requests when there are no results found. (David Bowen via hossman) * SOLR-3811: Query Form using wrong values for dismax, edismax (steffkes) * SOLR-3779: DataImportHandler's LineEntityProcessor when used in conjunction with FileListEntityProcessor would only process the first file. (Ahmet Arslan via James Dyer) * SOLR-3791: CachedSqlEntityProcessor would throw a NullPointerException when a query returns a row with a NULL key. (Steffen Moelter via James Dyer) * SOLR-3833: When a election is started because a leader went down, the new leader candidate should decline if the last state they published was not active. (yonik, Mark Miller) * SOLR-3836: When doing peer sync, we should only count sync attempts that cannot reach the given host as success when the candidate leader is syncing with the replicas - not when replicas are syncing to the leader. (Mark Miller) * SOLR-3835: In our leader election algorithm, if on connection loss we found we did not create our election node, we should retry, not throw an exception. (Mark Miller) * SOLR-3834: A new leader on cluster startup should also run the leader sync process in case there was a bad cluster shutdown. (Mark Miller) * SOLR-3772: On cluster startup, we should wait until we see all registered replicas before running the leader process - or if they all do not come up, N amount of time. (Mark Miller) * SOLR-3756: If we are elected the leader of a shard, but we fail to publish this for any reason, we should clean up and re trigger a leader election. (Mark Miller) * SOLR-3812: ConnectionLoss during recovery can cause lost updates, leading to shard inconsistency. (Mark Miller) * SOLR-3813: When a new leader syncs, we need to ask all shards to sync back, not just those that are active. (Mark Miller) * SOLR-3641: CoreContainer is not persisting roles core attribute. (hossman, Mark Miller) * SOLR-3527: SolrCmdDistributor drops some of the important commit attributes (maxOptimizeSegments, softCommit, expungeDeletes) when sending a commit to replicas. (Andy Laird, Tomás Fernández Löbbe, Mark Miller) * SOLR-3844: SolrCore reload can fail because it tries to remove the index write lock while already holding it. (Mark Miller) * SOLR-3831: Atomic updates do not distribute correctly to other nodes. (Jim Musil, Mark Miller) * SOLR-3465: Replication causes two searcher warmups. (Michael Garski, Mark Miller) * SOLR-3645: /terms should default to distrib=false. (Nick Cotton, Mark Miller) * SOLR-3759: Various fixes to the example-DIH configs (Ahmet Arslan, hossman) * SOLR-3777: Dataimport-UI does not send unchecked checkboxes (Glenn MacStravic via steffkes) * SOLR-3850: DataImportHandler "cacheKey" parameter was incorrectly renamed "cachePk" (James Dyer) * SOLR-3087: Fixed DOMUtil so that code doing attribute validation will automatically ignore nodes in the reserved "xml" prefix - in particular this fixes some bugs related to xinclude and fieldTypes. (Amit Nithian, hossman) * SOLR-3783: Fixed Pivot Faceting to work with facet.missing=true (hossman) * SOLR-3869: A PeerSync attempt to its replicas by a candidate leader should not fail on o.a.http.conn.ConnectTimeoutException. (Mark Miller) * SOLR-3875: Fixed index boosts on multi-valued fields when docBoost is used (hossman) * SOLR-3878: Exception when using open-ended range query with CurrencyField (janhoy) * SOLR-3891: CacheValue in CachingDirectoryFactory cannot be used outside of solr.core package. (phunt via Mark Miller) * SOLR-3892: Inconsistent locking when accessing cache in CachingDirectoryFactory from RAMDirectoryFactory and MockDirectoryFactory. (phunt via Mark Miller) * SOLR-3883: Distributed indexing forwards non-applicable request params. (Dan Sutton, Per Steffensen, yonik, Mark Miller) * SOLR-3903: Fixed MissingFormatArgumentException in ConcurrentUpdateSolrServer (hossman) * SOLR-3916: Fixed whitespace bug in parsing the fl param (hossman) Other Changes ---------------------- * SOLR-3690: Fixed binary release packages to include dependencies needed for the solr-test-framework (hossman) * SOLR-2857: The /update/json and /update/csv URLs were restored to aid in the migration of existing clients. (yonik) * SOLR-3691: SimplePostTool: Mode for crawling/posting web pages See http://wiki.apache.org/solr/ExtractingRequestHandler for examples (janhoy) * SOLR-3707: Upgrade Solr to Tika 1.2 (janhoy) * SOLR-2747: Updated changes2html.pl to handle Solr's CHANGES.txt; added target 'changes-to-html' to solr/build.xml. (Steve Rowe, Robert Muir) * SOLR-3752: When a leader goes down, have the Overseer clear the leader state in cluster.json (Mark Miller) * SOLR-3751: Add defensive checks for SolrCloud updates and requests that ensure the local state matches what we can tell the request expected. (Mark Miller) * SOLR-3773: Hash based on the external String id rather than the indexed representation for distributed updates. (Michael Garski, yonik, Mark Miller) * SOLR-3780: Maven build: Make solrj tests run separately from solr-core. (Steve Rowe) * SOLR-3772: Optionally, on cluster startup, we can wait until we see all registered replicas before running the leader process - or if they all do not come up, N amount of time. (Jan Høydahl, Per Steffensen, Mark Miller) * SOLR-3750: Optionally, on session expiration, we can explicitly wait some time before running the leader sync process so that we are sure every node participates. (Per Steffensen, Mark Miller) * SOLR-3824: Velocity: Error messages from search not displayed (janhoy) * SOLR-3826: Test framework improvements for specifying coreName on initCore (Amit Nithian, hossman) * SOLR-3749: Allow default UpdateLog syncLevel to be configured by solrconfig.xml (Raintung Li, Mark Miller) * SOLR-3845: Rename numReplicas to replicationFactor in Collections API. (yonik, Mark Miller) * SOLR-3815: SolrCloud - Add properties such as "range" to shards, which changes the clusterstate.json and puts the shard replicas under "replicas". (yonik) * SOLR-3871: SyncStrategy should use an executor for the threads it creates to request recoveries. (Mark Miller) * SOLR-3870: SyncStrategy should have a close so it can abort earlier on shutdown. (Mark Miller) ================== 4.0.0-BETA =================== Versions of Major Components --------------------- Apache Tika 1.1 Carrot2 3.5.0 Velocity 1.6.4 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.3.6 Upgrading from Solr 4.0.0-ALPHA ---------------------- Solr is now much more strict about requiring that the uniqueKeyField feature (if used) must refer to a field which is not multiValued. If you upgrade from an earlier version of Solr and see an error that your uniqueKeyField "can not be configured to be multivalued" please add 'multiValued="false"' to the declaration for your uniqueKeyField. See SOLR-3682 for more details. Detailed Change List ---------------------- New Features ---------------------- * LUCENE-4201: Added JapaneseIterationMarkCharFilterFactory to normalize Japanese iteration marks. (Robert Muir, Christian Moen) * SOLR-1856: In Solr Cell, literals should override Tika-parsed values. Patch adds a param "literalsOverride" which defaults to true, but can be set to "false" to let Tika-parsed values be appended to literal values (Chris Harris, janhoy) * SOLR-3488: Added a Collection management API for SolrCloud. (Tommaso Teofili, Sami Siren, yonik, Mark Miller) * SOLR-3559: Full deleteByQuery support with SolrCloud distributed indexing. All replicas of a shard will be consistent, even if updates arrive in a different order on different replicas. (yonik) * SOLR-1929: Index encrypted documents with ExtractingUpdateRequestHandler. By supplying resource.password= or specifying an external file with regular expressions matching file names, Solr will decrypt and index PDFs and DOCX formats. (janhoy, Yiannis Pericleous) * SOLR-3562: Add options to remove instance dir or data dir on core unload. (Mark Miller, Per Steffensen) * SOLR-2702: The default directory factory was changed to NRTCachingDirectoryFactory which wraps the StandardDirectoryFactory and caches small files for improved Near Real-time (NRT) performance. (Mark Miller, yonik) * SOLR-2616: Include a sample java util logging configuration file. (David Smiley, Mark Miller) * SOLR-3460: Add cloud-scripts directory and a zkcli.sh|bat tool for easy scripting and interaction with ZooKeeper. (Mark Miller) * SOLR-1725: StatelessScriptUpdateProcessorFactory allows users to implement the full ScriptUpdateProcessor API using any scripting language with a javax.script.ScriptEngineFactory (Uri Boness, ehatcher, Simon Rosenthal, hossman) * SOLR-139: Change to updateable documents to create the document if it doesn't already exist. To assert that the document must exist, use the optimistic concurrency feature by specifying a _version_ of 1. (yonik) * LUCENE-2510, LUCENE-4044: Migrated Solr's Tokenizer-, TokenFilter-, and CharFilterFactories to the lucene-analysis module. To add new analysis modules to Solr (like ICU, SmartChinese, Morfologik,...), just drop in the JAR files from Lucene's binary distribution into your Solr instance's lib folder. The factories are automatically made available with SPI. (Chris Male, Robert Muir, Uwe Schindler) * SOLR-3634, SOLR-3635: CoreContainer and CoreAdminHandler will now remember and report back information about failures to initialize SolrCores. These failures will be accessible from the web UI and CoreAdminHandler STATUS command until they are "reset" by creating/renaming a SolrCore with the same name. (hossman, steffkes) * SOLR-1280: Added commented-out example of the new script update processor to the example configuration. See http://wiki.apache.org/solr/ScriptUpdateProcessor (ehatcher) * SOLR-3672: SimplePostTool: Improvements for posting files Support for auto mode, recursive and wildcards (janhoy) Optimizations ---------------------- * SOLR-3708: Add hashCode to ClusterState so that structures built based on the ClusterState can be easily cached. (Mark Miller) * SOLR-3709: Cache the url list created from the ClusterState in CloudSolrServer on each request. (Mark Miller, yonik) * SOLR-3710: Change CloudSolrServer so that update requests are only sent to leaders by default. (Mark Miller) Bug Fixes ---------------------- * SOLR-3582: Our ZooKeeper watchers respond to session events as if they are change events, creating undesirable side effects. (Trym R. Møller, Mark Miller) * SOLR-3467: ExtendedDismax escaping is missing several reserved characters (Michael Dodsworth via janhoy) * SOLR-3587: After reloading a SolrCore, the original Analyzer is still used rather than a new one. (Alexey Serba, yonik, rmuir, Mark Miller) * LUCENE-4185: Fix a bug where CharFilters were wrongly being applied twice. (Michael Froh, rmuir) * SOLR-3610: After reloading a core, indexing would fail on any newly added fields to the schema. (Brent Mills, rmuir) * SOLR-3377: edismax fails to correctly parse a fielded query wrapped by parens. This regression was introduced in 3.6. (Bernd Fehling, Jan Høydahl, yonik) * SOLR-3621: Fix rare concurrency issue when opening a new IndexWriter for replication or rollback. (Mark Miller) * SOLR-1781: Replication index directories not always cleaned up. (Markus Jelsma, Terje Sten Bjerkseth, Mark Miller) * SOLR-3639: Update ZooKeeper to 3.3.6 for a variety of bug fixes. (Mark Miller) * SOLR-3629: Typo in solr.xml persistence when overriding the solrconfig.xml file name using the "config" attribute prevented the override file from being used. (Ryan Zezeski, hossman) * SOLR-3642: Correct broken check for multivalued fields in stats.facet (Yandong Yao, hossman) * SOLR-3660: Velocity: Link to admin page broken (janhoy) * SOLR-3658: Adding thousands of docs with one UpdateProcessorChain instance can briefly create spikes of threads in the thousands. (yonik, Mark Miller) * SOLR-3656: A core reload now always uses the same dataDir. (Mark Miller, yonik) * SOLR-3662: Core reload bugs: a reload always obtained a non-NRT searcher, which could go back in time with respect to the previous core's NRT searcher. Versioning did not work correctly across a core reload, and update handler synchronization was changed to synchronize on core state since more than on update handler can coexist for a single index during a reload. (yonik) * SOLR-3663: There are a couple of bugs in the sync process when a leader goes down and a new leader is elected. (Mark Miller) * SOLR-3623: Fixed inconsistent treatment of third-party dependencies for solr contribs analysis-extras & uima (hossman) * SOLR-3652: Fixed range faceting to error instead of looping infinitely when 'gap' is zero -- or effectively zero due to floating point arithmetic underflow. (hossman) * SOLR-3648: Fixed VelocityResponseWriter template loading in SolrCloud mode. For the example configuration, this means /browse now works with SolrCloud. (janhoy, ehatcher) * SOLR-3677: Fixed misleading error message in web ui to distinguish between no SolrCores loaded vs. no /admin/ handler available. (hossman, steffkes) * SOLR-3428: SolrCmdDistributor flushAdds/flushDeletes can cause repeated adds/deletes to be sent (Mark Miller, Per Steffensen) * SOLR-3647: DistributedQueue should use our Solr zk client rather than the std zk client. ZooKeeper expiration can be permanent otherwise. (Mark Miller) Other Changes ---------------------- * SOLR-3524: Make discarding punctuation configurable in JapaneseTokenizerFactory. The default is to discard punctuation, but this is overridable as an expert option. (Kazuaki Hiraga, Jun Ohtani via Christian Moen) * SOLR-1770: Move the default core instance directory into a collection1 folder. (Mark Miller) * SOLR-3355: Add shard and collection to SolrCore statistics. (Michael Garski, Mark Miller) * SOLR-3575: solr.xml should default to persist=true (Mark Miller) * SOLR-3563: Unloading all cores in a SolrCloud collection will now cause the removal of that collection's meta data from ZooKeeper. (Mark Miller, Per Steffensen) * SOLR-3599: Add zkClientTimeout to solr.xml so that it's obvious how to change it and so that you can change it with a system property. (Mark Miller) * SOLR-3609: Change Solr's expanded webapp directory to be at a consistent path called solr-webapp rather than a temporary directory. (Mark Miller) * SOLR-3600: Raise the default zkClientTimeout from 10 seconds to 15 seconds. (Mark Miller) * SOLR-3215: Clone SolrInputDocument when distrib indexing so that update processors after the distrib update process do not process the document twice. (Mark Miller) * SOLR-3683: Improved error handling if an contains both an explicit class attribute, as well as nested factories. (hossman) * SOLR-3682: Fail to parse schema.xml if uniqueKeyField is multivalued (hossman) * SOLR-2115: DIH no longer requires the "config" parameter to be specified in solrconfig.xml. Instead, the configuration is loaded and parsed with every import. This allows the use of a different configuration with each import, and makes correcting configuration errors simpler. Also, the configuration itself can be passed using the "dataConfig" parameter rather than using a file (this previously worked in debug mode only). When configuration errors are encountered, the error message is returned in XML format. (James Dyer) * SOLR-3439: Make SolrCell easier to use out of the box. Also improves "/browse" to display rich-text documents correctly, along with facets for author and content_type. With the new "content" field, highlighting of body is supported. See also SOLR-3672 for easier posting of a whole directory structure. (Jack Krupansky, janhoy) * SOLR-3579: SolrCloud view should default to the graph view rather than tree view. (steffkes, Mark Miller) ================== 4.0.0-ALPHA ================== More information about this release, including any errata related to the release notes, upgrade instructions, or other changes may be found online at: https://wiki.apache.org/solr/Solr4.0 Versions of Major Components --------------------- Apache Tika 1.1 Carrot2 3.5.0 Velocity 1.6.4 and Velocity Tools 2.0 Apache UIMA 2.3.1 Apache ZooKeeper 3.3.4 Upgrading from Solr 3.6-dev ---------------------- * The Lucene index format has changed and as a result, once you upgrade, previous versions of Solr will no longer be able to read your indices. In a master/slave configuration, all searchers/slaves should be upgraded before the master. If the master were to be updated first, the older searchers would not be able to read the new index format. * Setting abortOnConfigurationError=false is no longer supported (since it has never worked properly). Solr will now warn you if you attempt to set this configuration option at all. (see SOLR-1846) * The default logic for the 'mm' param of the 'dismax' QParser has been changed. If no 'mm' param is specified (either in the query, or as a default in solrconfig.xml) then the effective value of the 'q.op' param (either in the query or as a default in solrconfig.xml or from the 'defaultOperator' option in schema.xml) is used to influence the behavior. If q.op is effectively "AND" then mm=100%. If q.op is effectively "OR" then mm=0%. Users who wish to force the legacy behavior should set a default value for the 'mm' param in their solrconfig.xml file. * The VelocityResponseWriter is no longer built into the core. Its JAR and dependencies now need to be added (via or solr/home lib inclusion), and it needs to be registered in solrconfig.xml like this: * The update request parameter to choose Update Request Processor Chain is renamed from "update.processor" to "update.chain". The old parameter was deprecated but still working since Solr3.2, but is now removed entirely. * The and sections of solrconfig.xml are discontinued and replaced with the section. There are also better defaults. When migrating, if you don't know what your old settings mean, simply delete both and sections. If you have customizations, put them in section - with same syntax as before. * Two of the SolrServer subclasses in SolrJ were renamed/replaced. CommonsHttpSolrServer is now HttpSolrServer, and StreamingUpdateSolrServer is now ConcurrentUpdateSolrServer. * The PingRequestHandler no longer looks for a option in the (legacy) section of solrconfig.xml. Users who wish to take advantage of this feature should configure a "healthcheckFile" init param directly on the PingRequestHandler. As part of this change, relative file paths have been fixed to be resolved against the data dir. See the example solrconfig.xml and SOLR-1258 for more details. * Due to low level changes to support SolrCloud, the uniqueKey field can no longer be populated via or in the schema.xml. Users wishing to have Solr automatically generate a uniqueKey value when adding documents should instead use an instance of solr.UUIDUpdateProcessorFactory in their update processor chain. See SOLR-2796 for more details. Detailed Change List ---------------------- New Features ---------------------- * SOLR-3272: Solr filter factory for MorfologikFilter (Polish lemmatisation). (Rafał Kuć via Dawid Weiss, Steven Rowe, Uwe Schindler). * SOLR-571: The autowarmCount for LRUCaches (LRUCache and FastLRUCache) now supports "percentages" which get evaluated relative the current size of the cache when warming happens. (Tomás Fernández Löbbe and hossman) * SOLR-1932: New relevancy function queries: termfreq, tf, docfreq, idf norm, maxdoc, numdocs. (yonik) * SOLR-1665: Add debug component options for timings, results and query info only (gsingers, hossman, yonik) * SOLR-2112: Solrj API now supports streaming results. (ryan) * SOLR-792: Adding PivotFacetComponent for Hierarchical faceting (ehatcher, Jeremy Hinegardner, Thibaut Lassalle, ryan) * LUCENE-2507, SOLR-2571, SOLR-2576: Added DirectSolrSpellChecker, which uses Lucene's DirectSpellChecker to retrieve correction candidates directly from the term dictionary using levenshtein automata. (James Dyer, rmuir) * SOLR-1873, SOLR-2358: SolrCloud - added shared/central config and core/shard management via zookeeper, built-in load balancing, and distributed indexing. (Jamie Johnson, Sami Siren, Ted Dunning, yonik, Mark Miller) Additional Work: - SOLR-2324: SolrCloud solr.xml parameters are not persisted by CoreContainer. (Massimo Schiavon, Mark Miller) - SOLR-2287: Allow users to query by multiple, compatible collections with SolrCloud. (Soheb Mahmood, Alex Cowell, Mark Miller) - SOLR-2622: ShowFileRequestHandler does not work in SolrCloud mode. (Stefan Matheis, Mark Miller) - SOLR-3108: Error in SolrCloud's replica lookup code when replica's are hosted in same Solr instance. (Bruno Dumon, Sami Siren, Mark Miller) - SOLR-3080: Remove shard info from zookeeper when SolrCore is explicitly unloaded. (yonik, Mark Miller, siren) - SOLR-3437: Recovery issues a spurious commit to the cluster. (Trym R. Møller via Mark Miller) - SOLR-2822: Skip update processors already run on other nodes (hossman) * SOLR-1566: Transforming documents in the ResponseWriters. This will allow for more complex results in responses and open the door for function queries as results. (ryan with patches from grant, noble, cmale, yonik, Jan Høydahl, Arul Kalaipandian, Luca Cavanna, hossman) - SOLR-2037: Thanks to SOLR-1566, documents boosted by the QueryElevationComponent can be marked as boosted. (gsingers, ryan, yonik) * SOLR-2396: Add CollationField, which is much more efficient than the Solr 3.x CollationKeyFilterFactory, and also supports Locale-sensitive range queries. (rmuir) * SOLR-2338: Add support for using in a schema's fieldType, for customizing scoring on a per-field basis. (hossman, yonik, rmuir) * SOLR-2335: New 'field("...")' function syntax for referring to complex field names (containing whitespace or special characters) in functions. * SOLR-2383: /browse improvements: generalize range and date facet display (Jan Høydahl via yonik) * SOLR-2272: Pseudo-join queries / filters. Examples: - To restrict to the set of parents with at least one blue-eyed child: fq={!join from=parent to=name}eyes:blue - To restrict to the set of children with at least one blue-eyed parent: fq={!join from=name to=parent}eyes:blue (yonik) * SOLR-1942: Added the ability to select postings format per fieldType in schema.xml as well as support custom Codecs in solrconfig.xml. (simonw via rmuir) * SOLR-2136: Boolean type added to function queries, along with new functions exists(), if(), and(), or(), xor(), not(), def(), and true and false constants. (yonik) * SOLR-2491: Add support for using spellcheck collation in conjunction with grouping. Note that the number of hits returned for collations is the number of ungrouped hits. (James Dyer via rmuir) * SOLR-1298: Return FunctionQuery as pseudo field. The solr 'fl' param now supports functions. For example: fl=id,sum(x,y) -- NOTE: only functions with fast random access are recommended. (yonik, ryan) * SOLR-705: Optionally return shard info with each document in distributed search. Use fl=id,[shard] to return the shard url. (ryan) * SOLR-2417: Add explain info directly to return documents using ?fl=id,[explain] (ryan) * SOLR-2533: Converted ValueSource.ValueSourceSortField over to new rewriteable Lucene SortFields. ValueSourceSortField instances must be rewritten before they can be used. This is done by SolrIndexSearcher when necessary. (Chris Male). * SOLR-2193, SOLR-2565: You may now specify a 'soft' commit when committing. This will use Lucene's NRT feature to avoid guaranteeing documents are on stable storage in exchange for faster reopen times. There is also a new 'soft' autocommit tracker that can be configured. (Mark Miller, Robert Muir) * SOLR-2399: Updated Solr Admin interface. New look and feel with per core administration and many new options. (Stefan Matheis via ryan) * SOLR-1032: CSV handler now supports "literal.field_name=value" parameters. (Simon Rosenthal, ehatcher) * SOLR-2656: realtime-get, efficiently retrieves the latest stored fields for specified documents, even if they are not yet searchable (i.e. without reopening a searcher) (yonik) * SOLR-2703: Added support for Lucene's "surround" query parser. (Simon Rosenthal, ehatcher) * SOLR-2754: Added factories for several ranking algorithms: - BM25SimilarityFactory: Okapi BM25 - DFRSimilarityFactory: Divergence from Randomness models - IBSimilarityFactory: Information-based models - LMDirichletSimilarity: LM with Dirichlet smoothing - LMJelinekMercerSimilarity: LM with Jelinek-Mercer smoothing (David Mark Nemeskey, Robert Muir) * SOLR-2134 Trie* fields should support sortMissingLast=true, and deprecate Sortable* Field Types (Ryan McKinley, Mike McCandless, Uwe Schindler, Erick Erickson) * SOLR-2438 added MultiTermAwareComponent to the various classes to allow automatic lowercasing for multiterm queries (wildcards, regex, prefix, range, etc). You can now optionally specify a "multiterm" analyzer in our schema.xml, but Solr should "do the right thing" if you don't specify (Pete Sturge Erick Erickson, Mentoring from Seeley and Muir) * SOLR-2481: Add support for commitWithin in DataImportHandler (Sami Siren via yonik) * SOLR-2992: Add support for IndexWriter.prepareCommit() via prepareCommit=true on update URLs. (yonik) * SOLR-2906: Added LFU cache options to Solr. (Shawn Heisey via Erick Erickson) * SOLR-3069: Ability to add openSearcher=false to not open a searcher when doing a hard commit. commitWithin now only invokes a softCommit. (yonik) * SOLR-2802: New FieldMutatingUpdateProcessor and Factory to simplify the development of UpdateProcessors that modify field values of documents as they are indexed. Also includes several useful new implementations: - RemoveBlankFieldUpdateProcessorFactory - TrimFieldUpdateProcessorFactory - HTMLStripFieldUpdateProcessorFactory - RegexReplaceProcessorFactory - FieldLengthUpdateProcessorFactory - ConcatFieldUpdateProcessorFactory - FirstFieldValueUpdateProcessorFactory - LastFieldValueUpdateProcessorFactory - MinFieldValueUpdateProcessorFactory - MaxFieldValueUpdateProcessorFactory - TruncateFieldUpdateProcessorFactory - IgnoreFieldUpdateProcessorFactory (hossman, janhoy) * SOLR-3120: Optional post filtering for spatial queries bbox and geofilt for LatLonType. (yonik) * SOLR-2459: Expose LogLevel selection with a RequestHandler rather then servlet (Stefan Matheis, Upayavira, ryan) * SOLR-3134: Include shard info in distributed response when shards.info=true (Russell Black, ryan) * SOLR-2898: Support grouped faceting. (Martijn van Groningen) Additional Work: - SOLR-3406: Extended grouped faceting support to facet.query and facet.range parameters. (David Boychuck, Martijn van Groningen) * SOLR-2949: QueryElevationComponent is now supported with distributed search. (Mark Miller, yonik) * SOLR-3221: Added the ability to directly configure aspects of the concurrency and thread-pooling used within distributed search in solr. This allows for finer grained controlled and can be tuned by end users to target their own specific requirements. This builds on the work of the HttpCommComponent and uses the same configuration block to configure the thread pool. The default configuration has the same behaviour as solr 3.5, favouring throughput over latency. More information can be found on the wiki (http://wiki.apache.org/solr/SolrConfigXml) (Greg Bowyer) * SOLR-3278: Negative boost support to the Extended Dismax Query Parser Boost Query (bq). (James Dyer) * SOLR-3255: OpenExchangeRates.Org Exchange Rate Provider for CurrencyField (janhoy) * SOLR-3358: Logging events are captured and available from the /admin/logging request handler. (ryan) * SOLR-1535: PreAnalyzedField type provides a functionality to index (and optionally store) field content that was already processed and split into tokens using some external processing chain. Serialization format is pluggable, and defaults to JSON. (ab) * SOLR-3363: Consolidated Exceptions in Analysis Factories so they only throw InitializationExceptions (Chris Male) * SOLR-2690: New support for a "TZ" request param which overrides the TimeZone used when rounding Dates in DateMath expressions for the entire request (all date range queries and date faceting is affected). The default TZ is still UTC. (David Schlotfeldt, hossman) * SOLR-3402: Analysis Factories are now configured with their Lucene Version throw setLuceneMatchVersion, rather than through the Map passed to init. Parsing and simple error checking for the Version is now done inside the code that creates the Analysis Factories. (Chris Male) * SOLR-3178: Optimistic locking. If a _version_ is provided with an update that does not match the version in the index, an HTTP 409 error (Conflict) will result. (Per Steffensen, yonik) * SOLR-139: Updateable documents. JSON Example: {"id":"mydoc", "f1":{"set":10}, "f2":{"add":20}} will result in field "f1" being set to 10, "f2" having an additional value of 20 added, and all other existing fields unchanged. All source fields must be stored for this feature to work correctly. (Ryan McKinley, Erik Hatcher, yonik) * SOLR-2857: Support XML,CSV,JSON, and javabin in a single RequestHandler and choose the correct ContentStreamLoader based on Content-Type header. This also deprecates the existing [Xml,JSON,CSV,Binary,Xslt]UpdateRequestHandler. (ryan) * SOLR-2585: Context-Sensitive Spelling Suggestions & Collations. This adds support for the "spellcheck.alternativeTermCount" & "spellcheck.maxResultsForSuggest" parameters, letting users receive suggestions even when all the queried terms exist in the dictionary. This differs from "spellcheck.onlyMorePopular" in that the suggestions need not consist entirely of terms with a greater document frequency than the queried terms. (James Dyer) * SOLR-2058: Edismax query parser to allow "phrase slop" to be specified per-field on the pf/pf2/pf3 parameters using optional "FieldName~slop^boost" syntax. The prior "FieldName^boost" syntax is still accepted. In such cases the value on the "ps" parameter serves as the default slop. (Ron Mayer via James Dyer) * SOLR-3495: New UpdateProcessors have been added to create default values for configured fields. These works similarly to the option in schema.xml, but are applied in the UpdateProcessorChain, so they may be used prior to other UpdateProcessors, or to generate a uniqueKey field value when using the DistributedUpdateProcessor (ie: SolrCloud) TimestampUpdateProcessorFactory UUIDUpdateProcessorFactory DefaultValueUpdateProcessorFactory (hossman) * SOLR-2993: Add WordBreakSolrSpellChecker to offer suggestions by combining adjacent query terms and/or breaking terms into multiple words. This spellchecker can be configured with a traditional checker (ie: DirectSolrSpellChecker). The results are combined and collations can contain a mix of corrections from both spellcheckers. (James Dyer) * SOLR-3508: Simplify JSON update format for deletes as well as allow version specification for optimistic locking. Examples: - {"delete":"myid"} - {"delete":["id1","id2","id3"]} - {"delete":{"id":"myid", "_version_":123456789}} (yonik) * SOLR-3211: Allow parameter overrides in conjunction with "spellcheck.maxCollationTries". To do so, use parameters starting with "spellcheck.collateParam." For instance, to override the "mm" parameter, specify "spellcheck.collateParam.mm". This is helpful in cases where testing spellcheck collations for result counts should use different parameters from the main query (James Dyer) * SOLR-2599: CloneFieldUpdateProcessorFactory provides similar functionality to schema.xml's declaration but as an update processor that can be combined with other processors in any order. (Jan Høydahl & hossman) * SOLR-3351: eDismax: ps2 and ps3 params (janhoy) * SOLR-3542: Add WeightedFragListBuilder for FVH and set it to default fragListBuilder in example solrconfig.xml. (Sebastian Lutze, koji) * SOLR-2396: Add ICUCollationField to contrib/analysis-extras, which is much more efficient than the Solr 3.x ICUCollationKeyFilterFactory, and also supports Locale-sensitive range queries. (rmuir) Optimizations ---------------------- * SOLR-1875: Per-segment field faceting for single valued string fields. Enable with facet.method=fcs, control the number of threads used with the "threads" local param on the facet.field param. This algorithm will only be faster in the presence of rapid index changes. (yonik) * SOLR-1904: When facet.enum.cache.minDf > 0 and the base doc set is a SortedIntSet, convert to HashDocSet for better performance. (yonik) * SOLR-2092: Speed up single-valued and multi-valued "fc" faceting. Typical improvement is 5%, but can be much greater (up to 10x faster) when facet.offset is very large (deep paging). (yonik) * SOLR-2193, SOLR-2565: The default Solr update handler has been improved so that it uses fewer locks, keeps the IndexWriter open rather than closing it on each commit (ie commits no longer wait for background merges to complete), works with SolrCore to provide faster 'soft' commits, and has an improved API that requires less instanceof special casing. (Mark Miller, Robert Muir) Additional Work: - SOLR-2697: commit and autocommit operations don't reset DirectUpdateHandler2.numDocsPending stats attribute. (Alexey Serba, Mark Miller) * SOLR-2950: The QueryElevationComponent now avoids using the FieldCache and looking up every document id (gsingers, yonik) Bug Fixes ---------------------- * SOLR-3139: Make ConcurrentUpdateSolrServer send UpdateRequest.getParams() as HTTP request params (siren) * SOLR-3165: Cannot use DIH in Solrcloud + Zookeeper (Alexey Serba, Mark Miller, siren) * SOLR-3068: Occasional NPE in ThreadDumpHandler (siren) * SOLR-2762: FSTLookup could return duplicate results or one results less than requested. (David Smiley, Dawid Weiss) * SOLR-2741: Bugs in facet range display in trunk (janhoy) * SOLR-1908: Fixed SignatureUpdateProcessor to fail to initialize on invalid config. Specifically: a signatureField that does not exist, or overwriteDupes=true with a signatureField that is not indexed. (hossman) * SOLR-1824: IndexSchema will now fail to initialize if there is a problem initializing one of the fields or field types. (hossman) * SOLR-1928: TermsComponent didn't correctly break ties for non-text fields sorted by count. (yonik) * SOLR-2107: MoreLikeThisHandler doesn't work with alternate qparsers. (yonik) * SOLR-2108: Fixed false positives when using wildcard queries on fields with reversed wildcard support. For example, a query of *zemog* would match documents that contain 'gomez'. (Landon Kuhn via Robert Muir) * SOLR-1962: SolrCore#initIndex should not use a mix of indexPath and newIndexPath (Mark Miller) * SOLR-2275: fix DisMax 'mm' parsing to be tolerant of whitespace (Erick Erickson via hossman) * SOLR-2193, SOLR-2565, SOLR-2651: SolrCores now properly share IndexWriters across SolrCore reloads. (Mark Miller, Robert Muir) Additional Work: - SOLR-2705: On reload, IndexWriterProvider holds onto the initial SolrCore it was created with. (Yury Kats, Mark Miller) * SOLR-2682: Remove addException() in SimpleFacet. FacetComponent no longer catches and embeds exceptions occurred during facet processing, it throws HTTP 400 or 500 exceptions instead. (koji) * SOLR-2654: Directorys used by a SolrCore are now closed when they are no longer used. (Mark Miller) * SOLR-2854: Now load URL content stream data (via stream.url) when called for during request handling, rather than loading URL content streams automatically regardless of use. (David Smiley and Ryan McKinley via ehatcher) * SOLR-2829: Fix problem with false-positives due to incorrect equals methods. (Yonik Seeley, Hossman, Erick Erickson. Marc Tinnemeyer caught the bug) * SOLR-2848: Removed 'instanceof AbstractLuceneSpellChecker' hacks from distributed spellchecking code, and added a merge() method to SolrSpellChecker instead. Previously if you extended SolrSpellChecker your spellchecker would not work in distributed fashion. (James Dyer via rmuir) * SOLR-2509: StringIndexOutOfBoundsException in the spellchecker collate when the term contains a hyphen. (Thomas Gambier caught the bug, Steffen Godskesen did the patch, via Erick Erickson) * SOLR-1730: Made it clearer when a core failed to load as well as better logging when the QueryElevationComponent fails to properly initialize (gsingers) * SOLR-1520: QueryElevationComponent now supports non-string ids (gsingers) * SOLR-3037: When using binary format in solrj the codec screws up parameters (Sami Siren, Jörg Maier via yonik) * SOLR-3062: A join in the main query was not respecting any filters pushed down to it via acceptDocs since LUCENE-1536. (Mike Hugo, yonik) * SOLR-3214: If you use multiple fl entries rather than a comma separated list, all but the first entry can be ignored if you are using distributed search. (Tomás Fernández Löbbe via Mark Miller) * SOLR-3352: eDismax: pf2 should kick in for a query with 2 terms (janhoy) * SOLR-3361: ReplicationHandler "maxNumberOfBackups" doesn't work if backups are triggered on commit (James Dyer, Tomás Fernández Löbbe) * SOLR-2605: fixed tracking of the 'defaultCoreName' in CoreContainer so that CoreAdminHandler could return consistent information regardless of whether there is a a default core name or not. (steffkes, hossman) * SOLR-3370: fixed CSVResponseWriter to respect globs in the 'fl' param (Keith Fligg via hossman) * SOLR-3436: Group count incorrect when not all shards are queried in the second pass. (Francois Perron, Martijn van Groningen) * SOLR-3454: Exception when using result grouping with main=true and using wt=javabin. (Ludovic Boutros, Martijn van Groningen) * SOLR-3446: Better errors when PatternTokenizerFactory is configured with an invalid pattern, and include the 'name' whenever possible in plugin init error messages. (hossman) * LUCENE-4075: Cleaner path usage in TestXPathEntityProcessor (Greg Bowyer via hossman) * SOLR-2923: IllegalArgumentException when using useFilterForSortedQuery on an empty index. (Adrien Grand via Mark Miller) * SOLR-2352: Fixed TermVectorComponent so that it will not fail if the fl param contains globs or psuedo-fields (hossman) * SOLR-3541: add missing solrj dependencies to binary packages. (Thijs Vonk via siren) * SOLR-3522: fixed parsing of the 'literal()' function (hossman) * SOLR-3548: Fixed a bug in the cachability of queries using the {!join} parser or the strdist() function, as well as some minor improvements to the hashCode implementation of {!bbox} and {!geofilt} queries. (hossman) * SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories are respected now (Stanislaw Osinski, Dawid Weiss) * SOLR-3430: Added a new DIH test against a real SQL database. Fixed problems revealed by this new test related to the expanded cache support added to 3.6/SOLR-2382 (James Dyer) * SOLR-1958: When using the MailEntityProcessor, import would fail if fetchMailsSince was not specified. (Max Lynch via James Dyer) * SOLR-4289: Admin UI - JVM memory bar - dark grey "used" width is too small (steffkes, elyograg) Other Changes ---------------------- * SOLR-1846: Eliminate support for the abortOnConfigurationError option. It has never worked very well, and in recent versions of Solr hasn't worked at all. (hossman) * SOLR-1889: The default logic for the 'mm' param of DismaxQParser and ExtendedDismaxQParser has been changed to be determined based on the effective value of the 'q.op' param (hossman) * SOLR-1946: Misc improvements to the SystemInfoHandler: /admin/system (hossman) * SOLR-2289: Tweak spatial coords for example docs so they are a bit more spread out (Erick Erickson via hossman) * SOLR-2288: Small tweaks to eliminate compiler warnings. primarily using Generics where applicable in method/object declarations, and adding @SuppressWarnings("unchecked") when appropriate (hossman) * SOLR-2375: Suggester Lookup implementations now store trie data and load it back on init. This means that large tries don't have to be rebuilt on every commit or core reload. (ab) * SOLR-2413: Support for returning multi-valued fields w/o tag in the XMLResponseWriter was removed. XMLResponseWriter only no longer work with values less then 2.2 (ryan) * SOLR-2423: FieldType argument changed from String to Object Conversion from SolrInputDocument > Object > Fieldable is now managed by FieldType rather then DocumentBuilder. (ryan) * SOLR-2461: QuerySenderListener and AbstractSolrEventListener are now public (hossman) * LUCENE-2995: Moved some spellchecker and suggest APIs to modules/suggest: HighFrequencyDictionary, SortedIterator, TermFreqIterator, and the suggester APIs and implementations. (rmuir) * SOLR-2576: Remove deprecated SpellingResult.add(Token, int). (James Dyer via rmuir) * LUCENE-3232: Moved MutableValue classes to new 'common' module. (Chris Male) * LUCENE-2883: FunctionQuery, DocValues (and its impls), ValueSource (and its impls) and BoostedQuery have been consolidated into the queries module. They can now be found at o.a.l.queries.function. * SOLR-2027: FacetField.getValues() now returns an empty list if there are no values, instead of null (Chris Male) * SOLR-1825: SolrQuery.addFacetQuery now enables facets automatically, like addFacetField (Chris Male) * SOLR-2663: FieldTypePluginLoader has been refactored out of IndexSchema and made public. (hossman) * SOLR-2331,SOLR-2691: Refactor CoreContainer's SolrXML serialization code and improve testing (Yury Kats, hossman, Mark Miller) * SOLR-2698: Enhance CoreAdmin STATUS command to return index size. (Yury Kats, hossman, Mark Miller) * SOLR-2654: The same Directory instance is now always used across a SolrCore so that it's easier to add other DirectoryFactory's without static caching hacks. (Mark Miller) * LUCENE-3286: 'luke' ant target has been disabled due to incompatibilities with XML queryparser location (Chris Male) * SOLR-1897: The data dir from the core descriptor should override the data dir from the solrconfig.xml rather than the other way round. (Mark Miller) * SOLR-2756: Maven configuration: Excluded transitive stax:stax-api dependency from org.codehaus.woodstox:wstx-asl dependency. (David Smiley via Steve Rowe) * SOLR-2588: Moved VelocityResponseWriter back to contrib module in order to remove it as a mandatory core dependency. (ehatcher) * SOLR-2862: More explicit lexical resources location logged if Carrot2 clustering extension is used. Fixed solr. impl. of IResource and IResourceLookup. (Dawid Weiss) * SOLR-1123: Changed JSONResponseWriter to now use application/json as its Content-Type by default. However the Content-Type can be overwritten and is set to text/plain in the example configuration. (Uri Boness, Chris Male) * SOLR-2607: Removed deprecated client/ruby directory, which included solr-ruby and flare. (ehatcher) * SOLR-3032: logOnce from SolrException logOnce and all the supporting structure is gone. abortOnConfigurationError is also gone as it is no longer referenced. Errors should be caught and logged at the top-most level or logged and NOT propagated up the chain. (Erick Erickson) * SOLR-2105: Remove support for deprecated "update.processor" (since 3.2), in favor of "update.chain" (janhoy) * SOLR-3005: Default QueryResponseWriters are now initialized via init() with an empty NamedList. (Gasol Wu, Chris Male) * SOLR-2607: Removed obsolete client/ folder (ehatcher, Eric Pugh, janhoy) * SOLR-3202, SOLR-3244: Dropping Support for JSP. New Admin UI is all client side (ryan, Aliaksandr Zhuhrou, Uwe Schindler) * SOLR-3159: Upgrade example and tests to run with Jetty 8 (ryan) * SOLR-3254: Upgrade Solr to Tika 1.1 (janhoy) * SOLR-3329: Dropped getSourceID() from SolrInfoMBean and using getClass().getPackage().getSpecificationVersion() for Version. (ryan) * SOLR-3302: Upgraded SLF4j to version 1.6.4 (hossman) * SOLR-3322: Add more context to IndexReaderFactory.newReader (ab) * SOLR-3343: Moved FastWriter, FileUtils, RegexFileFilter, RTimer and SystemIdResolver from org.apache.solr.common to org.apache.solr.util (Chris Male) * SOLR-3357: ResourceLoader.newInstance now accepts a Class representation of the expected instance type (Chris Male) * SOLR-3388: HTTP caching is now disabled by default for RequestUpdateHandlers. (ryan) * SOLR-3309: web.xml now specifies metadata-complete=true (which requires Servlet 2.5) to prevent servlet containers from scanning class annotations on startup. This allows for faster startup times on some servlet containers. (Bill Bell, hossman) * SOLR-1893: Refactored some common code from LRUCache and FastLRUCache into SolrCacheBase (Tomás Fernández Löbbe via hossman) * SOLR-3403: Deprecated Analysis Factories now log their own deprecation messages. No logging support is provided by Factory parent classes. (Chris Male) * SOLR-1258: PingRequestHandler is now directly configured with a "healthcheckFile" instead of looking for the legacy syntax. Filenames specified as relative paths have been fixed so that they are resolved against the data dir instead of the CWD of the java process. (hossman) * SOLR-3083: JMX beans now report Numbers as numeric values rather then String (Tagged Siteops, Greg Bowyer via ryan) * SOLR-2796: Due to low level changes to support SolrCloud, the uniqueKey field can no longer be populated via or in the schema.xml. * SOLR-3534: The Dismax and eDismax query parsers will fall back on the 'df' parameter when 'qf' is absent. And if neither is present nor the schema default search field then an exception will be thrown now. (dsmiley) * SOLR-3262: The "threads" feature of DIH is removed (deprecated in Solr 3.6) (James Dyer) * SOLR-3422: Refactored DIH internal data classes. All entities in data-config.xml must have a name (James Dyer) Documentation ---------------------- * SOLR-2232: Improved README info on solr.solr.home in examples (Eric Pugh and hossman) ================== 3.6.2 ================== Bug Fixes ---------------------- * SOLR-3790: ConcurrentModificationException could be thrown when using hl.fl=*. (yonik, koji) * SOLR-3589: Edismax parser does not honor mm parameter if analyzer splits a token. (Tom Burton-West, Robert Muir) ================== 3.6.1 ================== More information about this release, including any errata related to the release notes, upgrade instructions, or other changes may be found online at: https://wiki.apache.org/solr/Solr3.6.1 Bug Fixes * LUCENE-3969: Throw IAE on bad arguments that could cause confusing errors in PatternTokenizer. CommonGrams populates PositionLengthAttribute correctly. (Uwe Schindler, Mike McCandless, Robert Muir) * SOLR-3361: ReplicationHandler "maxNumberOfBackups" doesn't work if backups are triggered on commit (James Dyer, Tomás Fernández Löbbe) * SOLR-3375: Fix charset problems with HttpSolrServer (Roger Håkansson, yonik, siren) * SOLR-3436: Group count incorrect when not all shards are queried in the second pass. (Francois Perron, Martijn van Groningen) * SOLR-3454: Exception when using result grouping with main=true and using wt=javabin. (Ludovic Boutros, Martijn van Groningen) * SOLR-3489: Config file replication less error prone (Jochen Just via janhoy) * SOLR-3477: SOLR does not start up when no cores are defined (Tomás Fernández Löbbe via tommaso) * SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories are respected now (Stanislaw Osinski, Dawid Weiss) * SOLR-3360: More DIH bug fixes for the deprecated "threads" parameter. (Mikhail Khludnev, Claudio R, via James Dyer) * SOLR-3430: Added a new DIH test against a real SQL database. Fixed problems revealed by this new test related to the expanded cache support added to 3.6/SOLR-2382 (James Dyer) * SOLR-3336: SolrEntityProcessor substitutes most variables at query time. (Michael Kroh, Lance Norskog, via Martijn van Groningen) ================== 3.6.0 ================== More information about this release, including any errata related to the release notes, upgrade instructions, or other changes may be found online at: https://wiki.apache.org/solr/Solr3.6 Upgrading from Solr 3.5 ---------------------- * SOLR-2983: As a consequence of moving the code which sets a MergePolicy from SolrIndexWriter to SolrIndexConfig, (custom) MergePolicies should now have an empty constructor; thus an IndexWriter should not be passed as constructor parameter but instead set using the setIndexWriter() method. * As doGet() methods in SimplePostTool was changed to static, the client applications of this class need to be recompiled. * In Solr version 3.5 and earlier, HTMLStripCharFilter had known bugs in the character offsets it provided, triggering e.g. exceptions in highlighting. HTMLStripCharFilter has been re-implemented, addressing this and other issues. See the entry for LUCENE-3690 in the Bug Fixes section below for a detailed list of changes. For people who depend on the behavior of HTMLStripCharFilter in Solr version 3.5 and earlier: the old implementation (bugs and all) is preserved as LegacyHTMLStripCharFilter. * As of Solr 3.6, the and sections of solrconfig.xml are deprecated and replaced with a new section. Read more in SOLR-1052 below. * SOLR-3040: The DIH's admin UI (dataimport.jsp) now requires DIH request handlers to start with a '/'. (dsmiley) * SOLR-3161: is now the default. An existing config will probably work as-is because handleSelect was explicitly enabled in default configs. HandleSelect makes /select work as well as enables the 'qt' parameter. Instead, consider explicitly configuring /select as is done in the example solrconfig.xml, and register your other search handlers with a leading '/' which is a recommended practice. (David Smiley, Erik Hatcher) * SOLR-3161: Don't use the 'qt' parameter with a leading '/'. It probably won't work in 4.0 and it's now limited in 3.6 to SearchHandler subclasses that aren't lazy-loaded. * SOLR-2724: Specifying and in schema.xml is now considered deprecated. Instead you are encouraged to specify these via the "df" and "q.op" parameters in your request handler definition. (David Smiley) * Bugs found and fixed in the SignatureUpdateProcessor that previously caused some documents to produce the same signature even when the configured fields contained distinct (non-String) values. Users of SignatureUpdateProcessor are strongly advised that they should re-index as document signatures may have now changed. (see SOLR-3200 & SOLR-3226 for details) New Features ---------------------- * SOLR-2020: Add Java client that uses Apache Http Components http client (4.x). (Chantal Ackermann, Ryan McKinley, Yonik Seeley, siren) * SOLR-2854: Now load URL content stream data (via stream.url) when called for during request handling, rather than loading URL content streams automatically regardless of use. (David Smiley and Ryan McKinley via ehatcher) * SOLR-2904: BinaryUpdateRequestHandler should be able to accept multiple update requests from a stream (shalin) * SOLR-1565: StreamingUpdateSolrServer supports RequestWriter API and therefore, javabin update format (shalin) * SOLR-2438 added MultiTermAwareComponent to the various classes to allow automatic lowercasing for multiterm queries (wildcards, regex, prefix, range, etc). You can now optionally specify a "multiterm" analyzer in our schema.xml, but Solr should "do the right thing" if you don't specify (Pete Sturge Erick Erickson, Mentoring from Seeley and Muir) * SOLR-2919: Added support for localized range queries when the analysis chain uses CollationKeyFilter or ICUCollationKeyFilter. (Michael Sokolov, rmuir) * SOLR-2982: Added BeiderMorseFilterFactory for Beider-Morse (BMPM) phonetic encoder. Upgrades commons-codec to version 1.6 (Brooke Schreier Ganz, rmuir) * SOLR-1843: A new "rootName" attribute is now available when configuring in solrconfig.xml. If this attribute is set, Solr will use it as the root name for all MBeans Solr exposes via JMX. The default root name is "solr" followed by the core name. (Constantijn Visinescu, hossman) * SOLR-2906: Added LFU cache options to Solr. (Shawn Heisey via Erick Erickson) * SOLR-3036: Ability to specify overwrite=false on the URL for XML updates. (Sami Siren via yonik) * SOLR-2603: Add the encoding function for alternate fields in highlighting. (Massimo Schiavon, koji) * SOLR-1729: Evaluation of NOW for date math is done only once per request for consistency, and is also propagated to shards in distributed search. Adding a parameter NOW= to the request will override the current time. (Peter Sturge, yonik, Simon Willnauer) * SOLR-1709: Distributed support for Date and Numeric Range Faceting (Peter Sturge, David Smiley, hossman, Simon Willnauer) * SOLR-3054, LUCENE-3671: Add TypeTokenFilterFactory that creates TypeTokenFilter that filters tokens based on their TypeAttribute. (Tommaso Teofili via Uwe Schindler) * LUCENE-3305, SOLR-3056: Added Kuromoji morphological analyzer for Japanese. See the 'text_ja' fieldtype in the example to get started. (Christian Moen, Masaru Hasegawa via Robert Muir) * SOLR-1860: StopFilterFactory, CommonGramsFilterFactory, and CommonGramsQueryFilterFactory can optionally read stopwords in Snowball format (specify format="snowball"). (Robert Muir) * SOLR-3105: ElisionFilterFactory optionally allows the parameter ignoreCase (default=false). (Robert Muir) * LUCENE-3714: Add WFSTLookupFactory, a suggester that uses a weighted FST for more fine-grained suggestions. (Mike McCandless, Dawid Weiss, Robert Muir) * SOLR-3143: Add SuggestQueryConverter, a QueryConverter intended for auto-suggesters. (Robert Muir) * SOLR-3033: ReplicationHandler's backup command now supports a 'maxNumberOfBackups' init param that can be used to delete all but the most recent N backups. (Torsten Krah, James Dyer) * SOLR-2202: Currency FieldType, whith support for currencies and exchange rates (Greg Fodor & Andrew Morrison via janhoy, rmuir, Uwe Schindler) * SOLR-3026: eDismax: Locking down which fields can be explicitly queried (user fields aka uf) (janhoy, hossmann, Tomás Fernández Löbbe) * SOLR-2826: URLClassify Update Processor (janhoy) * SOLR-2764: Create a NorwegianLightStemmer and NorwegianMinimalStemmer (janhoy) * SOLR-3221: Added the ability to directly configure aspects of the concurrency and thread-pooling used within distributed search in solr. This allows for finer grained controlled and can be tuned by end users to target their own specific requirements. This builds on the work of the HttpCommComponent and uses the same configuration block to configure the thread pool. The default configuration has the same behaviour as solr 3.5, favouring throughput over latency. More information can be found on the wiki (http://wiki.apache.org/solr/SolrConfigXml) (Greg Bowyer) * SOLR-2001: The query component will substitute an empty query that matches no documents if the query parser returns null. This also prevents an exception from being thrown by the default parser if "q" is missing. (yonik) - SOLR-435: if q is "" then it's also acceptable. (dsmiley, hoss) * SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory. These can be used to customize range query/sort behavior, for example to support numeric collation, ignore punctuation/whitespace, ignore accents but not case, control whether upper/lowercase values are sorted first, etc. (rmuir) * SOLR-2346: Add a chance to set content encoding explicitly via content type of stream for extracting request handler. This is convenient when Tika's auto detector cannot detect encoding, especially the text file is too short to detect encoding. (koji) * SOLR-1499: Added SolrEntityProcessor that imports data from another Solr core or instance based on a specified query. (Lance Norskog, Erik Hatcher, Pulkit Singhal, Ahmet Arslan, Luca Cavanna, Martijn van Groningen) * SOLR-3190: Minor improvements to SolrEntityProcessor. Add more consistency between solr parameters and parameters used in SolrEntityProcessor and ability to specify a custom HttpClient instance. (Luca Cavanna via Martijn van Groningen) * SOLR-2382: Added pluggable cache support to DIH so that any Entity can be made cache-able by adding the "cacheImpl" parameter. Include "SortedMapBackedCache" to provide in-memory caching (as previously this was the only option when using CachedSqlEntityProcessor). Users can provide their own implementations of DIHCache for other caching strategies. Deprecate CachedSqlEntityProcessor in favor of specifing "cacheImpl" with SqlEntityProcessor. Make SolrWriter implement DIHWriter and allow the possibility of pluggable Writers (DIH writing to something other than Solr). (James Dyer, Noble Paul) Optimizations ---------------------- * SOLR-1931: Speedup for LukeRequestHandler and admin/schema browser. New parameter reportDocCount defaults to 'false'. Old behavior still possible by specifying this as 'true' (Erick Erickson) * SOLR-3012: Move System.getProperty("type") in postData() to main() and add type argument so that the client applications of SimplePostTool can set content type via method argument. (koji) * SOLR-2888: FSTSuggester refactoring: internal storage is now UTF-8, external sorting (on disk) prevents OOMs even with large data sets (the bottleneck is now FST construction), code cleanups and API cleanups. (Dawid Weiss, Robert Muir) Bug Fixes ---------------------- * SOLR-3187 SystemInfoHandler leaks filehandles (siren) * LUCENE-3820: Fixed invalid position indexes by reimplementing PatternReplaceCharFilter. This change also drops real support for boundary characters -- all input is prebuffered for pattern matching. (Dawid Weiss) * SOLR-3068: Fixed NPE in ThreadDumpHandler (siren) * SOLR-2912: Fixed File descriptor leak in ShowFileRequestHandler (Michael Ryan, shalin) * SOLR-2819: Improved speed of parsing hex entities in HTMLStripCharFilter (Bernhard Berger, hossman) * SOLR-2509: StringIndexOutOfBoundsException in the spellchecker collate when the term contains a hyphen. (Thomas Gambier caught the bug, Steffen Godskesen did the patch, via Erick Erickson) * SOLR-2955: Fixed IllegalStateException when querying with group.sort=score desc in sharded environment. (Steffen Elberg Godskesen, Martijn van Groningen) * SOLR-2956: Fixed inconsistencies in the flags (and flag key) reported by the LukeRequestHandler (hossman) * SOLR-1730: Made it clearer when a core failed to load as well as better logging when the QueryElevationComponent fails to properly initialize (gsingers) * SOLR-1520: QueryElevationComponent now supports non-string ids (gsingers) * SOLR-3024: Fixed JSONTestUtil.matchObj, in previous releases it was not respecting the 'delta' arg (David Smiley via hossman) * SOLR-2542: Fixed DIH Context variables which were broken for all scopes other then SCOPE_ENTITY (Linbin Chen & Frank Wesemann via hossman) * SOLR-3042: Fixed Maven Jetty plugin configuration. (David Smiley via Steve Rowe) * SOLR-2970: CSV ResponseWriter returns fields defined as stored=false in schema (janhoy) * LUCENE-3690, LUCENE-2208, SOLR-882, SOLR-42: Re-implemented HTMLStripCharFilter as a JFlex-generated scanner and moved it to lucene/contrib/analyzers/common/. See below for a list of bug fixes and other changes. To get the same behavior as HTMLStripCharFilter in Solr version 3.5 and earlier (including the bugs), use LegacyHTMLStripCharFilter, which is the previous implementation. Behavior changes from the previous version: - Known offset bugs are fixed. - The "Mark invalid" exceptions reported in SOLR-1283 are no longer triggered (the bug is still present in LegacyHTMLStripCharFilter). - The character entity "'" is now always properly decoded. - More cases of