lucene

Commit Graph

Author	SHA1	Message	Date
Adrien Grand	3246b26058	LUCENE-9147: Fix codec excludes.	2020-02-06 10:34:35 +01:00
Houston Putman	e0d35f9641	SOLR-13887: Use the default idleTimeout instead of 0 for HTTP2 (#991 )	2020-02-05 12:45:14 -08:00
Chris Hostetter	bbdfce944b	SOLR-14241: New delete() Stream Decorator (cherry picked from commit `c5d0391df9`)	2020-02-05 13:31:55 -07:00
Adrien Grand	597141df6b	LUCENE-9147: Move the stored fields index off-heap. (#1179 ) This replaces the index of stored fields and term vectors with two `DirectMonotonic` arrays. `DirectMonotonicWriter` requires to know the number of values to write up-front, so incoming doc IDs and file pointers are buffered on disk using temporary files that never get fsynced, but have index headers and footers to make sure any corruption in these files wouldn't propagate to the index. `DirectMonotonicReader` gets a specialized `binarySearch` implementation that leverages the metadata in order to avoid going to the IndexInput as often as possible. Actually in the common case, it would only go to a single sub `DirectReader` which, combined with the size of blocks of 1k values, helps bound the number of page faults to 2.	2020-02-05 19:19:32 +01:00
Adrien Grand	d007470bda	SOLR-14242: HdfsDirectory#createTempOutput. (#1240 )	2020-02-05 16:39:30 +01:00
Mike McCandless	3e63cd38ef	LUCENE-9200: consistently use double (not float) math for TieredMergePolicy's decisions, to fix a corner-case bug uncovered by randomized tests	2020-02-05 09:52:19 -05:00
Tomas Fernandez Lobbe	37d4121770	SOLR-14219: Revert changes in OverseerSolrRespose and move serialization (#1227 ) SOLR-14095 Introduced an issue for rolling restarts (Incompatible Java serialization). This change fixes the compatibility issue while keeping the functionality in SOLR-14095	2020-02-04 11:07:38 -08:00
Adrien Grand	d7859097ee	SOLR-14238: Fix HdfsDirectory to no longer overwrite existing files. (#1237 )	2020-02-04 19:35:52 +01:00
Munendra S N	358043d1f3	SOLR-14090: fix delete-copy-field when source is dynamic field	2020-02-04 21:48:56 +05:30
Munendra S N	5a3a05d953	SOLR-10567: add support for DateRangeField in JSON facet range	2020-02-04 21:47:54 +05:30
Andrzej Bialecki	4a002411fc	SOLR-14239: Fix the behavior of CaffeineCache.computeIfAbsent on branch_8x.	2020-02-04 17:02:05 +01:00
Ignacio Vera	996945fff7	LUCENE-9197: fix wrong implementation on Point2D#withinTriangle (#1228 )	2020-02-04 07:11:07 +01:00
Anshum Gupta	02f9b276b0	SOLR-14206: Annotate HttpSolrCall as thread-safe (#1205 ) * SOLR-14206: Annotate HttpSolrCall and V2HttpCall as thread-safe	2020-02-03 10:00:43 -08:00
Mikhail Khludnev	34d299018e	SOLR-12325: uniqueBlock(\{!v=foo:bar})	2020-02-02 15:20:16 +03:00
Jan Høydahl	e4721d9a2d	SOLR-14221: Upgrade restlet to version 2.4.0 (#1211 ) (cherry picked from commit `16b8d50284`)	2020-02-02 11:45:01 +01:00
Kazuaki Hiraga	12242b52e6	LUCENE-9123: Add new JapaneseTokenizer constructors with discardCompoundToken option to control whether the tokenizer emits original tokens when the mode is not NORMAL.	2020-02-01 15:20:02 +09:00
Munendra S N	43d07db523	fix typo in schema-api documentation	2020-02-01 10:39:33 +05:30
Robert Muir	507ef67d5f	support ECJ linting on newer JDK versions The entire precommit task will still fail with unsupported java version (subsequent checks do not support the newer javadocs format). But this allows the ECJ linter to run, which checks for things such as unused imports.	2020-01-31 14:07:03 -05:00
Jason Gerlowski	68cfe27b68	SOLR-13892: Add 'top-level' docValues Join implementation (#1171 )	2020-01-31 13:11:28 -05:00
Joel Bernstein	d4a4b4413d	SOLR-14139: Support backtick phrase queries in Streaming Expressions	2020-01-31 12:14:43 -05:00
Christine Poerschke	fc3497d24c	LUCENE-9195: precommit fix (remove unused import)	2020-01-31 16:53:12 +00:00
Christine Poerschke	53d8b5b1b8	LUCENE-8530: fix some 'rawtypes' javac warnings	2020-01-31 16:42:25 +00:00
Robert Muir	30b2cc0163	LUCENE-9195: more slow tests fixes	2020-01-31 09:27:01 -05:00
Chris Hostetter	b2d8b784a3	New /stream test cases showing authn+authz edge cases in cloud mode This triggers various places in the Streaming Expressions code that use background threads to confirm that the expected credentails (or lack of) are propogarded along. Test currently has comments + workarounds for 2 known client issues: - SOLR-14226: SolrStream reports AuthN/AuthZ failures (401\|403) as IOException w/o details - SOLR-14222: CloudSolrClient converts (update) 403 error to 500 error (cherry picked from commit `517438e356`)	2020-01-30 10:04:09 -07:00
Adrien Grand	744dec7275	LUCENE-4702: Improve performance for fuzzy queries. Fuzzy queries with an edit distance of 1 or 2 must visit all blocks whose prefix length is 1 or 2. By not compressing those, we can trade very little space (a couple MBs in the case of the wikibigall index) for better query efficiency.	2020-01-30 10:40:44 +01:00
Ignacio Vera	46fa876c35	LUCENE-9141: Simplify LatLonShapeXQuery API by adding a new abstract class called LatLonGeometry. (#1170 )	2020-01-30 08:04:09 +01:00
Robert Muir	e258ab32f0	LUCENE-9192: speed up more slow tests	2020-01-29 14:33:05 -05:00
Robert Muir	16f240e740	LUCENE-9160: add params/docs to override jvm params in gradle build, default C2 off in tests. Adds some build parameters to tune how tests run. There is an example shown by "gradle helpLocalSettings" Default C2 off in tests as it is wasteful locally and causes slowdown of tests runs. You can override this by setting tests.jvmargs for gradle, or args for ant. Some crazy lucene stress tests may need to be toned down after the change, as they may have been doing too many iterations by default... but this is not a new problem.	2020-01-29 13:59:07 -05:00
Robert Muir	5ee2a6fcae	fix merging difficulty while trying to give branch_8x some love	2020-01-29 13:57:48 -05:00
Robert Muir	e1cc7eb4b7	LUCENE-9189: TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes The issue is that MockDirectoryWrapper's disk full check is horribly inefficient. On every writeByte/etc, it totally recomputes disk space across all files. This means it calls listAll() on the underlying Directory (which sorts all the underlying files), then sums up fileLength() for each of those files. This leads to many pathological cases in the disk full tests... but the number of tests impacted by this is minimal, and the logic is scary.	2020-01-29 13:47:05 -05:00
Robert Muir	3dd47cf9c7	LUCENE-9186: remove linefiledocs usage from BaseTokenStreamTestCase	2020-01-29 13:46:41 -05:00
Robert Muir	037cc5b1de	LUCENE-9172: nuke some compiler warnings	2020-01-29 13:45:59 -05:00
Robert Muir	0cc67223a8	SOLR-14217: tests respect tests.workDir correctly (prevent SSD destruction)	2020-01-29 13:44:51 -05:00
Robert Muir	2fd904e3d0	LUCENE-9167: test speedup for slowest/pathological tests (round 3)	2020-01-29 13:44:19 -05:00
Robert Muir	70b60734a1	LUCENE-9163: test speedup for slowest/pathological tests Calming down individual test methods with double-digit execution times after running tests many times. There are a few more issues remaining, but this solves the majority of them.	2020-01-29 13:41:57 -05:00
Robert Muir	f787c093b3	TestPointValues only index 300k docs in NIGHTLY configuration, that is too much locally	2020-01-29 13:40:51 -05:00
Robert Muir	1faf5aa6c7	mark StressRamUsageEstimator tests nightly. This is consistently the slowest test for me in all of lucene core by far. Takes around an entire minute. Mark it nightly: should catch any issues with RAM estimation but keep local builds fast.	2020-01-29 13:40:24 -05:00
Ignacio Vera	29542c7f59	LUCENE-9152: Improve line intersection detection for polygons (#1187 )	2020-01-29 19:25:46 +01:00
Adrien Grand	47c01af394	SOLR-13897: Fix precommit.	2020-01-28 20:11:24 +01:00
Adrien Grand	25fc09ee9e	LUCENE-9161: DirectMonotonicWriter checks for overflows. (#1197 )	2020-01-28 19:10:46 +01:00
Adrien Grand	033220e2ab	LUCENE-4702: Reduce terms dictionary compression overhead. (#1216 ) Changes include: - Removed LZ4 compression of suffix lengths which didn't save much space anyway. - For stats, LZ4 was only really used for run-length compression of terms whose docFreq is 1. This has been replaced by explicit run-length compression. - Since we only use LZ4 for suffix bytes if the compression ration is < 75%, we now only try LZ4 out if the average suffix length is greater than 6, in order to reduce index-time overhead.	2020-01-28 19:09:59 +01:00
Cassandra Targett	088e6c3006	Ref Guide: Remove outdated or invalid links to Solr Wiki; update URL of those that remain	2020-01-27 16:39:22 -06:00
Cassandra Targett	d5bacc9a1c	Ref Guide: fix undefined substitution error caused by formatting of variables in paths	2020-01-27 16:39:10 -06:00
Adrien Grand	666bdac64d	LUCENE-4702: CHANGES entry.	2020-01-27 18:28:22 +01:00
Adrien Grand	33a7af9cbf	LUCENE-4702: Terms dictionary compression. (#1126 ) Compress blocks of suffixes in order to make the terms dictionary more space-efficient. Two compression algorithms are used depending on which one is more space-efficient: - LowercaseAsciiCompression, which applies when all bytes are in the `[0x1F,0x3F)` or `[0x5F,0x7F)` ranges, which notably include all digits, lowercase ASCII characters, '.', '-' and '_', and encodes 4 chars on 3 bytes. It is very often applicable on analyzed content and decompresses very quickly thanks to auto-vectorization support in the JVM. - LZ4, when the compression ratio is less than 0.75. I was a bit unhappy with the complexity of the high-compression LZ4 option, so I simplified it in order to only keep the logic that detects duplicate strings. The logic about what to do in case overlapping matches are found, which was responsible for most of the complexity while only yielding tiny benefits, has been removed.	2020-01-27 18:28:18 +01:00
Adrien Grand	ace4fcc7be	LUCENE-9116: Remove long[] from `PostingsWriterBase#encodeTerm`. (#1149 ) (#1158 ) All the metadata can be directly encoded in the `DataOutput`.	2020-01-27 18:28:18 +01:00
Robert Muir	d614bb854d	LUCENE-9180: dos2unix files that don't need dos line endings. gitignore gradle-specific stuff that shows up modified if you switch branches, no gradle here.	2020-01-27 11:31:59 -05:00
Alan Woodward	4bf883ddb8	LUCENE-9153: Allow WhitespaceAnalyzer to set a custom maxTokenLen (#1198 ) WhitespaceTokenizer defaults to a maximum token length of 255, and WhitespaceAnalyzer does not allow this to be changed. This commit adds an optional maxTokenLen parameter to WhitespaceAnalyzer as well, and documents the existing token length restriction.	2020-01-27 09:22:56 +00:00
Ignacio Vera	89c72a693b	LUCENE-9176: Handle the case when there is only one leaf node in TestEstimatePointCount (#1212 )	2020-01-27 09:53:18 +01:00
Andrzej Bialecki	df91041652	SOLR-14211: Fix a bug introduced in SOLR-14192.	2020-01-27 09:24:58 +01:00

1 2 3 4 5 ...

32819 Commits All Branches Search

32819 Commits

All Branches