mirror of
https://github.com/apache/lucene.git
synced 2025-03-06 08:19:23 +00:00
I analyzed a heap dump of Elasticsearch where FixedBitSet uses more than 1GB of memory. Most of these FixedBitSets are used by soft-deletes reader wrappers, even though these segments have no deletes at all. I believe these segments previously had soft-deletes, but these deletes were pruned by merges. The reason we wrap soft-deletes is that the soft-deletes field exists. Since these segments had soft-deletes previously, we carried the field-infos into the new segment. Ideally, we should have ways to check whether the returned docValues iterator is empty or not so that we can avoid allocating FixedBitSet completely, or we should prune fields without values after merges.