Updated the BloomFilter javadoc.

Remove trailing periods on parameters and arguments. Remove reference to LongBuffer. Clarify what the long[] represents in 'long[] getBits()'. Clarify cardinality using (number of enabled bits). Rearrange BloomFilter interface methods to functional order. The order is: - Query operations - Modification operations - Counting operations Improve javadoc for BloomFilter contains with additional information for what 'contains' means. Update exception message for contains/merge/add/subtract to be consistent.
2020-03-15 21:45:05 +00:00 · 2020-03-15 21:45:05 +00:00 · 9de28a7b62
parent 86bac5e602
commit 9de28a7b62
2 changed files with 179 additions and 133 deletions
--- a/src/main/java/org/apache/commons/collections4/bloomfilter/BloomFilter.java
+++ b/src/main/java/org/apache/commons/collections4/bloomfilter/BloomFilter.java
@ -26,48 +26,24 @@ import org.apache.commons.collections4.bloomfilter.hasher.StaticHasher;
 */
 public interface BloomFilter {

-    /**
-     * Performs a logical "AND" with the other Bloom filter and returns the cardinality of
-     * the result.
-     *
-     * @param other the other Bloom filter.
-     * @return the cardinality of the result of {@code ( this AND other )}.
-     */
-    int andCardinality(BloomFilter other);
+    // Query Operations

    /**
-     * Gets the cardinality of this Bloom filter.
-     * <p>This is also known as the Hamming value.</p>
+     * Gets the shape of this filter.
     *
-     * @return the cardinality (number of enabled bits) in this filter.
+     * @return the shape of this filter
     */
-    int cardinality();
+    Shape getShape();

    /**
-     * Performs a contains check. Effectively this AND other == other.
+     * Gets an array of little-endian long values representing the bits of this filter.
     *
-     * @param other the Other Bloom filter.
-     * @return true if this filter matches the other.
-     */
-    boolean contains(BloomFilter other);
-
-    /**
-     * Performs a contains check against a decomposed Bloom filter. The shape must match
-     * the shape of this filter. The hasher provides bit indexes to check for. Effectively
-     * decomposed AND this == decomposed.
+     * <p>The returned array will have length {@code ceil(m / 64)} where {@code m} is the
+     * number of bits in the filter and {@code ceil} is the ceiling function.
+     * Bits 0-63 are in the first long. A value of 1 at a bit position indicates the bit
+     * index is enabled.
     *
-     * @param hasher The hasher containing the bits to check.
-     * @return true if this filter contains the other.
-     * @throws IllegalArgumentException if the shape argument does not match the shape of
-     * this filter, or if the hasher is not the specified one
-     */
-    boolean contains(Hasher hasher);
-
-    /**
-     * Gets an array of little-endian long values representing the on bits of this filter.
-     * bits 0-63 are in the first long.
-     *
-     * @return the LongBuffer representation of this filter.
+     * @return the {@code long[]} representation of this filter
     */
    long[] getBits();

@ -75,51 +51,103 @@ public interface BloomFilter {
     * Creates a StaticHasher that contains the indexes of the bits that are on in this
     * filter.
     *
-     * @return a StaticHasher for that produces this Bloom filter.
+     * @return a StaticHasher for that produces this Bloom filter
     */
    StaticHasher getHasher();

    /**
-     * Gets the shape of this filter.
+     * Returns {@code true} if this filter contains the specified filter. Specifically this
+     * returns {@code true} if this filter is enabled for all bits that are enabled in the
+     * {@code other} filter. Using the bit representations this is
+     * effectively {@code (this AND other) == other}.
     *
-     * @return The shape of this filter.
+     * @param other the other Bloom filter
+     * @return true if this filter is enabled for all enabled bits in the other filter
+     * @throws IllegalArgumentException if the shape of the other filter does not match
+     * the shape of this filter
     */
-    Shape getShape();
+    boolean contains(BloomFilter other);

    /**
-     * Merges the other Bloom filter into this one.
+     * Returns {@code true} if this filter contains the specified decomposed Bloom filter.
+     * Specifically this returns {@code true} if this filter is enabled for all bit indexes
+     * identified by the {@code hasher}. Using the bit representations this is
+     * effectively {@code (this AND hasher) == hasher}.
     *
-     * @param other the other Bloom filter.
+     * @param hasher the hasher to provide the indexes
+     * @return true if this filter is enabled for all bits specified by the hasher
+     * @throws IllegalArgumentException if the hasher cannot generate indices for the shape of
+     * this filter
+     */
+    boolean contains(Hasher hasher);
+
+    // Modification Operations
+
+    /**
+     * Merges the specified Bloom filter into this Bloom filter. Specifically all bit indexes
+     * that are enabled in the {@code other} filter will be enabled in this filter.
+     *
+     * <p>Note: This method should return {@code true} even if no additional bit indexes were
+     * enabled. A {@code false} result indicates that this filter is not ensured to contain
+     * the {@code other} Bloom filter.
+     *
+     * @param other the other Bloom filter
     * @return true if the merge was successful
+     * @throws IllegalArgumentException if the shape of the other filter does not match
+     * the shape of this filter
     */
    boolean merge(BloomFilter other);

    /**
-     * Merges the decomposed Bloom filter defined by the hasher into this Bloom
-     * filter. The hasher provides an iterator of bit indexes to enable.
+     * Merges the specified decomposed Bloom filter into this Bloom filter. Specifically all
+     * bit indexes that are identified by the {@code hasher} will be enabled in this filter.
     *
-     * @param hasher the hasher to provide the indexes.
+     * <p>Note: This method should return {@code true} even if no additional bit indexes were
+     * enabled. A {@code false} result indicates that this filter is not ensured to contain
+     * the specified decomposed Bloom filter.
+     *
+     * @param hasher the hasher to provide the indexes
     * @return true if the merge was successful
-     * @throws IllegalArgumentException if the shape argument does not match the shape of
-     * this filter, or if the hasher is not the specified one
+     * @throws IllegalArgumentException if the hasher cannot generate indices for the shape of
+     * this filter
     */
    boolean merge(Hasher hasher);

+    // Counting Operations
+
    /**
-     * Performs a logical "OR" with the other Bloom filter and returns the cardinality of
-     * the result.
+     * Gets the cardinality (number of enabled bits) of this Bloom filter.
     *
-     * @param other the other Bloom filter.
-     * @return the cardinality of the result of {@code ( this OR other )}.
+     * <p>This is also known as the Hamming value.</p>
+     *
+     * @return the cardinality of this filter
+     */
+    int cardinality();
+
+    /**
+     * Performs a logical "AND" with the other Bloom filter and returns the cardinality
+     * (number of enabled bits) of the result.
+     *
+     * @param other the other Bloom filter
+     * @return the cardinality of the result of {@code (this AND other)}
+     */
+    int andCardinality(BloomFilter other);
+
+    /**
+     * Performs a logical "OR" with the other Bloom filter and returns the cardinality
+     * (number of enabled bits) of the result.
+     *
+     * @param other the other Bloom filter
+     * @return the cardinality of the result of {@code (this OR other)}
     */
    int orCardinality(BloomFilter other);

    /**
-     * Performs a logical "XOR" with the other Bloom filter and returns the cardinality of
-     * the result.
+     * Performs a logical "XOR" with the other Bloom filter and returns the cardinality
+     * (number of enabled bits) of the result.
     *
-     * @param other the other Bloom filter.
-     * @return the cardinality of the result of {@code ( this XOR other )}
+     * @param other the other Bloom filter
+     * @return the cardinality of the result of {@code (this XOR other)}
     */
    int xorCardinality(BloomFilter other);
 }
--- a/src/main/java/org/apache/commons/collections4/bloomfilter/CountingBloomFilter.java
+++ b/src/main/java/org/apache/commons/collections4/bloomfilter/CountingBloomFilter.java
@ -72,86 +72,7 @@ public interface CountingBloomFilter extends BloomFilter {
        void accept(int index, int count);
    }

-    /**
-     * {@inheritDoc}
-     *
-     * <p>Note: If the other filter is a counting Bloom filter the index counts are ignored.
-     * All counts for the indexes identified by the other filter will be incremented by 1.
-     *
-     * <p>This method will return true if the filter is valid after the operation.
-     */
-    @Override
-    boolean merge(BloomFilter other);
-
-    /**
-     * {@inheritDoc}
-     *
-     * <p>Note: If the hasher contains duplicate bit indexes these are ignored.
-     * All counts for the indexes identified by the other filter will be incremented by 1.
-     *
-     * <p>This method will return true if the filter is valid after the operation.
-     */
-    @Override
-    boolean merge(Hasher other);
-
-    /**
-     * Removes the other Bloom filter from this one.
-     * All counts for the indexes identified by the other filter will be decremented by 1.
-     *
-     * <p>This method will return true if the filter is valid after the operation.
-     *
-     * @param other the other Bloom filter
-     * @return true if the removal was successful and the state is valid
-     * @throws IllegalArgumentException if the shape of the other filter does not match
-     * the shape of this filter
-     * @see #isValid()
-     */
-    boolean remove(BloomFilter other);
-
-    /**
-     * Removes the decomposed Bloom filter defined by the hasher from this Bloom filter.
-     * All counts for the indexes identified by the hasher will be decremented by 1.
-     * Duplicate indexes should be ignored.
-     *
-     * <p>This method will return true if the filter is valid after the operation.
-     *
-     * @param hasher the hasher to provide the indexes
-     * @return true if the removal was successful and the state is valid
-     * @throws IllegalArgumentException if the hasher cannot generate indices for the shape of
-     * this filter
-     * @see #isValid()
-     */
-    boolean remove(Hasher hasher);
-
-    /**
-     * Adds the other counting Bloom filter to this one.
-     * All counts for the indexes identified by the other filter will be incremented by their
-     * corresponding counts in the other filter.
-     *
-     * <p>This method will return true if the filter is valid after the operation.
-     *
-     * @param other the other counting Bloom filter
-     * @return true if the addition was successful and the state is valid
-     * @throws IllegalArgumentException if the shape of the other filter does not match
-     * the shape of this filter
-     * @see #isValid()
-     */
-    boolean add(CountingBloomFilter other);
-
-    /**
-     * Subtracts the other counting Bloom filter from this one.
-     * All counts for the indexes identified by the other filter will be decremented by their
-     * corresponding counts in the other filter.
-     *
-     * <p>This method will return true if the filter is valid after the operation.
-     *
-     * @param other the other counting Bloom filter
-     * @return true if the subtraction was successful and the state is valid
-     * @throws IllegalArgumentException if the shape of the other filter does not match
-     * the shape of this filter
-     * @see #isValid()
-     */
-    boolean subtract(CountingBloomFilter other);
+    // Query Operations

    /**
     * Returns true if the internal state is valid. This flag is a warning that an addition or
@ -180,4 +101,101 @@ public interface CountingBloomFilter extends BloomFilter {
     * @throws NullPointerException if the specified action is null
     */
    void forEachCount(BitCountConsumer action);
+
+    // Modification Operations
+
+    /**
+     * Merges the specified Bloom filter into this Bloom filter. Specifically all counts for
+     * indexes that are enabled in the {@code other} filter will be incremented by 1.
+     *
+     * <p>Note: If the other filter is a counting Bloom filter the index counts are ignored; only
+     * the enabled indexes are used.
+     *
+     * <p>This method will return true if the filter is valid after the operation.
+     *
+     * @param other {@inheritDoc}
+     * @return true if the merge was successful and the state is valid
+     * @throws IllegalArgumentException {@inheritDoc}
+     * @see #isValid()
+     */
+    @Override
+    boolean merge(BloomFilter other);
+
+    /**
+     * Merges the specified decomposed Bloom filter into this Bloom filter. Specifically all
+     * counts for the <em>distinct</em> indexes that are identified by the {@code hasher} will
+     * be incremented by 1. If the {@code hasher} contains duplicate bit indexes these are ignored.
+     *
+     * <p>This method will return true if the filter is valid after the operation.
+     *
+     * @param hasher {@inheritDoc}
+     * @return true if the merge was successful and the state is valid
+     * @throws IllegalArgumentException {@inheritDoc}
+     * @see #isValid()
+     */
+    @Override
+    boolean merge(Hasher hasher);
+
+    /**
+     * Removes the specified Bloom filter from this Bloom filter. Specifically
+     * all counts for the indexes identified by the {@code other} filter will be decremented by 1.
+     *
+     * <p>Note: If the other filter is a counting Bloom filter the index counts are ignored; only
+     * the enabled indexes are used.
+     *
+     * <p>This method will return true if the filter is valid after the operation.
+     *
+     * @param other the other Bloom filter
+     * @return true if the removal was successful and the state is valid
+     * @throws IllegalArgumentException if the shape of the other filter does not match
+     * the shape of this filter
+     * @see #isValid()
+     * @see #subtract(CountingBloomFilter)
+     */
+    boolean remove(BloomFilter other);
+
+    /**
+     * Removes the specified decomposed Bloom filter from this Bloom filter. Specifically
+     * all counts for the <em>distinct</em> indexes identified by the {@code hasher} will be
+     * decremented by 1. If the {@code hasher} contains duplicate bit indexes these are ignored.
+     *
+     * <p>This method will return true if the filter is valid after the operation.
+     *
+     * @param hasher the hasher to provide the indexes
+     * @return true if the removal was successful and the state is valid
+     * @throws IllegalArgumentException if the hasher cannot generate indices for the shape of
+     * this filter
+     * @see #isValid()
+     */
+    boolean remove(Hasher hasher);
+
+    /**
+     * Adds the specified counting Bloom filter to this Bloom filter. Specifically
+     * all counts for the indexes identified by the {@code other} filter will be incremented
+     * by their corresponding counts in the {@code other} filter.
+     *
+     * <p>This method will return true if the filter is valid after the operation.
+     *
+     * @param other the other counting Bloom filter
+     * @return true if the addition was successful and the state is valid
+     * @throws IllegalArgumentException if the shape of the other filter does not match
+     * the shape of this filter
+     * @see #isValid()
+     */
+    boolean add(CountingBloomFilter other);
+
+    /**
+     * Adds the specified counting Bloom filter to this Bloom filter. Specifically
+     * all counts for the indexes identified by the {@code other} filter will be decremented
+     * by their corresponding counts in the {@code other} filter.
+     *
+     * <p>This method will return true if the filter is valid after the operation.
+     *
+     * @param other the other counting Bloom filter
+     * @return true if the subtraction was successful and the state is valid
+     * @throws IllegalArgumentException if the shape of the other filter does not match
+     * the shape of this filter
+     * @see #isValid()
+     */
+    boolean subtract(CountingBloomFilter other);
 }