Updated the BloomFilter javadoc.

Remove trailing periods on parameters and arguments.

Remove reference to LongBuffer. Clarify what the long[] represents in
'long[] getBits()'.

Clarify cardinality using (number of enabled bits).

Rearrange BloomFilter interface methods to functional order. The order
is:

- Query operations
- Modification operations
- Counting operations

Improve javadoc for BloomFilter contains with additional information for
what 'contains' means.

Update exception message for contains/merge/add/subtract to be
consistent.
This commit is contained in:
Alex Herbert 2020-03-15 21:45:05 +00:00
parent 86bac5e602
commit 9de28a7b62
2 changed files with 179 additions and 133 deletions

View File

@ -26,48 +26,24 @@ import org.apache.commons.collections4.bloomfilter.hasher.StaticHasher;
*/
public interface BloomFilter {
/**
* Performs a logical "AND" with the other Bloom filter and returns the cardinality of
* the result.
*
* @param other the other Bloom filter.
* @return the cardinality of the result of {@code ( this AND other )}.
*/
int andCardinality(BloomFilter other);
// Query Operations
/**
* Gets the cardinality of this Bloom filter.
* <p>This is also known as the Hamming value.</p>
* Gets the shape of this filter.
*
* @return the cardinality (number of enabled bits) in this filter.
* @return the shape of this filter
*/
int cardinality();
Shape getShape();
/**
* Performs a contains check. Effectively this AND other == other.
* Gets an array of little-endian long values representing the bits of this filter.
*
* @param other the Other Bloom filter.
* @return true if this filter matches the other.
*/
boolean contains(BloomFilter other);
/**
* Performs a contains check against a decomposed Bloom filter. The shape must match
* the shape of this filter. The hasher provides bit indexes to check for. Effectively
* decomposed AND this == decomposed.
* <p>The returned array will have length {@code ceil(m / 64)} where {@code m} is the
* number of bits in the filter and {@code ceil} is the ceiling function.
* Bits 0-63 are in the first long. A value of 1 at a bit position indicates the bit
* index is enabled.
*
* @param hasher The hasher containing the bits to check.
* @return true if this filter contains the other.
* @throws IllegalArgumentException if the shape argument does not match the shape of
* this filter, or if the hasher is not the specified one
*/
boolean contains(Hasher hasher);
/**
* Gets an array of little-endian long values representing the on bits of this filter.
* bits 0-63 are in the first long.
*
* @return the LongBuffer representation of this filter.
* @return the {@code long[]} representation of this filter
*/
long[] getBits();
@ -75,51 +51,103 @@ public interface BloomFilter {
* Creates a StaticHasher that contains the indexes of the bits that are on in this
* filter.
*
* @return a StaticHasher for that produces this Bloom filter.
* @return a StaticHasher for that produces this Bloom filter
*/
StaticHasher getHasher();
/**
* Gets the shape of this filter.
* Returns {@code true} if this filter contains the specified filter. Specifically this
* returns {@code true} if this filter is enabled for all bits that are enabled in the
* {@code other} filter. Using the bit representations this is
* effectively {@code (this AND other) == other}.
*
* @return The shape of this filter.
* @param other the other Bloom filter
* @return true if this filter is enabled for all enabled bits in the other filter
* @throws IllegalArgumentException if the shape of the other filter does not match
* the shape of this filter
*/
Shape getShape();
boolean contains(BloomFilter other);
/**
* Merges the other Bloom filter into this one.
* Returns {@code true} if this filter contains the specified decomposed Bloom filter.
* Specifically this returns {@code true} if this filter is enabled for all bit indexes
* identified by the {@code hasher}. Using the bit representations this is
* effectively {@code (this AND hasher) == hasher}.
*
* @param other the other Bloom filter.
* @param hasher the hasher to provide the indexes
* @return true if this filter is enabled for all bits specified by the hasher
* @throws IllegalArgumentException if the hasher cannot generate indices for the shape of
* this filter
*/
boolean contains(Hasher hasher);
// Modification Operations
/**
* Merges the specified Bloom filter into this Bloom filter. Specifically all bit indexes
* that are enabled in the {@code other} filter will be enabled in this filter.
*
* <p>Note: This method should return {@code true} even if no additional bit indexes were
* enabled. A {@code false} result indicates that this filter is not ensured to contain
* the {@code other} Bloom filter.
*
* @param other the other Bloom filter
* @return true if the merge was successful
* @throws IllegalArgumentException if the shape of the other filter does not match
* the shape of this filter
*/
boolean merge(BloomFilter other);
/**
* Merges the decomposed Bloom filter defined by the hasher into this Bloom
* filter. The hasher provides an iterator of bit indexes to enable.
* Merges the specified decomposed Bloom filter into this Bloom filter. Specifically all
* bit indexes that are identified by the {@code hasher} will be enabled in this filter.
*
* @param hasher the hasher to provide the indexes.
* <p>Note: This method should return {@code true} even if no additional bit indexes were
* enabled. A {@code false} result indicates that this filter is not ensured to contain
* the specified decomposed Bloom filter.
*
* @param hasher the hasher to provide the indexes
* @return true if the merge was successful
* @throws IllegalArgumentException if the shape argument does not match the shape of
* this filter, or if the hasher is not the specified one
* @throws IllegalArgumentException if the hasher cannot generate indices for the shape of
* this filter
*/
boolean merge(Hasher hasher);
// Counting Operations
/**
* Performs a logical "OR" with the other Bloom filter and returns the cardinality of
* the result.
* Gets the cardinality (number of enabled bits) of this Bloom filter.
*
* @param other the other Bloom filter.
* @return the cardinality of the result of {@code ( this OR other )}.
* <p>This is also known as the Hamming value.</p>
*
* @return the cardinality of this filter
*/
int cardinality();
/**
* Performs a logical "AND" with the other Bloom filter and returns the cardinality
* (number of enabled bits) of the result.
*
* @param other the other Bloom filter
* @return the cardinality of the result of {@code (this AND other)}
*/
int andCardinality(BloomFilter other);
/**
* Performs a logical "OR" with the other Bloom filter and returns the cardinality
* (number of enabled bits) of the result.
*
* @param other the other Bloom filter
* @return the cardinality of the result of {@code (this OR other)}
*/
int orCardinality(BloomFilter other);
/**
* Performs a logical "XOR" with the other Bloom filter and returns the cardinality of
* the result.
* Performs a logical "XOR" with the other Bloom filter and returns the cardinality
* (number of enabled bits) of the result.
*
* @param other the other Bloom filter.
* @return the cardinality of the result of {@code ( this XOR other )}
* @param other the other Bloom filter
* @return the cardinality of the result of {@code (this XOR other)}
*/
int xorCardinality(BloomFilter other);
}

View File

@ -72,86 +72,7 @@ public interface CountingBloomFilter extends BloomFilter {
void accept(int index, int count);
}
/**
* {@inheritDoc}
*
* <p>Note: If the other filter is a counting Bloom filter the index counts are ignored.
* All counts for the indexes identified by the other filter will be incremented by 1.
*
* <p>This method will return true if the filter is valid after the operation.
*/
@Override
boolean merge(BloomFilter other);
/**
* {@inheritDoc}
*
* <p>Note: If the hasher contains duplicate bit indexes these are ignored.
* All counts for the indexes identified by the other filter will be incremented by 1.
*
* <p>This method will return true if the filter is valid after the operation.
*/
@Override
boolean merge(Hasher other);
/**
* Removes the other Bloom filter from this one.
* All counts for the indexes identified by the other filter will be decremented by 1.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param other the other Bloom filter
* @return true if the removal was successful and the state is valid
* @throws IllegalArgumentException if the shape of the other filter does not match
* the shape of this filter
* @see #isValid()
*/
boolean remove(BloomFilter other);
/**
* Removes the decomposed Bloom filter defined by the hasher from this Bloom filter.
* All counts for the indexes identified by the hasher will be decremented by 1.
* Duplicate indexes should be ignored.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param hasher the hasher to provide the indexes
* @return true if the removal was successful and the state is valid
* @throws IllegalArgumentException if the hasher cannot generate indices for the shape of
* this filter
* @see #isValid()
*/
boolean remove(Hasher hasher);
/**
* Adds the other counting Bloom filter to this one.
* All counts for the indexes identified by the other filter will be incremented by their
* corresponding counts in the other filter.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param other the other counting Bloom filter
* @return true if the addition was successful and the state is valid
* @throws IllegalArgumentException if the shape of the other filter does not match
* the shape of this filter
* @see #isValid()
*/
boolean add(CountingBloomFilter other);
/**
* Subtracts the other counting Bloom filter from this one.
* All counts for the indexes identified by the other filter will be decremented by their
* corresponding counts in the other filter.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param other the other counting Bloom filter
* @return true if the subtraction was successful and the state is valid
* @throws IllegalArgumentException if the shape of the other filter does not match
* the shape of this filter
* @see #isValid()
*/
boolean subtract(CountingBloomFilter other);
// Query Operations
/**
* Returns true if the internal state is valid. This flag is a warning that an addition or
@ -180,4 +101,101 @@ public interface CountingBloomFilter extends BloomFilter {
* @throws NullPointerException if the specified action is null
*/
void forEachCount(BitCountConsumer action);
// Modification Operations
/**
* Merges the specified Bloom filter into this Bloom filter. Specifically all counts for
* indexes that are enabled in the {@code other} filter will be incremented by 1.
*
* <p>Note: If the other filter is a counting Bloom filter the index counts are ignored; only
* the enabled indexes are used.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param other {@inheritDoc}
* @return true if the merge was successful and the state is valid
* @throws IllegalArgumentException {@inheritDoc}
* @see #isValid()
*/
@Override
boolean merge(BloomFilter other);
/**
* Merges the specified decomposed Bloom filter into this Bloom filter. Specifically all
* counts for the <em>distinct</em> indexes that are identified by the {@code hasher} will
* be incremented by 1. If the {@code hasher} contains duplicate bit indexes these are ignored.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param hasher {@inheritDoc}
* @return true if the merge was successful and the state is valid
* @throws IllegalArgumentException {@inheritDoc}
* @see #isValid()
*/
@Override
boolean merge(Hasher hasher);
/**
* Removes the specified Bloom filter from this Bloom filter. Specifically
* all counts for the indexes identified by the {@code other} filter will be decremented by 1.
*
* <p>Note: If the other filter is a counting Bloom filter the index counts are ignored; only
* the enabled indexes are used.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param other the other Bloom filter
* @return true if the removal was successful and the state is valid
* @throws IllegalArgumentException if the shape of the other filter does not match
* the shape of this filter
* @see #isValid()
* @see #subtract(CountingBloomFilter)
*/
boolean remove(BloomFilter other);
/**
* Removes the specified decomposed Bloom filter from this Bloom filter. Specifically
* all counts for the <em>distinct</em> indexes identified by the {@code hasher} will be
* decremented by 1. If the {@code hasher} contains duplicate bit indexes these are ignored.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param hasher the hasher to provide the indexes
* @return true if the removal was successful and the state is valid
* @throws IllegalArgumentException if the hasher cannot generate indices for the shape of
* this filter
* @see #isValid()
*/
boolean remove(Hasher hasher);
/**
* Adds the specified counting Bloom filter to this Bloom filter. Specifically
* all counts for the indexes identified by the {@code other} filter will be incremented
* by their corresponding counts in the {@code other} filter.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param other the other counting Bloom filter
* @return true if the addition was successful and the state is valid
* @throws IllegalArgumentException if the shape of the other filter does not match
* the shape of this filter
* @see #isValid()
*/
boolean add(CountingBloomFilter other);
/**
* Adds the specified counting Bloom filter to this Bloom filter. Specifically
* all counts for the indexes identified by the {@code other} filter will be decremented
* by their corresponding counts in the {@code other} filter.
*
* <p>This method will return true if the filter is valid after the operation.
*
* @param other the other counting Bloom filter
* @return true if the subtraction was successful and the state is valid
* @throws IllegalArgumentException if the shape of the other filter does not match
* the shape of this filter
* @see #isValid()
*/
boolean subtract(CountingBloomFilter other);
}