Add default methods to add a CharSequenece.
Make it clear each object added to the Builder should represent an
entire item.
Document that build() should reset the builder for future use.
Avoid using Arrays.deepHashCode. The array passed to deepHashCode is
always length 2. So we can unroll the same computation for the fixed 2
iterations.
Remove trailing periods from params and returns.
Remove the specification in the Hasher.Builder to convert the String to
bytes using the UTF-8 charset. This is an implementation detail. It has
been moved to the DynamicHasher implementation.
Update exception message for getBits to be less specific. The reference
to getName() is now obsolete.
Remove trailing periods on parameters and arguments.
Remove reference to LongBuffer. Clarify what the long[] represents in
'long[] getBits()'.
Clarify cardinality using (number of enabled bits).
Rearrange BloomFilter interface methods to functional order. The order
is:
- Query operations
- Modification operations
- Counting operations
Improve javadoc for BloomFilter contains with additional information for
what 'contains' means.
Update exception message for contains/merge/add/subtract to be
consistent.
This is to support the extension to a counting Bloom filter which can
return true/false if the state is valid.
Drops redundant abstract methods from the AbstractBloomFilter that are
overrides of the BloomFilter interface .
Update documented exception conditions.
Update javadoc for the shape properties to drop AKA abbreviation.
Change Probability of collision to Probability of False positives.
Update the getProbability method to document it applies to a filter full
to the intended capacity.
An exact copy of the javadoc is redundant. It also means updates to the
parent get lost by those inheriting. It is better to use {@inheritDoc}
and add extra information.
Since HashFunctionIdentity is an interface there is no control over what
is hashed. Add a hash function to the HashFunctionValidator to ensure
the hash code is the same if two hash functions are equal according to
the hashFunctionIdentity.
Note: Since Shape is final we use the properties directly and not through the get methods.
Standardise the constructor assertions to functions.
Ensure Shape catches NaN probability in the constructor.
Previously NaN would result in a NaN computation for the number of bits.
When cast to int it would be zero. This change improves the error
message in the exception.
Clean-up javadocs.
Ensure Shape is final. If not final then the rest of the Bloom filter
API cannot assume that a Shape is valid as it may be extended and the
computations changed.
This is done by duplicating the and/or/xor cardinality tests and merge
tests in the AbtsractBloomFilterTest using the current filter type
(provided via abstract methods) and a generic BloomFilter
implementation.
Coverage cannot reach 100% because assert statements have been included
that test assumptions. These asserts are unreachable if the StaticHasher
functions as expected and returns an iterator with at least 1 value when
it reports a non-zero size.
Anything that fails in after_success is ignored in travis reporting. The
checks must be done in the main script.
The japicmp was failing due to the lack of a jar to compare and so
coveralls was then not submitting. No coverage reports have been logged
by coveralls since June 2017.
Partially removes changes made in commit:
6ad69bedd3
The class requires a revision to handle add/subtract of another
CountingBloomFilter. Restore the tests to check that Counting filters
are merged as if another non-counting filter type.
Remove the javadoc from the CountingBloomFilter methods that state it
uses the counts when merge/remove are called with a CountingBloomFilter
as this is not what the functionality currently performs.
Fix the merge with a hasher to only increment the count by 1 even if the
hasher contains duplicates. Add test to verify this works as documented.
This matches the remove functionality which removes duplicates before
subtraction.
Fixed all code. Case statements have an indent of zero as recommended by
Oracle. Previously the code was using either 0 or 4 as the indent. Using
zero for the check has fewer violations that require fixing.