Add default methods to add a CharSequenece.
Make it clear each object added to the Builder should represent an
entire item.
Document that build() should reset the builder for future use.
Avoid using Arrays.deepHashCode. The array passed to deepHashCode is
always length 2. So we can unroll the same computation for the fixed 2
iterations.
Remove trailing periods from params and returns.
Remove the specification in the Hasher.Builder to convert the String to
bytes using the UTF-8 charset. This is an implementation detail. It has
been moved to the DynamicHasher implementation.
Update exception message for getBits to be less specific. The reference
to getName() is now obsolete.
Remove trailing periods on parameters and arguments.
Remove reference to LongBuffer. Clarify what the long[] represents in
'long[] getBits()'.
Clarify cardinality using (number of enabled bits).
Rearrange BloomFilter interface methods to functional order. The order
is:
- Query operations
- Modification operations
- Counting operations
Improve javadoc for BloomFilter contains with additional information for
what 'contains' means.
Update exception message for contains/merge/add/subtract to be
consistent.
This is to support the extension to a counting Bloom filter which can
return true/false if the state is valid.
Drops redundant abstract methods from the AbstractBloomFilter that are
overrides of the BloomFilter interface .
Update documented exception conditions.
Update javadoc for the shape properties to drop AKA abbreviation.
Change Probability of collision to Probability of False positives.
Update the getProbability method to document it applies to a filter full
to the intended capacity.
An exact copy of the javadoc is redundant. It also means updates to the
parent get lost by those inheriting. It is better to use {@inheritDoc}
and add extra information.
Since HashFunctionIdentity is an interface there is no control over what
is hashed. Add a hash function to the HashFunctionValidator to ensure
the hash code is the same if two hash functions are equal according to
the hashFunctionIdentity.
Note: Since Shape is final we use the properties directly and not through the get methods.
Standardise the constructor assertions to functions.
Ensure Shape catches NaN probability in the constructor.
Previously NaN would result in a NaN computation for the number of bits.
When cast to int it would be zero. This change improves the error
message in the exception.
Clean-up javadocs.
Ensure Shape is final. If not final then the rest of the Bloom filter
API cannot assume that a Shape is valid as it may be extended and the
computations changed.
This is done by duplicating the and/or/xor cardinality tests and merge
tests in the AbtsractBloomFilterTest using the current filter type
(provided via abstract methods) and a generic BloomFilter
implementation.
Coverage cannot reach 100% because assert statements have been included
that test assumptions. These asserts are unreachable if the StaticHasher
functions as expected and returns an iterator with at least 1 value when
it reports a non-zero size.
Partially removes changes made in commit:
6ad69bedd3
The class requires a revision to handle add/subtract of another
CountingBloomFilter. Restore the tests to check that Counting filters
are merged as if another non-counting filter type.
Remove the javadoc from the CountingBloomFilter methods that state it
uses the counts when merge/remove are called with a CountingBloomFilter
as this is not what the functionality currently performs.
Fix the merge with a hasher to only increment the count by 1 even if the
hasher contains duplicates. Add test to verify this works as documented.
This matches the remove functionality which removes duplicates before
subtraction.
Fixed all code. Case statements have an indent of zero as recommended by
Oracle. Previously the code was using either 0 or 4 as the indent. Using
zero for the check has fewer violations that require fixing.
The counting functionality appears to be broken. Annotations have been
added to the code at locations that are incorrect.
Tests have been updated that currently fail and disabled to allow the
build to pass.
The comparators are never used to perform ordering of functions. The
only current use is to determine that two hash functions are
functionally equivalent. A replacement utility class has been added to
test for equality.
* Added initial bloom filter code. Added changed lang3 dependency from
test to compile in pom.xml
* added tests + made recommended changes.
* Updated documentation
* refactored ProtoBloomFilter added tests.
* Cleand up code and added tests
* Added CountingBloomFilter
* Fixed CountingBloomFilter issues
Fixed checkstyle and bug report issues
* Initial bloom filter collections checkin
* Added unit tests
* fixed test cases
* Extract BloomFilter as an interface
* added missing license info
* fixed Jacoco errors
* fixed names for so build picks up tests
* cleaned up Jacoco report for BloomNestedCollection
* removed unused code
* cleaned up and reformatted
* added javadoc
fixed issue with BloomNestedCollection detecting duplicates in an edge
case.
* fixed candidate testing bug
* Cleand up niggling report issues.
* fixed javadoc errors
* fixed javadoc for java 13 issue
* Second set of fixes.
* "package private for testing" for methods and properties.
* In "Builder":
** Field "hashes" made "final"
* removes some "Serializable" implementations.
* "StandardBloomFilter" made non non "final" fields final and changed
"final protected" to "final private".
* removed transient fields
* made Package name singular
* added javadocs for private and protected fields and methods.
* Occurrences of "bloom" replaced with "Bloom"
* removed checkstyle and findbugs exclusions
* Fixed method and class names
* Documentation updates
* Fixed checkstyle isses
Added BloomFilterConfiguration functions for estimation.
* added .checkstyle to eclipse ignore section.
* renamed test classes to match main class names
* Updated the documentation.
* Implemented requested changes. Part of COLLECTIONS-728
Changed remaining "get" comments to "gets" etc.
Added final where possible and reasonable.
renamed enum Change to CHANGE
fixed missing javadoc links and missed name changes.
fixed ProtoBloomFilter hashCode
renamed CollectionStatistics to BloomCollectionStatistics
renamed CollectionConfiguration to BloomCollectionConfiguration
renamed BloomCollectionStatistics.getTxnCount() to getTransactionCount()
* Added final set of constructors and tests for them.
Cleaned up issues from Gilles Sadowski review
* fixes for Gilles Sadowski issues in BloomCollectionStatistics
* Update javadoc
* renamed match() -> matches() and inverseMatch() -> inverseMatches()
This follows the pattern set with the Object.equals() method name.
* added isFull() method to check if a bloom filter is full.
* Changed gate from StandardBloomFilter to BloomFilter
* renamed BloomCollectionX -> BloomFilterGatedX
specifically:
BloomCollectionConfiguration -> BloomFilterGatedConfiguraiton
BloomCollectionStatistics -> BloomFilterGatedStatistics
* Made the StandardBloomFilter(BitSet) constructor public
* removed extraneous build() methods from ProtoBloomFilter.Factory
* Added Use cases
* Initial cut
* changes for interface
* Changed to Hasher implementation
* Added missing files and removed Shape from some BloomFilter calls
* Added @since 4.5 tags
* fixed javadoc
* fixed PMD errors
* Added tests and fixed sign extension issues
* changed to Byte constant
* made BloomFilter.verify*() non final
* Added remove(Hasher) for completeness
* Replaced private implementation of MurmurHash3 with commons-codec
* fixed typo
* Removed Hasher.Factory added HashFunction interface
* removed Usage.md
* made commons-codec dependency optional
* Improved performance of Iterator.
* renamed instance variable "md" as messageDigest.
* updated javadoc
* renamed Iter to Iterator and removed unused imports
* removed unused imports
* Made instance variables final.
Also fixed MD5 constructor to throw IllegalStateException if MD5 algo
can not be found.
* removed unused imports
* Updated javadoc.
* Added HashFunctionIdentity to replace HashFunctionName
Added test cases, updated java doc.
Renamed function implementations to reflect actual function.
Added comparators for HashFunctionIdentity
* fixed naming issues
* Updated javadoc
* fixed checkstyle issue
* Removed link that was causing problems in java 11+ javadoc
* changed HashFunctionIdentity.getProcess() to getProcessType()
* changed HashFunctionIdentity.getProcess() to getProcessType()
* Added package documentation
* Added BloomFilter interface and removed unnecessary methods
* updated tests and fixed issues
* Moved set operations to separate class and updated tests
* fixed FindBugs, PMD and Checkstyle errors
* fixed javadocs
* Added SetOperations and tests
* Added javadocs indicating optional commons-codec required
* Added another cosine test
* Updated to commons-codec 1.14
* fixed typos
* moved Hasher to o.a.c.c.b.hasher package
* extracted Shape.java and moved to o.a.c.c.b.hasher package
* Added javadoc and removed unused imports in testing code
* Added isEmpty() method to Hasher
* initial documentation
* updated to latest mathjax
* Fixed typographical issues
* Added initial bloom filter code. Added changed lang3 dependency from
test to compile in pom.xml
* added tests + made recommended changes.
* Updated documentation
* refactored ProtoBloomFilter added tests.
* Cleand up code and added tests
* Added CountingBloomFilter
* Fixed CountingBloomFilter issues
Fixed checkstyle and bug report issues
* Initial bloom filter collections checkin
* Added unit tests
* fixed test cases
* Extract BloomFilter as an interface
* added missing license info
* fixed Jacoco errors
* fixed names for so build picks up tests
* cleaned up Jacoco report for BloomNestedCollection
* removed unused code
* cleaned up and reformatted
* added javadoc
fixed issue with BloomNestedCollection detecting duplicates in an edge
case.
* fixed candidate testing bug
* Cleand up niggling report issues.
* fixed javadoc errors
* fixed javadoc for java 13 issue
* Second set of fixes.
* "package private for testing" for methods and properties.
* In "Builder":
** Field "hashes" made "final"
* removes some "Serializable" implementations.
* "StandardBloomFilter" made non non "final" fields final and changed
"final protected" to "final private".
* removed transient fields
* made Package name singular
* added javadocs for private and protected fields and methods.
* Occurrences of "bloom" replaced with "Bloom"
* removed checkstyle and findbugs exclusions
* Fixed method and class names
* Documentation updates
* Fixed checkstyle isses
Added BloomFilterConfiguration functions for estimation.
* added .checkstyle to eclipse ignore section.
* renamed test classes to match main class names
* Updated the documentation.
* Implemented requested changes. Part of COLLECTIONS-728
Changed remaining "get" comments to "gets" etc.
Added final where possible and reasonable.
renamed enum Change to CHANGE
fixed missing javadoc links and missed name changes.
fixed ProtoBloomFilter hashCode
renamed CollectionStatistics to BloomCollectionStatistics
renamed CollectionConfiguration to BloomCollectionConfiguration
renamed BloomCollectionStatistics.getTxnCount() to getTransactionCount()
* Added final set of constructors and tests for them.
Cleaned up issues from Gilles Sadowski review
* fixes for Gilles Sadowski issues in BloomCollectionStatistics
* Update javadoc
* renamed match() -> matches() and inverseMatch() -> inverseMatches()
This follows the pattern set with the Object.equals() method name.
* added isFull() method to check if a bloom filter is full.
* Changed gate from StandardBloomFilter to BloomFilter
* renamed BloomCollectionX -> BloomFilterGatedX
specifically:
BloomCollectionConfiguration -> BloomFilterGatedConfiguraiton
BloomCollectionStatistics -> BloomFilterGatedStatistics
* Made the StandardBloomFilter(BitSet) constructor public
* removed extraneous build() methods from ProtoBloomFilter.Factory
* Added Use cases
* Initial cut
* changes for interface
* Changed to Hasher implementation
* Added missing files and removed Shape from some BloomFilter calls
* Added @since 4.5 tags
* fixed javadoc
* fixed PMD errors
* Added tests and fixed sign extension issues
* changed to Byte constant
* made BloomFilter.verify*() non final
* Added remove(Hasher) for completeness
* Replaced private implementation of MurmurHash3 with commons-codec
* fixed typo
* Removed Hasher.Factory added HashFunction interface
* removed Usage.md
* made commons-codec dependency optional
* Improved performance of Iterator.
* renamed instance variable "md" as messageDigest.
* updated javadoc
* renamed Iter to Iterator and removed unused imports
* removed unused imports
* Made instance variables final.
Also fixed MD5 constructor to throw IllegalStateException if MD5 algo
can not be found.
* removed unused imports
* Updated javadoc.
* Added HashFunctionIdentity to replace HashFunctionName
Added test cases, updated java doc.
Renamed function implementations to reflect actual function.
Added comparators for HashFunctionIdentity
* fixed naming issues
* Updated javadoc
* fixed checkstyle issue
* Removed link that was causing problems in java 11+ javadoc
* changed HashFunctionIdentity.getProcess() to getProcessType()
* changed HashFunctionIdentity.getProcess() to getProcessType()
* Added package documentation
* Added BloomFilter interface and removed unnecessary methods
* updated tests and fixed issues
* Moved set operations to separate class and updated tests
* fixed FindBugs, PMD and Checkstyle errors
* fixed javadocs
* Added SetOperations and tests
* Added javadocs indicating optional commons-codec required
* Added another cosine test
* Updated to commons-codec 1.14
* fixed typos
* moved Hasher to o.a.c.c.b.hasher package
* extracted Shape.java and moved to o.a.c.c.b.hasher package
* Added javadoc and removed unused imports in testing code
* Added isEmpty() method to Hasher