Commit Graph

3499 Commits

Author SHA1 Message Date
Pengyu Nie 3d8d96c7d2 [COLLECTIONS-737]: Return 0 immeditaly if the given iterable is null in IterableUtils#size. Update tests. 2020-03-28 18:14:22 +13:00
Bruno P. Kinoshita e20e373fe0 Merge branch 'pr-114'
This closes #114
2020-03-28 18:00:01 +13:00
dota17 4a662289d3 Modified the error in javadoc of BulkTest 2020-03-28 17:57:57 +13:00
Gary Gregory 0a9dccb7bc Format. 2020-03-27 14:08:14 -04:00
Gary Gregory f9fb07955d Update tests from Apache Commons Lang 3.9 to 3.10. 2020-03-27 14:01:45 -04:00
Alex Herbert a02a0e6993
Merge pull request #143 from dota17/fixtypoForBloomfilter
Fix typos for the Bloom filter components
2020-03-25 12:34:56 +00:00
dota17 092959ddcb Fixtypo for the bloomFilter 2020-03-25 19:26:58 +08:00
Alex Herbert fbc3c06d78
Merge pull request #142 from dota17/FixedMurmur128x64Cyclic
Bloom filter updates.

Fixed Murmur128x64Cyclic.

Added test for DynamicHasher NoValuesIterator.
2020-03-19 10:31:00 +00:00
dota17 e70a21d7cd Import the fail 2020-03-19 14:35:35 +08:00
dota17 514c2eddfc add a testcase for DynamicHasher.NoValuesIterator.nextInt() 2020-03-19 11:33:44 +08:00
dota17 a7973b8d30 Fixed Murmur128x64Cyclic 2020-03-19 11:04:48 +08:00
Alex Herbert 39aef59785 Optimise DynamicHasher iterator. 2020-03-18 11:18:40 +00:00
Alex Herbert bbee9fbd9b Update Hasher.Builder.
Add default methods to add a CharSequenece.

Make it clear each object added to the Builder should represent an
entire item.

Document that build() should reset the builder for future use.
2020-03-18 10:49:15 +00:00
Alex Herbert a34da7bcf5 DefaultBloomFilterMethodsTest: Correct javadoc for internal test class 2020-03-18 09:24:55 +00:00
Alex Herbert f157196e00 Merge branch 'dota17-fixtypoForBloomfilterTest' 2020-03-18 09:22:49 +00:00
dota17 00408690a2 Fixtypo for BloomfilterTest 2020-03-18 14:49:08 +08:00
Alex Herbert 2cbac58f7e Remove empty line. 2020-03-17 12:57:57 +00:00
Alex Herbert 70947b1767 Add link to Hasher in the HashFunction javadoc header 2020-03-17 12:43:20 +00:00
Alex Herbert ac2c7f2206 Improve documentation of Hasher. 2020-03-17 12:41:38 +00:00
Alex Herbert 0feeab0820 Change Hasher.getBits() to iterator() 2020-03-17 12:27:43 +00:00
Alex Herbert 976d645835 Remove Hasher isEmpty() 2020-03-17 12:16:09 +00:00
Alex Herbert f00daff8c8 Fix typo in Shape.checkNumberOfBits 2020-03-17 07:39:20 +00:00
Alex Herbert d6eeceb018 Optimise ObjectsHashIterative hash function.
Avoid using Arrays.deepHashCode. The array passed to deepHashCode is
always length 2. So we can unroll the same computation for the fixed 2
iterations.
2020-03-17 00:59:00 +00:00
aherbert a699c8b9ba Update Hasher javadoc.
Remove trailing periods from params and returns.

Remove the specification in the Hasher.Builder to convert the String to
bytes using the UTF-8 charset. This is an implementation detail. It has
been moved to the DynamicHasher implementation.

Update exception message for getBits to be less specific. The reference
to getName() is now obsolete.
2020-03-16 17:14:28 +00:00
Alex Herbert 142d53a6a5 Remove trailing whitespace 2020-03-15 23:43:12 +00:00
Alex Herbert 7b15598da0 Update javadoc for ArrayCountingBloomFilter.
Document that no exception is raised when the filter state transitions
to invalid.
2020-03-15 23:26:40 +00:00
Alex Herbert 9de28a7b62 Updated the BloomFilter javadoc.
Remove trailing periods on parameters and arguments.

Remove reference to LongBuffer. Clarify what the long[] represents in
'long[] getBits()'.

Clarify cardinality using (number of enabled bits).

Rearrange BloomFilter interface methods to functional order. The order
is:

- Query operations
- Modification operations
- Counting operations

Improve javadoc for BloomFilter contains with additional information for
what 'contains' means.

Update exception message for contains/merge/add/subtract to be
consistent.
2020-03-15 23:17:43 +00:00
Alex Herbert 86bac5e602 Change BloomFilter merge return type from void to boolean.
This is to support the extension to a counting Bloom filter which can
return true/false if the state is valid.

Drops redundant abstract methods from the AbstractBloomFilter that are
overrides of the BloomFilter interface .
2020-03-15 21:36:21 +00:00
Alex Herbert 22d161a25b Delete MapCountingBloomFilter.
This is obsolete given the ArrayCountingBloomFilter.
2020-03-14 14:25:28 +00:00
Alex Herbert fe88827643 Move the unique filtering of the Hasher indexes to a separate class. 2020-03-14 14:22:09 +00:00
Alex Herbert fb358a5c80 Added CountingBloomFilter interface and ArrayCountingBloomFilter. 2020-03-14 14:22:09 +00:00
Alex Herbert 9f4953f4cb Rename CountingBloomFilter to MapCountingBloomFilter 2020-03-14 14:22:09 +00:00
Alex Herbert 90f705e732 Change log to ln in Shape javadoc 2020-03-14 08:12:48 +00:00
Alex Herbert e3484deb51 Fix ShapeTest typos 2020-03-14 07:59:12 +00:00
Alex Herbert a1dd122342 Consolidate @throws clauses for Shape 2020-03-14 07:35:39 +00:00
Alex Herbert 34a5a6f0c5 Change minimum number of bits from 8 to 1 2020-03-14 07:27:30 +00:00
aherbert 32a730d964 Remove Shape getNumberOfBytes
This method only applies to a Bloom filter using an uncompressed byte
representation. It is trivially derived from the number of bits.
2020-03-13 15:13:00 +00:00
aherbert 7b22b4ddc6 Update javadoc for Shape.
Update documented exception conditions.

Update javadoc for the shape properties to drop AKA abbreviation.

Change Probability of collision to Probability of False positives.

Update the getProbability method to document it applies to a filter full
to the intended capacity.
2020-03-13 15:10:16 +00:00
Alex Herbert 3a981a01b7 Update BloomFilterIndex comments and added tests for negative index. 2020-03-12 23:39:51 +00:00
aherbert 391d91e353 Improved documentation of Murmur3 hash functions.
Added references to Commons Codec and SMHasher.
2020-03-12 17:03:02 +00:00
aherbert 8fb518e6a1 Standardise computation of signatures. 2020-03-12 16:42:32 +00:00
aherbert 33d6ddc7f9 Correct javadoc of the hash function signature. 2020-03-12 15:17:39 +00:00
aherbert 9f2271334d Update the hash function tests to use a base class.
The base class performs the standard signature test that all hash
functions should pass.
2020-03-12 15:13:54 +00:00
aherbert a51c96520a Remove javadocs in overridden methods that are duplicates.
An exact copy of the javadoc is redundant. It also means updates to the
parent get lost by those inheriting. It is better to use {@inheritDoc}
and add extra information.
2020-03-12 14:22:18 +00:00
aherbert 2a9bdc0098 Improve comment in BloomFilterIndexer. 2020-03-12 13:59:19 +00:00
aherbert eda601dd04 Update package info for Bloom filter sub-packages. 2020-03-12 13:55:28 +00:00
Alex Herbert cb967680c3 Standardise the Bloom filter shape equations.
Equations match those in:

https://hur.st/bloomfilter/

Fixed documentation of the approximate value of the denominator. Compute
using a re-arrangement.
2020-03-10 06:46:27 +00:00
Alex Herbert 03543e5f9b Ensure hashCode hashes the same properties as the equality.
Since HashFunctionIdentity is an interface there is no control over what
is hashed. Add a hash function to the HashFunctionValidator to ensure
the hash code is the same if two hash functions are equal according to
the hashFunctionIdentity.

Note: Since Shape is final we use the properties directly and not through the get methods.
2020-03-10 01:11:52 +00:00
Alex Herbert 0964d5bf19 Standardise Shape constructor validations.
Standardise the constructor assertions to functions.

Ensure Shape catches NaN probability in the constructor.

Previously NaN would result in a NaN computation for the number of bits.
When cast to int it would be zero. This change improves the error
message in the exception.

Clean-up javadocs.

Ensure Shape is final. If not final then the rest of the Bloom filter
API cannot assume that a Shape is valid as it may be extended and the
computations changed.
2020-03-10 00:29:04 +00:00
Alex Herbert cb88c4ed01 Achieve 100% test coverage for BitSetBloomFilter.
This is done by duplicating the and/or/xor cardinality tests and merge
tests in the AbtsractBloomFilterTest using the current filter type
(provided via abstract methods) and a generic BloomFilter
implementation.
2020-03-09 22:49:47 +00:00