Commit Graph

121 Commits

Author SHA1 Message Date
Jihoon Son 50a4ec2b0b Add support for headers and skipping thereof for CSV and TSV (#4254)
* initial commit

* small fixes

* fix bug

* fix bug

* address code review

* more cr

* more cr

* more cr

* fix

* Skip head rows for CSV and TSV

* Move checking skipHeadRows to FileIteratingFirehose

* Remove checking null iterators

* Remove unused imports

* Address comments

* Fix compilation error

* Address comments

* Add more tests

* Add a comment to ReplayableFirehose

* Addressing comments

* Add docs and fix typos
2017-05-15 22:57:31 -07:00
Roman Leventov 0bc18e7906 Make UpdateCounter proof to update count overflow (#4138)
* Make UpdateCounter proof to update count overflow.

* Fix
2017-05-01 09:59:49 -07:00
Gian Merlino 2ca7b00346 Update versions to 0.10.1-SNAPSHOT. (#4191) 2017-04-20 18:12:28 -07:00
Roman Leventov 15f3a94474 Copy closer into Druid codebase (fixes #3652) (#4153) 2017-04-10 09:38:45 +09:00
Gian Merlino 12317fd001 Bump version to 0.10.0-SNAPSHOT. (#3913) 2017-02-06 17:54:35 -08:00
Jihoon Son d80bec83cc Enable auto license checking (#3836)
* Enable license checking

* Clean duplicated license headers
2017-01-10 18:13:47 -08:00
Roman Leventov 49d71e9b38 Fix the build after #3697 (#3807) 2016-12-26 17:06:48 -06:00
Roman Leventov 76cb06a8d8 Lookup cache refactoring (the main part of #3667) (#3697)
* Lookup cache refactoring (the main part of druid-io/druid#3667)

* Use PowerMock's static methods in NamespaceLookupExtractorFactoryTest

* Fix KafkaLookupExtractorFactoryTest

* Use VisibleForTesting annotation instead of Javadoc comment

* Create a NamespaceExtractionCacheManager separately for each test in NamespaceExtractionCacheManagersTest

* Rename CacheScheduler.NoCache.ENTRY_DISPOSED to ENTRY_CLOSED

* Reduce visibility of NamespaceExtractionCacheManager.cacheCount() and monitor() implementations, and don't run NamespaceExtractionCacheManagerExecutorsTest with off-heap cache (it didn't before)

* In NamespaceLookupExtractorFactory, use safer idiom to check if CacheState is NoCache or VersionedCache

* More logging in CacheHandler constructor and close(), VersionedCache.close()

* PR comments addressed

* Make CacheScheduler.EntryImpl AutoCloseable, avoid 'dispose' verb in comments, logging and naming in CacheScheduler in favor of 'close'

* More Javadoc comments to CacheScheduler

* Fix NPE

* Remove logging in OnHeapNamespaceExtractionCacheManager.expungeCollectedCaches()

* Make NamespaceExtractionCacheManagersTest.testRacyCreation() to have similar load to what it be before the refactoring

* Unwrap NamespaceExtractionCacheManager.scheduledExecutorService from unneeded MoreExecutors.listeningDecorator() and specify that this is ScheduledThreadPoolExecutor, which ensures happens-before between periodic runs of the tasks

* More comments on MapDbCacheDisposer.disposed

* Replace concat with Long.toString()

* Comment on why NamespaceExtractionCacheManager.scheduledExecutorService() returns ScheduledThreadPoolExecutor

* Place logging statements in VersionedCache.close() and CacheHandler.close() after actual closing logic, because logging may fail

* Make JDBCExtractionNamespaceCacheFactory and StaticMapExtractionNamespaceCacheFactory to try to close newly created VersionedCache if population has failed, as it is done already in URIExtractionNamespaceCacheFactory

* Don't close the whole CacheScheduler.Entry, if the cache update task failed

* Replace AtomicLong updateCounter and firstRunLatch with Phaser-based UpdateCounter in CacheScheduler.EntryImpl
2016-12-23 18:04:27 -08:00
Roman Leventov 988d97b09c Unwrap exceptions from RuntimeException in URIExtractionNamespaceCacheFactory.populateCache() (part of #3667) (#3668)
* Unwrap exceptions from RuntimeException in URIExtractionNamespaceCacheFactory.populateCache()

* Fix tests
2016-11-11 17:25:41 -08:00
Roman Leventov 22b57ddd60 Make ExtractionNamespaceCacheFactory to populate cache directly instead of returning callable (#3651)
* Rename ExtractionNamespaceCacheFactory.getCachePopulator() to populateCache() and make it to populate cache itself instead of returning a Callable which populates cache, because this "callback style" is not actually needed.

ExtractionNamespaceCacheFactory isn't a "factory" so it should be renamed, but renaming right in this commit would tear the git history for files, because ExtractionNamespaceCacheFactory implementations have too many changed lines. Going to rename ExtractionNamespaceCacheFactory to something like "CachePopulator" in one of subsequent PRs.

This commit is a part of a bigger refactoring of the lookup cache subsystem.

* Remove unused line and imports
2016-11-04 13:33:16 -07:00
Gian Merlino 4203580290 URIExtractionNamespace: Treat null values in lookup maps as missing entries. (#3512)
* URIExtractionNamespace: Treat null values in lookup maps as missing entries.

This is useful when many logical lookups are derived from the same base JSON file,
and some lookups' values may be unknown sometimes.

* Add test, logging message, and address other comments.

* Update docs.
2016-11-03 13:53:04 -07:00
Roman Leventov 36a1543222 Lookup cache bug fixes (#3609)
* Return better lastVersion from JDBCExtractionNamespaceCacheFactory's cache populator callable

* Return the lastVersion if URI lookup last modified date is not later than the last cached, from URIExtractionNamespaceCacheFactory's cache populator callable

* Fix a race condition in NamespaceExtractionCacheManager.cancelFuture()

* Don't delete cache from NamespaceExtractionCacheManager if the ExtractionNamespaceCacheFactory returned the same version as the last; Better exception treatment in the scheduled cache updater runnable in NamespaceExtractionCacheManager (in particular, don't consume Errors); throw AssertionError in StaticMapExtractionNamespaceCacheFactory if the lastVersion != null)

* In NamespaceExtractionCacheManager, put NamespaceImplData.latestVersion update in the same synchronized() block with swapAndClearCache(id, cacheId); Turn getPostRunnable which returns a callback into a simple updateNamespace() method

* In StaticMapExtractionNamespaceCacheFactory.getCachePopulator(), check the input directly, not inside a callback

* In URIExtractionNamespaceCacheFactory, allow URI last modified time to go backwards

* Better logging in NamespaceExtractionCacheManager

* Add comment on lastVersion nullability in URIExtractionNamespaceCacheFactory
2016-11-02 09:40:19 -07:00
Charles Allen 78159d7ca4 Move off-heap QTL global cache delete lock outside of subclass lock (#3597)
* Move off-heap QTL global cache delete lock outside of subclass lock

* Make `delete` thread safe
2016-10-27 22:23:53 +05:30
Akash Dwivedi 4b3bd8bd63 Migrating java-util from Metamarkets. (#3585)
* Migrating java-util from Metamarkets.

* checkstyle and updated license on java-util files.

* Removed unused imports from whole project.

* cherry pick metamx/java-util@826021f.

* Copyright changes on java-util pom, address review comments.
2016-10-21 14:57:07 -07:00
Gian Merlino 40f2fe7893 Bump versions to 0.9.3-SNAPSHOT (#3524) 2016-09-29 13:53:32 -07:00
Slim 3175e17a3b Cached lookup module. first cut implementing JDBC cache (#2819) 2016-09-16 13:45:54 -07:00
Charles Allen 95e08b38ea [QTL] Reduced Locking Lookups (#3071)
* Lockless lookups

* Fix compile problem

* Make stack trace throw instead

* Remove non-germane change

* * Add better naming to cache keys. Makes logging nicer
* Fix #3459

* Move start/stop lock to non-interruptable for readability purposes
2016-09-16 11:54:23 -07:00
Charles Allen bfa5c05aaa Make global lookup cache introspector class public (#3199)
* Make global lookup cache introspector class public
* Fixes #3187

* Make KafkaLookupExtractorIntrospectionHandler a public static class
2016-07-01 15:50:57 -07:00
Gian Merlino ebf890fe79 Update master version to 0.9.2-SNAPSHOT. (#3133) 2016-06-13 13:10:38 -07:00
Charles Allen 8cac710546 Async lookups-cached-global by default (#3074)
* Async lookups-cached-global by default
* Also better lookup docs

* Fix test timeouts

* Fix timing of deserialized test

* Fix problem with 0 wait failing immediately
2016-06-03 15:58:10 -05:00
Charles Allen 8024b915e2 [QTL] Implement LookupExtractorFactory of namespaced lookup (#2926)
* support LookupReferencesManager registration of namespaced lookup and eliminate static configurations for lookup from namespecd lookup extensions

- druid-namespace-lookup and druid-kafka-extraction-namespace are modified
- However, druid-namespace-lookup still has configuration about ON/OFF
  HEAP cache manager selection, which is not namespace wide
  configuration but node wide configuration as multiple namespace shares
  the same cache manager

* update KafkaExtractionNamespaceTest to reflect argument signature changes

* Add more synchronization functionality to NamespaceLookupExtractorFactory

* Remove old way of using extraction namespaces

* resolve compile error by supporting LookupIntrospectHandler

* Remove kafka lookups

* Remove unused stuff

* Fix start and stop behavior to be consistent with new javadocs

* Remove unused strings

* Add timeout option

* Address comments on configurations and improve docs

* Add more options and update hash key and replaces

* Move monitoring to the overriding classes

* Add better start/stop logging

* Remove old docs about namespace names

* Fix bad comma

* Add `@JsonIgnore` to lookup factory

* Address code review comments

* Remove ExtractionNamespace from module json registration

* Fix problems with naming and initialization. Add tests

* Optimize imports / reformat

* Fix future not being properly cancelled on failed initial scheduling

* Fix delete returns

* Add more docs about whole introspection

* Add `/version` introspection point for lookups

* Add more tests and address comments

* Add StaticMap extraction namespace for testing. Also add a bunch of tests

* Move cache system property to `druid.lookup.namespace.cache.type`

* Make VERSION lower case

* Change poll period to 0ms  for StaticMap

* Move cache key to bytebuffer

* Change hashCode and equals on static map extraction fn

* Add more comments on StaticMap

* Address comments

* Make scheduleAndWait use a latch

* Sanity renames and fix imports

* Remove extra info in docs

* Fix review comments

* Strengthen failure on start from warn to error

* Address comments

* Rename namespace-lookup to lookups-cached-global

* Fix injective mis-naming
* Also add serde test
2016-05-24 10:56:40 -07:00