Commit Graph

70 Commits

Author SHA1 Message Date
Roman Leventov 693e3575f9
Remove unused code and exception declarations (#5461)
* Remove unused code and exception declarations

* Address comments

* Remove redundant Exception declarations

* Make FirehoseFactoryV2.connect() to throw IOException again
2018-03-16 22:11:12 +01:00
Gian Merlino 0ce406bdf1
Introduce "transformSpec" at ingest-time. (#4890)
* Introduce "transformSpec" at ingest-time.

It accepts a "filter" (standard query filter object) and "transforms" (a
list of objects with "name" and "expression"). These can be used to do
filtering and single-row transforms without need for a separate data
processing job.

The "expression" fields use the same expression language as other
expression-based feature.

* Remove forbidden api.

* Fix compile error.

* Fix tests.

* Some more changes.

- Add nullable annotation to Firehose.nextRow.
- Add tests for index task, realtime task, kafka task, hadoop mapper,
  and ingestSegment firehose.

* Fix bad merge.

* Adjust imports.

* Adjust whitespace.

* Make Transform into an interface.

* Add missing annotation.

* Switch logger.

* Switch logger.

* Adjust test.

* Adjustment to handling for DatasourceIngestionSpec.

* Fix test.

* CR comments.

* Remove unused method.

* Add javadocs.

* More javadocs, and always decorate.

* Fix bug in TransformingStringInputRowParser.

* Fix bad merge.

* Fix ISFF tests.

* Fix DORC test.
2017-10-30 17:38:52 -07:00
Roman Leventov aa7e4ae5e4 Enforce correct spacing with Checkstyle (#4651) 2017-08-05 10:18:25 -07:00
Chris Gavin 960cb07ea6 Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474)
* Remove some unnecessary use of boxed types.

* Fix some incorrect format strings.

* Enable IDEA's MalformedFormatString inspection.

* Add a Checkstyle check for finding uses of incorrect logging packages.

* Fix some incorrect usages of the metamx logger.

* Bypass incorrect logger Checkstyle check where using the correct logger is not simple.

* Fix some more places where the wrong number of arguments are provided to format strings.

* Suppress `MalformedFormatString` inspection on legacy logging test.

* Use @SuppressWarnings rather than a noinspection suppression comment.

* Fix some more incorrect format strings.

* Suppress some more incorrect format string warnings where the incorrect string is intentional.

* Log the aggregator when closing it fails.

* Remove some unneeded log lines.
2017-07-13 12:15:32 -07:00
Roman Leventov 9ae457f7ad Avoid using the default system Locale and printing to System.out in production code (#4409)
* Avoid usages of Default system Locale and printing to System.out or System.err in production code

* Fix Charset in DruidKerberosUtil

* Remove redundant string format in GenericIndexed

* Rename StringUtils.safeFormat() to unimportantSafeFormat(); add StringUtils.format() which fails as well as String.format()

* Fix testSafeFormat()

* More fixes of redundant StringUtils.format() inside ISE

* Rename unimportantSafeFormat() to nonStrictFormat()
2017-06-29 14:06:19 -07:00
Jihoon Son 733dfc9b30 Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193)
* Add PrefetcheableTextFilesFirehoseFactory

* fix comment

* exception handling

* Fix wrong json property

* Remove ReplayableFirehoseFactory and fix misspelling

* Defer object initialization

* Add a temporaryDirectory parameter to FirehoseFactory.connect()

* fix when cache and fetch are disabled

* Address comments

* Add more test

* Increase timeout for test

* Add wrapObjectStream

* Move methods to Firehose from PrefetchableFirehoseFactory

* Cleanup comment

* add directory listing to s3 firehose

* Rename a variable

* Addressing comments

* Update document

* Support disabling prefetch

* Fix race condition

* Add fetchLock

* Remove ReplayableFirehoseFactoryTest

* Fix compilation error

* Fix test failure

* Address comments

* Add default implementation for new method
2017-05-18 15:37:18 +09:00
Roman Leventov b7a52286e8 Make @Override annotation obligatory (#4274)
* Make MissingOverride an error

* Make travis stript to fail fast

* Add missing Override annotations

* Comment
2017-05-16 13:30:30 -05:00
Gian Merlino 657e4512d2 Checkstyle checks for AvoidStaticImport, UnusedImports. (#3660)
Excludes tests from AvoidStaticImport, since those are used often there and
I didn't want to make this changeset too large. Production code use was minimal
and I switched those to non-static imports.
2016-11-05 11:34:36 -07:00
Himanshu Gupta 62ba9ade37 unifying license header in all java files 2015-12-05 22:16:23 -06:00
David Lim f42f6247ee Modified the Twitter firehose to process more properties
Add dimensions such as screen name, retweet and verified booleans,
source, location, and originator information to support additional
analytics.
2015-09-25 00:21:15 -06:00
fjy d05032b98a towards a community led druid 2015-01-31 20:57:36 -08:00
Charles Allen 687c82daa8 Added more Twitter fields to TwitterSpritzerFirehoseFactory
* Now with GEOGRAPHY support!
2014-12-12 15:27:00 -08:00
Charles Allen 92ea82da6d Fix the twitter firehose
* It was missing some json annotations
2014-12-11 16:19:47 -08:00
fjy 8ee4d12562 Refactor structure for examples and extensions 2014-11-21 14:45:24 -08:00
nishantmonu51 454acd3f5a remove backwards compatible code
1) remove backwards compatible and deprecated code
2) make hashed partitions spec default
2014-10-13 19:30:44 +05:30
Xavier Léauté d4795ce927 fix missing charsets 2014-09-15 12:53:40 -07:00
Xavier Léauté ac05836833 make Java 8 javadoc happy 2014-08-29 13:58:50 -07:00
fjy 76e0a48527 Merge branch 'master' into new-schema
Conflicts:
	indexing-hadoop/src/main/java/io/druid/indexer/DbUpdaterJob.java
	indexing-hadoop/src/test/java/io/druid/indexer/HadoopDruidIndexerConfigTest.java
	indexing-service/src/main/java/io/druid/indexing/common/task/HadoopIndexTask.java
	server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumber.java
	server/src/main/java/io/druid/segment/realtime/plumber/RealtimePlumberSchool.java
2014-04-25 14:03:28 -07:00
fjy 7c90a0fb96 fix web supplier test 2014-04-08 15:01:31 -07:00
fjy 4b7c76762d unit tests passingn at this point, finished rt port maybe 2014-02-18 15:14:38 -08:00
Stefán Freyr Stefánsson 71598ee60e Moving RabbitMQ stuff to a module. 2013-11-21 18:41:06 +00:00
fjy aeb411a3a3 fix according to code review and fix broken examples 2013-11-07 15:42:48 -08:00
cheddar c47fe202c7 Fix HadoopDruidIndexer to work with the new way of things
There are multiple and sundry changes in here.

First, "HadoopDruidIndexer" has been split into two pieces, (1) CliHadoop which pulls the hadoop version and builds up the right classpath with the proper hadoop version to run the indexer and (2) CliInternalHadoopIndexer which actually runs the indexer.

In order to work around a bunch of jets3t version conflicts with Hadoop and Druid, I needed to extract the S3 deep storage stuff into its own module.  I then also moved the HDFS stuff into its own module so that I could eliminate the dependency on Hadoop for druid-server.

In doing these changes, I wanted to make the extensions buildable with only the druid-api jar, so a few other things had to move out of Druid and into druid-api.  They are all API-level things, however, so they really belong in druid-api instead.

Lastly, I removed the druid-realtime module and put it all in druid-server.
2013-10-09 15:15:44 -05:00
cheddar 5712b29c8c Fix issues with bindings and handling extensions
The way the Guice bindings were setup previously, each process only had bindings
for the things it cared about.  This became problematic when adding extension modules
that bound everything that they could possibly need expecting that the processes would
only instantiate what they actually do need.  Guice tries to fail-fast and verifies that all
 bindings exist before it does anything, which is a problem because the extension bind
 some objects that don't necessarily have all of their dependencies bound in all processes.

The fix for this is to build a single Injector with all bindings in it and let each of the
 processes only load the things that they care about.  This also requires the use of
 Module overrides and other such interesting things, which are node done.

 In doing the fix, I also swapped out the way that the DataSegmentPusher/Puller stuff is bound, as well as made the Cassandra stuff fail if its settings are not provided.  This all of a sudden made all of the things require Cassandra's settings, so I migrated the Cassandra deep storage stuff into its own module.

 In doing these changes, I also discovered that some properties weren't properly converting for the ConvertProperties command (specifically, the properties related to data segment loading and pushing), so I fixed that.
2013-09-20 17:45:01 -05:00
fjy cabae7993d port over multi threaded realtime and also fix broken realtime nodes that can't start up 2013-09-16 16:03:47 -07:00
cheddar 3c39f90c89 1) Move Firehose interface and dependencies to druid-api
2) Move DataSegment* interfaces and dependencies to druid-api
2013-08-31 16:43:28 -05:00
cheddar 5ab671050e No more com.metamx.druid, it is now all io.druid! 2013-08-30 19:42:12 -05:00
cheddar bd0756e360 More stuff moved, things still compiling and tests still passing. Yay! 2013-08-30 18:58:35 -05:00
cheddar 56e2b956d0 OMG!!! A lot of stuff has been moved. Modules have been created and destroyed, but everything is compiling and unit tests are passing, OMFG this is awesome.! 2013-08-30 18:21:04 -05:00
cheddar 9c30ced5ea 1) Move various "api" classes to io.druid packages and make sure things compile and stuff 2013-08-28 15:51:02 -05:00
fjy 261ef7ce56 add some fixes 2013-08-22 10:56:50 -07:00
fjy 6a8c160740 update code according to code review 2013-08-22 10:46:05 -07:00
fjy 85ee8bb267 port realtime to guice 2013-08-13 17:08:45 -07:00
cheddar 2361e0112a Make it all compile again... 2013-08-02 10:14:46 -07:00
Stefán Freyr Stefánsson ae4132adba Adding a test producer application. 2013-07-18 19:42:07 +00:00
cheddar 3778425250 1) Fix new WebStreamFirehose stuff to have tests in the tests directory
2) Change the webStream package to the web package, because capital letters in packages suck
2013-07-10 14:12:13 -07:00
cheddar b83bc14784 Merge pull request #173 from metamx/dhruv
Add new demo firehose that is lower friction than twitter
2013-07-09 17:39:51 -07:00
Dhruv Parthasarathy bd6dcd3973 interruption! 2013-07-09 17:39:19 -07:00
Dhruv Parthasarathy ba484fca5c catch error 2013-07-09 17:17:21 -07:00
Dhruv Parthasarathy 9f7284c801 fixed thread stuff 2013-07-09 16:59:49 -07:00
Dhruv Parthasarathy b3157c2752 made thread final 2013-07-09 16:42:35 -07:00
Dhruv Parthasarathy 3250c698bb fixed thread stuff and made tests cleaner 2013-07-09 16:34:55 -07:00
Dhruv Parthasarathy 6d000fc4c2 interfaces added and tests simplified 2013-07-09 15:31:03 -07:00
Dhruv Parthasarathy 41cb115d60 few more changes to RenamingKeysUpdateStreamFactory and test 2013-07-09 10:46:43 -07:00
Dhruv Parthasarathy 72fbc516bc added a RenameKeysUpdateStream wrapper 2013-07-08 18:53:08 -07:00
Dhruv Parthasarathy 142271aad2 better encapsulation 2013-07-08 16:46:47 -07:00
Dhruv Parthasarathy 01b4728c40 removed shared queue structure. Queue now encapsulated within updateStream 2013-07-08 14:27:35 -07:00
Dhruv Parthasarathy e7da31e42d seems to be working 2013-07-08 13:41:19 -07:00
Dhruv Parthasarathy 439e8ca4ad now take a map for renaming 2013-07-08 13:35:22 -07:00
Dhruv Parthasarathy c8c686c738 added druid license comments 2013-07-08 12:19:38 -07:00