druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	6c725a7e06	Fix havingSpec on complex aggregators. (#5024 ) * Fix havingSpec on complex aggregators. - Uses the technique from #4883 on DimFilterHavingSpec too. - Also uses Transformers from #4890, necessitating a move of that and other related classes from druid-server to druid-processing. They probably make more sense there anyway. - Adds a SQL query test. Fixes #4957. * Remove unused import.	2017-11-01 12:58:08 -04:00
Gian Merlino	0ce406bdf1	Introduce "transformSpec" at ingest-time. (#4890 ) * Introduce "transformSpec" at ingest-time. It accepts a "filter" (standard query filter object) and "transforms" (a list of objects with "name" and "expression"). These can be used to do filtering and single-row transforms without need for a separate data processing job. The "expression" fields use the same expression language as other expression-based feature. * Remove forbidden api. * Fix compile error. * Fix tests. * Some more changes. - Add nullable annotation to Firehose.nextRow. - Add tests for index task, realtime task, kafka task, hadoop mapper, and ingestSegment firehose. * Fix bad merge. * Adjust imports. * Adjust whitespace. * Make Transform into an interface. * Add missing annotation. * Switch logger. * Switch logger. * Adjust test. * Adjustment to handling for DatasourceIngestionSpec. * Fix test. * CR comments. * Remove unused method. * Add javadocs. * More javadocs, and always decorate. * Fix bug in TransformingStringInputRowParser. * Fix bad merge. * Fix ISFF tests. * Fix DORC test.	2017-10-30 17:38:52 -07:00
Roman Leventov	dc7cb117a1	Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism (#4886 ) * Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism * Fix MapVirtualColumn.makeColumnValueSelector() * Minor fixes * Fix IndexGeneratorCombinerTest * DimensionSelector to return zeros when treated as numeric ColumnValueSelector * Fix IncrementalIndexTest * Fix IncrementalIndex.makeColumnSelectorFactory() * Optimize MapBasedRow.getMetric() * Fix VarianceAggregatorTest * Simplify IncrementalIndex.makeColumnSelectorFactory() * Address comments * More comments * Test	2017-10-13 21:44:17 -05:00
Jihoon Son	8d9902831e	Refactoring PrefetchableTextFilesFirehoseFactory (#4836 ) * Refactoring prefetchable firehose * Fix to read cache when prefetch is disabled * More tests * Cleanup codes * Add Fetcher * Fix test failure * Count file size * Fix test * rename generic parameter * address comments * address comments * reuse buffer * move Execs to java-util * use execs * Fix build	2017-10-13 21:39:28 -05:00
Jihoon Son	675c6c00dd	Add checkstyle and intellij rule to prohibit unnecessary qualifiers in interfaces (#4958 ) * add checkstyle and intellij rule * fix tc fail	2017-10-13 07:56:19 -07:00
praveev	4ff12e4394	Hadoop indexing: Fix NPE when intervals not provided (#4686 ) * Fix #4647 * NPE protect bucketInterval as well * Add test to verify timezone as well * Also handle case when intervals are already present * Fix checkstyle error * Use factory method instead for Datetime * Use Intervals factory method	2017-10-05 22:46:07 -07:00
Gian Merlino	1f2074c247	Bump versions in master to 0.11.1-SNAPSHOT. (#4878 ) * Bump versions in master to 0.11.1-SNAPSHOT. * Missed a few.	2017-09-28 17:09:51 -05:00
Himanshu	f69c9280c4	remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form (#4858 ) * remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form * sanitize output of /druid/coordinator/v1/cluster endpoint	2017-09-28 10:40:59 -05:00
Goh Wei Xiang	2c30d5ba55	Add org.joda.time.DateTime.parse() to forbidden APIs (#4857 ) * Added org.joda.time.DateTime#(java.lang.String) to forbidden API. * Added org.joda.time.DateTime#(java.lang.String, org.joda.time.format.DateTimeFormatter) to forbidden API. * Add additional APIs that may create DateTime with default time zone * Add helper function that accepts formatter to parse String. * Add additional forbidden APIs * Replace existing usage of forbidden APIs * Use wrapper class to enforce Chronology on DateTimeFormatter. * Creates constant UtcFormatter for constant ISODateTimeFormat.	2017-09-27 17:46:44 -05:00
Roman Leventov	e267f3901b	Enforce Indentation with Checkstyle (#4799 )	2017-09-21 13:06:48 -07:00
Roman Leventov	c7b8116b3a	Remove HadoopIOPeon (#4742 )	2017-09-03 13:36:38 -07:00
Charles Allen	bdfc6fe25e	Move common TypeReference into JacksonUtils (#4738 )	2017-08-31 13:40:16 -07:00
T R Kyaw	d6179126ed	Allow index job to utilize hadoop cluster information from job config. (#4626 ) * Allow ndex job to utilize hadoop cluster information from job config. * Add new method that inject system configuration and then job configuration. * Make changes to use HadoopDruidIndexerConfig.addJobProperties method. * refactor code for overloaded addJobProperties.	2017-08-30 16:44:33 -05:00
Gian Merlino	daf3c5f927	Add "round" option to cardinality and hyperUnique aggregators. (#4720 ) * Add "round" option to cardinality and hyperUnique aggregators. Also turn it on by default in SQL, to make math on distinct counts work more as expected. * Fix some compile errors. * Fix test. * Formatting.	2017-08-28 14:52:11 -07:00
Roman Leventov	cbd1902db8	Add forbidden-apis plugin; prohibit using system time zone (#4611 ) * Forbidden APIs WIP * Remove some tests * Restore io.druid.math.expr.Function * Integration tests fix * Add comments * Fix in SimpleWorkerProvisioningStrategy * Formatting * Replace String.format() with StringUtils.format() in RemoteTaskRunnerTest * Address comments * Fix GroupByMultiSegmentTest	2017-08-21 13:02:42 -07:00
Jihoon Son	d5606bc558	Passing lockTimeout as a parameter for TaskLockbox.lock() (#4549 ) * Passing lockTimeout as a parameter for TaskLockbox.lock() * Remove TIME_UNIT * Fix tc fail * Add taskLockTimeout to TaskContext * Add caution	2017-08-08 18:21:07 -07:00
Roman Leventov	f5d4171459	Prohibit for loops which could be foreach with IntelliJ (#4653 ) * Replace for with foreach * Replace for with for-each in GroupByQueryEngineV2 * Remove io.druid.collections.IntList	2017-08-08 18:05:33 -07:00
Roman Leventov	aa7e4ae5e4	Enforce correct spacing with Checkstyle (#4651 )	2017-08-05 10:18:25 -07:00
Yuewen Wang	b154ff0d4a	[Bugfix] null string in map/reduce jvm opts (#4588 ) fix a bug when the cluster don't configure mapreduce.map.java.opts or mapreduce.map.java.opts	2017-07-22 06:22:30 -07:00
Roman Leventov	c0beb78ffd	Enforce brace formatting with Checkstyle (#4564 )	2017-07-21 10:26:59 -05:00
Slim	71e7a4c054	Adding double colums supports (#4491 ) * add double columns support * Fix numbers and expected results in UTs * adding float aggregators * fix IT expected test results * fix comments * more fixes * fix comp * fix test * refactor double and float aggregator factories * fix * fix UTs * fix comments * clean unused code * fix more comments * undo unnecessary changes * fix null issue * refactor TopNColumnSelectorStrategyFactory * fix docs * refactor NumericTopNColumnSelectorStrategy * fix return * fix comments * handle the null case in DimesionIndexer * more null fixing * cosmetic changes	2017-07-20 10:14:14 +03:00
Gian Merlino	441ee56ba9	DataSegmentPusher: Add allowed hadoop property prefixes. (#4562 ) * DataSegmentPusher: Add allowed hadoop property prefixes. * Fix dots.	2017-07-18 10:16:12 -07:00
Roman Leventov	60cdf94677	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 ) * Add PMD and prohibit unnecessary fully qualified class names in code * Extra fixes * Remove extra unnecessary fully-qualified names * Remove qualifiers * Remove qualifier	2017-07-17 22:22:29 +09:00
Slim	c5c17bb803	Fix issue 4536 suggested by @@erikdubbelboer (#4541 )	2017-07-13 14:14:31 -07:00
Parag Jain	6e2f78f552	TLS support (#4270 )	2017-07-06 17:40:12 -07:00
Roman Leventov	9ae457f7ad	Avoid using the default system Locale and printing to System.out in production code (#4409 ) * Avoid usages of Default system Locale and printing to System.out or System.err in production code * Fix Charset in DruidKerberosUtil * Remove redundant string format in GenericIndexed * Rename StringUtils.safeFormat() to unimportantSafeFormat(); add StringUtils.format() which fails as well as String.format() * Fix testSafeFormat() * More fixes of redundant StringUtils.format() inside ISE * Rename unimportantSafeFormat() to nonStrictFormat()	2017-06-29 14:06:19 -07:00
Roman Leventov	ae900a4934	Update versions to 0.11.0-SNAPSHOT (#4483 )	2017-06-28 17:05:58 -07:00
Roman Leventov	05d58689ad	Remove the ability to create segments in v8 format (#4420 ) * Remove ability to create segments in v8 format * Fix IndexGeneratorJobTest * Fix parameterized test name in IndexMergerTest * Remove extra legacy merging stuff * Remove legacy serializer builders * Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest	2017-06-26 13:21:39 -07:00
Goh Wei Xiang	f68a0693f3	Allow use of non-threadsafe ObjectCachingColumnSelectorFactory (#4397 ) * Adding a flag to indicate when ObjectCachingColumnSelectorFactory need not be threadsafe. * - Use of computeIfAbsent over putIfAbsent - Replace Maps.newXXXMap() with normal instantiation - Documentations on when is thread-safe required. - Use Builders for On/OffheapIncrementalIndex * - Optimization on computeIfAbsent - Constant EMPTY DimensionsSpec - Improvement on IncrementalIndexSchema.Builder - Remove setting of default values - Use var args for metrics - Correction on On/OffheapIncrementalIndex Builders - Combine On/OffheapIncrementalIndex Builders * - Removing unused imports. * - Helper method for testing with IncrementalIndex.Builder * - Correction on javadoc. * Style fix	2017-06-16 16:04:19 -05:00
Gian Merlino	1f2afccdf8	Expressions: Add ExprMacros. (#4365 ) * Expressions: Add ExprMacros, which have the same syntax as functions, but can convert themselves to any kind of Expr at parse-time. ExprMacroTable is an extension point for adding new ExprMacros. Anything that might need to parse expressions needs an ExprMacroTable, which can be injected through Guice. * Address code review comments.	2017-06-08 09:32:10 -04:00
Roman Leventov	63a897c278	Enable most IntelliJ 'Probable bugs' inspections (#4353 ) * Enable most IntelliJ 'Probable bugs' inspections * Fix in RemoteTestNG * Fix IndexSpec's equals() and hashCode() to include longEncoding * Fix inspection errors * Extract global isntance of natural().nullsFirst(); address comments * Fix * Use noinspection comments instead of SuppressWarnings on method for IntelliJ-specific inspections * Prohibit Ordering.natural().nullsFirst() using Checkstyle	2017-06-07 09:54:25 -07:00
Himanshu	4ace65a2af	fix NPE in IndexGeneratorJob (#4371 ) * fix NPE in IndexGeneratorJob * address review comment * review comments	2017-06-07 05:54:03 -07:00
Roman Leventov	31d33b333e	Make using implicit system Charset an error (#4326 ) * Make using implicit system charset an error * Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String() * Use English locale in StringUtils.safeFormat() * Restore comment	2017-06-05 23:57:25 -07:00
Slim	a2584d214a	Delagate creation of segmentPath/LoadSpec to DataSegmentPushers and add S3a support (#4116 ) * Adding s3a schema and s3a implem to hdfs storage module. * use 2.7.3 * use segment pusher to make loadspec * move getStorageDir and makeLoad spec under DataSegmentPusher * fix uts * fix comment part1 * move to hadoop 2.8 * inject deep storage properties * set version to 2.7.3 * fix build issue about static class * fix comments * fix default hadoop default coordinate * fix create filesytem * downgrade aws sdk * bump the version	2017-06-04 00:55:09 -06:00
Goh Wei Xiang	b77fab8a30	Replace usages of CountingMap with Object2LongMap (#4320 ) * Replaces use of CountingMap with Object2LongMap from fastutil. * Remove CountingMap classes and minor fixes * Added additional test cases for DatasourceInputFormat. * Added additional test cases for CoordinatorStats. * Not materializing segment list. * Put in this fix because it is failing the test on its expected behavior. * Added missing header.	2017-05-24 17:40:32 -07:00
Roman Leventov	b7a52286e8	Make @Override annotation obligatory (#4274 ) * Make MissingOverride an error * Make travis stript to fail fast * Add missing Override annotations * Comment	2017-05-16 13:30:30 -05:00
Benedict Jin	e823085866	Improve `collection` related things that reusing a immutable object instead of creating a new object (#4135 )	2017-05-17 01:38:51 +09:00
Jihoon Son	50a4ec2b0b	Add support for headers and skipping thereof for CSV and TSV (#4254 ) * initial commit * small fixes * fix bug * fix bug * address code review * more cr * more cr * more cr * fix * Skip head rows for CSV and TSV * Move checking skipHeadRows to FileIteratingFirehose * Remove checking null iterators * Remove unused imports * Address comments * Fix compilation error * Address comments * Add more tests * Add a comment to ReplayableFirehose * Addressing comments * Add docs and fix typos	2017-05-15 22:57:31 -07:00
Roman Leventov	1ebfa22955	Update Error prone configuration; Fix bugs (#4252 ) * Make Errorprone the default compiler * Address comments * Make Error Prone's ClassCanBeStatic rule a error * Preconditions allow only %s pattern * Fix DruidCoordinatorBalancerTester * Try to give the compiler more memory * Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now * Don't show compiler warnings * Try different travis script * Fix travis.yml * Make Error Prone optional again * For error-prone compiler * Increase compiler's maxmem * Don't run Error Prone for benchmarks because of OOM * Skip install step in Travis * Remove MetricHolder.writeToChannel() * In travis.yml, check compilation before tests, because it may fail faster	2017-05-12 15:55:17 +09:00
Pierre	bba31e0c8b	close aggregators in indexing-hadoop mappers (#4251 )	2017-05-05 08:29:13 -07:00
Pierre	e9872f0695	do not flush on closed stream (#4250 )	2017-05-05 09:19:20 +09:00
Roman Leventov	8277284d67	Add Checkstyle rule to force comments to classes and methods to be Javadoc comments (#4239 )	2017-05-04 11:14:41 -07:00
Gian Merlino	97ddb38d75	DatasourceInputSplit: Serialize with write instead of writeUTF. (#4195 ) writeUTF has a limit of 64KB, making it difficult to write out splits that read a large number of descriptors for small segments.	2017-04-25 10:26:44 -07:00
Gian Merlino	2ca7b00346	Update versions to 0.10.1-SNAPSHOT. (#4191 )	2017-04-20 18:12:28 -07:00
Gian Merlino	b4289c0004	Remove "granularity" from IngestSegmentFirehose. (#4110 ) It wasn't doing anything useful (the sequences were being concatted, and cursor.getTime() wasn't being called) and it defaulted to Granularities.NONE. Changing it to Granularities.ALL gave me a 700x+ performance boost on a small dataset I was reindexing (2m27s to 365ms). Most of that was from avoiding making a lot of unnecessary column selectors.	2017-03-24 10:28:54 -07:00
Roman Leventov	81a5f9851f	TmpFileIOPeons to create files under the merging output directory, instead of java.io.tmpdir (#3990 ) * In IndexMerger and IndexMergerV9, create temporary files under the output directory/tmpPeonFiles, instead of java.io.tmpdir * Use FileUtils.forceMkdir() across the codebase and remove some unused code * Fix test * Fix PullDependencies.run() * Unused import	2017-03-02 14:05:12 -08:00
Akash Dwivedi	94da5e80f9	Namespace optimization for hdfs data segments. (#3877 ) * NN optimization for hdfs data segments. * HdfsDataSegmentKiller, HdfsDataSegment finder changes to use new storage format.Docs update. * Common utility function in DataSegmentPusherUtil. * new static method `makeSegmentOutputPathUptoVersionForHdfs` in JobHelper * reuse getHdfsStorageDirUptoVersion in DataSegmentPusherUtil.getHdfsStorageDir() * Addressed comments. * Review comments. * HdfsDataSegmentKiller requested changes. * extra newline * Add maprfs.	2017-03-01 09:51:20 -08:00
praveev	5ccfdcc48b	Fix testDeadlock timeout delay (#3979 ) * No more singleton. Reduce iterations * Granularities * Fix the delay in the test * Add license header * Remove unused imports * Lot more unused imports from all the rearranging * CR feedback * Move javadoc to constructor	2017-02-28 12:51:41 -06:00
praveev	c3bf40108d	One granularity (#3850 ) * Refactor Segment Granularity * Beginning of one granularity * Copy the fix for custom periods in segment-grunalrity over here. * Remove the custom serialization for now. * Compilation cleanup * Reformat code * Fixing unit tests * Unify to use a single iterable * Backward compatibility for rolling upgrade * Minor check style. Cosmetic changes. * Rename length and millis to duration * CR feedback * Minor changes.	2017-02-25 01:02:29 -06:00
Akash Dwivedi	797488a677	Removing Integer.MAX column size limit. (#3743 ) * Removing Integer.MAX column size limit. * On demand creation of headerLong, use v2 instead of v3 * Avoid reusing the same object from a previous test. * Avoid reusing the same object from a previous test part#2 * code formatting. * GenericIndexed/Writer code review changes. * GenericIndexed/writer code review requested changes. * checkIndex() to static * native endianess for genericIndexedV2, code review requested changes. * Formatting * Hll fix. * use native endianess during bag size calculation. * Code review requested changes. * IOPeon close() changes. * use different tmp directory path for testing. * Code review requested changes.	2017-02-16 20:09:43 -06:00

1 2 3 4 5 ...

876 Commits