druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	ac84a3e011	SQL: Add resolution parameter, fix filtering bug with APPROX_QUANTILE (#3868 ) * SQL: Add resolution parameter to quantile agg, rename to APPROX_QUANTILE. * Fix bug with re-use of filtered approximate histogram aggregators. Also add APPROX_QUANTILE tests for filtering and running on complex columns. Includes some slight refactoring to allow tests to make DruidTables that include complex columns. * Remove unused import	2017-01-25 18:39:26 -08:00
Niketh Sabbineni	2b8d3c102b	Remove throttling on drop segments (#3736 ) * Remove throttling on drop * Throttle loadqueuepeon segment change requests to ZK * Make initial delay configurable, add docs, shutdown gracefully * Make loadqueuepeon repeat delay configurable	2017-01-20 10:02:19 -08:00
Gian Merlino	d51f5e058d	SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. (#3852 ) * SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. Switched from CalciteConnection to Planner, bringing benefits: - CalciteConnection's JDBC interface no longer sits between the SQL server (HTTP/Avatica) and Druid's query layer. Instead, the SQL servers can use Druid Sequence objects directly, reducing overhead in the query return path. - Implemented our own Planner-based Avatica Meta, letting us control connection timeouts and connection / statement limits. The previous CalciteConnection-based implementation didn't have any limits or timeouts. - The Planner interface lets us override the operator table, opening up SQL language extensions. This patch includes two: APPROX_COUNT_DISTINCT in core, and a QUANTILE aggregator in the druid-histogram extension. Also: - Added INFORMATION_SCHEMA metadata schema. - Added tests for Unicode literals and escapes. * Verify statement is actually open before closing it. * More detailed INFORMATION_SCHEMA docs.	2017-01-19 16:32:20 -08:00
kaijianding	33ae9dd485	streaming version of select query (#3307 ) * streaming version of select query * use columns instead of dimensions and metrics;prepare for valueVector;remove granularity * respect query limit within historical * use constant * fix thread name corrupted bug when using jetty qtp thread rather than processing thread while working with SpecificSegmentQueryRunner * add some test for scan query * add scan query document * fix merge conflicts * add compactedList resultFormat, this format is better for json ser/der * respect query timeout * respect query limit on broker * use static consts and remove unused code	2017-01-19 16:09:53 -06:00
David Lim	ff52581bd3	IndexTask improvements (#3611 ) * index task improvements * code review changes * add null check	2017-01-18 14:24:37 -08:00
Fokko Driesprong	31bea380eb	Updated Apache Zookeeper to the latest stable version (#3841 )	2017-01-12 13:39:29 -08:00
Gian Merlino	e86859b228	SQL support for nested groupBys. (#3806 ) * SQL support for nested groupBys. Allows, for example, doing exact count distinct by writing: SELECT COUNT() FROM (SELECT DISTINCT col FROM druid.foo) Contrast with approximate count distinct, which is: SELECT COUNT(DISTINCT col) FROM druid.foo Add deeply-nested groupBy docs, tests, and maxQueryCount config. * Extract magic constants into statics. * Rework rules to put preconditions in the "matches" method.	2017-01-11 18:32:53 -08:00
Gian Merlino	76620615a1	Properly respect the enableAvatica and enableJsonOverHttp options. (#3834 )	2017-01-11 14:43:34 -06:00
Jihoon Son	c099977a5b	Add an option to SearchQuery to choose a search query execution strategy (#3792 ) * Add an option to SearchQuery to choose a search query execution strategy. Supported strategies are 1) Index-only query execution 2) Cursor-based scan 3) Auto: choose an efficient strategy for a given query * Add SearchStrategy and SearchQueryExecutor * Address comments * Rename strategies and set UseIndexesStrategy as the default strategy * Add a cost-based planner for auto strategy * Add document * Fix code style * apply code style * apply comments	2017-01-10 18:04:20 -08:00
Vinh Tran	dddeae813a	Update caching.md typo (#3824 ) * Update caching.md Typo of Command vs Comma * Update index.md Fixing `Command` typo	2017-01-06 12:14:07 -08:00
Yuusaku Taniguchi	02519d5b64	Exhibitor Support (#3664 ) * allow JsonConfigTesterBase to treat the fields of collections * [Feature] Exhibitor Support (#3664) This patch provides the integration of Druid & Netflix Exhibitor. Druid currently use Apache Curator as ZooKeeper client. Curator can be integrated with Exhibitor to achieve a live/updating list of the ZooKeeper ensemble. This patch enables Druid to use this features.	2017-01-02 09:15:36 -08:00
Himanshu	4ca3b7f1e4	overlord helpers framework and tasklog auto cleanup (#3677 ) * overlord helpers framework and tasklog auto cleanup * review comment changes * further review comments addressed	2016-12-21 15:18:55 -08:00
Nishant	f576a0ff14	Contrib Extension for Ambari Metrics Emitter (#3767 ) * Contrib Extension for Ambari Metrics Emitter extension to enable druid to send metrics to ambari metrics server (https://cwiki.apache.org/confluence/display/AMBARI/Metrics) review comments switch to public repo * review comments * add docs * fix pom version * Add link for doc page in extensions.md * remove unused imports * review comments review comments remove unused dependency review comment	2016-12-19 11:12:47 -08:00
Nishant	35160e5595	Add metrics for Query Count statistics (#3470 ) * Add metrics for Query Count statistics This PR adds a new metrics monitor “QueryCountStatsMonitor” which emits three new metrics - 1) query/success/count - number of successful queries 2) query/failed/count - number of failed queries 3) query/interrupted/count - number of interrupted/timedout queries fix bindings * make fields final * fix imports * AsyncQueryForwardingServlet implement QueryStatsProvider * remove unused import	2016-12-19 09:47:58 -08:00
David Lim	8eee259629	add documentation on segments generated (#3785 )	2016-12-19 09:41:47 -08:00
Dongkyu Hwangbo	da007ca3c2	Replace caravel with superset (#3780 )	2016-12-16 20:47:52 -08:00
Gian Merlino	dd63f54325	Built-in SQL. (#3682 )	2016-12-16 17:15:59 -08:00
Jonathan Wei	2bfcc8a592	First and Last Aggregator (#3566 ) * add first and last aggregator * add test and fix * moving around * separate aggregator valueType * address PR comment * add finalize inner query and adjust v1 inner indexing * better test and fixes * java-util import fixes * PR comments * Add first/last aggs to ITWikipediaQueryTest	2016-12-16 15:26:40 -08:00
Nishant	8cfcb95fbc	Add Filtered and Composing request loggers (#3469 ) * Add Filtered and Composing request loggers Add Filtered and Composite Request loggers - enables users to filter request logs for slow queries. fix test * review comments * review comment * remove unused import	2016-12-16 11:18:32 -08:00
Himanshu	ed322a4beb	remove size from default analysisTypes list for segmentMetadata query (#3773 )	2016-12-13 18:01:21 -08:00
Ninglin Du	469ab21091	[Feature] Thrift support for realtime and batch ingestion (#3418 ) * Thrift ingestion plugin 1. thrift binary is platform dependent, use scrooge to generate java files to avoid style check failure 2. stream and hadoop ingesion are both supported, input format can be sequence file and lzo thrift block file. 3. base64 and protocol aware change header * fix conlicts in pom	2016-12-13 10:05:15 -08:00
kaijianding	4be3eb0ce7	report message gap, source gap and sink count in RealtimePlumber (#3744 ) * report message gap, source gap and sink count in RealtimePlumber * report message gap, sink count in Appenderator * add ingest/events/sourceGap in metrics.md * remove source gap	2016-12-13 11:23:02 -06:00
Erik Dubbelboer	bb9e35e1af	Add Greatest and Least post aggregations (#3567 )	2016-12-07 17:58:23 -08:00
Gian Merlino	943982b7b0	Configurable HTTP compression. (#3759 ) * Configurable HTTP compression. * Call real-time nodes real-time processes in docs.	2016-12-07 17:40:39 -08:00
Himanshu	06d0ef9c6c	allow and load extensions with absolute paths in druid.extensions.loadList (#3747 )	2016-12-06 17:40:23 -08:00
Himanshu	45da7e48f1	groupBy sort results by (dimensions,timestamp) instead of (timestamp,dimension) (#3672 ) * sortByDimsFirst flag for groupBy query * Remove need for KeyType in Grouper<KeyType> to be Comparable<KeyType> * fix review comments * fix review comments regarding removing code duplication of dim/time comparison * move comparator for KeyType object to KeySerdeFactory so that creation of comparator does not need KeySerde * remove unnecessary system.out.println * make access static var NATURAL_NULLS_FIRST directly * further review comments addressing	2016-12-06 09:48:56 -08:00
Niketh Sabbineni	d904c79081	Normalized Cost Balancer (#3632 ) * Normalized Cost Balancer * Adding documentation and renaming to use diskNormalizedCostBalancer * Remove balancer from the strings * Update docs and include random cost balancer * Fix checkstyle issues	2016-12-05 17:18:20 -08:00
Gian Merlino	353fee79dd	Add "asMillis" option to "timeFormat" extractionFn. (#3733 ) This is useful for chaining extractionFns that all want to treat time as millis, such as having a javascript extractionFn after a timeFormat.	2016-12-02 13:45:16 -08:00
Gian Merlino	102375d9bb	Add "strlen" extractionFn. (#3731 )	2016-12-02 12:08:51 -08:00
Gian Merlino	4c5d10f8a3	Add DimFilterHavingSpec. (#3727 ) * Add DimFilterHavingSpec. * Add test for DimFilterHavingSpec with extractionFns.	2016-12-02 10:04:30 -08:00
Gian Merlino	e4465e63bd	Fix ordering of sections on dimensionspecs.md. (#3722 ) The Filtered and List DimensionSpecs were mixed in with the extraction functions.	2016-11-29 16:28:36 -08:00
Niketh Sabbineni	2640d170c3	Blacklist workers if they fail for too many times (#3643 ) * Blacklist workers if they fail for too many times * Adding documentation * Changing to timeout to period and updating docs * 1. Add configurable maxPercentageBlacklistWorkers 2. Rename variable * Change maxPercentageBlacklistWorkers to double * Remove thread.sleep	2016-11-29 12:38:56 +05:30
Erik Dubbelboer	9f7050e221	Fix some grammar and spelling mistakes (#3717 )	2016-11-28 11:49:30 -08:00
Himanshu	7d37f675ba	fix the documented property name for specifying avro reader schema (#3708 )	2016-11-22 15:02:41 -08:00
Parag Jain	7ee6bb7410	option to reset offest automatically in case of OffsetOutOfRangeException (#3678 ) * option to reset offset automatically in case of OffsetOutOfRangeException if the next offset is less than the earliest available offset for that partition * review comments * refactoring * refactor * review comments	2016-11-21 16:29:46 -06:00
Jonathan Wei	7c63bee7f5	Add mapreduce.job.classloader.system.classes property to 'Other Hadoop Versions' docs (#3706 )	2016-11-18 16:16:50 -08:00
Erik Dubbelboer	7d36f540e8	WIP: Add Google Storage support (#2458 ) Also excludes the correct artifacts from #2741	2016-11-16 14:06:45 +05:30
Keuntae Park	094f5b851b	Support Min/Max for Timestamp (#3299 ) * Min/Max aggregator for Timestamp * remove unused imports and method * rebase and zip the test data * add docs	2016-11-14 23:00:21 -08:00
Joan Viladrosa	2df98bcaa6	Fixed Missing commas in json example of Lookup (#3680 )	2016-11-15 14:56:18 +09:00
Gian Merlino	bcd20441be	Make buildV9Directly the default. (#3688 )	2016-11-14 09:29:32 -08:00
praveev	52a74cf84f	Use timestamp in millis as Map key instead of DateTime object (#3674 ) * Use Long timestamp as key instead of DateTime. DateTime representation is screwed up when you store with an obj and read with a different DateTime obj. For example: The code below fails when you use DateTime as key ``` DateTime odt = DateTime.now(DateTimeUtils.getZone(DateTimeZone.forID("America/Los_Angeles"))); HashMap<DateTime, String> map = new HashMap<>(); map.put(odt, "abc"); DateTime dt = new DateTime(odt.getMillis()); System.out.println(map.get(dt)); ``` * Respect timezone when creating the file. * Update docs with timezone caveat in granularity spec * Remove unused imports	2016-11-11 10:20:20 -08:00
Himanshu	b76b3f8d85	reset-cluster command to clean up druid state stored on metadata and deep storage (#3670 )	2016-11-09 11:07:01 -06:00
Mark	575aeb843a	Metadata Storage extension for Microsoft SqlServer (sqlserver-metadata-storage) (#3421 )	2016-11-08 14:56:52 -08:00
Nicolas Colomer	37ecffb648	Add support for Confluent Schema Registry in the avro extension (#3529 )	2016-11-08 16:10:45 -06:00
cheddar	c49a9d5693	Call out semver expectations for modules (#3659 ) * Call out semver expectations for modules * Update modules.md * Link to versioning	2016-11-04 12:52:05 -07:00
Gian Merlino	2c504b6258	Add "like" filter. (#3642 ) * Add "like" filter. * Addressed some PR comments. * Slight simplifications to LikeFilter. * Additional simplifications. * Fix comment in LikeFilter. * Clarify comment in LikeFilter. * Simplify LikeMatcher a bit. * No use going through the optimized path if prefix is empty. * Add more tests.	2016-11-04 23:25:03 +05:30
Gian Merlino	4203580290	URIExtractionNamespace: Treat null values in lookup maps as missing entries. (#3512 ) * URIExtractionNamespace: Treat null values in lookup maps as missing entries. This is useful when many logical lookups are derived from the same base JSON file, and some lookups' values may be unknown sometimes. * Add test, logging message, and address other comments. * Update docs.	2016-11-03 13:53:04 -07:00
Navis Ryu	e10def32f2	Support string type in math expression (#2836 ) * Support string type in math expression addressed comments addressed comments Addressed comments * Updated math function document * Addressed comments	2016-11-02 21:10:48 -06:00
Himanshu	641469fc38	manage overshadowing efficiently at coordinator (#3584 ) * manage overshadowing efficiently at coordinator * take readlock in VersionedIntervalTimeline.isOvershadowed()	2016-10-24 22:49:08 +05:30
jaehong choi	6f21778364	Support finding segments in AWS S3. (#3399 ) * support finding segments from a AWS S3 storage. * add more Uts * address comments and add a document for the feature. * update docs indentation * update docs indentation * address comments. 1. add a Ut for json ser/deser for the config object. 2. more informant error message in a Ut. * address comments. 1. use @Min to validate the configuration object 2. change updateDescriptor to a string as it does not take an argument otherwise * fix a Ut failure - delete a Ut for testing default max length.	2016-10-10 17:27:09 -07:00

1 2 3 4 5 ...

1280 Commits