mirror of https://github.com/apache/druid.git
more edits
This commit is contained in:
parent
aec12ee3cc
commit
e863afd375
Binary file not shown.
|
@ -332,12 +332,9 @@ we also include results from synthetic workloads on TPC-H data.
|
|||
|
||||
\subsection{Query Performance}
|
||||
Query latencies are shown in Figure~\ref{fig:query_latency} for a cluster
|
||||
holding 10TB of data across several hundred nodes. The average queries per
|
||||
minute during this time was approximately 1000. The number of dimensions the
|
||||
various data sources vary from 25 to 78 dimensions, and 8 to 35 metrics. Across
|
||||
all the various data sources, average query latency is approximately 550
|
||||
milliseconds, with 90\% of queries returning in less than 1 second, 95\% in
|
||||
under 2 seconds, and 99\% of queries returning in less than 10 seconds.
|
||||
hosting approximately 10.5TB of data using 1302 processing threads and 672
|
||||
total cores (hyperthreaded). There are approximately 50 billion rows of data in
|
||||
this cluster.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
|
@ -346,6 +343,20 @@ under 2 seconds, and 99\% of queries returning in less than 10 seconds.
|
|||
\label{fig:query_latency}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width = 2.3in]{tpch_100gb}
|
||||
\caption{Druid \& MySQL benchmarks -- 100GB TPC-H data.}
|
||||
\label{fig:tpch_100gb}
|
||||
\end{figure}
|
||||
|
||||
The average queries per minute during this time was approximately
|
||||
1000. The number of dimensions the various data sources vary from 25 to 78
|
||||
dimensions, and 8 to 35 metrics. Across all the various data sources, average
|
||||
query latency is approximately 550 milliseconds, with 90\% of queries returning
|
||||
in less than 1 second, 95\% in under 2 seconds, and 99\% of queries returning
|
||||
in less than 10 seconds.
|
||||
|
||||
Approximately 30\% of the queries are standard
|
||||
aggregates involving different types of metrics and filters, 60\% of queries
|
||||
are ordered group bys over one or more dimensions with aggregates, and 10\% of
|
||||
|
@ -354,13 +365,6 @@ columns scanned in aggregate queries roughly follows an exponential
|
|||
distribution. Queries involving a single column are very frequent, and queries
|
||||
involving all columns are very rare.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width = 2.3in]{tpch_100gb}
|
||||
\caption{Druid \& MySQL benchmarks -- 100GB TPC-H data.}
|
||||
\label{fig:tpch_100gb}
|
||||
\end{figure}
|
||||
|
||||
We also present Druid benchmarks on TPC-H data. Most TPC-H queries do
|
||||
not directly apply to Druid, so we selected queries more typical of Druid's
|
||||
workload to demonstrate query performance. As a comparison, we also provide the
|
||||
|
|
Loading…
Reference in New Issue