druid/sql-jdbc.md at b77eacf69a104420e2e99c0b4ca41e4770c49639

12 KiB

Raw Blame History

id	title	sidebar_label
sql-jdbc	SQL JDBC driver API	SQL JDBC driver

:::info Apache Druid supports two query languages: Druid SQL and native queries. This document describes the SQL language. :::

You can make Druid SQL queries using the Avatica JDBC driver. We recommend using Avatica JDBC driver version 1.23.0 or later. Note that starting with Avatica 1.21.0, you may need to set the transparent_reconnection property to true if you notice intermittent query failures.

Once you've downloaded the Avatica client jar, add it to your classpath.

Example connection string:

jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true

Or, to use the protobuf protocol instead of JSON:

jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica-protobuf/;transparent_reconnection=true;serialization=protobuf

The url is the /druid/v2/sql/avatica/ endpoint on the Router, which routes JDBC connections to a consistent Broker. For more information, see Connection stickiness.

Set transparent_reconnection to true so your connection is not interrupted if the pool of Brokers changes membership, or if a Broker is restarted.

Set serialization to protobuf if using the protobuf endpoint.

Note that as of the time of this writing, Avatica 1.23.0, the latest version, does not support passing connection context parameters from the JDBC connection string to Druid. These context parameters must be passed using a Properties object instead. Refer to the Java code below for an example.

Example Java code:

// Connect to /druid/v2/sql/avatica/ on your Broker.
String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true";

// Set any connection context parameters you need here.
// Any property from https://druid.apache.org/docs/latest/querying/sql-query-context.html can go here.
Properties connectionProperties = new Properties();
connectionProperties.setProperty("sqlTimeZone", "Etc/UTC");
//To connect to a Druid deployment protected by basic authentication,
//you can incorporate authentication details from https://druid.apache.org/docs/latest/operations/security-overview   
connectionProperties.setProperty("user", "admin");                
connectionProperties.setProperty("password", "password1");     

try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
  try (
      final Statement statement = connection.createStatement();
      final ResultSet resultSet = statement.executeQuery(query)
  ) {
    while (resultSet.next()) {
      // process result set
    }
  }
}

For a runnable example that includes a query that you might run, see Examples.

It is also possible to use a protocol buffers JDBC connection with Druid, this offer reduced bloat and potential performance improvements for larger result sets. To use it apply the following connection URL instead, everything else remains the same

String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica-protobuf/;transparent_reconnection=true;serialization=protobuf";

:::info The protobuf endpoint is also known to work with the official Golang Avatica driver :::

Table metadata is available over JDBC using connection.getMetaData() or by querying the INFORMATION_SCHEMA tables. For an example of this, see Get the metadata for a datasource.

Connection stickiness

Druid's JDBC server does not share connection state between Brokers. This means that if you're using JDBC and have multiple Druid Brokers, you should either connect to a specific Broker or use a load balancer with sticky sessions enabled. The Druid Router process provides connection stickiness when balancing JDBC requests, and can be used to achieve the necessary stickiness even with a normal non-sticky load balancer. Please see the Router documentation for more details.

Note that the non-JDBC JSON over HTTP API is stateless and does not require stickiness.

Dynamic parameters

You can use parameterized queries in JDBC code, as in this example:

PreparedStatement statement = connection.prepareStatement("SELECT COUNT(*) AS cnt FROM druid.foo WHERE dim1 = ? OR dim1 = ?");
statement.setString(1, "abc");
statement.setString(2, "def");
final ResultSet resultSet = statement.executeQuery();

Sample code where dynamic parameters replace arrays using STRING_TO_ARRAY:

PreparedStatement statement = connection.prepareStatement("select l1 from numfoo where SCALAR_IN_ARRAY(l1, STRING_TO_ARRAY(CAST(? as varchar),','))");
List<Integer> li = ImmutableList.of(0, 7);
String sqlArg = Joiner.on(",").join(li);
statement.setString(1, sqlArg);
statement.executeQuery();

Sample code using native array:

PreparedStatement statement = connection.prepareStatement("select l1 from numfoo where SCALAR_IN_ARRAY(l1, ?)");
Iterable<Object> list = ImmutableList.of(0, 7);
ArrayFactoryImpl arrayFactoryImpl = new ArrayFactoryImpl(TimeZone.getDefault());
AvaticaType type = ColumnMetaData.scalar(Types.INTEGER, SqlType.INTEGER.name(), Rep.INTEGER);
Array array = arrayFactoryImpl.createArray(type, list);
statement.setArray(1, array);
statement.executeQuery();

Examples

The following section contains two complete samples that use the JDBC connector:

Get the metadata for a datasource shows you how to query the INFORMATION_SCHEMA to get metadata like column names.
Query data runs a select query against the datasource.

You can try out these examples after verifying that you meet the prerequisites.

For more information about the connection options, see Client Reference.

Prerequisites

Make sure you meet the following requirements before trying these examples:

A supported Java version
Avatica JDBC driver. You can add the JAR to your CLASSPATH directly or manage it externally, such as through Maven and a pom.xml file.
An available Druid instance. You can use the micro-quickstart configuration described in Quickstart (local). The examples assume that you are using the quickstart, so no authentication or authorization is expected unless explicitly mentioned.
The example wikipedia datasource from the quickstart is loaded on your Druid instance. If you have a different datasource loaded, you can still try these examples. You'll have to update the table name and column names to match your datasource.

Get the metadata for a datasource

Metadata, such as column names, is available either through the INFORMATION_SCHEMA table or through connection.getMetaData(). The following example uses the INFORMATION_SCHEMA table to retrieve and print the list of column names for the wikipedia datasource that you loaded during a previous tutorial.

import java.sql.*;
import java.util.Properties;

public class JdbcListColumns {

    public static void main(String[] args)
    {
        // Connect to /druid/v2/sql/avatica/ on your Router. 
        // You can connect to a Broker but must configure connection stickiness if you do. 
        String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true";

        String query = "SELECT COLUMN_NAME,* FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'wikipedia' and TABLE_SCHEMA='druid'";

        // Set any connection context parameters you need here.
        // Any property from https://druid.apache.org/docs/latest/querying/sql-query-context.html can go here.
        Properties connectionProperties = new Properties();

        try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
            try (
                    final Statement statement = connection.createStatement();
                    final ResultSet rs = statement.executeQuery(query)
            ) {
                while (rs.next()) {
                    String columnName = rs.getString("COLUMN_NAME");
                    System.out.println(columnName);
                }
            }
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }

    }
}

Query data

Now that you know what columns are available, you can start querying the data. The following example queries the datasource named wikipedia for the timestamps and comments from Japan. It also sets the query context parameter sqlTimeZone. Optionally, you can also parameterize queries by using dynamic parameters.

import java.sql.*;
import java.util.Properties;

public class JdbcCountryAndTime {

    public static void main(String[] args)
    {
        // Connect to /druid/v2/sql/avatica/ on your Router. 
        // You can connect to a Broker but must configure connection stickiness if you do. 
        String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true";

        //The query you want to run.
        String query = "SELECT __time, isRobot, countryName, comment FROM wikipedia WHERE countryName='Japan'";

        // Set any connection context parameters you need here.
        // Any property from https://druid.apache.org/docs/latest/querying/sql-query-context.html can go here.
        Properties connectionProperties = new Properties();
        connectionProperties.setProperty("sqlTimeZone", "America/Los_Angeles");

        try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
            try (
                    final Statement statement = connection.createStatement();
                    final ResultSet rs = statement.executeQuery(query)
            ) {
                while (rs.next()) {
                    Timestamp timeStamp = rs.getTimestamp("__time");
                    String comment = rs.getString("comment");
                    System.out.println(timeStamp);
                    System.out.println(comment);
                }
            }
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }

    }
}

12 KiB Raw Blame History