mirror of https://github.com/apache/druid.git
pip install for Python Druid API (#13938)
Broken test appears unrelated to this PR * make druidapi pip installable * include druidapi in prerequisites * add license to setup.py * updates from Paul's review * note about editable install * Apply suggestions from code review Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com> * update install instructions * found unrelated typos * standardize install cmd with pip --------- Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
This commit is contained in:
parent
1c7a03a47b
commit
ede9903ff4
|
@ -30,3 +30,6 @@ integration-tests/gen-scripts/
|
|||
*.hprof
|
||||
**/.ipynb_checkpoints/
|
||||
*.pyc
|
||||
**/.ipython/
|
||||
**/.jupyter/
|
||||
**/.local/
|
||||
|
|
|
@ -46,7 +46,7 @@
|
|||
"- The `requests` package for Python. For example, you can install it with the following command:\n",
|
||||
"\n",
|
||||
" ```bash\n",
|
||||
" pip3 install requests\n",
|
||||
" pip install requests\n",
|
||||
" ````\n",
|
||||
"\n",
|
||||
"- JupyterLab (recommended) or Jupyter Notebook running on a non-default port. By default, Druid\n",
|
|
@ -564,7 +564,7 @@
|
|||
"id": "2654e72c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Use the REST client if you need to make calls that are not yet wrapped by the Python API, or if you want to do something special. To illustrate the client, you can make some of the same calls as in the [Druid REST API notebook](api_tutorial.ipynb).\n",
|
||||
"Use the REST client if you need to make calls that are not yet wrapped by the Python API, or if you want to do something special. To illustrate the client, you can make some of the same calls as in the [Druid REST API notebook](api-tutorial.ipynb).\n",
|
||||
"\n",
|
||||
"The REST API maintains the Druid host: you just provide the specifc URL tail. There are methods to get or post JSON results. For example, to get status information:"
|
||||
]
|
||||
|
|
|
@ -1,9 +1,9 @@
|
|||
# Jupyter Notebook tutorials for Druid
|
||||
|
||||
If you are reading this in Jupyter, switch over to the [- START HERE -](- START HERE -.ipynb]
|
||||
If you are reading this in Jupyter, switch over to the [0-START-HERE](0-START-HERE.ipynb)
|
||||
notebook instead.
|
||||
|
||||
<!-- This README, the "- START HERE -" notebook, and the tutorial-jupyter-index.md file in
|
||||
<!-- This README, the "0-START-HERE" notebook, and the tutorial-jupyter-index.md file in
|
||||
docs/tutorials share a lot of the same content. If you make a change in one place, update
|
||||
the other too. -->
|
||||
|
||||
|
@ -39,7 +39,7 @@ Make sure you meet the following requirements before starting the Jupyter-based
|
|||
- The `requests` package for Python. For example, you can install it with the following command:
|
||||
|
||||
```bash
|
||||
pip3 install requests
|
||||
pip install requests
|
||||
```
|
||||
|
||||
- JupyterLab (recommended) or Jupyter Notebook running on a non-default port. By default, Druid
|
||||
|
@ -49,9 +49,9 @@ Make sure you meet the following requirements before starting the Jupyter-based
|
|||
|
||||
```bash
|
||||
# Install JupyterLab
|
||||
pip3 install jupyterlab
|
||||
pip install jupyterlab
|
||||
# Install Jupyter Notebook
|
||||
pip3 install notebook
|
||||
pip install notebook
|
||||
```
|
||||
- Start Jupyter using either JupyterLab
|
||||
```bash
|
||||
|
@ -65,8 +65,15 @@ Make sure you meet the following requirements before starting the Jupyter-based
|
|||
jupyter notebook --port 3001
|
||||
```
|
||||
|
||||
- An available Druid instance. You can use the `micro-quickstart` configuration
|
||||
described in [Quickstart](https://druid.apache.org/docs/latest/tutorials/index.html).
|
||||
- The Python API client for Druid. Clone the Druid repo if you haven't already.
|
||||
Go to your Druid source repo and install `druidapi` with the following commands:
|
||||
|
||||
```bash
|
||||
cd examples/quickstart/jupyter-notebooks/druidapi
|
||||
pip install .
|
||||
```
|
||||
|
||||
- An available Druid instance. You can use the [quickstart deployment](https://druid.apache.org/docs/latest/tutorials/index.html).
|
||||
The tutorials assume that you are using the quickstart, so no authentication or authorization
|
||||
is expected unless explicitly mentioned.
|
||||
|
||||
|
@ -85,4 +92,4 @@ Make sure you meet the following requirements before starting the Jupyter-based
|
|||
|
||||
## Continue in Jupyter
|
||||
|
||||
Start Jupyter (see above) and navigate to the "- START HERE -" page for more information.
|
||||
Start Jupyter (see above) and navigate to the "0-START-HERE" notebook for more information.
|
||||
|
|
|
@ -63,7 +63,7 @@
|
|||
"Install the [Requests](https://requests.readthedocs.io/en/latest/) library for Python before you start. For example:\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"pip3 install requests\n",
|
||||
"pip install requests\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Please read the [Requests Quickstart](https://requests.readthedocs.io/en/latest/user/quickstart/) to gain a basic understanding of how Requests works.\n",
|
||||
|
|
|
@ -28,6 +28,9 @@ in any Python environment, but is optimized for use in Jupyter, providing a comp
|
|||
environment which complements the UI-based Druid console. The primary use of `druidapi` at present
|
||||
is to support the set of tutorial notebooks provided in the parent directory.
|
||||
|
||||
`druidapi` works against any version of Druid. Operations that make use of newer features obviously work
|
||||
only against versions of Druid that support those features.
|
||||
|
||||
## Install
|
||||
|
||||
At present, the best way to use `druidapi` is to clone the Druid repo itself:
|
||||
|
@ -36,21 +39,29 @@ At present, the best way to use `druidapi` is to clone the Druid repo itself:
|
|||
git clone git@github.com:apache/druid.git
|
||||
```
|
||||
|
||||
`druidapi` is located in `examples/quickstart/jupyter-notebooks/druidapi/`
|
||||
`druidapi` is located in `examples/quickstart/jupyter-notebooks/druidapi/`.
|
||||
From this directory, install the package and its dependencies with pip using the following command:
|
||||
|
||||
Eventually we would like to create a Python package that can be installed with `pip`. Contributions
|
||||
in that area are welcome.
|
||||
```
|
||||
pip install .
|
||||
```
|
||||
|
||||
Dependencies are listed in `requirements.txt`.
|
||||
Note that there is a second level `druidapi` directory that contains the modules. Do not run
|
||||
the install command in the subdirectory.
|
||||
|
||||
`druidapi` works against any version of Druid. Operations that exploit newer features obviously work
|
||||
only against versions of Druid that support those features.
|
||||
Verify your installation by checking that the following command runs in Python:
|
||||
|
||||
## Getting Started
|
||||
```python
|
||||
import druidapi
|
||||
```
|
||||
|
||||
The import statement should not return anything if it runs successfully.
|
||||
|
||||
## Getting started
|
||||
|
||||
To use `druidapi`, first import the library, then connect to your cluster by providing the URL to your Router instance. The way that is done differs a bit between consumers.
|
||||
|
||||
### From a Tutorial Jupyter Notebook
|
||||
### From a tutorial Jupyter notebook
|
||||
|
||||
The tutorial Jupyter notebooks in `examples/quickstart/jupyter-notebooks` reside in the same directory tree
|
||||
as this library. We start the library using the Jupyter-oriented API which is able to render tables in
|
||||
|
@ -70,40 +81,17 @@ druid = druidapi.jupyter_client(router_endpoint)
|
|||
The `jupyter_client` call defines a number of CSS styles to aid in displaying tabular results. It also
|
||||
provides a "display" client that renders information as HTML tables.
|
||||
|
||||
### From Any Other Juypter Notebook
|
||||
|
||||
If you create a Jupyter notebook outside of the `jupyter-notebooks` directory then you must tell Python where
|
||||
to find the `druidapi` library. (This step is temporary until `druidapi` is properly packaged.)
|
||||
|
||||
First, set a variable to point to the location where you cloned the Druid git repo:
|
||||
|
||||
```python
|
||||
druid_dev = '/path/to/Druid-repo'
|
||||
```
|
||||
|
||||
Then, add the notebooks directory to Python's module search path:
|
||||
|
||||
```python
|
||||
import sys
|
||||
sys.path.append(druid_dev + '/examples/quickstart/jupyter-notebooks/')
|
||||
```
|
||||
|
||||
Now you can import `druidapi` and create a client as shown in the previous section.
|
||||
|
||||
### From a Python Script
|
||||
### From a Python script
|
||||
|
||||
`druidapi` works in any Python script. When run outside of a Jupyter notebook, the various "display"
|
||||
commands revert to displaying a text (not HTML) format. The steps are similar to those above:
|
||||
|
||||
```python
|
||||
druid_dev = '/path/to/Druid-repo'
|
||||
import sys
|
||||
sys.path.append(druid_dev + '/examples/quickstart/jupyter-notebooks/')
|
||||
import druidapi
|
||||
druid = druidapi.client(router_endpoint)
|
||||
```
|
||||
|
||||
## Library Organization
|
||||
## Library organization
|
||||
|
||||
`druidapi` organizes Druid REST operations into various "clients," each of which provides operations
|
||||
for one of Druid's functional areas. Obtain a client from the `druid` client created above. For
|
||||
|
@ -127,7 +115,7 @@ available as properties on the `druid` object created above.
|
|||
* `display` - A set of convenience operations to display results as lightly formatted tables
|
||||
in either HTML (for Jupyter notebooks) or text (for other Python scripts).
|
||||
|
||||
## Assumed Cluster Architecture
|
||||
## Assumed cluster architecture
|
||||
|
||||
`druidapi` assumes that you run a standard Druid cluster with a Router in front of the other nodes.
|
||||
This design works well for most Druid clusters:
|
||||
|
@ -148,7 +136,7 @@ The one exception to this rule is if you want to perform a health check (i.e. th
|
|||
on a service other than the Router. These checks are _not_ proxied by the Router: you must connect to
|
||||
the target service directly.
|
||||
|
||||
## Status Operations
|
||||
## Status operations
|
||||
|
||||
When working with tutorials, a local Druid cluster, or a Druid integration test cluster, it is common
|
||||
to start your cluster then immediately start performing `druidapi` operations. However, because Druid
|
||||
|
@ -183,7 +171,7 @@ extension is loaded:
|
|||
status_client.properties['druid.extensions.loadList']
|
||||
```
|
||||
|
||||
## Display Client
|
||||
## Display client
|
||||
|
||||
When run in a Jupyter notebook, it is often handy to format results for display. A special display
|
||||
client performs operations _and_ formats them for display as HTML tables within the notebook.
|
||||
|
@ -204,7 +192,7 @@ The most common methods are:
|
|||
The display client also has other methods to format data as a table, to display various kinds
|
||||
of messages and so on.
|
||||
|
||||
## Interactive Queries
|
||||
## Interactive queries
|
||||
|
||||
The original [`pydruid`](https://pythonhosted.org/pydruid/) library revolves around Druid
|
||||
"native" queries. Most new applications now use SQL. `druidapi` provides two ways to run
|
||||
|
@ -264,7 +252,7 @@ channel count
|
|||
|
||||
Within Jupyter, the results are formatted as an HTML table.
|
||||
|
||||
### Advanced Queries
|
||||
### Advanced queries
|
||||
|
||||
In addition to the SQL text, Druid also lets you specify:
|
||||
|
||||
|
@ -350,7 +338,7 @@ resp.show()
|
|||
In fact, the display client `sql()` method uses the `resp.show()` method internally, which in turn uses the
|
||||
`rows` and `schema` properties.
|
||||
|
||||
### Run a Query and Return Results
|
||||
### Run a query and return results
|
||||
|
||||
The above forms are handy for interactive use in a notebook. If you just need to run a query to use the results
|
||||
in code, just do the following:
|
||||
|
@ -366,7 +354,7 @@ sql = 'SELECT * FROM {}'
|
|||
rows = sql_client.sql(sql, ['myTable'])
|
||||
```
|
||||
|
||||
## MSQ Queries
|
||||
## MSQ queries
|
||||
|
||||
The SQL client can also run an MSQ query. See the `sql-tutorial.ipynb` notebook for examples. First define the
|
||||
query:
|
||||
|
@ -408,7 +396,7 @@ while for Druid to load the resulting segments, so you must wait for the table t
|
|||
sql_client.wait_until_ready('myTable')
|
||||
```
|
||||
|
||||
## Datasource Operations
|
||||
## Datasource operations
|
||||
|
||||
To get information about a datasource, prefer to query the `INFORMATION_SCHEMA` tables, or use the methods
|
||||
in the display client. Use the datasource client for other operations.
|
||||
|
@ -425,7 +413,7 @@ datasources.drop('myWiki', True)
|
|||
|
||||
The True argument asks for "if exists" semantics so you don't get an error if the datasource does not exist.
|
||||
|
||||
## REST Client
|
||||
## REST client
|
||||
|
||||
The `druidapi` is based on a simple REST client which is itself based on the Requests library. If you
|
||||
need to use Druid REST APIs not yet wrapped by this library, you can use the REST client directly.
|
||||
|
@ -495,3 +483,28 @@ Druid has a large number of special constants: type names, options, etc. The con
|
|||
from druidapi import consts
|
||||
help(consts)
|
||||
```
|
||||
|
||||
## Contributing
|
||||
|
||||
We encourage you to contribute to the `druidapi` package.
|
||||
Set up an editable installation for development by running the following command
|
||||
in a local clone of your `apache/druid` repo in
|
||||
`examples/quickstart/jupyter-notebooks/druidapi/`:
|
||||
|
||||
```
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
An editable installation allows you to implement and test changes iteratively
|
||||
without having to reinstall the package with every change.
|
||||
|
||||
When you update the package, also increment the version field in `setup.py` following the
|
||||
[PEP 440 semantic versioning scheme](https://peps.python.org/pep-0440/#semantic-versioning).
|
||||
|
||||
Use the following guidelines for incrementing the version number:
|
||||
* Increment the third position for a patch or bug fix.
|
||||
* Increment the second position for new features, such as adding new method wrappers.
|
||||
* Increment the first position for major changes and changes that are not backwards compatible.
|
||||
|
||||
Submit your contribution by opening a pull request to the `apache/druid` GitHub repository.
|
||||
|
||||
|
|
|
@ -13,14 +13,14 @@
|
|||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
from .druid import DruidClient
|
||||
from druidapi.druid import DruidClient
|
||||
|
||||
def jupyter_client(endpoint) -> DruidClient:
|
||||
'''
|
||||
Create a Druid client configured to display results as HTML withing a Jupyter notebook.
|
||||
Waits for the cluster to become ready to avoid intermitent problems when using Druid.
|
||||
'''
|
||||
from .html import HtmlDisplayClient
|
||||
from druidapi.html_display import HtmlDisplayClient
|
||||
druid = DruidClient(endpoint, HtmlDisplayClient())
|
||||
druid.status.wait_until_ready()
|
||||
return druid
|
||||
|
@ -33,3 +33,4 @@ def client(endpoint) -> DruidClient:
|
|||
that the cluster has not yet fully started.
|
||||
'''
|
||||
return DruidClient(endpoint)
|
||||
|
|
@ -14,8 +14,8 @@
|
|||
# limitations under the License.
|
||||
|
||||
import requests
|
||||
from .consts import COORD_BASE
|
||||
from .rest import check_error
|
||||
from druidapi.consts import COORD_BASE
|
||||
from druidapi.rest import check_error
|
||||
|
||||
# Catalog (new feature in Druid 26)
|
||||
CATALOG_BASE = COORD_BASE + '/catalog'
|
|
@ -14,9 +14,9 @@
|
|||
# limitations under the License.
|
||||
|
||||
import requests, time
|
||||
from .consts import COORD_BASE
|
||||
from .rest import check_error
|
||||
from .util import dict_get
|
||||
from druidapi.consts import COORD_BASE
|
||||
from druidapi.rest import check_error
|
||||
from druidapi.util import dict_get
|
||||
|
||||
REQ_DATASOURCES = COORD_BASE + '/datasources'
|
||||
REQ_DATASOURCE = REQ_DATASOURCES + '/{}'
|
|
@ -13,7 +13,7 @@
|
|||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
from . import consts
|
||||
from druidapi import consts
|
||||
|
||||
class DisplayClient:
|
||||
'''
|
|
@ -13,12 +13,12 @@
|
|||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
from .rest import DruidRestClient
|
||||
from .status import StatusClient
|
||||
from .catalog import CatalogClient
|
||||
from .sql import QueryClient
|
||||
from .tasks import TaskClient
|
||||
from .datasource import DatasourceClient
|
||||
from druidapi.rest import DruidRestClient
|
||||
from druidapi.status import StatusClient
|
||||
from druidapi.catalog import CatalogClient
|
||||
from druidapi.sql import QueryClient
|
||||
from druidapi.tasks import TaskClient
|
||||
from druidapi.datasource import DatasourceClient
|
||||
|
||||
class DruidClient:
|
||||
'''
|
||||
|
@ -36,7 +36,7 @@ class DruidClient:
|
|||
if display_client:
|
||||
self.display_client = display_client
|
||||
else:
|
||||
from .text import TextDisplayClient
|
||||
from druidapi.text_display import TextDisplayClient
|
||||
self.display_client = TextDisplayClient()
|
||||
self.display_client._druid = self
|
||||
|
|
@ -15,8 +15,8 @@
|
|||
|
||||
from IPython.display import display, HTML
|
||||
from html import escape
|
||||
from .display import DisplayClient
|
||||
from .base_table import BaseTable
|
||||
from druidapi.display import DisplayClient
|
||||
from druidapi.base_table import BaseTable
|
||||
|
||||
STYLES = '''
|
||||
<style>
|
|
@ -14,9 +14,9 @@
|
|||
# limitations under the License.
|
||||
|
||||
import requests
|
||||
from .util import dict_get
|
||||
from druidapi.util import dict_get
|
||||
from urllib.parse import quote
|
||||
from .error import ClientError
|
||||
from druidapi.error import ClientError
|
||||
|
||||
def check_error(response):
|
||||
'''
|
||||
|
@ -52,7 +52,7 @@ def check_error(response):
|
|||
# We have an explanation from Druid. Raise a Client exception
|
||||
raise ClientError(msg)
|
||||
|
||||
# Don't know what the Druid JSON is. Raise a Requetss exception, but
|
||||
# Don't know what the Druid JSON is. Raise a Requests exception, but
|
||||
# add on the JSON in the hopes that the caller can make use of it.
|
||||
try:
|
||||
response.raise_for_status()
|
|
@ -14,9 +14,9 @@
|
|||
# limitations under the License.
|
||||
|
||||
import time, requests
|
||||
from . import consts
|
||||
from .util import dict_get, split_table_name
|
||||
from .error import DruidError, ClientError
|
||||
from druidapi import consts
|
||||
from druidapi.util import dict_get, split_table_name
|
||||
from druidapi.error import DruidError, ClientError
|
||||
|
||||
REQ_SQL = consts.ROUTER_BASE + '/sql'
|
||||
REQ_SQL_TASK = REQ_SQL + '/task'
|
|
@ -13,7 +13,7 @@
|
|||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
from .consts import OVERLORD_BASE
|
||||
from druidapi.consts import OVERLORD_BASE
|
||||
|
||||
REQ_TASKS = OVERLORD_BASE + '/tasks'
|
||||
REQ_POST_TASK = OVERLORD_BASE + '/task'
|
|
@ -13,8 +13,8 @@
|
|||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
from .display import DisplayClient
|
||||
from .base_table import pad, BaseTable
|
||||
from druidapi.display import DisplayClient
|
||||
from druidapi.base_table import pad, BaseTable
|
||||
|
||||
alignments = ['', '^', '>']
|
||||
|
|
@ -13,7 +13,7 @@
|
|||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
from .error import ClientError
|
||||
from druidapi.error import ClientError
|
||||
|
||||
def dict_get(dict, key, default=None):
|
||||
'''
|
|
@ -0,0 +1,37 @@
|
|||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
from setuptools import setup, find_packages
|
||||
|
||||
setup(
|
||||
name='druidapi',
|
||||
version='0.1.0',
|
||||
description='Python API client for Apache Druid',
|
||||
url='https://github.com/apache/druid/tree/master/examples/quickstart/jupyter-notebooks/druidapi',
|
||||
author='Apache Druid project',
|
||||
author_email='dev@druid.apache.org',
|
||||
license='Apache License 2.0',
|
||||
packages=find_packages(),
|
||||
install_requires=['requests'],
|
||||
|
||||
classifiers=[
|
||||
'Development Status :: 3 - Alpha',
|
||||
'Intended Audience :: Developers',
|
||||
'Intended Audience :: End Users/Desktop',
|
||||
'License :: OSI Approved :: Apache Software License',
|
||||
'Operating System :: OS Independent',
|
||||
'Programming Language :: Python :: 3',
|
||||
],
|
||||
)
|
Loading…
Reference in New Issue