druid/docs/tutorials/tutorial-jupyter-index.md

4.4 KiB

id title
tutorial-jupyter-index Jupyter Notebook tutorials

You can try out the Druid APIs using the Jupyter Notebook-based tutorials. These tutorials provide snippets of Python code that you can use to run calls against the Druid API to complete the tutorial.

Prerequisites

Make sure you meet the following requirements before starting the Jupyter-based tutorials:

  • Python 3.7 or later

  • The requests package for Python. For example, you can install it with the following command:

    pip3 install requests
    
  • JupyterLab (recommended) or Jupyter Notebook running on a non-default port. By default, Druid and Jupyter both try to use port 8888, so start Jupyter on a different port.

    • Install JupyterLab or Notebook:

      # Install JupyterLab
      pip3 install jupyterlab
      # Install Jupyter Notebook
      pip3 install notebook
      
    • Start Jupyter using either JupyterLab

      # Start JupyterLab on port 3001
      jupyter lab --port 3001
      

      Or using Jupyter Notebook

      # Start Jupyter Notebook on port 3001
      jupyter notebook --port 3001
      
  • An available Druid instance. You can use the Quickstart (local) instance. The tutorials assume that you are using the quickstart, so no authentication or authorization is expected unless explicitly mentioned.

    If you contribute to Druid, and work with Druid integration tests, can use a test cluster. Assume you have an environment variable, DRUID_DEV, which identifies your Druid source repo.

    cd $DRUID_DEV
    ./it.sh build
    ./it.sh image
    ./it.sh up <category>
    

    Replace <category> with one of the available integration test categories. See the integration test README.md for details.

Simple Druid API

One of the notebooks shows how to use the Druid REST API. The others focus on other topics and use a simple set of Python wrappers around the underlying REST API. The wrappers reside in the druidapi package within the notebooks directory. While the package can be used in any Python program, the key purpose, at present, is to support these notebooks. See the [Introduction to the Druid Python API] (https://github.com/apache/druid/tree/master/examples/quickstart/jupyter-notebooks/python-api-tutorial.ipynb) for an overview of the Python API.

Tutorials

The notebooks are located in the apache/druid repo. You can either clone the repo or download the notebooks you want individually.

The links that follow are the raw GitHub URLs, so you can use them to download the notebook directly, such as with wget, or manually through your web browser. Note that if you save the file from your web browser, make sure to remove the .txt extension.