diff --git a/docs/tutorials/tutorial-jupyter-docker.md b/docs/tutorials/tutorial-jupyter-docker.md index 6886e3d6e7b..0cb055d2fe2 100644 --- a/docs/tutorials/tutorial-jupyter-docker.md +++ b/docs/tutorials/tutorial-jupyter-docker.md @@ -33,12 +33,15 @@ You can run the following combination of applications: * [Jupyter only](#start-only-the-jupyter-container) * [Jupyter and Druid](#start-jupyter-and-druid) * [Jupyter, Druid, and Kafka](#start-jupyter-druid-and-kafka) +* [Kafka and Jupyter](#start-kafka-and-jupyter) ## Prerequisites Jupyter in Docker requires that you have **Docker** and **Docker Compose**. We recommend installing these through [Docker Desktop](https://docs.docker.com/desktop/). +For ARM-based devices, see [Tutorial setup for ARM-based devices](#tutorial-setup-for-arm-based-devices). + ## Launch the Docker containers You run Docker Compose to launch Jupyter and optionally Druid or Kafka. @@ -53,7 +56,7 @@ access the files in `druid/examples/quickstart/jupyter-notebooks/docker-jupyter` ### Start only the Jupyter container -If you already have Druid running locally, you can run only the Jupyter container to complete the tutorials. +If you already have Druid running locally or on another machine, you can run the Docker containers for Jupyter only. In the same directory as `docker-compose.yaml`, start the application: ```bash @@ -63,6 +66,11 @@ docker compose --profile jupyter up -d The Docker Compose file assigns `8889` for the Jupyter port. You can override the port number by setting the `JUPYTER_PORT` environment variable before starting the Docker application. +If Druid is running local to the same machine as Jupyter, open the tutorial and set the `host` variable to `host.docker.internal` before starting. For example: +```python +host = "host.docker.internal" +``` + ### Start Jupyter and Druid Running Druid in Docker requires the `environment` file as well as an environment variable named `DRUID_VERSION`, @@ -85,6 +93,26 @@ In the same directory as `docker-compose.yaml` and `environment`, start the appl DRUID_VERSION={{DRUIDVERSION}} docker compose --profile all-services up -d ``` +### Start Kafka and Jupyter + +If you already have Druid running externally, such as an existing cluster or a dedicated infrastructure for Druid, you can run the Docker containers for Kafka and Jupyter only. + +In the same directory as `docker-compose.yaml` and `environment`, start the application: + +```bash +DRUID_VERSION={{DRUIDVERSION}} docker compose --profile kafka-jupyter up -d +``` + +If you have an external Druid instance running on a different machine than the one hosting the Docker Compose environment, change the `host` variable in the notebook tutorial to the hostname or address of the machine where Druid is running. + +If Druid is running local to the same machine as Jupyter, open the tutorial and set the `host` variable to `host.docker.internal` before starting. For example: + +```python +host = "host.docker.internal" +``` + +To enable Druid to ingest data from Kafka within the Docker Compose environment, update the `bootstrap.servers` property in the Kafka ingestion spec to `localhost:9094` before ingesting. For reference, see [more on consumer properties](../development/extensions-core/kafka-supervisor-reference.md#more-on-consumerproperties). + ### Update image from Docker Hub If you already have a local cache of the Jupyter image, you can update the image before running the application using the following command: @@ -193,9 +221,30 @@ as well as the [Python client for Druid](tutorial-jupyter-index.md#python-api-fo You should now be able to access and complete the tutorials. +## Tutorial setup for ARM-based devices + +For ARM-based devices, follow this setup to start Druid externally, while keeping Kafka and Jupyter within the Docker Compose environment: + +1. Start Druid using the `start-druid` script. You can follow [Quickstart (local)](./index.md) instructions. The tutorials + assume that you are using the quickstart, so no authentication or authorization is expected unless explicitly mentioned. +2. Start either Jupyter only or Jupyter and Kafka using the following commands in the same directory as `docker-compose.yaml` and `environment`: + + ```bash + # Start only Jupyter + docker compose --profile jupyter up -d + + # Start Kafka and Jupyter + DRUID_VERSION={{DRUIDVERSION}} docker compose --profile kafka-jupyter up -d + ``` + +3. If Druid is running local to the same machine as Jupyter, open the tutorial and set the `host` variable to `host.docker.internal` before starting. For example: + ```python + host = "host.docker.internal" + ``` +4. If using Kafka to handle the data stream that will be ingested into Druid and Druid is running local to the same machine, update the consumer property `bootstrap.servers` to `localhost:9094`. + ## Learn more See the following topics for more information: * [Jupyter Notebook tutorials](tutorial-jupyter-index.md) for the available Jupyter Notebook-based tutorials for Druid -* [Tutorial: Run with Docker](docker.md) for running Druid from a Docker container - +* [Tutorial: Run with Docker](docker.md) for running Druid from a Docker container \ No newline at end of file diff --git a/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose-local.yaml b/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose-local.yaml index 3d7baef9052..197b8b3722e 100644 --- a/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose-local.yaml +++ b/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose-local.yaml @@ -58,12 +58,14 @@ services: # To learn about configuring Kafka for access across networks see # https://www.confluent.io/blog/kafka-client-cannot-connect-to-broker-on-aws-on-docker-etc/ - "9092:9092" + - '9094:9094' depends_on: - zookeeper environment: - KAFKA_BROKER_ID=1 - - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092 - - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092 + - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093,EXTERNAL://:9094 + - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,EXTERNAL://localhost:9094 + - KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,PLAINTEXT:PLAINTEXT - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181 - ALLOW_PLAINTEXT_LISTENER=yes - KAFKA_ENABLE_KRAFT=false diff --git a/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose.yaml b/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose.yaml index e6f2cd95ae9..932df6a22b4 100644 --- a/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose.yaml +++ b/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose.yaml @@ -58,12 +58,14 @@ services: # To learn about configuring Kafka for access across networks see # https://www.confluent.io/blog/kafka-client-cannot-connect-to-broker-on-aws-on-docker-etc/ - "9092:9092" + - '9094:9094' depends_on: - zookeeper environment: - KAFKA_BROKER_ID=1 - - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092 - - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092 + - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093,EXTERNAL://:9094 + - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,EXTERNAL://localhost:9094 + - KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,PLAINTEXT:PLAINTEXT - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181 - ALLOW_PLAINTEXT_LISTENER=yes - KAFKA_ENABLE_KRAFT=false