druid/docs/content/dependencies/cassandra-deep-storage.md

---
layout: doc_page
title: "Cassandra Deep Storage"
---

<!--
  ~ Licensed to the Apache Software Foundation (ASF) under one
  ~ or more contributor license agreements.  See the NOTICE file
  ~ distributed with this work for additional information
  ~ regarding copyright ownership.  The ASF licenses this file
  ~ to you under the Apache License, Version 2.0 (the
  ~ "License"); you may not use this file except in compliance
  ~ with the License.  You may obtain a copy of the License at
  ~
  ~   http://www.apache.org/licenses/LICENSE-2.0
  ~
  ~ Unless required by applicable law or agreed to in writing,
  ~ software distributed under the License is distributed on an
  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  ~ KIND, either express or implied.  See the License for the
  ~ specific language governing permissions and limitations
  ~ under the License.
  -->

# Cassandra Deep Storage

## Introduction

Druid can use Cassandra as a deep storage mechanism. Segments and their metadata are stored in Cassandra in two tables:
`index_storage` and `descriptor_storage`.  Underneath the hood, the Cassandra integration leverages Astyanax.  The
index storage table is a [Chunked Object](https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store) repository. It contains
compressed segments for distribution to historical nodes.  Since segments can be large, the Chunked Object storage allows the integration to multi-thread
the write to Cassandra, and spreads the data across all the nodes in a cluster.  The descriptor storage table is a normal C* table that
stores the segment metadatak.

## Schema
Below are the create statements for each:

```sql
CREATE TABLE index_storage(key text,
                           chunk text,
                           value blob,
                           PRIMARY KEY (key, chunk)) WITH COMPACT STORAGE;

CREATE TABLE descriptor_storage(key varchar,
                                lastModified timestamp,
                                descriptor varchar,
                                PRIMARY KEY (key)) WITH COMPACT STORAGE;
```

## Getting Started
First create the schema above. I use a new keyspace called `druid` for this purpose, which can be created using the
[Cassandra CQL `CREATE KEYSPACE`](http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/create_keyspace_r.html) command.

Then, add the following to your historical and realtime runtime properties files to enable a Cassandra backend.

```properties
druid.extensions.loadList=["druid-cassandra-storage"]
druid.storage.type=c*
druid.storage.host=localhost:9160
druid.storage.keyspace=druid
```
Front Matter header needs to be on the first line for md to be rendered properly by jekyll (#6733) 2018-12-13 14:47:20 -05:00			`---`
			`layout: doc_page`
			`title: "Cassandra Deep Storage"`
			`---`

add missing license headers, in particular to MD files; clean up RAT … (#6563) * add missing license headers, in particular to MD files; clean up RAT exclusions * revert inadvertent doc changes * docs * cr changes * fix modified druid-production.svg 2018-11-13 12:38:37 -05:00			`<!--`
			`~ Licensed to the Apache Software Foundation (ASF) under one`
			`~ or more contributor license agreements. See the NOTICE file`
			`~ distributed with this work for additional information`
			`~ regarding copyright ownership. The ASF licenses this file`
			`~ to you under the Apache License, Version 2.0 (the`
			`~ "License"); you may not use this file except in compliance`
			`~ with the License. You may obtain a copy of the License at`
			`~`
			`~ http://www.apache.org/licenses/LICENSE-2.0`
			`~`
			`~ Unless required by applicable law or agreed to in writing,`
			`~ software distributed under the License is distributed on an`
			`~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY`
			`~ KIND, either express or implied. See the License for the`
			`~ specific language governing permissions and limitations`
			`~ under the License.`
			`-->`

Added titles and harmonized docs to improve usability and SEO (#6731) * added titles and harmonized docs * manually fixed some titles 2018-12-12 23:42:12 -05:00			`# Cassandra Deep Storage`
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00
			`## Introduction`
Added titles and harmonized docs to improve usability and SEO (#6731) * added titles and harmonized docs * manually fixed some titles 2018-12-12 23:42:12 -05:00
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00			`Druid can use Cassandra as a deep storage mechanism. Segments and their metadata are stored in Cassandra in two tables:`
Added titles and harmonized docs to improve usability and SEO (#6731) * added titles and harmonized docs * manually fixed some titles 2018-12-12 23:42:12 -05:00			`index_storage` and `descriptor_storage`. Underneath the hood, the Cassandra integration leverages Astyanax. The
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00			`index storage table is a [Chunked Object](https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store) repository. It contains`
			`compressed segments for distribution to historical nodes. Since segments can be large, the Chunked Object storage allows the integration to multi-thread`
Added titles and harmonized docs to improve usability and SEO (#6731) * added titles and harmonized docs * manually fixed some titles 2018-12-12 23:42:12 -05:00			`the write to Cassandra, and spreads the data across all the nodes in a cluster. The descriptor storage table is a normal C* table that`
			`stores the segment metadatak.`
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00
			`## Schema`
			`Below are the create statements for each:`

			```sql
			`CREATE TABLE index_storage(key text,`
			`chunk text,`
			`value blob,`
			`PRIMARY KEY (key, chunk)) WITH COMPACT STORAGE;`

			`CREATE TABLE descriptor_storage(key varchar,`
			`lastModified timestamp,`
			`descriptor varchar,`
			`PRIMARY KEY (key)) WITH COMPACT STORAGE;`
			```

			`## Getting Started`
			First create the schema above. I use a new keyspace called `druid` for this purpose, which can be created using the
			[Cassandra CQL `CREATE KEYSPACE`](http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/create_keyspace_r.html) command.

			`Then, add the following to your historical and realtime runtime properties files to enable a Cassandra backend.`

			```properties
do some editing of the instructions for using mysql for metadata 2016-01-20 21:46:26 -05:00			`druid.extensions.loadList=["druid-cassandra-storage"]`
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00			`druid.storage.type=c*`
			`druid.storage.host=localhost:9160`
			`druid.storage.keyspace=druid`
			```