In Druid, use time ranges to specify the data you want to update, as opposed to a primary key or dimensions often used in transactional databases. Data outside the specified replacement time range remains unaffected.
You can use this Druid functionality to perform data updates, inserts, and deletes, similar to UPSERT functionality for transactional databases.
This tutorial shows you how to use the Druid SQL [REPLACE](../multi-stage-query/reference.md#replace) function with the OVERWRITE clause to update existing data.
Before you follow the steps in this tutorial, download Druid as described in [Quickstart (local)](index.md) and have it running on your local machine. You don't need to load any data into the Druid cluster.
You should be familiar with data querying in Druid. If you haven't already, go through the [Query data](../tutorials/tutorial-query.md) tutorial first.
Load a sample dataset using [REPLACE](../multi-stage-query/reference.md#replace) and [EXTERN](../multi-stage-query/reference.md#extern-function) functions.
In Druid SQL, the REPLACE function can create a new [datasource](../design/storage.md) or update an existing datasource.
Note that the values in the `__time` column have changed to one day later.
## Overwrite records for a specific time range
You can use the REPLACE function to overwrite a specific time range of a datasource. When you overwrite a specific time range, that time range must align with the granularity specified in the PARTITIONED BY clause.
In the web console, open a new tab and run the following query to insert a new row and update specific rows. Note that the OVERWRITE WHERE clause tells the query to only update records for the date 2024-01-03.
```sql
REPLACE INTO "update_tutorial"
OVERWRITE WHERE "__time" >= TIMESTAMP'2024-01-03 00:00:00' AND "__time" <TIMESTAMP'2024-01-0400:00:00'
* The `iguana` and `seahorse` rows have different numbers.
## Update a row using partial segment overshadowing
In Druid, you can overlay older data with newer data for the entire segment or portions of the segment within a particular partition.
This capability is called [overshadowing](../ingestion/tasks.md#overshadowing-between-segments).
You can use partial overshadowing to update a single row by adding a smaller time granularity segment on top of the existing data.
It's a less common variation on a more common approach where you replace the entire time chunk.
The following example demonstrates how update data using partial overshadowing with mixed segment granularity.
Note the following important points about the example:
* The query updates a single record for a specific `number` row.
* The original datasource uses DAY segment granularity.
* The new data segment is at HOUR granularity and represents a time range that's smaller than the existing data.
* The OVERWRITE WHERE and WHERE TIME_IN_INTERVAL clauses specify the destination where the update occurs and the source of the update, respectively.
* The query replaces everything within the specified interval. To update only a subset of data in that interval, you have to carry forward all records, changing only what you want to change. You can accomplish that by using the [CASE](../querying/sql-functions.md#case) function in the SELECT list.
```sql
REPLACE INTO "update_tutorial"
OVERWRITE
WHERE "__time" >= TIMESTAMP'2024-01-03 05:00:00' AND "__time" <TIMESTAMP'2024-01-0306:00:00'
SELECT
"__time",
"animal",
CAST(486 AS BIGINT) AS "number"
FROM "update_tutorial"
WHERE TIME_IN_INTERVAL("__time", '2024-01-03T05:01:35Z/PT1S')
When you perform partial segment overshadowing multiple times, you can create segment fragmentation that could affect query performance. Use [compaction](../data-management/compaction.md) to correct any fragmentation.
* [Data updates](../data-management/update.md) for an overview of updating data in Druid.
* [Load files with SQL-based ingestion](../tutorials/tutorial-msq-extern.md) for generating a query that references externally hosted data.
* [Overwrite data with REPLACE](../multi-stage-query/concepts.md#overwrite-data-with-replace) for details on how the MSQ task engine executes SQL REPLACE queries.