Introduction
As data grows over time, storing every row forever becomes increasingly expensive and often unnecessary. Application logs, IoT events, monitoring metrics, audit records, and clickstream data typically have different retention requirements depending on their age.
ClickHouse® provides TTL (Time To Live), a powerful built-in feature that automates data lifecycle management. Instead of relying on scheduled cleanup scripts or manual DELETE operations, TTL rules can be defined directly within a table schema.
With TTL, ClickHouse can automatically:
- Delete expired data
- Move older data to lower-cost storage
- Recompress aging data using stronger codecs
- Aggregate historical records into summarized data
These capabilities make TTL one of the most effective storage optimization features available in ClickHouse.
What is TTL?
TTL (Time To Live) defines what should happen to data after a specified period of time.
A TTL expression is usually based on a Date or DateTime column.
TTL event_time + INTERVAL 30 DAY
After the specified interval expires, ClickHouse performs the configured action during background merge operations.
Unlike scheduled jobs, TTL is completely integrated into the MergeTree engine, making lifecycle management automatic and maintenance-free.
How TTL Works Internally
When new data is inserted into a MergeTree table, the following process occurs:
- Data is written into immutable parts.
- Background merge operations combine smaller parts into larger ones.
- During each merge, ClickHouse evaluates TTL expressions.
- Expired rows are deleted, moved, recompressed, or aggregated depending on the configured rule.
Because TTL works during merges:
- Expired data may remain visible for a short period.
- Cleanup is asynchronous.
- Execution depends on the merge schedule.
For testing purposes, you can force a merge using:
OPTIMIZE TABLE table_name FINAL;
1. Automatically Deleting Expired Data
The most common TTL use case is automatic data deletion.
CREATE TABLE logs
(
timestamp DateTime,
message String
)
ENGINE = MergeTree
ORDER BY timestamp
TTL timestamp + INTERVAL 30 DAY;
What happens?
- Newly inserted data remains available.
- After 30 days, rows become eligible for removal.
- During future background merges, ClickHouse automatically deletes expired rows.
This eliminates the need for cron jobs or periodic cleanup scripts while reducing storage usage automatically.
2. Column-Level TTL
TTL isn't limited to entire rows. It can also be applied to individual columns.
CREATE TABLE user_events
(
event_time DateTime,
user_id UInt64,
session_data String TTL event_time + INTERVAL 7 DAY
)
ENGINE = MergeTree
ORDER BY event_time;
Result
After seven days:
-
session_datais removed. - The remaining columns stay intact.
This is particularly useful when temporary or sensitive information only needs to be retained for a limited time.
3. Moving Data to Cold Storage
Many organizations use multiple storage tiers.
| Storage Tier | Purpose |
|---|---|
| SSD | Frequently accessed recent data |
| HDD / Object Storage | Historical or infrequently accessed data |
TTL can automatically move older data between storage volumes.
CREATE TABLE logs
(
timestamp DateTime,
message String
)
ENGINE = MergeTree
ORDER BY timestamp
TTL timestamp + INTERVAL 90 DAY
TO VOLUME 'cold_storage';
Result
- Recent data remains on fast storage.
- Older data is moved to lower-cost storage.
This helps reduce infrastructure costs while keeping historical data available when needed.
4. Recompressing Older Data
Recent data is queried frequently and benefits from fast compression codecs such as LZ4.
Historical data is accessed less often, making stronger compression more efficient.
TTL supports automatic recompression.
CREATE TABLE events
(
event_time DateTime,
user_id UInt64,
value Float64
)
ENGINE = MergeTree
ORDER BY event_time
TTL event_time + INTERVAL 30 DAY
RECOMPRESS CODEC(ZSTD(17));
Compression lifecycle
| Data Age | Compression |
|---|---|
| 0–30 Days | Default (LZ4) |
| After 30 Days | ZSTD(17) |
Benefits
- Lower storage consumption
- Better disk utilization
- Fully automated optimization
- No manual recompression required
5. Aggregating Historical Data with GROUP BY TTL
One of the most powerful TTL capabilities is automatic rollup aggregation.
Instead of deleting old data, ClickHouse can summarize historical records into aggregated values.
This is particularly useful for:
- Monitoring systems
- IoT platforms
- Clickstream analytics
- Time-series databases
Example
CREATE TABLE hits
(
timestamp DateTime,
id String,
hits Int32,
max_hits Int32 DEFAULT hits,
sum_hits Int64 DEFAULT hits
)
ENGINE = MergeTree
PRIMARY KEY (id, toStartOfDay(timestamp), timestamp)
TTL timestamp + INTERVAL 1 DAY
GROUP BY
id,
toStartOfDay(timestamp)
SET
max_hits = max(max_hits),
sum_hits = sum(sum_hits);
Before Rollup
| Timestamp | ID | Hits |
|---|---|---|
| 10:00 | A | 5 |
| 11:00 | A | 8 |
| 12:00 | A | 12 |
After one day, the TTL rule executes:
TTL timestamp + INTERVAL 1 DAY
GROUP BY
id,
toStartOfDay(timestamp)
SET
max_hits = max(max_hits),
sum_hits = sum(sum_hits);
After Rollup
| Day | ID | Max Hits | Sum Hits |
|---|---|---|---|
| 2026-06-20 | A | 12 | 25 |
Instead of storing three separate rows, ClickHouse stores one summarized row.
This significantly reduces storage while preserving meaningful historical insights.
Best Practices for Using TTL
- Use TTL for predictable retention policies instead of scheduled DELETE statements.
- Remember that TTL actions occur during background merges, not immediately after expiration.
- Apply column-level TTL when only specific fields need limited retention.
- Use storage tiering to reduce infrastructure costs without losing historical data.
- Combine TTL with recompression to maximize disk efficiency.
- Consider
GROUP BY TTLfor time-series workloads where summarized historical data is sufficient. - Use
OPTIMIZE TABLE ... FINALonly for testing, not as a regular maintenance task.
Conclusion
TTL is one of ClickHouse's most valuable automation features for managing data throughout its lifecycle. With a single declarative rule, you can automatically delete expired data, move historical records to lower-cost storage, apply stronger compression to older data, or aggregate detailed records into summarized insights.
By leveraging TTL effectively, organizations can reduce storage costs, simplify maintenance, improve long-term performance, and build scalable analytical systems without relying on external cleanup jobs or manual intervention.
Whether you're managing log data, monitoring metrics, IoT telemetry, or large-scale analytical workloads, TTL provides a clean and efficient approach to implementing automated data retention policies in ClickHouse.
United States
NORTH AMERICA
Related News
Corporativismo fascista e Taleb
7h ago
How I Built a Full Stack Laundry Management System Using Angular & Node.js
7h ago
SvGrid: a Svelte 5 native data grid (MIT core, headless + render component, MCP-ready)
10h ago
Turing's Last Cipher: The Lost Archive
10h ago
Why Most Custom AI Skills Never Run (And the One Fix)
11h ago