7.8 KiB

Raw Blame History

name	description
timescaledb	Manage time-series data in PostgreSQL with TimescaleDB. Use this skill to install, configure, optimize, and interact with TimescaleDB for high-performance time-series data storage and analysis. This includes creating hypertables, continuous aggregates, handling data retention, and querying time-series data efficiently.

TimescaleDB Skill

Manage time-series data in PostgreSQL using TimescaleDB, extending PostgreSQL for high-performance time-series workloads.

When to Use This Skill

Use this skill when you need to:

Work with time-series data in PostgreSQL
Install and configure TimescaleDB
Create and manage hypertables
Optimize performance for time-series data
Implement continuous aggregates for rollup data
Manage data retention and compression
Query and analyze time-series data
Migrate existing PostgreSQL tables to hypertables
Integrate with other PostgreSQL tools and extensions

Not For / Boundaries

This skill is NOT for:

General PostgreSQL administration (use a specific PostgreSQL skill for that)
Deep database tuning unrelated to time-series performance
Replacing dedicated time-series databases if TimescaleDB's PostgreSQL foundation is not a requirement
Providing data visualization beyond basic SQL queries (use a BI tool or separate visualization library)

Quick Reference

Installation & Configuration

Install TimescaleDB Extension (Debian/Ubuntu):

sudo apt install -y postgresql-{{pg_version}}-timescaledb
sudo pg_createcluster {{pg_version}} main --start
sudo pg_ctlcluster {{pg_version}} main start
sudo -u postgres psql -c "CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;"

(Replace {{pg_version}} with your PostgreSQL version, e.g., 16)

Configuration (postgresql.conf):

# Add to postgresql.conf
shared_preload_libraries = 'timescaledb'
timescaledb.max_background_workers = 8 # Adjust based on CPU cores
max_connections = 100 # Adjust based on workload

(After changes, restart PostgreSQL: sudo systemctl restart postgresql)

Hypertables

Create Hypertables:

CREATE TABLE sensor_data (
  time TIMESTAMPTZ NOT NULL,
  device_id INT,
  temperature DOUBLE PRECISION,
  humidity DOUBLE PRECISION
);

SELECT create_hypertable('sensor_data', 'time');

Convert Existing Table to Hypertable:

SELECT create_hypertable('your_existing_table', 'time_column', migrate_data => true);

Show Hypertables:

\d+
SELECT * FROM timescaledb_information.hypertables;

Continuous Aggregates

Create Continuous Aggregate:

CREATE MATERIALIZED VIEW device_hourly_summary
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  AVG(temperature) AS avg_temp,
  MAX(temperature) AS max_temp
FROM sensor_data
GROUP BY time_bucket('1 hour', time), device_id
WITH NO DATA; -- Initially create without data

-- Refresh the continuous aggregate
CALL refresh_continuous_aggregate('device_hourly_summary', NULL, NULL);

Get Continuous Aggregates Info:

SELECT * FROM timescaledb_information.continuous_aggregates;

Data Retention & Compression

Set Data Retention Policy (Drop data older than 3 months):

SELECT add_retention_policy('sensor_data', INTERVAL '3 months');

Enable Compression (Compress data older than 7 days):

ALTER TABLE sensor_data SET (timescaledb.compress = TRUE);
SELECT add_compression_policy('sensor_data', INTERVAL '7 days');

Show Compression Status:

SELECT * FROM timescaledb_information.compression_settings;

Querying Time-Series Data

Basic Time-Range Query:

SELECT * FROM sensor_data
WHERE time >= NOW() - INTERVAL '1 day'
  AND time < NOW()
ORDER BY time DESC;

Gapfilling and Interpolation:

SELECT
  time_bucket('1 hour', time) AS bucket,
  AVG(temperature) AS avg_temp,
  locf(AVG(temperature)) OVER (ORDER BY time_bucket('1 hour', time)) AS avg_temp_locf
FROM sensor_data
GROUP BY bucket
ORDER BY bucket;

High-Performance Queries

Approximate Count:

SELECT COUNT(*) FROM sensor_data TABLESAMPLE BERNOULLI (1);

Top-N Queries:

SELECT time, device_id, temperature
FROM sensor_data
WHERE time >= NOW() - INTERVAL '1 day'
ORDER BY temperature DESC
LIMIT 10;

Examples

Example 1: IoT Sensor Data Pipeline

Input: Stream of sensor readings (time, device_id, value)
Steps:
1. Create a hypertable for iot_readings.
2. Ingest data into the hypertable.
3. Create a continuous aggregate to compute hourly average readings.
4. Query the continuous aggregate for a specific device's hourly trend.
5. Set a retention policy to keep only 1 year of raw data.
Expected output / acceptance: Efficient storage, automatic hourly rollups, and proper data pruning.

Example 2: Financial Tick Data Analysis

Input: High-frequency financial tick data (timestamp, symbol, price, volume)
Steps:
1. Create a hypertable tick_data with proper chunk sizing for high ingest rate.
2. Enable compression for older tick_data.
3. Query tick_data to calculate 5-minute VWAP (Volume Weighted Average Price) for a specific symbol.
4. Visualize the VWAP over the last trading day.
Expected output / acceptance: Ability to ingest and analyze millions of rows/second, with optimized storage and fast analytical queries.

Example 3: Monitoring System Metrics

Input: Server metrics (timestamp, host_id, cpu_usage, memory_usage, network_io)
Steps:
1. Create a hypertable system_metrics partitioned by time and host_id.
2. Use a time_bucket_gapfill query to find CPU usage for all hosts over the last 24 hours, filling in missing data points.
3. Create an alert based on MAX(cpu_usage) exceeding a threshold using a continuous aggregate.
Expected output / acceptance: Comprehensive monitoring with gap-filled data for visualization and real-time alerting.

References

references/installation.md: Detailed installation and setup
references/hypertables.md: Deep dive into hypertable management
references/continuous_aggregates.md: Advanced continuous aggregate techniques
references/compression.md: Comprehensive guide to data compression
references/api.md: TimescaleDB SQL functions and commands reference
references/performance.md: Performance tuning and best practices
references/getting_started.md: Official TimescaleDB Getting Started Guide
references/llms.md: Using TimescaleDB with LLMs (e.g., storing embeddings, RAG)
references/llms-full.md: Full LLM integration scenarios
references/tutorials.md: Official TimescaleDB Tutorials and Use Cases
references/time_buckets.md: Guide to time_bucket and gapfilling functions
references/hyperfunctions.md: Advanced analytical functions for time-series

Maintenance

Sources: Official TimescaleDB Documentation, GitHub repository, blog posts.
Last updated: 2025-12-17
Known limits: This skill focuses on core TimescaleDB features. Advanced PostgreSQL features (e.g., PostGIS, JSONB) are covered by other specialized skills.

Troubleshooting

Slow Queries

Ensure indexes are on time and other frequently queried columns.
Verify chunk sizing is appropriate for your data ingestion rate.
Use EXPLAIN ANALYZE to identify bottlenecks.
Consider creating continuous aggregates for frequently accessed aggregated data.

High Disk Usage

Implement data retention policies for older, less critical data.
Enable compression for older chunks.
Regularly run VACUUM ANALYZE on your tables.

Failed to Create Hypertable

Ensure the time column is TIMESTAMPTZ or a supported integer type.
The table must be empty or you must use migrate_data => true.
Check for existing triggers or foreign keys that might conflict.

This skill provides a robust foundation for managing time-series data with TimescaleDB!

7.8 KiB Raw Blame History