timedb is a opinionated schema and API built on top of PostgreSQL design to handle overlapping time series revisions and auditable human-in-the-loop updates.
Most time series systems assume a single immutable value per timestamp. timedb is built for domains where data is revised, forecasted, reviewed, and corrected over time.
timedb lets you:
- ⏱️ Retain "time-of-knowledge" history through a three-dimensional time series data model;
- ✍️ Make versioned ad-hoc updates to the time series data with annotations and tags; and
- 🔀 Represent both timestamp and time-interval time series simultaneously.
Most time series systems assume:
- one value per timestamp;
- immutable historical data; and
- no distinction between when something was true vs when it was known.
This pattern is a major drawback in situations such as:
- forecasting, where multiple forecast revisions predicts the same timestamp;
- backtesting, where "time-of-knowledge" history is required by algorithms;
- data communication, where and auditable history of updates is required.
- Human review and correction, where values are manually adjusted, annotated, or validated over time
- Late-arriving data and backfills, where new information must be incorporated without rewriting history
In practice, teams work around these limitations by overwriting data, duplicating tables and columns, or encoding semantics in column names — making systems fragile, opaque, and hard to reason about.
timedb addresses this by making revisions, provenance, and temporal semantics explicit in the data model, rather than treating them as edge cases.
pip install timedbimport timedb as td
import pandas as pd
from datetime import datetime, timezone, timedelta
# Create database schema (requires TIMEDB_DSN or DATABASE_URL env var)
td.create()
# Create time series data
base_time = datetime(2025, 1, 1, 0, 0, tzinfo=timezone.utc)
df = pd.DataFrame({
'valid_time': [base_time + timedelta(hours=i) for i in range(24)],
'value': [20.0 + i * 0.3 for i in range(24)]
})
# Insert time series
result = td.insert_batch(df=df)
print(f"Inserted series: {result.series_id}")
# Read data back
df_read = td.read()
print(df_read)See the examples/ folder for interactive Jupyter notebooks demonstrating more advanced usage.
| Field | Type | Purpose |
|---|---|---|
batch_id (primary key) |
UUID | Unique identifier for the batch (generated by the API) |
tenant_id |
UUID | Tenant identifier for multi-tenant support |
workflow_id (optional) |
text | Identifier for the workflow/pipeline (NULL for manual insertions) |
batch_start_time (optional) |
timestamptz | When the batch started |
batch_finish_time (optional) |
timestamptz | When the batch finished |
known_time |
timestamptz | When the data was known/available (default: now()) |
batch_params (optional) |
jsonb | Parameters/config used for this batch (e.g., model version, API args) |
inserted_at |
timestamptz | When the row was inserted (default: now()) |
| Field | Type | Purpose |
|---|---|---|
series_id (primary key) |
UUID | Unique identifier for the series |
name |
text | Parameter name (e.g., 'wind_power', 'temperature') |
unit |
text | Canonical unit for the series (e.g., 'MW', 'degC') |
labels |
jsonb | JSON object differentiating this series (e.g., {"site": "Gotland", "turbine": "T01"}) |
description (optional) |
text | Human-readable description |
inserted_at |
timestamptz | When the series was created (default: now()) |
Uniqueness constraint: (name, labels) - A series is uniquely identified by its name and labels. The unit is NOT part of the uniqueness constraint, so you cannot create two series with the same name+labels but different units.
| Field | Type | Purpose |
|---|---|---|
value_id (primary key) |
attribute | Unique identifier for each version of a value |
batch_id (foreign key) |
attribute | References the batch that produced this value (batches_table.batch_id) |
tenant_id |
attribute | Tenant identifier for multi-tenant support |
series_id (foreign key) |
attribute | References the series (series_table.series_id) |
valid_time |
time dimension | Timestamp the value is valid for |
valid_time_end (optional) |
time dimension | Optional interval end time; NULL means point-in-time at valid_time |
value (optional) |
measure | The numeric value (nullable; NULL can be a valid stored value) |
annotation (optional) |
attribute | Optional human annotation (whitespace-only disallowed) |
metadata (optional) |
attribute | Optional JSONB metadata (e.g., {"quality": "good", "source": "manual"}) |
tags (optional) |
attribute | Optional semantic labels / quality flags (empty arrays disallowed; use NULL) |
changed_by (optional) |
attribute | User or service responsible for the change |
change_time |
time dimension | When this version row was created (default now()) |
is_current |
attribute | Whether this row is the active version for its key (default true) |
| Field | Type | Purpose |
|---|---|---|
metadata_id (primary key) |
attribute | Surrogate primary key for metadata rows |
batch_id (foreign key) |
attribute | References batch context (batches_table.batch_id) |
tenant_id |
attribute | Tenant identifier for multi-tenant support |
valid_time |
time dimension | Time context for the metadata (joins onto values via (batch_id, valid_time)) |
metadata_key |
attribute | Name of the metadata field (e.g. contractId, deliveryStart) |
value_number (optional) |
attribute | Numeric metadata value (exactly one typed value must be set per row) |
value_string (optional) |
attribute | String metadata value (mutually exclusive with other typed values) |
value_bool (optional) |
attribute | Boolean metadata value (mutually exclusive with other typed values) |
value_time (optional) |
attribute | Timestamp metadata value (mutually exclusive with other typed values) |
value_json (optional) |
attribute | JSON metadata value (mutually exclusive with other typed values) |
inserted_at |
time dimension | When the metadata row was inserted (default now()) |
Most time series databases are two-dimensional: they map a timestamp to a value. timedb adds a third dimension: time-of-knowledge (known_time).
This allows you to store overlapping forecasts and revisions while preserving the full history of how values evolved. For example:
- A forecast made on Monday for Tuesday can coexist with a forecast made on Tuesday for the same time
- You can query "what did we know about Tuesday on Monday?" vs "what do we know now?"
The key dimensions are:
valid_time: When the value is valid (the timestamp being forecasted/measured)known_time: When this value became known (when the forecast was made or data arrived)batch_id: Groups values that were inserted together in one batch
This design naturally supports:
- Forecast revisions: Multiple predictions for the same valid_time from different known_times
- Data corrections: Updates that preserve the original value with full audit trail
- Backtesting: Reconstruct what was known at any point in the past
Beyond the core time dimensions, timedb includes attributes for human-in-the-loop corrections:
tags: Array of strings for quality flags (e.g.,["reviewed", "corrected"])annotation: Text descriptions of changesmetadata: JSONB field for flexible metadata storagechanged_by: Who made the changeversion: Incremented on each update to track historyis_current: Flag indicating if this is the active version
Updates create new rows rather than overwriting, so you maintain a complete audit trail of all changes.
- Decouple the knowledge time from the batch_start_time
- Python SDK that allows time series data manipulations, reads and writes
- RESTful API layer that serves data to users
- Built in authentication and multi-tenancy support
- Unit handling (e.g. MW, kW)
- Handle different time zones in the API layer while always storing in UTC in the database
- Support for postgres time intervals (tsrange/tstzrange)
- Built in data retention, TTL, and archiving
- Support for subscribing to database updates through the API
- Xarray integration for multidimensional time series
- Polars integration for lazy computations
- Parquet file integration
- Real-Time Subscriptions through websocket subscription
- Store time series with geographic coordinates. Query by spatial region (e.g., "all temperature sensors in this polygon")
- Automatic alignment and interpolation of different time series resolutions
- Symbolic time series + serialization