LangWatch stores data across three systems. Each requires its own backup strategy:Documentation Index
Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
| Data Store | What It Stores | Backup Priority |
|---|---|---|
| PostgreSQL | Users, teams, projects, configurations, prompt versions | Critical |
| ClickHouse | Traces, spans, evaluations, experiments, analytics | High |
| S3 | Datasets, ClickHouse cold data | Medium |
PostgreSQL Backups
PostgreSQL holds your control plane data — losing it means losing user accounts, project configurations, and monitor definitions.Chart-Managed PostgreSQL
If you’re using the chart-managed PostgreSQL (development/small deployments), usepg_dump:
External PostgreSQL (RDS, Cloud SQL, etc.)
For production, use your cloud provider’s built-in backup features:- AWS RDS: Enable automated snapshots (recommended: 30-day retention) and point-in-time recovery
- GCP Cloud SQL: Enable automated backups with point-in-time recovery
- Azure Database: Enable geo-redundant backups
ClickHouse Backups
ClickHouse holds all your trace and evaluation data. Theclickhouse-serverless subchart supports native ClickHouse BACKUP/RESTORE to S3-compatible storage.
Enable Backups
Backups require an S3-compatible bucket. Configure in your Helm values:cold-storage-s3.yaml overlay which enables both cold storage and backups:
S3 Authentication
IRSA / Workload Identity (recommended):Backup Schedule
| Backup Type | Default Schedule | Description |
|---|---|---|
| Full | 0 */12 * * * (every 12h) | Complete database backup |
| Incremental | 0 * * * * (every 1h) | Only changes since last full backup |
clickhouse-client commands inside the ClickHouse pod.
Restore from Backup
To restore, you need to identify the backup name and run the restore command:ClickHouse Cold Storage
Cold storage is separate from backups — it’s a tiered storage strategy that automatically moves older data from local SSD to S3 for cost savings.How It Works
- New data is written to hot storage (local SSD on the ClickHouse pod)
- After the TTL period, data is moved to cold storage (S3)
- Queries transparently read from both hot and cold storage
- Cold data is cached locally for repeated reads
Enable Cold Storage
We recommend setting the TTL to a multiple of 7 (e.g., 7, 14, 28, 49) to align with ClickHouse’s weekly partition boundaries for more efficient data management. The default of 49 days means data stays on fast local storage for ~7 weeks before moving to S3.
Cost Savings
Cold storage can reduce storage costs significantly:| Storage Type | Approximate Cost | Speed |
|---|---|---|
| gp3 SSD (hot) | ~$0.08/GB/month | Fast |
| S3 Standard (cold) | ~$0.023/GB/month | Slower (cached) |
| S3 Infrequent Access | ~$0.0125/GB/month | Slower |
S3 Dataset Backups
If you’re using S3 for dataset storage (app.datasetObjectStorage.enabled: true), protect this data with:
- S3 Versioning: Enable versioning on the bucket to recover from accidental deletes
- Cross-region replication: For disaster recovery, replicate to another region
- Lifecycle policies: Move old versions to Glacier after 30 days
Disaster Recovery Checklist
- PostgreSQL automated backups enabled (30-day retention)
- ClickHouse backup CronJobs running (check
kubectl get cronjobs) - S3 bucket versioning enabled
- Backup S3 bucket is in a different region or account from primary
- Restore procedure documented and tested
- Quarterly restore drills scheduled