Replication
LiteJoin integrates with Litestream for continuous backup of SQLite shards to cloud object storage. Litestream ships WAL frames to S3, GCS, or Azure Blob in near-real-time, providing disaster recovery with minimal data loss.
Why Replicate?
- Data loss on crash — LiteJoin targets containerized deployments where local disk is ephemeral. A pod restart means data is gone.
- Restore-on-startup — Containers can boot, restore shards from S3, and resume processing automatically.
- Point-in-time recovery — Roll back to any point within the retention window after a bad deployment.
Sidecar Setup (Recommended)
The simplest approach: run Litestream alongside LiteJoin as a sidecar process.
1. Create litestream.yml
dbs:
- path: ./data/shard_0.db
replicas:
- type: s3
bucket: my-litejoin-backups
path: shard_0
region: us-east-1
sync-interval: 1s
- path: ./data/shard_1.db
replicas:
- type: s3
bucket: my-litejoin-backups
path: shard_1
region: us-east-1
sync-interval: 1s
# ... one entry per shard (match your shard_count)
2. Startup Script
#!/bin/bash
# run.sh — restore shards, then start Litestream with LiteJoin
# Restore all shards from replica (skips if local file exists)
for i in $(seq 0 7); do
litestream restore -if-db-not-exists \
-o ./data/shard_$i.db \
s3://my-litejoin-backups/shard_$i
done
# Start Litestream replicate, exec LiteJoin as child process
exec litestream replicate -exec "litejoin run -c litejoin.yaml"
The -exec flag makes Litestream the parent process: it starts replication, launches LiteJoin as a subprocess, and gracefully flushes WAL frames on shutdown.
3. Dockerfile
FROM golang:1.24-alpine AS build
RUN apk add --no-cache gcc musl-dev
WORKDIR /app
COPY . .
RUN CGO_ENABLED=1 go build -o litejoin ./cmd/litejoin
FROM alpine:3.21
RUN apk add --no-cache litestream ca-certificates
COPY --from=build /app/litejoin /usr/local/bin/
COPY litestream.yml /etc/litestream.yml
COPY run.sh /run.sh
RUN chmod +x /run.sh
ENTRYPOINT ["/run.sh"]
Configuration
replication:
enabled: true
mode: "sidecar"
replica_url: "s3://my-bucket/litejoin"
sync_interval: 1s
restore_on_boot: true
snapshot_interval: 1h
retention: 72h
| Field | Type | Default | Description |
|---|
enabled | bool | false | Enable replication. |
mode | string | sidecar | "sidecar" or "embedded". |
replica_url | string | — | S3/GCS/Azure URL for replicas. |
sync_interval | duration | 1s | How often WAL frames are shipped. |
restore_on_boot | bool | true | Restore shards from replica on startup. |
snapshot_interval | duration | 1h | How often full snapshots are created. |
retention | duration | 72h | How long to keep replica history. |
Replica Layout
s3://my-bucket/litejoin/
├── shard_0/
│ └── generations/
│ └── a1b2c3d4/
│ ├── snapshots/
│ └── wal/
├── shard_1/
│ └── ...
├── ...
└── dlq/
└── ... # DLQ database also replicated
Restore Flow
On startup with restore_on_boot: true:
- For each shard, check if the local file exists.
- If missing, restore from the latest snapshot + replay WAL frames from the replica.
- If present, skip restore and start replication.
Litestream reads WAL frames asynchronously from disk — it does not sit in the write path. Benchmarks show < 1% write latency impact in sidecar mode.
Supported Backends
| Backend | URL Format |
|---|
| Amazon S3 | s3://bucket-name/prefix |
| Google Cloud Storage | gcs://bucket-name/prefix |
| Azure Blob Storage | abs://container-name/prefix |
| SFTP | sftp://user@host/path |
| Local file | file:///path/to/backup |
Litestream uses standard cloud SDK credentials (AWS_ACCESS_KEY_ID, GOOGLE_APPLICATION_CREDENTIALS, etc.). See the Litestream documentation for details.
Important Considerations
Litestream is a backup/recovery tool, not a high-availability solution. There is no automatic failover. If LiteJoin crashes, it must restart and restore — there is a recovery window bounded by sync_interval.
- Shard count changes require re-sharding. The FNV hash distribution changes when you modify
shard_count, invalidating existing replicas.
- DLQ replication — the
dlq.db database should also be replicated for complete disaster recovery.
- Encryption — Litestream does not encrypt WAL frames. Use server-side encryption (SSE-S3, SSE-KMS) for encryption at rest.