Skip to main content

Deployment Guide

LiteJoin is a single binary with no external dependencies. This makes deployment simple, but there are operational considerations for production use.

Resource Planning

CPU

LiteJoin is lightweight. A 2-vCPU machine handles most workloads. CPU usage scales with:
  • Number of sources (each poll cycle parses JSON, diffs, and writes)
  • Join complexity (each write triggers join re-evaluation)
  • Window evaluation frequency

Memory

Default memory usage is ~50–100 MB. SQLite’s page cache is the main consumer. If tiered storage is enabled with DuckDB, allocate additional memory (default DuckDB limit: 256 MB).

Disk

Disk usage depends on:
  • Shard count × retention duration × ingest rate
  • Each shard is a separate SQLite database file
  • WAL files can temporarily double the on-disk size during heavy writes
Rule of thumb: Allocate 2–3× your expected data size to account for WAL files and temporary compaction during archival.

Network

API source polling generates outbound HTTP requests. Calculate your request budget:
requests_per_minute = (sources × pages_per_poll) / interval_seconds × 60
A source polling every 10s that fetches 5 pages = 30 requests/minute.

Monitoring

Health Check

curl http://localhost:9100/health
# {"status": "ok"}

Log Levels

LiteJoin logs to stdout. Key patterns to monitor:
PatternLevelMeaning
api-source X: poll failed (N/max)WARNTransient source failure
api-source X: ERROR N consecutive failuresERRORSource degraded
dlq: enqueued result for sinkINFODelivery failure captured
dlq: entry exceeded max retriesERRORPermanent delivery failure
joiner: FAILED sink sendWARN/ERRORSink unavailable

Pipeline Monitoring

LiteJoin logs pipeline queue depths periodically to help diagnose bottlenecks:
pipeline-monitor: msgCh=12/1000 (1.2%) | writer-batch=45 | joiner-pending=3

Graceful Shutdown

LiteJoin handles SIGTERM and SIGINT gracefully:
  1. Sources stop polling / consuming
  2. In-flight messages are flushed to storage
  3. Pending DLQ retries complete
  4. Sink connections are closed
  5. SQLite databases are checkpointed and closed
Allow 10–30 seconds for graceful shutdown in your orchestrator’s termination grace period.

Multiple Pipelines

Run multiple LiteJoin instances with separate configs and data directories:
# Pipeline A
litejoin run -c pipeline-a.yaml

# Pipeline B (different data dir to avoid conflicts)
litejoin run -c pipeline-b.yaml
Never point two LiteJoin instances at the same data_dir. SQLite requires exclusive write access.