Core Concepts
Every LiteJoin pipeline follows the same data flow: sources ingest data into topics, joins and windows process it with SQL, and sinks deliver the results.Sources
A source is anything that produces data. LiteJoin supports three source types:| Source Type | How It Works |
|---|---|
| API | Polls any REST API on an interval, diffs the response, and emits only changes. Supports pagination, watermarks, and rate limiting. |
| HTTP | Opens an HTTP endpoint that accepts POST requests. External systems push events directly to LiteJoin. |
| Kafka | Consumes messages from one or more Kafka topics via a consumer group. |
Topics
A topic is a named stream of messages stored in SQLite. Every message has three fields:| Field | Description |
|---|---|
key | A unique identifier for the record (e.g., order_123) |
payload | The full JSON data |
timestamp | When the message was ingested (Unix epoch) |
FROM orders, you’re querying the orders topic’s SQLite table.
LiteJoin stores topics in sharded SQLite databases using WAL mode. Writes go through a single writer connection per shard; reads use a configurable pool of reader connections. This provides high throughput with zero contention.
Joins
A join is a SQL query that combines data from multiple topics. LiteJoin evaluates joins reactively — every time new data is written to a topic referenced by a join, the join query re-executes and emits results to a sink.- Reactive execution — Joins fire on every write, not on a schedule.
- Standard SQL — Use
SELECT,JOIN,WHERE,GROUP BY,json_extract(), and other SQLite functions. - Time-bounded — Use
timestampfilters inWHEREto limit joins to recent data and keep queries fast.
Windows
Windows group messages by time intervals for aggregation. LiteJoin supports three window types:- Tumbling
- Sliding
- Session
Fixed, non-overlapping intervals. Each message belongs to exactly one window.Example: “Count orders every 5 minutes.”
Sinks
A sink is a destination for join and window results. LiteJoin supports four sink types:| Sink Type | Description |
|---|---|
| HTTP | Sends results as JSON POST requests to a webhook URL. |
| Kafka | Produces results to a Kafka topic. |
| SSE | Streams results to browser clients via Server-Sent Events. Includes a /snapshot endpoint for initial state hydration. |
| SQLite | Writes results to a local SQLite database for querying by external applications. |
Messages
Every piece of data flowing through LiteJoin is aMessage:
_change metadata describing what changed:
Pipeline Lifecycle
- Sources poll APIs, consume Kafka, or accept HTTP pushes → messages enter the pipeline.
- Writer batches messages and writes them to the appropriate topic (SQLite table) via upsert.
- Joiner re-evaluates all joins whose referenced topics received new data → emits
JoinResultto sinks. - Windower evaluates window queries on their schedule → emits aggregated results to sinks.
- Sinks deliver results to external systems (webhooks, Kafka, SSE, SQLite).
- Retention cleaner periodically deletes data older than the configured TTL.
If at-least-once delivery is enabled, failed sink deliveries are captured in a dead-letter queue (DLQ) and retried automatically with exponential backoff.