Architecture

LiteJoin packs an entire stream processing engine into a single Go binary. There are no external dependencies — no message broker, no database server, no coordinator. Everything runs in-process, backed by SQLite.

Engine Overview

┌──────────────────────────────────────────────────────────────┐
│                        LiteJoin Engine                        │
│                                                              │
│  ┌─────────┐ ┌──────────┐ ┌──────────┐                      │
│  │   API   │ │   HTTP   │ │  Kafka   │    Sources            │
│  │ Source  │ │  Source   │ │  Source  │                       │
│  └────┬────┘ └────┬─────┘ └────┬─────┘                      │
│       │           │            │                             │
│       └───────────┴──────┬─────┘                             │
│                          ▼                                   │
│                     ┌─────────┐                              │
│                     │  msgCh  │    Message Channel            │
│                     └────┬────┘                              │
│                          ▼                                   │
│                     ┌─────────┐                              │
│                     │ Writer  │    Batched Writes             │
│                     └────┬────┘                              │
│                          ▼                                   │
│              ┌───────────────────────┐                       │
│              │  Sharded SQLite (WAL) │    Storage             │
│              │  shard_0 ... shard_N  │                       │
│              └───────────┬───────────┘                       │
│                          │                                   │
│              ┌───────────┼───────────┐                       │
│              ▼           ▼           ▼                       │
│         ┌─────────┐ ┌─────────┐ ┌──────────┐                │
│         │ Joiner  │ │Windower │ │Retention │                 │
│         └────┬────┘ └────┬────┘ │ Cleaner  │                 │
│              │           │      └──────────┘                 │
│              └─────┬─────┘                                   │
│                    ▼                                         │
│         ┌──────────────────────┐                             │
│         │       Sinks          │                             │
│         │  HTTP │ Kafka │ SSE  │                             │
│         └──────────────────────┘                             │
└──────────────────────────────────────────────────────────────┘

SQLite Storage

LiteJoin uses SQLite in WAL (Write-Ahead Logging) mode as its primary data store. Every topic maps to a SQLite table with key, payload, and timestamp columns.

Sharding

Data is distributed across multiple SQLite databases using consistent hashing (FNV) on the message key:

shard_index = fnv32(message.key) % shard_count

The default shard count is 8. Each shard is a separate .db file:

data/
├── shard_0.db
├── shard_1.db
├── ...
└── shard_7.db

Write Path

A single writer connection per shard ensures zero write contention. Messages are batched by the Writer component and flushed every flush_interval (default: 10ms) or when batch_size (default: 1000) is reached. Writes use SQLite’s INSERT OR REPLACE (upsert), meaning the latest value for a key always wins.

Read Path

A configurable pool of reader connections (default: 4 per shard) handles queries from the joiner, windower, and snapshot API. SQLite’s WAL mode provides snapshot isolation — readers see a consistent state even while writes are in progress.

Retention

A background goroutine periodically deletes data older than the configured retention duration:

retention:
  duration: 24h
  clean_interval: 1m

This keeps SQLite databases small and queries fast. For longer-term data, see Tiered Storage.

Tiered Storage (Optional)

When enabled, LiteJoin compacts expired data into Parquet files before deleting from SQLite. These files are queryable via an embedded DuckDB instance and can optionally be uploaded to cloud object storage (S3, GCS, Azure Blob):

Tier	Storage	Latency	Retention
Hot	SQLite (sharded)	Sub-millisecond	Configured TTL (e.g., 1h–24h)
Warm	Local Parquet files	Milliseconds	Configurable (default: 7 days)
Cold	Cloud object storage	Seconds	Indefinite

The hot path has zero overhead from tiered storage — DuckDB is only initialized when the feature is enabled and only used for historical queries.

Replication (Optional)

LiteJoin integrates with Litestream for continuous backup of SQLite shards to cloud object storage. This enables:

Disaster recovery — restore shards from S3/GCS on startup
Ephemeral deployments — containers can boot, restore state, and resume processing
Point-in-time recovery — roll back to any point within the retention window

See Replication for setup details.

At-Least-Once Delivery

When enabled, failed sink deliveries are captured in a dead-letter queue (DLQ) backed by a dedicated SQLite database. A background retry worker redelivers messages with exponential backoff:

Sink fails → Enqueue to DLQ → Retry with backoff → Ack on success

See Delivery Guarantees for configuration.

LiteJoin Studio

LiteJoin Studio is a Wails desktop application (Go backend + React/Vite frontend) that embeds the LiteJoin engine directly. It provides:

Visual pipeline builder — add sources, configure joins and windows via UI
Live SQL editor — write queries with autocomplete and see results stream in real-time
One-click deploy — export a production-ready litejoin.yaml configuration

Studio communicates with the embedded engine via Wails bindings — no network overhead, no separate server process.

Introduction

Getting Started

Sources

Sinks

Building Pipelines

Operations

Architecture

Architecture

Engine Overview

SQLite Storage

Sharding

Write Path

Read Path

Retention

Tiered Storage (Optional)

Replication (Optional)

At-Least-Once Delivery

LiteJoin Studio

Introduction

Getting Started

Sources

Sinks

Building Pipelines

Operations

​Architecture

​Engine Overview

​SQLite Storage

​Sharding

​Write Path

​Read Path

​Retention

​Tiered Storage (Optional)

​Replication (Optional)

​At-Least-Once Delivery

​LiteJoin Studio

Architecture

Engine Overview

SQLite Storage

Sharding

Write Path

Read Path

Retention

Tiered Storage (Optional)

Replication (Optional)

At-Least-Once Delivery

LiteJoin Studio