Audience: Architects, implementers. Use: Understand boundary handoff, cursor advancement, and replay-based recovery.
Asynchronous Replay Principle
BSFG zones never communicate via synchronous RPC. Instead, they exchange facts through asynchronous replay: facts are replayed from one zone's store buffer to another zone's forward buffer, with the receiving zone driving the fetch progress.
This decouples producer from consumer. A producer completes on local durability; it does not wait for remote delivery or processing. A consumer drives its own fetch pace, independent of producer availability.
The Four-Step Handoff Protocol
The boundary handoff occurs in four steps. Each step is durably recorded.
Step 1: Proposal
A fact arrives at the store buffer (ISB or ESB). The system assigns it an offset n and records:
offset= position in the logpayload= the fact bodyidempotency_key= derived from message_id or payload hash
Producer acknowledges on durability here. No consumer involvement yet.
Step 2: Insertion
The forward buffer (IFB or EFB) receives a proposal from the store buffer. It executes:
putIfAbsent(idempotency_key, payload) → AlreadyExists | Confirmed(offset=n)
If the key already exists, insertion is rejected. If new, the payload is stored and confirmed. This atomic operation prevents duplicates without background workers.
Step 3: Cursor Advancement
Once a fact is confirmed at the forward buffer, the cursor advances. The cursor (also called the frontier) tracks highest_contiguous_committed_offset — the highest offset at which all preceding offsets have been confirmed.
Contiguous Prefix Constraint: The cursor can only advance over contiguous offsets. Gaps prevent advancement until filled.
Example:
Store Buffer contents: [0, 1, 2, 3, 4, 5]
Confirmed at Forward: [0, 1, 2, ×, 4, 5] (offset 3 missing)
Cursor position: 2 (stops at first gap)
Once offset 3 is confirmed, the cursor advances to 5 (covering the contiguous prefix 0-5).
Step 4: Recovery
When the boundary reconnects after a partition:
- The store buffer reads its durable checkpoint:
last_cursor_position = 5 - It proposes all facts from offset 6 onward to the forward buffer
- The forward buffer applies
putIfAbsent— duplicates are silently rejected - The cursor advances as new confirmations arrive
Replay is idempotent by construction. A fact replayed twice produces the same result both times.
Frontier Semantics
The frontier is the boundary's notion of "what has been safely transferred." It is managed by the Cursor Tracker — an external service or persisted state machine.
Cursor Tracker state:
highest_contiguous_committed_offset = n
Truncation Safety: The store buffer can truncate entries before the frontier. Entries at or after the frontier must be retained until confirmation.
Example: If the frontier is at offset 100, the store buffer can safely delete offsets 0-99. Offsets 100+ are retained for replay or recovery.
Operational Modes
BSFG operates in three distinct modes, driven by boundary connectivity:
Normal Mode (Gate Open)
Both zones are connected and responsive. Synchronous paths work; handoff latency is low (~10ms).
- Producers append to ISB/ESB and get durable acknowledgment (~1-5ms)
- Consumers fetch from IFB/EFB with low latency (~10-50ms end-to-end)
- Cursor advances smoothly, truncation proceeds normally
Autonomous Mode (Gate Closed)
The boundary is partitioned. Remote zone is unreachable. Producers and consumers continue locally.
- Producers append to local ISB/ESB and receive durable ack — no blocking
- Consumers fetch from local IFB/EFB using last known cursor position
- Optional zone-local intra-zone buffers support emitter/consumer continuity
- Cursor freezes at the last known frontier
- Store buffers accumulate facts; truncation is disabled
Reconciliation Mode (Gate Reopening)
Connectivity is restored. The zones reconcile.
- Store buffer detects cursor position and begins replay from
frontier + 1 - Forward buffer deduplicates incoming facts via
putIfAbsent - Cursor advances gradually as facts are confirmed
- Store buffer catches up and returns to truncation
- System returns to Normal Mode once backlog clears
State Diagram
┌─────────────┐
│ NORMAL │
│ (Gate Open) │
│ P99 <50ms │
└──────┬──────┘
│ network partition
↓
┌──────────────┐
│ AUTONOMOUS │
│(Gate Closed) │
│ cursor frozen│
└──────┬───────┘
│ connectivity restored
↓
┌────────────────────┐
│ RECONCILIATION │
│ (Gate Reopening) │
│ cursor advancing │
└────────┬───────────┘
│ backlog cleared
↓
┌─────────────┐
│ NORMAL │
└─────────────┘
Delivery Semantics
BSFG guarantees the following delivery properties:
At-Least-Once Transport
Every fact appended successfully is delivered at least once to the receiving zone's forward buffer. Facts are never lost due to network partition or timeout, as long as the store buffer retains them (default 7-day TTL).
Idempotent Append
The forward buffer's putIfAbsent operation is atomic per idempotency key. If the same message_id is submitted multiple times (due to producer retry), it is stored once and confirmed multiple times. Deduplication is guaranteed at the storage layer.
Replay-Based Recovery
On boundary reconnection, facts are replayed from the store buffer using the cursor checkpoint. No facts are skipped; no facts are duplicated beyond what putIfAbsent handles.
No Exactly-Once
BSFG does not guarantee exactly-once delivery. Consumers must expect and handle at-least-once delivery by implementing idempotent processing.