Architecture

Overview

The authoritative definition of BSFG: principles, guarantees, and exclusions

Audience: System architects, implementers, standards auditors.

Use: This is the authoritative definition of what BSFG is. All other documentation defers to this specification. For implementation details, see the reference implementation guides and deployment documentation.

Executive Summary

The Bidirectional Store-and-Forward Gateway (BSFG) is a hexagonal boundary primitive enabling partition-tolerant integration between two independent trust domains (zones). It guarantees that either zone may become unreachable without blocking producers, losing data, or forcing shared infrastructure. Both zones continue operating autonomously during partition. Upon reconnection, facts are replayed from cursors with idempotent boundary-level materialization (duplicate suppression via atomic insertion at forward buffers).


Problem Statement

Modern industrial plants require asynchronous integration between Enterprise IT (External Zone) and Plant OT (Internal Zone). Traditional synchronous gateways force producers to block or fail during network partitions, GC pauses, DNS failures, and extended enterprise outages. Simple message buses lose in-flight data or require shared infrastructure that violates zone autonomy.

BSFG solves this via durable asynchronous buffering with configurable idempotency: neither zone blocks on the other's availability, data survives partition, and duplicates are suppressed at ingress without background reconciliation workers.


Architectural Objectives

ID Objective Verification Criteria
O1 Producer Non-Blocking Producers complete writes via local durable acknowledgment without network dependency on remote zone
O2 Effectively-Once Boundary No data loss within configured envelope; no duplicate materialization via atomic idempotent insertion
O3 Mechanism Agnosticism Supports arbitrary byte sequences (MQTT, OPC UA, Parquet, Protobuf) with content-type metadata; no schema enforcement
O4 Fast Swappability Any storage backend or hashing algorithm replaceable within 5 working days via hexagonal interface ports

Explicit Non-Objectives (Scope Exclusions)

ID Non-Objective Rationale
N1 Not Shared Database Zones remain transactionally autonomous; no 2PC or synchronous replication
N2 Not Synchronous RPC No request/response patterns holding connections open for remote acknowledgment
N3 Not Distributed Transactions No Saga orchestration or atomic cross-boundary transactions
N4 Not Global Total Ordering Gateway does not enforce causality; vector clocks transported as opaque metadata only
N5 Not Semantic Transformation No schema normalization or business logic at boundary; pure transport layer
N6 Not Infinite Durability Retention bounded by TTL or capacity; explicit overflow policies required
N7 Not End-to-End Exactly-Once Gateway guarantees exactly-once materialization at boundary buffers only; downstream application idempotency remains application concern

Topology: Four-Buffer Decomposition

BSFG represents the minimal factorization of durability × availability × directionality under partition tolerance constraints.

The Four Logical Roles

Role Direction Responsibility Interface
ISB (Ingress Store) External → Internal Durable write-ahead log at external perimeter append(payload, metadata) → Offset
IFB (Ingress Forward) External → Internal Idempotent deduplication via putIfAbsent putIfAbsent(key, payload) → Status
ESB (Egress Store) Internal → External Durable write-ahead log at internal perimeter append(payload, metadata) → Offset
EFB (Egress Forward) Internal → External Idempotent deduplication via putIfAbsent putIfAbsent(key, payload) → Status

Dataflow Diagram

flowchart LR
  subgraph EZ["EXTERNAL ZONE (Enterprise IT)"]
    EP["Producers/Consumers"]
    ISB["ISB (store)"]
    EFB["EFB (forward)"]
    EP --> ISB
    EFB --> EP
  end

  subgraph IZ["INTERNAL ZONE (Plant OT)"]
    IP["Producers/Consumers"]
    IFB["IFB (forward)"]
    ESB["ESB (store)"]
    IFB --> IP
    IP --> ESB
  end

  ISB -- "GATE open/closed (async handoff only)" --> IFB
  ESB -- "GATE open/closed (async handoff only)" --> EFB

Buffer Flow: ISB → IFB (ingress) | ESB → EFB (egress)


Cursor Management and Frontier Semantics

The Cursor Tracker maintains per-direction state:

Contiguous Prefix Constraint: Acknowledgment frontier must be strictly contiguous. Truncation at store buffer is safe only for entries ≤ highest_contiguous_committed_offset, ensuring no gaps in durability history.

Recovery Protocol: On node restart, replay from checkpointed highest_contiguous_committed_offset. Duplicates are suppressed by IFB/EFB idempotency.


Formal Interface Contracts

All implementations must satisfy these interfaces:

Store Buffer Interface (ISB, ESB)

interface StoreBuffer {
  // Append payload durably; return monotonic offset
  append(payload: Bytes, metadata: Headers) → Promise<Offset>;

  // Truncate all entries before offset (safe only for contiguous prefix)
  truncateBefore(offset: Offset) → Promise<void>;

  // Replay entries starting from offset
  replay(from: Offset) → AsyncIterator<Entry>;

  // Return highest_contiguous_committed_offset from Cursor Tracker
  getFrontier() → Promise<Offset>;
}

Forward Buffer Interface (IFB, EFB)

interface ForwardBuffer {
  // Atomic idempotent insert: reject if key exists, insert if not
  putIfAbsent(idempotencyKey: Key, payload: Bytes, metadata: Headers)
    → Promise<Status {Inserted, AlreadyExists}>;

  // Retrieve by key
  get(key: Key) → Promise<Option<Bytes>>;

  // Query by time range
  queryByTimeRange(start: Timestamp, end: Timestamp) → Iterator<Entry>;
}

Cursor Tracker Interface

interface CursorTracker {
  // Record that entry at offset with idempotencyKey is confirmed
  commit(offset: Offset, idempotencyKey: Key) → Promise<void>;

  // Return current checkpoint
  getCheckpoint() → Promise<{
    nextToSend: Offset,
    committed: Offset  // highest_contiguous_committed_offset
  }>;
}

Operational Modes

Normal Mode (Gate Open)

Autonomous Mode (Gate Closed)

Reconciliation Mode (Gate Reopening)


Idempotency Model

Idempotency Key Strategies

Implementations support configurable idempotency key derivation:

  1. Default (SHA-256): Hash the payload bytes exactly — byte-exact deduplication
  2. Semantic: Hash canonicalized payload + metadata subset — ignores non-essential fields
  3. Explicit: Application-provided message_id (e.g., UUID, business key)

Idempotent Append Semantics

if message_id already exists and payload is identical:
    return existing entry offset
if message_id already exists and payload differs:
    reject as conflicting_duplicate

This contract is enforced at the Forward Buffer (IFB/EFB) via atomic putIfAbsent semantics. Producers may retry indefinitely with the same message_id and payload; the second append is idempotent.


Data Integrity and Authenticity

Integrity via Content Addressing

Facts are immutable once appended. Content hash (SHA-256) verifies payload has not been modified in transit.

Authenticity via mTLS

All zone-to-zone communication uses mutual TLS (mTLS) with certificate-based zone identity. Peer zones verify the peer's certificate Common Name (CN) to confirm the authenticated source zone.

Artifacts

Producers upload artifacts and append facts referencing them by digest. The fact includes artifact_digest: SHA256(content). Consumers verify the artifact matches the digest; mismatch indicates tampering or loss.


Failure Mode Analysis

Scenario Behavior Recovery
Store Buffer Crash Replay from checkpointed cursor Cursor Tracker retrieves frontier; store replays from frontier; duplicates rejected by forward buffer atomicity
Forward Buffer Crash Mid-Write Atomic CAS ensures consistency Replay from store; duplicate key rejected by putIfAbsent atomicity
Network Partition (Gate Close) ISB/ESB retain unacknowledged; zones operate autonomously Resume handoff from cursor frontier on reconnect; contiguous prefix ensures no gaps
Hash Collision Detected by secondary hash comparison Manual intervention alert; quarantine affected entry
Buffer Exhaustion Configurable policy activation Standard: evict oldest unacknowledged; Safety-critical: reject new + operator alert
Clock Skew Not interpreted by gateway Vector clocks transported as opaque metadata; applications handle temporal ambiguity

Threat Model

Threat Vector Mitigation
Replay Attack Resubmission of old payload with valid key TTL validation + timestamp metadata checks; applications implement nonce/version checks
Hash Flooding Collision generation to degrade performance Cryptographic hash (SHA-256) with collision-resistant properties; fallback to bytewise compare on collision
Tampering Payload modification in transit Hash verification detects modification; authenticity via mTLS certificates
Unauthorized Injection Spurious data from compromised client mTLS client certificates + zone-level authorization per buffer namespace
Fact Tampering Attacker modifies fact in durable store Facts are append-only; immutable once written; audit trail is tamper-evident
Artifact Substitution Artifact replacement breaks digest Digest mismatch signals tampering; immutable once referenced; consumers verify
Unauthorized Cross-Zone Replication Rogue data replication from one zone to another Facts are pulled by receiving zone, not pushed by sending zone. Network-level firewall rules enforce zone-to-zone connectivity boundaries

Safety Certification Context

BSFG operates as an "Other System" per IEC 61508 (not itself a Safety Instrumented Function). It does not execute safety shutdowns. It provides data availability for safety systems during network partition:


Standards Alignment

Standard Reference BSFG Mapping
ISA-95 Level 3/4 Boundary Enterprise/Control layer integration boundary specification
IEC 62264 Gateway Object Class Functional definition of gateway with store-and-forward capability
IEC 62541-14 OPC UA PubSub Store-and-Forward Gateway functional category
RFC 1129 Internet Store-and-Forward Durability and retry semantics
EIP #101 Guaranteed Delivery Durable retry until acknowledgment
EIP #128 Messaging Gateway Abstraction over transport mechanisms
EIP #201 Idempotent Receiver Content-addressed duplicate suppression

Proof by Exclusion

BSFG is the minimal compositional primitive satisfying all objectives (O1–O4) and exclusions (N1–N7). Systematic elimination:

  1. Request-Reply (#154): Eliminated — violates O1 via synchronous coupling
  2. Message Bus (non-durable): Eliminated — violates O2 via volatile in-flight storage
  3. Guaranteed Delivery (#101, unidirectional): Eliminated — insufficient for bidirectional autonomy
  4. Shared Database Integration: Eliminated — violates N1 via 2PC requirement
  5. Messaging Bridge (#133): Eliminated — assumes simultaneous availability, lacks intermediate staging

Conclusion: BSFG is the minimal dual Guaranteed Delivery (bidirectional) + Idempotent Receiver (configurable keys) + Messaging Gateway (abstraction).


Verification Matrix

Objective BSFG Mechanism Verification
O1 Producer Non-Blocking ISB/ESB local durable append Producer ack after local durability; no remote network dependency
O2 Effectively-Once Atomic putIfAbsent at IFB/EFB with contiguous frontier Single materialization per idempotency key; truncation only after contiguous confirmation
O3 Mechanism Agnostic Opaque byte storage with metadata headers Protocol-agnostic payload handling; no transformation
O4 Fast Swappability Hexagonal ports (StoreBuffer, ForwardBuffer, CursorTracker) Backend replacement without protocol changes

Glossary


Cross-Links

Logical System Layer (How BSFG operates):

Integration Layer (How to use BSFG):

Substrate Layer (How to implement BSFG):

Documentation Model: