Context
BSFG facts carry structured payloads in object_json. Because the boundary model relies on idempotent append keyed by message_id, the system must be able to determine whether two messages with the same identifier are truly identical.
Standard JSON does not guarantee deterministic byte representation: object key order, whitespace, and number formatting may vary across languages and runtimes. The JSON Canonicalization Scheme (JCS) addresses this by defining a deterministic serialization of JSON data that can be hashed or compared consistently across systems. ([rfc-editor.org](https://www.rfc-editor.org/rfc/rfc8785?utm\_source=chatgpt.com))
The architecture therefore needs to decide whether payload equality is evaluated at the semantic object level or the byte-representation level.
Options Considered
| Option | Description | Benefits | Drawbacks |
|---|---|---|---|
| Opaque byte comparison | Accept whatever JSON encoding the producer sends and compare raw bytes. |
- simple implementation
- no serialization rules required
|
- logically identical payloads may differ bytewise
- idempotency checks become unreliable
- cross-language producers behave inconsistently
| | Loose semantic comparison | Parse JSON into objects and compare structures ignoring order and formatting. |
- logically identical objects match
- flexible producer implementations
|
- comparison rules become complex
- language runtimes differ in number handling
- hashing and deduplication become harder
| | Binary serialization format | Replace JSON entirely with a binary protocol format. |
- compact encoding
- deterministic representation
|
- loses JSON’s universal readability
- introduces new schema tooling requirements
- reduces diagnostic transparency
| | Canonical JSON serialization (Selected) | Require canonical JSON encoding so that equivalent objects produce identical byte representation. |
- deterministic hashing and equality
- stable idempotency checks
- maintains JSON’s human readability
- cross-language consistency
|
- producers must implement canonical serialization rules
- slightly stricter payload preparation discipline
|
Decision
BSFG requires canonical JSON serialization for object_json.
object_json = canonicalized JSON (RFC 8785)
Canonicalization ensures:
- deterministic key ordering
- consistent numeric representation
- no insignificant whitespace
- stable byte representation for hashing
Equality checks during idempotent append are therefore performed on the canonical byte representation of the JSON payload.
hash = SHA256(canonical_json(object))
This hash may be used internally to verify that repeated attempts with the same message_id contain identical content.
Consequences
Benefits:
- idempotency checks are deterministic across languages
- payload equality is simple and reliable
- diagnostics remain human-readable JSON
- content hashes remain stable for auditing or integrity verification
Tradeoffs:
- producer libraries must implement canonicalization
- payload generation pipelines must avoid non-deterministic JSON formatting
- teams must be aware of canonicalization when generating or comparing payloads