Context
BSFG needs a durable per-zone storage topology that remains understandable, keeps operational overhead bounded, and still allows differentiated retention and filtering. Facts are heterogeneous: operational process facts, audit/protocol facts, and document-reference facts do not have identical lifecycle requirements. Large artifacts also require bucket organization that aligns with subject taxonomy without exploding operational complexity.
The topology must answer two related questions:
- how many streams exist per zone and how messages are partitioned within them
- how large-artifact buckets are organized in the zone-local object store
Options Considered
| Option | Description | Benefits | Drawbacks |
|---|---|---|---|
| Single stream and single bucket | Place every fact in one stream and every artifact in one object bucket. |
- minimal topology
- easy to explain initially
|
- weak separation of retention classes
- noisy filtering surface
- mixed artifact lifecycle policy
| | One stream and bucket per domain kind | Create separate streams and buckets for each subject kind such as batch, asset, alarm, lot, and recipe. |
- strong domain isolation
- fine-grained retention control
|
- stream sprawl
- larger operational surface
- harder cross-domain administration
| | Topology per producer or zone-pair | Partition streams and buckets by sender, receiver, or replication lane. |
- explicit channel ownership
- local troubleshooting may feel direct
|
- storage model leaks network topology
- topology churn forces storage churn
- poor long-term maintainability
| | Small fixed stream set + subject-prefix partitioning + bucket-per-subject-kind (Selected) | Use a small fixed stream set per zone, partition facts by subject prefix, and organize object buckets by subject kind. |
- bounded ops surface
- clear retention classes
- native filtering by subject prefix
- artifact lifecycle aligned to subject taxonomy
|
- requires naming discipline
- subject vocabulary becomes structurally important
|
Decision
Each zone uses a small fixed set of streams:
facts.operational
facts.audit
facts.documents
Facts are partitioned within those streams by subject prefix, for example:
facts.operational.batch
facts.operational.asset
facts.operational.alarm
facts.audit.protocol
facts.audit.security
facts.documents.file
facts.documents.record
This keeps stream count small while still allowing filtering, retention separation, and future growth through subject naming rather than stream multiplication.
Large artifacts are stored in bucket-per-subject-kind object stores, for example:
batch-files
asset-files
alarm-files
document-files
lot-files
recipe-files
Artifact buckets are therefore aligned to subject taxonomy, while fact streams remain aligned to retention and operational class.
Consequences
Benefits:
- small and stable stream topology per zone
- retention policy is driven by fact class, not by uncontrolled stream proliferation
- filtering and consumer selection use subject prefixes naturally
- artifact lifecycle policy aligns to subject kind without forcing stream sprawl
Tradeoffs:
- subject naming conventions become part of architecture, not just convenience
- misclassified subjects can pollute stream boundaries
- bucket taxonomy must evolve carefully as new subject kinds appear