HLESS Deep Dive¶
High-Level Event Semantics Specification
This document explains why HLESS exists, the problems it solves, and its philosophical foundations.
The Problem¶
Event-driven architectures fail in three predictable ways:
- Semantic drift - Over time, "events" accumulate meanings that weren't intended
- Vocabulary lock-in - Teams inherit Kafka's terminology and conflate abstraction layers
- Human imprecision - Natural language descriptions become ambiguous record definitions
HLESS addresses all three by introducing a semantic layer between human intent and implementation.
Why "Event" Is a Forbidden Word¶
The word "event" is banned in HLESS strict mode. This is deliberate.
The Ambiguity Problem¶
When someone says "the OrderCreated event", they might mean:
- A request to create an order (which might be rejected)
- A fact that an order now exists (irreversible)
- An observation that something happened in an external system
- A derivation computed from other data
These are fundamentally different things with different semantics, ordering guarantees, and replay behaviours. Calling them all "events" creates bugs that manifest years later when someone replays the log and discovers their assumptions were wrong.
Kafka's Vocabulary Is Not Neutral¶
Kafka documentation uses terms like "event", "message", "record", and "topic" interchangeably. This vocabulary has leaked into how teams think about event-driven systems.
But Kafka is an implementation. Its vocabulary describes: - How data is stored (partitions, offsets) - How data is transmitted (producers, consumers) - How data is organised (topics, consumer groups)
None of this tells you what the data means.
HLESS exists precisely because implementation vocabulary ("Kafka topic", "event stream") tells you nothing about: - Whether records represent completed facts or pending requests - Whether replay should trigger side effects - Whether ordering matters and at what scope
The Four Record Kinds¶
HLESS requires every record to be classified as exactly one of four kinds. This is not optional.
INTENT¶
A fact that someone requested or attempted something.
Key property: Does NOT imply success. May lead to acceptance or rejection.
FACT¶
A fact about the domain that is permanently true.
Key property: Cannot be retracted. Must remain true forever.
OBSERVATION¶
A fact that something was observed, measured, or reported.
TemperatureMeasured (valid - observation was recorded)
TemperatureIs25Degrees (invalid - asserts correctness)
Key property: Truth is "this was observed", not "this is correct". May be duplicated, late, or out of order.
DERIVATION¶
A fact that a value was computed from other records.
Key property: Always rebuildable from sources. Must not introduce new domain truth.
Why This Classification Matters¶
Replay Safety¶
When you replay a log, what happens?
- INTENT records: Should trigger processing again (they're requests)
- FACT records: Should NOT trigger side effects (the fact already happened)
- OBSERVATION records: May duplicate, requires idempotency handling
- DERIVATION records: Can be deleted and rebuilt from sources
Without explicit classification, replay is dangerous.
Ordering Guarantees¶
HLESS requires explicit ordering scope:
This declares that ordering is guaranteed only within a single order. Cross-order invariants must not assume ordering.
If you write an invariant that assumes global ordering, HLESS will reject it.
Idempotency¶
Each record kind has default idempotency strategies:
| Kind | Default Strategy |
|---|---|
| INTENT | Deterministic ID from content |
| FACT | Hash of stream + key + payload |
| OBSERVATION | Time-windowed deduplication |
| DERIVATION | Hash of source records + function |
These can be overridden, but they must be explicit.
The Human-LLM Translation Problem¶
HLESS was designed for a specific workflow:
- A human says something imprecise: "when an order is placed"
- An LLM translates this into a formal specification
- The specification is validated against semantic rules
- Only then is code or infrastructure generated
The risk is that an LLM will guess what the human meant. HLESS prevents this by:
- Banning ambiguous terminology - Forces explicit RecordKind
- Requiring structural declarations - StreamSpec must be complete
- Enforcing semantic rules - Violations block generation
Example: Imprecise Human Input¶
Human says: "I need an event for when orders are created"
Without HLESS, an LLM might generate:
This tells us nothing about semantics. Is it a request? A fact? Can it be replayed?
With HLESS, the LLM must first clarify:
- Is this describing a request to create an order? → INTENT
- Is this describing that an order was created? → FACT
- Is this an observation from an external system? → OBSERVATION
Then it generates:
name: orders.fact.v1
record_kind: FACT
schemas:
- name: OrderPlaced
version: v1
partition_key: order_id
ordering_scope: per_order
time_semantics:
t_event_field: placed_at
idempotency:
strategy_type: deterministic_id
field: record_id
invariants:
- "OrderPlaced is irreversible"
- "order_id is unique within this stream"
The specification is complete before any code is written.
Technology Independence¶
HLESS deliberately avoids infrastructure-specific terminology.
What HLESS Does NOT Define¶
- How records are stored (Kafka, Pulsar, Redpanda, files)
- How records are partitioned (implementation detail)
- How consumers are grouped (implementation detail)
- Wire format (Avro, Protobuf, JSON)
What HLESS Does Define¶
- What kind of truth each record represents
- What ordering guarantees exist and at what scope
- How duplicates should be detected
- What the contract is between intent and outcome
This separation means you can: 1. Start with an in-memory log for development 2. Move to Kafka for production 3. Switch to Pulsar without changing semantics
The StreamSpec remains the same. Only the mapping changes.
The Semantic Validation Rules¶
HLESS enforces six core rules. These are not advisory—violations block processing.
Rule 1: FACT Streams Must Not Contain Imperatives¶
OrderPlaced ✓ (describes completed truth)
CreateOrder ✗ (imperative command)
PlaceOrder ✗ (imperative command)
Rule 2: INTENT Streams Must Not Imply Success¶
OrderPlacementRequested ✓ (describes attempt)
OrderPlaced ✗ (implies completion)
OrderCreated ✗ (implies success)
Rule 3: DERIVATION Streams Must Reference Sources¶
Every DERIVATION must declare what it derives from:
lineage:
source_streams: [order_facts, payment_facts]
derivation_type: aggregate
rebuild_strategy: full_replay
Without this, the derivation cannot be rebuilt.
Rule 4: OBSERVATION Streams Must Not Assert Correctness¶
"TemperatureWasRecorded" ✓ (observation)
"TemperatureIsAccurate" ✗ (asserts truth)
"ValueIsGuaranteedCorrect" ✗ (asserts truth)
Observations may be late, duplicated, or wrong. The only truth is that they were observed.
Rule 5: Ordering Invariants Must Match Partition Key¶
If your invariant says "OrderShipped always follows OrderPlaced", this only holds within a single partition.
HLESS requires:
partition_key: order_id
ordering_scope: per_order
invariants:
- "Within each order: OrderShipped follows OrderPlaced"
Cross-partition ordering invariants are rejected unless you explicitly declare cross_partition: true.
Rule 6: FACT Records Must Be True Forever¶
If you write a FACT, it cannot be retracted. This is fundamental to event-first architecture.
The Three Time Axes¶
HLESS requires distinguishing between three timestamps:
| Timestamp | Meaning |
|---|---|
t_event |
When the thing happened in the real world |
t_log |
When the record was appended to the log |
t_process |
When the record was processed (derivations only) |
Why this matters:
- A sensor reading (
t_event: 14:00) might be logged later (t_log: 14:05) due to network delay - Replaying the log doesn't change
t_eventbut does changet_process - Windowed aggregations must know which time axis to use
HLESS bans the unqualified word "time" to prevent confusion.
The Intent-Outcome Contract¶
INTENT streams must declare their expected outcomes:
record_kind: INTENT
expected_outcomes:
success:
emits: [OrderPlaced]
target_stream: orders.fact.v1
failure:
emits: [OrderPlacementRejected]
target_stream: orders.fact.v1
This establishes the contract: - "If you send OrderPlacementRequested, you will eventually get OrderPlaced or OrderPlacementRejected" - The outcomes are FACT records (permanent truth) - They go to a specific stream with known semantics
Why Idempotency Is Mandatory¶
Every stream must declare how duplicates are detected:
idempotency:
strategy_type: deterministic_id
field: record_id
derivation: "hash(stream, natural_key, t_event, payload)"
This is not optional because: 1. Networks lose messages and retry 2. Producers crash and restart 3. Consumers fail and replay from checkpoints
Without explicit idempotency, you get duplicate processing or data loss.
HLESS Modes¶
Three enforcement modes exist:
| Mode | Behaviour |
|---|---|
strict |
Violations are errors. Default for new projects. |
warn |
Violations are warnings. For migration. |
off |
No enforcement. Strongly discouraged. |
Strict mode is recommended. It prevents problems before they manifest.
Summary¶
HLESS exists because:
- Human language is imprecise - "event" means different things to different people
- Kafka's vocabulary is not semantic - It describes implementation, not meaning
- Semantic errors compound over time - Wrong assumptions at design time become production incidents years later
By requiring explicit classification (INTENT, FACT, OBSERVATION, DERIVATION), explicit time semantics, and explicit idempotency, HLESS ensures that event-first systems remain understandable, replayable, and safe.
See Also¶
- Event Semantics - Quick reference for record kinds
- Messaging Reference - DSL syntax for channels
- Architecture Overview - System architecture