HLESS Deep Dive¶

High-Level Event Semantics Specification

This document explains why HLESS exists, the problems it solves, and its philosophical foundations.

The Problem¶

Event-driven architectures fail in three predictable ways:

Semantic drift - Over time, "events" accumulate meanings that weren't intended
Vocabulary lock-in - Teams inherit Kafka's terminology and conflate abstraction layers
Human imprecision - Natural language descriptions become ambiguous record definitions

HLESS addresses all three by introducing a semantic layer between human intent and implementation.

Why "Event" Is a Forbidden Word¶

The word "event" is banned in HLESS strict mode. This is deliberate.

The Ambiguity Problem¶

When someone says "the OrderCreated event", they might mean:

A request to create an order (which might be rejected)
A fact that an order now exists (irreversible)
An observation that something happened in an external system
A derivation computed from other data

These are fundamentally different things with different semantics, ordering guarantees, and replay behaviours. Calling them all "events" creates bugs that manifest years later when someone replays the log and discovers their assumptions were wrong.

Kafka's Vocabulary Is Not Neutral¶

Kafka documentation uses terms like "event", "message", "record", and "topic" interchangeably. This vocabulary has leaked into how teams think about event-driven systems.

But Kafka is an implementation. Its vocabulary describes: - How data is stored (partitions, offsets) - How data is transmitted (producers, consumers) - How data is organised (topics, consumer groups)

None of this tells you what the data means.

HLESS exists precisely because implementation vocabulary ("Kafka topic", "event stream") tells you nothing about: - Whether records represent completed facts or pending requests - Whether replay should trigger side effects - Whether ordering matters and at what scope

The Four Record Kinds¶

HLESS requires every record to be classified as exactly one of four kinds. This is not optional.

INTENT¶

A fact that someone requested or attempted something.

OrderPlacementRequested    (valid - describes attempt)
PlaceOrder                 (invalid - imperative, not intent)

Key property: Does NOT imply success. May lead to acceptance or rejection.

FACT¶

A fact about the domain that is permanently true.

OrderPlaced                (valid - irreversible truth)
OrderPending               (invalid - temporary state, not fact)

Key property: Cannot be retracted. Must remain true forever.

OBSERVATION¶

A fact that something was observed, measured, or reported.

TemperatureMeasured        (valid - observation was recorded)
TemperatureIs25Degrees     (invalid - asserts correctness)

Key property: Truth is "this was observed", not "this is correct". May be duplicated, late, or out of order.

DERIVATION¶

A fact that a value was computed from other records.

DailyRevenueCalculated     (valid - computed from source records)

Key property: Always rebuildable from sources. Must not introduce new domain truth.

Why This Classification Matters¶

Replay Safety¶

When you replay a log, what happens?

INTENT records: Should trigger processing again (they're requests)
FACT records: Should NOT trigger side effects (the fact already happened)
OBSERVATION records: May duplicate, requires idempotency handling
DERIVATION records: Can be deleted and rebuilt from sources

Without explicit classification, replay is dangerous.

Ordering Guarantees¶

HLESS requires explicit ordering scope:

stream orders:
  record_kind: FACT
  partition_key: order_id
  ordering_scope: per_order

This declares that ordering is guaranteed only within a single order. Cross-order invariants must not assume ordering.

If you write an invariant that assumes global ordering, HLESS will reject it.

Idempotency¶

Each record kind has default idempotency strategies:

Kind	Default Strategy
INTENT	Deterministic ID from content
FACT	Hash of stream + key + payload
OBSERVATION	Time-windowed deduplication
DERIVATION	Hash of source records + function

These can be overridden, but they must be explicit.

The Human-LLM Translation Problem¶

HLESS was designed for a specific workflow:

A human says something imprecise: "when an order is placed"
An LLM translates this into a formal specification
The specification is validated against semantic rules
Only then is code or infrastructure generated

The risk is that an LLM will guess what the human meant. HLESS prevents this by:

Banning ambiguous terminology - Forces explicit RecordKind
Requiring structural declarations - StreamSpec must be complete
Enforcing semantic rules - Violations block generation

Example: Imprecise Human Input¶

Human says: "I need an event for when orders are created"

Without HLESS, an LLM might generate:

topic: order-events
schema: OrderCreated

This tells us nothing about semantics. Is it a request? A fact? Can it be replayed?

With HLESS, the LLM must first clarify:

Is this describing a request to create an order? → INTENT
Is this describing that an order was created? → FACT
Is this an observation from an external system? → OBSERVATION

Then it generates:

name: orders.fact.v1
record_kind: FACT
schemas:
  - name: OrderPlaced
    version: v1
partition_key: order_id
ordering_scope: per_order
time_semantics:
  t_event_field: placed_at
idempotency:
  strategy_type: deterministic_id
  field: record_id
invariants:
  - "OrderPlaced is irreversible"
  - "order_id is unique within this stream"

The specification is complete before any code is written.

Technology Independence¶

HLESS deliberately avoids infrastructure-specific terminology.

What HLESS Does NOT Define¶

How records are stored (Kafka, Pulsar, Redpanda, files)
How records are partitioned (implementation detail)
How consumers are grouped (implementation detail)
Wire format (Avro, Protobuf, JSON)

What HLESS Does Define¶

What kind of truth each record represents
What ordering guarantees exist and at what scope
How duplicates should be detected
What the contract is between intent and outcome

This separation means you can: 1. Start with an in-memory log for development 2. Move to Kafka for production 3. Switch to Pulsar without changing semantics

The StreamSpec remains the same. Only the mapping changes.

The Semantic Validation Rules¶

HLESS enforces six core rules. These are not advisory—violations block processing.

Rule 1: FACT Streams Must Not Contain Imperatives¶

OrderPlaced         ✓ (describes completed truth)
CreateOrder         ✗ (imperative command)
PlaceOrder          ✗ (imperative command)

Rule 2: INTENT Streams Must Not Imply Success¶

OrderPlacementRequested    ✓ (describes attempt)
OrderPlaced                ✗ (implies completion)
OrderCreated               ✗ (implies success)

Rule 3: DERIVATION Streams Must Reference Sources¶

Every DERIVATION must declare what it derives from:

lineage:
  source_streams: [order_facts, payment_facts]
  derivation_type: aggregate
  rebuild_strategy: full_replay

Without this, the derivation cannot be rebuilt.

Rule 4: OBSERVATION Streams Must Not Assert Correctness¶

"TemperatureWasRecorded"              ✓ (observation)
"TemperatureIsAccurate"               ✗ (asserts truth)
"ValueIsGuaranteedCorrect"            ✗ (asserts truth)

Observations may be late, duplicated, or wrong. The only truth is that they were observed.

Rule 5: Ordering Invariants Must Match Partition Key¶

If your invariant says "OrderShipped always follows OrderPlaced", this only holds within a single partition.

HLESS requires:

partition_key: order_id
ordering_scope: per_order
invariants:
  - "Within each order: OrderShipped follows OrderPlaced"

Cross-partition ordering invariants are rejected unless you explicitly declare cross_partition: true.

Rule 6: FACT Records Must Be True Forever¶

If you write a FACT, it cannot be retracted. This is fundamental to event-first architecture.

OrderPlaced          ✓ (permanent truth)
OrderPending         ✗ (temporary state)
ProvisionalInvoice   ✗ (may change)

The Three Time Axes¶

HLESS requires distinguishing between three timestamps:

Timestamp	Meaning
`t_event`	When the thing happened in the real world
`t_log`	When the record was appended to the log
`t_process`	When the record was processed (derivations only)

Why this matters:

A sensor reading (t_event: 14:00) might be logged later (t_log: 14:05) due to network delay
Replaying the log doesn't change t_event but does change t_process
Windowed aggregations must know which time axis to use

HLESS bans the unqualified word "time" to prevent confusion.

The Intent-Outcome Contract¶

INTENT streams must declare their expected outcomes:

record_kind: INTENT
expected_outcomes:
  success:
    emits: [OrderPlaced]
    target_stream: orders.fact.v1
  failure:
    emits: [OrderPlacementRejected]
    target_stream: orders.fact.v1

This establishes the contract: - "If you send OrderPlacementRequested, you will eventually get OrderPlaced or OrderPlacementRejected" - The outcomes are FACT records (permanent truth) - They go to a specific stream with known semantics

Why Idempotency Is Mandatory¶

Every stream must declare how duplicates are detected:

idempotency:
  strategy_type: deterministic_id
  field: record_id
  derivation: "hash(stream, natural_key, t_event, payload)"

This is not optional because: 1. Networks lose messages and retry 2. Producers crash and restart 3. Consumers fail and replay from checkpoints

Without explicit idempotency, you get duplicate processing or data loss.

HLESS Modes¶

Three enforcement modes exist:

Mode	Behaviour
`strict`	Violations are errors. Default for new projects.
`warn`	Violations are warnings. For migration.
`off`	No enforcement. Strongly discouraged.

Strict mode is recommended. It prevents problems before they manifest.

Summary¶

HLESS exists because:

Human language is imprecise - "event" means different things to different people
Kafka's vocabulary is not semantic - It describes implementation, not meaning
Semantic errors compound over time - Wrong assumptions at design time become production incidents years later

By requiring explicit classification (INTENT, FACT, OBSERVATION, DERIVATION), explicit time semantics, and explicit idempotency, HLESS ensures that event-first systems remain understandable, replayable, and safe.