The day after the pivot, the work was to make Path B real: a baseline runtime in which a single language model is the organism, and a substrate around it that holds a body, a world, mortality, and memory.
The architectural line we will not move
One principle governs the whole build, and it is the thing that keeps the work honest:
The substrate classifies. The language model interprets.
The substrate’s job is to detect events, accumulate state, and emit structured, classified labels: energy level, body integrity, what lies in each direction, what it remembers about a place. The model’s job is to read those labels and author the first-person interpretation, the voice, the choice of action. The substrate is never allowed to write the organism’s inner monologue for it. Where earlier code blurred that line, we treat it as a defect and refactor it out.
This matters because it is the difference between a system that appears to narrate an inner life and one where every piece of first-person language is traceably the model’s own response to a labelled world-state. Only the second kind can survive a hostile reading.
How the world reaches the organism
The open design question on 2026-05-21 was how those labels should be rendered into the model’s view. Two honest options:
- Pure labels: the model reads something like
energy:well_fed integrity:unhurtand does all the interpretation itself. - Phrase templates: the substrate selects fixed, classification-indexed phrasings (“you feel well-fed, and you are unhurt”) and the model interprets those.
We built the sensory-framing layer to emit the second form for now, while being explicit that this is the borderline case of our own principle. A phrase template is still substrate-side classification, not substrate-authored prose, but it is not the cleanest expression of “the substrate only classifies.”
We have a working sensory-framing layer and a baseline runtime in which the organism perceives its world only through classified channels.
Not: we are not yet claiming the phrase-template rendering is the right one. Whether feel-shaped phrasing beats pure-label emission is a question we have written down as a controlled A/B experiment, with verbatim per-arm outputs and a refactor decision gated on the result. Until that experiment runs, we ship the version we can test, not the version we prefer.
Why we did not just pick one
It would have been faster to decide by taste. We deliberately did not. The choice of sensory rendering plausibly shapes everything downstream (how rich the voice is, how tightly behaviour couples to body-state), so it is exactly the kind of decision that should be settled by a matched comparison rather than an aesthetic preference. Writing the test down first, before refactoring in either direction, is the discipline; the answer comes later in this notebook.
What exists at the end of this day is modest and real: an organism that has a body it can be hurt in, a world it can move through, and a sensory channel that is the only way the world gets in. That channel is now the seam we will keep stress-testing.