Latest entry The lab notebook
Field notes, as they happen.
A research journal, in chronological order. Each entry asks what changed, why, what failed, and what comes next, with every claim paired against its non-claim. The entries where the answer is no are kept, not deleted.
Latest entry Earlier entries
Synthena Medical: the honesty dividend
medical effort; in-silico only; one trust-gate bug caught and fixed, three new cross-organ predictions validated against published clinical data, a measured discovery ceiling, one self-falsified claim; instrument stronger, real-world impact unmoved; not a cure What the refusal to fake a green light bought us in one intense stretch of work. We attacked our own trust gate and found the single bug that could have let it wave a cure claim through, then fixed and re-attacked it (195 tests green, the honesty modules untouched). The body model made three new cross-organ predictions, each hashed before we looked and each validated against published clinical data. A time-split test measured our own discovery ceiling: near-perfect on known-like molecules, below chance on genuinely novel scaffolds. And one claim self-falsified on its pre-registered test, which taught us a rule. As an instrument we are stronger; as a source of real-world cures we are unmoved at about 2.5 out of 10. Not a cure.
Synthena Medical: the engine that refuses to fake a green light
medical effort; in-silico only; a validated mechanistic-body rung, a fabrication-catching gate, and a capped lab-ready dossier; zero wet-lab validation; not a cure A deep update on Synthena Medical, the sister effort to this notebook. You fire a disease at it, a mechanistic body and a language model propose existing and novel candidates, and a machine-enforced honesty spine caps every claim at exactly what the evidence supports and refuses, in code, to say kill, efficacy, or cure. This update logs the first time the body predicted something true under controls, a fact-check gate that catches fabrication including our own, and a capped, wet-lab-ready dossier. We score ourselves about 2.5 out of 10 where it actually matters, on purpose. Zero wet-lab validation; not a cure.
The falsifier that caught us
no ALife claim; a matched-random control matched our best burst result, so burst timing is not yet a defensible adaptive signal A major update that produced no life claim, and something more useful at this stage: a stronger falsifier. A matched-random burst control reproduced our best stall-coupled result almost exactly, 42 advances against 41, so burst timing is not yet a defensible adaptive signal. A richer composite-resource challenge verified cleanly but still failed the open-ended-evolution gate on novelty and complexity plateaus. We are not claiming artificial life; we are building the evidence machinery that would tell us when such a claim is real. Zero life-properties demonstrated.
Expressible, but not yet selectable
first substrate positive (behavioural dimensionality 2 to 5, heritable); selection did not maintain it; no artificial-life claim A first real substrate positive: a stateful drive with a bounded memory raised expressible behavioural dimensionality from about 2 to 5, verified genome-driven and heritable. But the decisive selectability test came back an honest null: priced selection did not maintain the richer behaviours, and viable dimensionality collapsed back to 2. Expressibility and selectability are separate bottlenecks, and the ecology presents only about two meaningful resource channels. The next experiment widens it. No artificial-life claim; zero life-properties demonstrated.
Sliding weights now move outcomes, still not artificial life
online sliding weights move short-horizon final energy under matched controls; not a survival claim, not artificial life The substrate is now an auditable experimental spine, and online sliding adapter weights measurably move short-horizon body-world final energy under matched no-brain, frozen, and online controls. In Dolphin (n=12) the online arm's final energy beat both controls (p ≈ 0.003 and p ≈ 0.0002, mean about +10 energy), reproduced directionally in Qwen (n=3, p = 0.125). A final-energy advantage, not a survival claim, and not artificial life. The artifacts still carry explicit no-promotion and no-artificial-life flags. Zero life-properties demonstrated.
Where we are: does the language model actually do anything?
course-correction: the model demoted to an organ, its prior shown load-bearing for survival A course-correction: the language model is demoted from 'the organism' to a bounded brain that only nudges a mortal body's parameters. On a survival-recovery task its trained choices beat all four controls, including random-weights and valid-random, across two model families (p ≈ 0.006). A controlled measurement that the frozen model's learned prior is load-bearing for survival. Not life, not self-maintenance. Zero life-properties demonstrated.
A precondition, not a result
base-competence gate closed · adaptation unmeasured The base navigate-eat-survive gate now holds on a cleaned surface (3 of 3 organisms reached the 120-tick horizon, 0 invalid outputs), and perturbation experiments run end-to-end with matched controls. The one foraging null so far is geometry-confounded and uninterpretable. Self-maintenance closure still scores 0, blocked by an engineering bug. Zero life-properties demonstrated.
Dropping the deadline
self-paced publication: guardrails set We let go of the conference deadline we had been working toward and committed instead to publishing here, at our own pace. Removing the external pressure removes one failure mode and introduces another, so we set two guardrails before we did it.
The claim we did not make
autopoiesis push: held back at the control We designed an experiment to test for a self-maintaining loop (the property closest to the line we have promised not to cross) and built the adversarial controls before the result. The controls fired. The headline arm did not engage the mechanism. We held the claim back.
The same thought, twice
restore-determinism verified We tested whether the saved internal state of a life is really the life, or just a lossy summary. A fresh process, loading the saved state, continued the organism's thought token-for-token identically. The continuity claim now rests on a mechanism, not a metaphor.
Moving the life inside the model
native runtime: held cache as identity We replaced the request-and-respond boundary with an in-process runtime that holds the model's own internal state across moments and persists it to disk. The organism stops being something the runtime calls and becomes something that lives inside it.
A phrase that outlived its organism
landscape survey · novelty accounting · a lineage chain found by grep We surveyed the field to find out what was actually ours, wrote an honest novelty ledger, and, while verifying a claim against the database, discovered a metaphor that had been inherited and mutated across eight organisms without anyone having noticed.
A body the model can feel
Path B baseline + sensory framing layer With the language model now treated as the organism, we built the baseline runtime and the layer that decides how the world reaches it: structured, classified sensory labels rather than raw state. We left the exact rendering as an open, testable question.
The attractor that ended Path A
pivot: Path A deprecated The first architecture put a self-organising dynamical substrate at the centre and used the language model only as a narrator. It ran, and it collapsed into a single repeating state. We deprecated it the same day and changed the question.