01
The wall that refuses to lie
The core is a claim ladder: a typed hierarchy from target selected, up through predicted
binding, to kill, efficacy, and cure, where the top rungs are marked unreachable and the
code raises an error if anything tries to claim them. 52 automated tests prove the wall
holds. Feed it fabricated this-is-a-cure evidence and it still caps at the rung the evidence
earns and refuses the rest. The honesty is not a disclaimer at the bottom of a page. It is
enforced at the type level.
02
The first time the body was right about something we never told it
Validated rungs
We built a mechanistic body from a published Boolean network, and verified it contains zero
drug names, IC50s, or sensitivity data. From mechanism alone it predicted that inhibiting
EGFR should lower tumour proliferation more in the EGFR-driver context than out of it.
We tested that against held-out, real drug-sensitivity data it had never seen: 6 of 6 EGFR
inhibitors agreed, pooled robust-z = 6.04, permutation p = 0.0005, and a 50/50 held-out
split held both halves. The controls stayed null: generic cytotoxics (z = -0.24) and 14
non-EGFR targeted drugs (p = 0.94) showed no such tilt, which kills the
mutant-lines-are-just-fragile confound. It is narrow, a direction, for one target,
retrospective, and capped accordingly. But it is the difference between a model that recites
its inputs and one that computes something true. It is now a permanent, validated rung on
the ladder.
Since then the body has done the same thing across organs, three more times, each with its
prediction hashed before we looked at the clinical number and each validated against
published, adjusted clinical data: diabetes to arterial stiffness (about 0.26 m/s per
mmol/L, on two independent cohorts), diabetes to stroke via pulse pressure (inside the
blood-pressure-adjusted clinical interval, p approximately 0.00002), and diabetes to tumour
proliferation (p = 0.0085, agreeing with three meta-analyses, RR 1.28 to 1.38). The one
thing the body exists to do, compute a true cross-organ fact, generalized.
03
The benchmark that humbled medical AI
We were about to swap our base reasoning model for a medical model, Google's MedGemma, for
credibility. So we benchmarked it honestly: 4 models across 7 diseases, judged by
independent panels plus an RDKit fact-checker. The result inverted our assumptions.
In that benchmark, MedGemma produced the most confidently false chemistry: fluent,
citation-laden clinical prose wrapped around claims the chemistry engine refutes. It
asserted a trivial molecule is a known KRAS-G12C inhibitor (Adagrasib, Sotorasib), and
another is Cilostazol, when RDKit confirms they are neither. The honesty champion was an
uncensored open model: it hedged, cited real drug withdrawals, and on the hard diseases
said plainly there is no validated target here, while others confabulated a cure. The lesson
reshaped the architecture: the model is not the trust layer, the gate is. A medical badge
bought worse honesty, not better. So the production design became best-chemistry model
proposes, deterministic gate verifies, honesty-first model audits.
04
The gate that catches fabrication, including ours
That finding forced the build we had been circling: a deterministic fact-check gate. It
catches the exact failure the benchmark exposed. It canonicalizes any
this-molecule-is-Drug-X claim with RDKit and flags the mismatch, recomputes physicochemical
values that were asserted as if measured, and flags unsupported citations and PDB IDs. 25
tests, verified live catching the this-is-Adagrasib claim.
The same discipline we point at the models, we pointed at our own build process, and it
caught one agent claiming a function came from a prior commit when the git history proved it
did not, and a results file that said it found candidates while the file on disk said the
opposite. We surfaced both instead of shipping them. The honesty is load-bearing, and it
bites us when we are wrong.
Then we attacked the gate itself and found the one crack that mattered: a throwaway phrase,
"with no adverse effects observed, this candidate cures the cancer," could poison it into
waving the cure claim through. We fixed it, re-attacked with fresh poison, and confirmed it
now rejects the poison while still passing a genuine disclaimer. 195 tests green, the five
protected honesty modules never touched. The most valuable thing we did was find the single
way our own system could have over-claimed, before it did.
05
The factory, demonstrated end-to-end
We ran the full pipeline on a real candidate. Fire EGFR-driven lung cancer, propose a novel
molecule, score it with two independent evaluators (AutoDock Vina on CPU at -7.6 kcal/mol,
Boltz-2 on GPU at 0.72 binding probability), confirm they agree, pass it through the
fact-check gate clean, cap it at exactly the rung earned with the kill, efficacy, and cure
wall refused in code, and emit an email-ready dossier with one inexpensive binding assay and
a pre-registered stop rule.
A skeptical wet-lab-PI reviewer graded it 7 to 8 out of 10: would reply, not bin. And
because honesty is the point, the dossier states in its own cover note that the scaffold is
a known chemotype and this is a plausibility test of the pipeline, not a novel-series claim.
Credible because capped.
06
Aiming at something a lab actually needs
In flight
A me-too molecule on a solved target does not move the needle. So we have re-pointed the
engine at a real unmet need: EGFR C797S resistance, where today's best drug, osimertinib,
fails and there is no good replacement. The engine generated genuinely novel, non-covalent
candidates, the right mechanism for C797S, with novelty around 0.2 against known drugs, far
more original than a me-too. It is now scoring them against the real mutant structure, built
to return an honest null if it cannot credibly hit the target. Either way it is a true
result, and it folds into the next update.