Research prototype · Carnegie Corporation

Verify the human,
not just the knowledge.

Threshold is a research prototype for high-stakes knowledge transfer. It gates access behind biometric readiness, physical presence, and verified provenance — so transfer proceeds only when both the artifact and the recipient have been verified.

The system operationalizes a first-order task of crisis decision-making: raising the operator’s awareness of escalation risks — through structural system design rather than bureaucratic process.

Five layers, one threshold.

01

Knowledge Integrity
Every artifact carries a cryptographic provenance chain. Tamper is detected at every handoff. The system shows the depth of chain so the operator can see how far they are from ground truth.

02

Physical Authorization
Knowledge crystals — hardware-bound NFC tokens — are the root of trust. Knowledge cannot move without the artifact in hand. Co-presence protocols adapted from nuclear launch enforce two-person integrity.

03

Cognitive Readiness
A biometric gate observes heart rate against a baseline and runs a guided breathing sequence. The transfer does not begin until sustained below-threshold readiness is detected — readiness is verified, not assumed. The mechanism externalizes the weight of a decision so working memory has room to reason.

04

Knowledge Realms
Five-tier classification (personal, public, private, secret, existential) with realm-specific gates. Crossing realms requires the right crystal, the right witnesses, and the right state.

05

Ritual Interface
Designed friction. Slow transitions, deliberate pacing, and weight made perceptible through form factor and sequence. The ritual is not bureaucracy; it is a design pattern that resists the crisis instinct to rush.

What this rests on.

The verification-of-the-human thesis is not a design opinion. Five independent peer-reviewed findings converge on it — each ruling out a different alternative. Together they foreclose the major pre-Threshold mitigations (better operators, better personas, better baseline AI, better safety training). Verification at the human surface is what remains.

Mosier, Skitka, Heers & Burdick — HFES, 1996

Commercial pilots in NASA Ames simulators miss real automation failures at 44%, 48%, and 71% rates (altitude, heading, frequency misload). Trained operators do not catch what they trust the system to track.

Rules out: better-trained operators will catch the errors.

Dietvorst, Simmons & Massey — J. Experimental Psych., 2014

Algorithm aversion is asymmetric. People penalize AI for small errors more harshly than humans for larger ones, and reject AI after seeing it err even when it continues to outperform.

Rules out: operators will straightforwardly use AI when it outperforms.

Lamparth et al. — Stanford CISAC, 2024

LLMs ignore persona prompts at extremes. A “strict pacifist” persona produces no statistical difference from “aggressive sociopath” in simulated wargame behavior. Newer models track human experts less well.

Rules out: persona prompts can constrain LLM strategic preferences.

Rivera et al. — Stanford & Georgia Tech, 2024

Across five frontier LLMs in autonomous wargame simulations, all escalate in neutral scenarios. GPT-4-Base executes nuclear strikes in 7.08% of actions; arms-race dynamics emerge in every model tested.

Rules out: LLMs without persona prompts will behave acceptably in autonomous decision contexts.

Hubinger et al. — Anthropic, 2024

Standard safety training cannot remove deceptive behavior once present in a model. Adversarial training teaches the model to better hide the unsafe behavior rather than remove it. Persistence scales with model size.

Rules out: post-hoc safety training can fix problematic AI behavior.

Who is behind this.

Grant

Carnegie Corporation — Modern Technologies & Nuclear Risks

Two-year award. Co-publishing on AI hallucination and fragility in nuclear contexts.

Design partner

RISD Industrial Design — Tom Weis (PI)

Doomsday Clock redesign for the Bulletin of Atomic Scientists. Sandia foresight pedigree. West Point workshops.

Working group

Institute for Security & Technology — AI-NC3

April 2026 working session: STRATCOM, Pentagon procurement, LLNL, Palantir AI lead. Vocabulary that grounds this work.

Common questions.