ROUND R16 · COMPOUND VISION & EXECUTION
2026-05-10 · 5 streams · 5/6 panels · 2 rounds
R16 expanded the R15-locked VTC closure round into a 5-stream compound: VTC closures · Modulum product immediacy · local inference packaging · world model continuation · core vision of verifiable computation over reality. Two dispatch rounds. The synthesis adopts every adversarial finding from the panel as gating before any product ships.
Plain-language reading guide for everything below.
A spec round. Six AI models reviewed the proposed VTC architecture and the 90-day product plan. They argued, then arrived at a synthesis. No code has been written. No product has shipped.
Outputs are design decisions — what we'll commit to building, what we'll defer, what we won't ship.
Not a bug report. Nothing in R16 is about defects in software that exists today. The Hypernym product family (Verify, Modulum, Legal Endpoint) is mostly unbuilt; Verify has internal usage but no GA contract.
= ways the proposed design could fail if we shipped it naively. Codex (one of the panel models) imagined a hostile customer or attacker, then traced what they'd do to break the spec. Each "attack" is a thought experiment, not an incident.
Example: "an attacker could revert a load-bearing fact mid-attestation, leaving the audit receipt looking valid even though the underlying claim is now invalidated." We patch the spec to prevent that before we build it.
None of these are bugs in code. Codex critiqued the spec document, not a codebase. The phrasing "found vectors" or "downgraded verdicts" means: identified weaknesses in the proposed architecture that must be addressed in the spec before engineering starts.
= the proposed product launch sequence after we start building. Not "90 days from today" — 90 days from when we kick off the engineering sprint. Three products, in order: Verify GA (week 4), Modulum Solo (week 6), Legal Endpoint design-partner beta (week 10).
The panel agreed on this sequence. Engineering hasn't started yet.
= questions R16 surfaced but didn't close. We deferred them to the next research round (R17). The federation features (cross-tenant graph, lemma economics, regulated-industry redaction) cannot ship until R17 closes the open questions.
This is normal. R-rounds always queue carry-forward items.
Sections 01-11 are dense with internal terminology (VTC, ScaleBridge, GearBridge, etc.). If a section reads like jargon, treat it as: "the panel argued about a design decision; here's what was agreed." Every "attack vector," "abuse vector," or "ship-blocking work" is design-time, not runtime. Nothing is broken. We're deciding what to build.
Synthesized across 5/5 R2 panels. Codex was the only model to downgrade verdicts after cross-pollination.
| Stream | Verdict | Ship-blocking work |
|---|---|---|
| A — VTC algebra | sound primitivepartial spec | 8 attack-vector mitigations + manifold-aware revert sub-spec |
| B — Product immediacy | sound sequencepartial contract | 6 abuse-vector mitigations + tiered SLA + work-unit + premium pricing |
| C — Local inference | sound | MLX M5 throughput gate · refusal-degradation banner · attestation paradox |
| D — World model | partial — R17 carry | LAP + DomainBridge + RedactionBridge + refusal benchmarks |
| E — Core vision | sound | 3 maturity stages must be communicated separately |
All 5/5 R2 panels agreed. These are the load-bearing R16 conclusions.
state_before → transition → state_after with invariants, calibrated confidence, hash provenance, explicit failure mode.gear_state + GearBridge required in state_before. Claude origin. M5 mask compatibility is a closure requirement; algebra is unsound under M5 without it.integration_step for continuous-domain VTCs. Gemini origin. PDE-governed substrates (climate, physical systems) need resolution bounds reserved in the type.Codex named these in R2; Grok and Gemini missed them. Synthesis adopts every pair as required pre-GA work.
| # | Attack | Mitigation |
|---|---|---|
| 1 | Revert-then-attest race | Read lock on dependency frontier · receipt includes trace_epoch + invalidation_watermark · refuse attest if watermark stale |
| 2 | Query-relative load-bearing ambiguity | load_bearing as relation, not cell-tag · cached at trace-time · revalidated at query |
| 3 | Counterfactual branch-confusion | causal_model_id hash + CausalBridge · write-set agreement insufficient |
| 4 | Observational → interventional laundering | crossover_admissibility via Pearl back-door / front-door / instrumental criteria |
| 5 | Refusal laundering | Default strict_stop · refused never demotes to underdetermined · soft mode non-attestable |
| 6 | ScaleBridge overclaiming | Per-domain density floors · KL-divergence test · uncertainty propagation |
| 7 | M5 mask mismatch | gear_state.mask_signature compatibility OR declared GearBridge |
| 8 | Cross-tenant attestation replay | Attestation scope binds tenant_id_hash + policy_profile_id + ontology_version + jurisdiction + evidence_acl_root + validator_authority_id |
Per memory rule (Codex iterates more, catches HIGH bugs others miss): R2 confirms it. Codex was the only model to downgrade verdicts after cross-pollination — Stream A R1 sound → R2 partial; Stream B R1 sound → R2 partial. Other models upgraded verdicts. Codex's adversarial pass is gating, not optional.
Six operations. Every operation returns a VTC, a VTC trace, or a typed refusal carrying the minimal violating predicate.
| Op | Shape | What it does |
|---|---|---|
apply | VTC → VTC | refusal | Execute and validate one transition. Invariants checked. Commitment hash sealed. |
branch | VTC, Δ → VTC | Counterfactual. Same parent state, different changed/held-fixed sets. causal_model_id required. |
merge | [VTC] → VTC | refusal | Defaults to refusal if held-fixed disagreements OR causal-model mismatch. |
revert | VTC → trace | Invalidates descendants if load-bearing. Manifold-aware retrieval blocks stale embeddings. |
attest | VTC → signed VTC | Signs cell + sources + verdict + tenant/policy/ontology/jurisdiction binding. |
query | trace → verdict | Extracts supported / contradicted / underdetermined / refused. Validates dependency proof. |
Grok-bound shipping cadence. Every product surface writes VTCs by construction.
3-tier billing · domain-tier multipliers ($0.02 / $0.05 / $0.10) · work-unit + verdict-premium · audit-token canonical-claim binding · evidence-coverage gate.
M+3 falsifier: ≥10 paying, $500K ARR run-rate, ECE ≤ 0.05
$100/mo flat → 1M tokens/mo, 100 verified claims/day, 1GB persistent memory. Sales pitch: refusal-correctness, not cost. Single-developer purchase, no procurement.
M+3 falsifier: ≥1k users, refusal-correctness ≥ 90% in production
Design-partner beta. TrustFoundry $5K floor → 6mo ramp to $20K. 50K-VTC seed corpus. Refusal taxonomy specific to legal procedural claims.
M+3 falsifier: TrustFoundry signs at $5K + 1 design partner co-pays
Generic Modulum Router · Forge OS Solo · Persistent Memory API · IDE Magic. Per R15 Track C: ship weak-moat wedges only when they feed strong-moat assets. These don't.
Codex R2 catch. Verify GA cannot ship without these mitigations baked into the contract.
| # | Vector | Mitigation |
|---|---|---|
| 1 | Pricing manipulation via refusal retries | Work-unit floor · retry-link billing groups for near-duplicate claims in time window |
| 2 | Refusal-bounty abuse | Per-company auth aggregation · organic-traffic gate · dedup-yield gate · bounded per account |
| 3 | Audit-token forgery / laundering | Receipt binds canonical claim hash + AST + evidence bundle + policy + verdict + replay window |
| 4 | Citation-shadowing attack | Verify reports evidence_coverage · adversarial retrieval (search contradictors) |
| 5 | Schema drift product↔corpus | Promotion pipeline ObservedRecord → CandidateVTC → VerifiedVTC · versioned · rejection reasons |
| 6 | Cross-customer replay leakage | Selective disclosure modes: full · redacted · commitment proof · regulator escrow |
p95 ≤ 500ms. Draft-quality validation in conversational UX.
p95 2-5s. Decisive verdict. Includes evidence retrieval + claim decomposition + adversarial check.
p95 5-30s. Federation-grade receipt. Cross-bridge composition. Regulatory replay.
Hypernym is the company. Modulum is the product. Local is a deployment mode — cloud, on-prem, edge.
Standalone "Hypernym Local" rejected by Codex+Claude+Gemini as SKU sprawl that splits the substrate-sync story. Differentiation from Ollama / LMStudio / llama.cpp on "model running offline" alone is impossible — they are free, ubiquitous, and have a 2-year head start. The differentiation is substrate-grounded inference, which lives at the Modulum-product layer, not at the runtime layer.
Local attestation paradox resolved (Codex R2): local receipts can be locally signed but cloud-verifiable only when customer syncs commitment metadata. Unsynced local attestations are "private receipts" — not federation-grade. Distinction visible in API and UI.
≤10% capital allocation. Year-1 = single-tenant biomedical/legal graph designed as the first shard.
Single-tenant dense biomedical/legal graph. Schema designed as first shard of the federated graph. Per-domain calibrated density precondition.
Anti-poisoning: stake bond to commit Lemma; reverted/contradicted slashes. Anti-hoarding: micro-royalty per third-party compose.
Continuous-vector projection for O(1) LLM-critical-path retrieval. Invalidation-aware via attestation-tag cross-check at retrieval (Claude R2 catch).
1. Lemma Arbitration Protocol (LAP) — federation contradiction governance · 2. DomainBridge — cross-domain attestation (biomed → legal) · 3. RedactionBridge — differential-privacy at API surface for regulated industries · 4. Published refusal-correctness ground-truth benchmarks · 5. Continuous-domain integration_step calibration tables · 6. Per-domain density floors (formal numeric calibration) · 7. Validation-vs-inference cost margin model.
What we sell now. What we don't yet.
Verify · Modulum Solo · Legal Endpoint. Customer holds the decision; Hypernym attests claim validity.
System proposes; customer ratifies; receipt commits. The substrate flywheel.
NOT in core pitch until refusal · do-calculus · M5 gear · attestation closures pass benchmarks. Long-term arc, not 2026 marketing.
Customer corpus belongs to customer. Lemmas commit to federation only by stake.
Refusal is a feature, not a defect. Sales motion frames "we refuse, they hallucinate."
ECE published, not scalar. Calibration class on every cell.
Deterministic audit replay. Hash chains over inputs / outputs / validation trace.
90-day deterministic minimum. Required by FDA / courts / regulators.
Typed, attested. Gear · Scale · Causal · Domain · Refusal · Redaction.
Hard constraints cannot be violated. compose() refuses on invariant violation.
Anti-poisoning + anti-hoarding economics. Otherwise federation degrades to centralized arbitrage.
Codex R1+R2 origin. Collapses Streams A/B/C/D into one optimization target.
Every algebraic refinement, product wedge, runtime mode, world-model bet judged by the same scalar. CPAT ties all five streams to one executive dashboard.
accepted_transitions_count = decisive verdicts only. Refused / underdetermined are excluded from CPAT but counted toward the orthogonal second-axis KPI: refusal-correctness. Together they form the substrate-product fitness scalar: CPAT × refusal-correctness × ECE.
R16 commitment: every product surface (Verify, Solo, Edge, Legal, Federated Graph) reports CPAT to the Cost-Tracker (paperclip) per dispatch.
Every product Hypernym ships either produces VTCs, consumes VTCs, or composes them. Anything that doesn't is cash extraction or a distribution experiment — not core strategy.
VTC strictly extends Hypercore — PDS units feed VTCs as state types. The two corpora reinforce each other and grow monotonically with every Verify call. Federation activates after R17 closes LAP and the bridge family.