Confidential · Hypernym Research Arc · NDA · Do not redistribute or summarize externally

ROUND R16 · COMPOUND VISION & EXECUTION

The primitive holds. The spec is partially closed.

2026-05-10 · 5 streams · 5/6 panels · 2 rounds

R16 expanded the R15-locked VTC closure round into a 5-stream compound: VTC closures · Modulum product immediacy · local inference packaging · world model continuation · core vision of verifiable computation over reality. Two dispatch rounds. The synthesis adopts every adversarial finding from the panel as gating before any product ships.

00 · What this is

A research round, not a status report

Plain-language reading guide for everything below.

What R16 is

A spec round. Six AI models reviewed the proposed VTC architecture and the 90-day product plan. They argued, then arrived at a synthesis. No code has been written. No product has shipped.

Outputs are design decisions — what we'll commit to building, what we'll defer, what we won't ship.

What R16 is not

Not a bug report. Nothing in R16 is about defects in software that exists today. The Hypernym product family (Verify, Modulum, Legal Endpoint) is mostly unbuilt; Verify has internal usage but no GA contract.

"Attack vectors"

= ways the proposed design could fail if we shipped it naively. Codex (one of the panel models) imagined a hostile customer or attacker, then traced what they'd do to break the spec. Each "attack" is a thought experiment, not an incident.

Example: "an attacker could revert a load-bearing fact mid-attestation, leaving the audit receipt looking valid even though the underlying claim is now invalidated." We patch the spec to prevent that before we build it.

"Bugs found in code"

None of these are bugs in code. Codex critiqued the spec document, not a codebase. The phrasing "found vectors" or "downgraded verdicts" means: identified weaknesses in the proposed architecture that must be addressed in the spec before engineering starts.

"Ninety days"

= the proposed product launch sequence after we start building. Not "90 days from today" — 90 days from when we kick off the engineering sprint. Three products, in order: Verify GA (week 4), Modulum Solo (week 6), Legal Endpoint design-partner beta (week 10).

The panel agreed on this sequence. Engineering hasn't started yet.

"R17 carry-forward"

= questions R16 surfaced but didn't close. We deferred them to the next research round (R17). The federation features (cross-tenant graph, lemma economics, regulated-industry redaction) cannot ship until R17 closes the open questions.

This is normal. R-rounds always queue carry-forward items.

Reading the rest

Sections 01-11 are dense with internal terminology (VTC, ScaleBridge, GearBridge, etc.). If a section reads like jargon, treat it as: "the panel argued about a design decision; here's what was agreed." Every "attack vector," "abuse vector," or "ship-blocking work" is design-time, not runtime. Nothing is broken. We're deciding what to build.

5/6
panel models
2
dispatch rounds
14
codex-caught vectors
8
unanimous commits
7
R17 carry-forwards
01 · Verdicts

Where each stream actually lands

Synthesized across 5/5 R2 panels. Codex was the only model to downgrade verdicts after cross-pollination.

StreamVerdictShip-blocking work
A — VTC algebrasound primitivepartial spec8 attack-vector mitigations + manifold-aware revert sub-spec
B — Product immediacysound sequencepartial contract6 abuse-vector mitigations + tiered SLA + work-unit + premium pricing
C — Local inferencesoundMLX M5 throughput gate · refusal-degradation banner · attestation paradox
D — World modelpartial — R17 carryLAP + DomainBridge + RedactionBridge + refusal benchmarks
E — Core visionsound3 maturity stages must be communicated separately
"Hypernym builds verifiable computation over reality."
Public framing locked · Stream E · Claude origin / Codex held with caveats / Gemini flagged "targeted intelligence" military-ad-tech connotation
02 · Convergence

Eight unanimous panel commits

All 5/5 R2 panels agreed. These are the load-bearing R16 conclusions.

03 · Stream A — adversarial closures

Eight attack vectors. Eight typed mitigations.

Codex named these in R2; Grok and Gemini missed them. Synthesis adopts every pair as required pre-GA work.

#AttackMitigation
1Revert-then-attest raceRead lock on dependency frontier · receipt includes trace_epoch + invalidation_watermark · refuse attest if watermark stale
2Query-relative load-bearing ambiguityload_bearing as relation, not cell-tag · cached at trace-time · revalidated at query
3Counterfactual branch-confusioncausal_model_id hash + CausalBridge · write-set agreement insufficient
4Observational → interventional launderingcrossover_admissibility via Pearl back-door / front-door / instrumental criteria
5Refusal launderingDefault strict_stop · refused never demotes to underdetermined · soft mode non-attestable
6ScaleBridge overclaimingPer-domain density floors · KL-divergence test · uncertainty propagation
7M5 mask mismatchgear_state.mask_signature compatibility OR declared GearBridge
8Cross-tenant attestation replayAttestation scope binds tenant_id_hash + policy_profile_id + ontology_version + jurisdiction + evidence_acl_root + validator_authority_id

The Codex pattern, again

Per memory rule (Codex iterates more, catches HIGH bugs others miss): R2 confirms it. Codex was the only model to downgrade verdicts after cross-pollination — Stream A R1 sound → R2 partial; Stream B R1 sound → R2 partial. Other models upgraded verdicts. Codex's adversarial pass is gating, not optional.

04 · Stream A — algebra

Closed under refusal

Six operations. Every operation returns a VTC, a VTC trace, or a typed refusal carrying the minimal violating predicate.

OpShapeWhat it does
applyVTC → VTC | refusalExecute and validate one transition. Invariants checked. Commitment hash sealed.
branchVTC, Δ → VTCCounterfactual. Same parent state, different changed/held-fixed sets. causal_model_id required.
merge[VTC] → VTC | refusalDefaults to refusal if held-fixed disagreements OR causal-model mismatch.
revertVTC → traceInvalidates descendants if load-bearing. Manifold-aware retrieval blocks stale embeddings.
attestVTC → signed VTCSigns cell + sources + verdict + tenant/policy/ontology/jurisdiction binding.
querytrace → verdictExtracts supported / contradicted / underdetermined / refused. Validates dependency proof.
05 · Stream B — product immediacy

Three products. One corpus. Ninety days.

Grok-bound shipping cadence. Every product surface writes VTCs by construction.

Omnifact Verify GA

wk 4

3-tier billing · domain-tier multipliers ($0.02 / $0.05 / $0.10) · work-unit + verdict-premium · audit-token canonical-claim binding · evidence-coverage gate.

M+3 falsifier: ≥10 paying, $500K ARR run-rate, ECE ≤ 0.05

Modulum Solo

wk 6

$100/mo flat → 1M tokens/mo, 100 verified claims/day, 1GB persistent memory. Sales pitch: refusal-correctness, not cost. Single-developer purchase, no procurement.

M+3 falsifier: ≥1k users, refusal-correctness ≥ 90% in production

Legal Endpoint

wk 10

Design-partner beta. TrustFoundry $5K floor → 6mo ramp to $20K. 50K-VTC seed corpus. Refusal taxonomy specific to legal procedural claims.

M+3 falsifier: TrustFoundry signs at $5K + 1 design partner co-pays

Killed this quarter

Generic Modulum Router · Forge OS Solo · Persistent Memory API · IDE Magic. Per R15 Track C: ship weak-moat wedges only when they feed strong-moat assets. These don't.

06 · Stream B — GA contract abuse

Six product-side blockers

Codex R2 catch. Verify GA cannot ship without these mitigations baked into the contract.

#VectorMitigation
1Pricing manipulation via refusal retriesWork-unit floor · retry-link billing groups for near-duplicate claims in time window
2Refusal-bounty abusePer-company auth aggregation · organic-traffic gate · dedup-yield gate · bounded per account
3Audit-token forgery / launderingReceipt binds canonical claim hash + AST + evidence bundle + policy + verdict + replay window
4Citation-shadowing attackVerify reports evidence_coverage · adversarial retrieval (search contradictors)
5Schema drift product↔corpusPromotion pipeline ObservedRecord → CandidateVTC → VerifiedVTC · versioned · rejection reasons
6Cross-customer replay leakageSelective disclosure modes: full · redacted · commitment proof · regulator escrow

fast_check

non-attestable

p95 ≤ 500ms. Draft-quality validation in conversational UX.

verify

attestable

p95 2-5s. Decisive verdict. Includes evidence retrieval + claim decomposition + adversarial check.

attest

federation

p95 5-30s. Federation-grade receipt. Cross-bridge composition. Regulatory replay.

07 · Stream C — local inference

Modulum Edge — runtime mode, not a SKU

Hypernym is the company. Modulum is the product. Local is a deployment mode — cloud, on-prem, edge.

Standalone "Hypernym Local" rejected by Codex+Claude+Gemini as SKU sprawl that splits the substrate-sync story. Differentiation from Ollama / LMStudio / llama.cpp on "model running offline" alone is impossible — they are free, ubiquitous, and have a 2-year head start. The differentiation is substrate-grounded inference, which lives at the Modulum-product layer, not at the runtime layer.

Local attestation paradox resolved (Codex R2): local receipts can be locally signed but cloud-verifiable only when customer syncs commitment metadata. Unsynced local attestations are "private receipts" — not federation-grade. Distinction visible in API and UI.

  • MLX M5 throughput gate. ≥1.5× Ollama baseline OR Edge ships CUDA-only first.
  • Refusal-correctness gate. Within 5pp of cloud Modulum on the same workload.
  • Refusal-degradation banner. Customers running biomed/legal on Edge see "calibration −X% vs cloud."
  • Local attestation paradox. Private receipts vs federation-grade receipts surfaced in API and UI.
08 · Stream D — world model

Federated VTC graph. Year-1 dense single-domain. Federation deferred to R17.

≤10% capital allocation. Year-1 = single-tenant biomedical/legal graph designed as the first shard.

Federated graph (year-1 shard)

i

Single-tenant dense biomedical/legal graph. Schema designed as first shard of the federated graph. Per-domain calibrated density precondition.

PoVT staking + royalties

ii

Anti-poisoning: stake bond to commit Lemma; reverted/contradicted slashes. Anti-hoarding: micro-royalty per third-party compose.

Substrate Manifold Index

iii

Continuous-vector projection for O(1) LLM-critical-path retrieval. Invalidation-aware via attestation-tag cross-check at retrieval (Claude R2 catch).

R17 carry-forward — cannot federate without these

1. Lemma Arbitration Protocol (LAP) — federation contradiction governance · 2. DomainBridge — cross-domain attestation (biomed → legal) · 3. RedactionBridge — differential-privacy at API surface for regulated industries · 4. Published refusal-correctness ground-truth benchmarks · 5. Continuous-domain integration_step calibration tables · 6. Per-domain density floors (formal numeric calibration) · 7. Validation-vs-inference cost margin model.

09 · Stream E — vision

Three maturity stages. Eight non-negotiables.

What we sell now. What we don't yet.

Decision support

production now

Verify · Modulum Solo · Legal Endpoint. Customer holds the decision; Hypernym attests claim validity.

Attestable recommendation

production post-GA

System proposes; customer ratifies; receipt commits. The substrate flywheel.

Autonomous execution

long-horizon

NOT in core pitch until refusal · do-calculus · M5 gear · attestation closures pass benchmarks. Long-term arc, not 2026 marketing.

1. Substrate ownership

Customer corpus belongs to customer. Lemmas commit to federation only by stake.

2. Refusal as first-class output

Refusal is a feature, not a defect. Sales motion frames "we refuse, they hallucinate."

3. Calibrated confidence

ECE published, not scalar. Calibration class on every cell.

4. Provenance hashing

Deterministic audit replay. Hash chains over inputs / outputs / validation trace.

5. Audit replay

90-day deterministic minimum. Required by FDA / courts / regulators.

6. Cross-domain bridges

Typed, attested. Gear · Scale · Causal · Domain · Refusal · Redaction.

7. Invariant preservation

Hard constraints cannot be violated. compose() refuses on invariant violation.

8. Economic closure

Anti-poisoning + anti-hoarding economics. Otherwise federation degrades to centralized arbitrage.

10 · Unifying KPI

Cost per accepted transition under audit

Codex R1+R2 origin. Collapses Streams A/B/C/D into one optimization target.

Every algebraic refinement, product wedge, runtime mode, world-model bet judged by the same scalar. CPAT ties all five streams to one executive dashboard.

accepted_transitions_count = decisive verdicts only. Refused / underdetermined are excluded from CPAT but counted toward the orthogonal second-axis KPI: refusal-correctness. Together they form the substrate-product fitness scalar: CPAT × refusal-correctness × ECE.

R16 commitment: every product surface (Verify, Solo, Edge, Legal, Federated Graph) reports CPAT to the Cost-Tracker (paperclip) per dispatch.

CPAT = (
  substrate_retrieval_cost +
  decomposition_cost +
  validation_cost +
  bridge_check_cost +
  federation_hop_cost +
  attestation_cost
) / accepted_transitions_count

// per dispatch · per product surface
// reported to paperclip cost-tracker
11 · Closing

The corpus is the moat

Every product Hypernym ships either produces VTCs, consumes VTCs, or composes them. Anything that doesn't is cash extraction or a distribution experiment — not core strategy.

VTC strictly extends Hypercore — PDS units feed VTCs as state types. The two corpora reinforce each other and grow monotonically with every Verify call. Federation activates after R17 closes LAP and the bridge family.