Science has many partial maps and no shared state layer. This document is the architecture for what would change if it did.
Sky, Astronomy, Observatory
applied to science itself
At the civilizational scale, the registers are the infrastructure of science itself. They map cleanly onto the three-layer taxonomy the field is converging on: Runtime, State, Network.
State
the system that remembers science
Compiled scientific knowledge as a first-class substrate. Findings with evidence, provenance, confidence, and typed relations. Versioned. Content-addressed. Queryable. Correctable at the finding level.
Papers are human-readable renderings, not a substrate. No Git for findings. LLMs compile into the void. This is the 100% compilation debt that every downstream layer inherits.
Runtime
the system that does science
Experimental workflows, protocol execution, instruments and robots, compute, simulation, agents in the loop. The active work of extending the frontier. Where FutureHouse, Phylo, Biomni, Emerald Cloud Lab, and self-driving labs operate.
Without a State layer underneath, every runtime is a silo. Every agent re-extracts findings from the same papers. Every SDL stores results in its own schema. Compilation is redone every time because no substrate holds its output.
Network
the system that lets science compound
Registries, standards, federation, trust, attribution, governance. The coordination protocol that lets findings, runs, and workflows cross institutional boundaries without losing meaning or provenance.
DOI, ORCID, ROR, schema.org-for-science attempts. Piecemeal, disconnected, human-readable-only. There is no version-controlled, machine-queryable fabric that spans the scientific community. GitHub for science has not been built.
Vela is the open protocol that restores the State layer. The other two layers will be built by many teams. The thing missing underneath all of them is the compiled substrate. That is the wedge.
Why every AI-for-science tool
compiles into the void
Software infrastructure follows an ordering. Git came before GitHub. GitHub came before package managers. Package managers came before CI/CD. CI/CD came before Copilot. Each layer required the substrate beneath it to exist before it could compound. Remove any lower layer and everything above it collapses into individual heroics.
Science has inverted the stack. We have AI co-scientists, agentic pipelines, autonomous labs, and foundation models for biology, running on top of a substrate that does not yet exist. The compiled, queryable, versioned layer that Git provided for code has no equivalent for findings. Papers are not it. Databases are not it. Citation graphs are not it. They are human-readable renderings or single-purpose silos.
We have built the frontier layer of the stack before the substrate layer underneath it. This is why AI compilation of science does not compound.
Vela is the substrate. An open protocol for content-addressed, provenance-bearing, version-controlled finding bundles. The JSON Schema is published. The compiler runs from a Rust CLI. The first corridor — the Alzheimer blood-brain-barrier corridor — has 700 papers, 2,299 findings, 53,632 typed links, and 8 novel cross-domain hypotheses that produce zero results on PubMed. The flywheel is real and it is already spinning on a small region of the sky.
Every other serious AI-for-science team eventually reaches the same realization: there is nowhere for compiled findings to land. PhysMaster's LANDAU, Bohrium's traceable execution, Allen Institute's S2AG — all converging toward the same architectural requirement. Vela is the open, interoperable, first-class version of the thing they are all trying to build privately.
Time in a scientific universe
Time runs through all three registers as three clocks, each keeping time for a different kind of event. The discipline of keeping the clocks separate is the difference between a substrate that compounds and one that decays into story.
Confidence drift is what happens when the clocks are not separated. A tentative correlation from 1990 becomes an established fact by 2015 through successive citation, because no one tracks when the confidence was earned and when it was inherited. Vela separates the clocks by construction. Every finding has a world time and a system time. Simulation outputs are always tagged, always ephemeral, and never promoted to substrate without explicit verification. The clocks must stay separate or the substrate decays into the same confidence-laundering machine that produced the current crisis.
Four kinds of "what if,"
at civilizational stakes
Simulation is not a new surface. It is a capability that lives inside the Runtime register, reading the State substrate and writing ephemeral branches. At science-scale the consequence is larger by orders of magnitude: not "should we hire this advisor" but "what collapses if this claim is retracted."
Horizon re-projection
near-term vs long-horizon lens on the same corridor
The same compiled corridor read through different evaluation criteria. Does this region of the substrate look different under clinical-translation weights than under mechanism-discovery weights? The findings don't move. The interpretation does. Often the most illuminating.
Trajectory extrapolation
where is this corridor going if current momentum holds
Velocity and acceleration of a scientific region. Which corridors are compounding, which are stalling, which are about to enter a regime change. Useful for prioritization; fails exactly at the moments that matter most — supernova events like AlphaFold or CRISPR, which re-org the landscape in a year.
Counterfactual intervention
needs a causal model over the finding graph
"If this finding is retracted, what else collapses?" "If this mechanism is confirmed, which experiments become obvious next?" The dependency graph Vela makes explicit becomes a simulation surface. Where the compiled substrate starts to act as a causal inference engine for science itself.
Corridor composition
which regions of science should we compile first
Given finite attention and capital, which corridors should the Weave compile next? Which interlocking pairs unlock each other? Which dark regions should be deliberately long-exposed? Markowitz at civilizational scale, and the Horizons operating model.
The four are layered. Horizon re-projection is cheapest and ships first. Trajectory extrapolation builds on it. Counterfactual intervention requires Vela's typed links to be dense enough to support causal reasoning. Corridor composition is the portfolio-level capability that the Gigafactories need to operate against.
The honest epistemic limit. Prospective simulation of science is harder than retrospective reading, and the honesty is in admitting it. Short-horizon modest interventions are tractable. Long-horizon field-scale simulation is not forecastable with calibrated confidence. The model's job is not to predict the future but to surface which compilation decisions are robust across many plausible scientific futures. Not "what will happen" but "what compilation work performs well enough across the widest set of conditions."
Vela, Horizons, Gigafactories, the Weave
Vela is the protocol. Horizons is the organization. The Gigafactories are the campaigns. The Weave is what the three together produce when the substrate is open and the campaigns compound.
What this work is, and is not
Most AI-for-science funding is flowing to the Runtime layer. The foundation models, the autonomous labs, the agent pipelines. These are real and necessary. But every Runtime team that runs for long enough arrives at the same architectural realization: there is no substrate to land on. The compiled, queryable, versioned layer that Git provided for code has no equivalent for findings. Vela is the open protocol that fills that gap.
The bet is not that Vela alone reshapes science. The bet is that Vela plus a community of Runtime teams building on top of it, plus a Network layer coordinating across them, produces the compounding regime that every serious observer knows is missing and that no one has yet constructed. The State layer is the wedge. Everything else follows.