In 2026, the dominant LLM changed three times. By 2030, it will change again. By 2050, the
vendor on your invoice today probably won't exist. The Constellation Protocol is a vendor-neutral
standard for AI agents — a tiny declarative file (agent.yaml) that runs unchanged
across GPT-5, Claude 4.5, Gemini 2.5, Grok 4, DeepSeek V3.5, Llama 4, Qwen 3 and any
compliant runtime that comes next.
Every AI agent shipped in 2024–2026 has the same fatal flaw: it's a love letter to one vendor. The prompt is tuned for one model's quirks. The tools use one vendor's function-calling schema. The retries use one vendor's error codes. Move to a cheaper or better model and you rewrite the entire agent.
We've seen this movie. SOAP, then REST. AWS, then multi-cloud. Oracle, then anything else. The pattern is always: vendors deny lock-in, customers feel pain, an open standard emerges, the market consolidates around it, and the vendors who refuse to comply lose enterprise contracts.
The Constellation Protocol is the open standard, shipped early — before procurement starts writing it into contracts in 2027.
A complete production agent, defined declaratively. Drop into any compliant runtime and it executes — bit-for-bit identically — on any of these models.
# Constellation Protocol — agent.yaml # Spec: https://nexoraaero.com/constellation.html · v0.4 apiVersion: constellation/v0.4 kind: Agent id: support-triage version: 1.4.2 license: MIT # ─── MODEL POLICY ─────────────────────────────────────────── # Primary first. If primary times out, errors, or violates SLA, # runtime fails over to the next entry. Pinning ensures reproducibility. models: primary: { id: anthropic/claude-sonnet-4.5, pin: 2026-04-15 } fallback: - { id: openai/gpt-5-mini, pin: 2026-04-15 } - { id: google/gemini-2.5-flash, pin: 2026-04-15 } - { id: deepseek/v3.5, pin: 2026-04-15 } # budget rescue sla: first_token_ms: 800 cost_per_call_usd: 0.02 on_breach: failover # or: degrade | error # ─── BEHAVIOUR ────────────────────────────────────────────── system: | You are the first line of support triage. Read a customer ticket, classify by severity (P0/P1/P2/P3), route to the correct team, and draft a one-paragraph holding reply. Never fabricate facts. # ─── TOOLS · provider-neutral schema ──────────────────────── tools: - name: lookup_account description: Resolve a customer by email or account ID. parameters: type: object required: [identifier] properties: identifier: { type: string } handler: https://api.acme.com/v2/accounts/lookup - name: create_ticket description: Open a ticket in the routing queue. parameters: type: object required: [severity, team, summary] properties: severity: { enum: [P0, P1, P2, P3] } team: { enum: [billing, platform, success] } summary: { type: string, maxLength: 240 } handler: https://api.acme.com/v2/tickets # ─── COMPLIANCE ───────────────────────────────────────────── audit: log: s3://acme-audit/constellation/${agent.id}/${run.id}.jsonl seal: ed25519 # cryptographic run-receipt retention_days: 2557 # 7 years (SOX) redact: [email, phone, card_pan] # ─── EVALS · the contract that survives model swaps ───────── evals: - name: P0_outage_routed_to_platform input: "Production down, 500s on every request" assert: tools.create_ticket.calls[0].severity == "P0" - name: billing_to_billing_team input: "Why was I charged twice?" assert: tools.create_ticket.calls[0].team == "billing"
One JSON-Schema function definition translates at runtime to OpenAI tool-use, Anthropic tool-use, Gemini function-calling, Grok function-calling — whichever model is active.
You declare your SLA (first-token ms, cost-per-call USD). Breach the SLA twice in a row → runtime promotes the next model in your fallback list. Zero downtime.
Pin a model + date and the runtime refuses to load an updated version unless you re-pin. Reproducible runs survive regulator subpoenas and the next quiet vendor model swap.
Every run produces a signed receipt: agent hash + model version + system prompt hash + I/O fingerprint. Tamper-evident. SOC 2 and EU AI Act friendly.
Declare what to redact in audit.redact; runtime scrubs before logs hit storage. Patterns ship for email, phone, card PAN, SSN, IBAN — extend as needed.
Your evals block is the deal: if a model swap breaks any assertion, the runtime refuses to promote the new model to primary. Behaviour stays correct across vendor changes.
No vendor co-operation needed. The runtime adapter is open-source. New models are usually a 50-line PR.
Legend · ✓ shipping · – model doesn't expose this capability · vN.M adapter maturity. New vendors take ~3 days to add — open an issue on GitHub with the model's API docs and we'll write the adapter.
"Vendor's deliverable shall be expressed as a Constellation Protocol v0.4+ compliant agent. Vendor lock-in modifications shall be rejected at acceptance."
"All model invocations shall be pinned to a published version. Vendor shall provide signed run-receipts on request, retained for the contract term plus seven years."
"At contract end, Vendor shall provide the agent definition file under a license no more restrictive than MIT. No transition assistance shall be required to switch model providers."
"Service interruption shall be measured against the agent's declared SLA. Runtime failover to fallback models is contractually equivalent to continued operation."
"Run-receipts shall be available within 24 hours of regulator request. Redaction rules shall be defined declaratively and stored alongside the agent definition."
A standard procurement rider implementing all five clauses is available in the /legal directory of the repository. Reviewed by counsel. Reusable. Public domain.
This page. 10 vendor adapters. Reference runtime in Python (380 LOC). Working evals harness.
Node-native runtime. Edge-deployable (Cloudflare Workers, Vercel Edge, Deno Deploy). Same wire format.
Constellations of agents. Message-passing between agents with cryptographic delivery receipts.
Submit to a vendor-neutral standards body (IETF or W3C) once we have ≥50 production deployments and ≥5 independent runtime implementations.
If we've done this right, the question by 2030 is not "are you Constellation-compliant" — it's "why are you still writing vendor-specific code."
Spec discussions happen in GitHub issues. Procurement and licensing questions: protocol@nexoraaero.com.
agent.yaml
is wider than the audience for code, and YAML is the format that survives that audience without
training. We considered TOML; it doesn't handle nested tools cleanly. JSON is unreadable past
three levels.
models.primary.prompt_override.
But the bigger answer: your evals block enforces behaviour. If GPT can't pass them,
GPT doesn't get promoted to primary. The contract is the evals, not the prompt.
agent.yaml defines the contract; the wrapper
translates to your framework's primitives. The wrappers are also MIT and live in the same repo.
spec-review.
We list all named contributors in the CONTRIBUTORS.md file. We do not publish a
list of customer deployments; that's the customer's call.
Three things you can do today.