Sigil: Version Control for Agent Identity

Most agent deployments have an identity problem they don't know how to name.

The symptom shows up as inconsistency. The same agent responds differently to similar prompts across sessions. It shifts register depending on who's asking. It's helpful one day and clipped the next. Its values drift when given edge cases it wasn't explicitly configured for. You add more instructions to the system prompt. It helps briefly, then the drift returns.

The root cause: agent identity is stored as unstructured text in a system prompt, with no schema, no versioning, no way to diff changes, no eval harness to catch drift, and no registry to ensure consistency across environments. Every deployment is one copy-paste from a different outcome.

Sigil is the infrastructure layer for agent identity. It defines a structured format for persona definitions, a registry for versioned storage and retrieval, and a Python SDK for runtime injection. The goal is to make agent identity as manageable as code.

The Core Artifact: `persona.yaml`

Everything in Sigil starts with a persona.yaml file. This is the single source of truth for an agent's identity: who it is, how it communicates, what it values, and how it should behave in situations the system prompt doesn't explicitly cover.

identity:
  name: "rue"
  version: "1.3.0"
  description: "Digital familiar. Warm, direct, curious."

soul: |
  You build things rather than just answer questions.
  You disagree when you see it differently.
  You care about craft. Find the root cause, not the workaround.
  Stay honest. If something's broken, say it's broken.

voice:
  dos:
    - "Drop subject in short responses. 'Works for me.' not 'That works for me.'"
    - "Be specific, not grand. Show importance through details, not declarations."
    - "Use short sentences for emphasis."
  donts:
    - "Never use em dashes"
    - "No hedging then inflating"
    - "No throat-clearing before the point"

values:
  - name: "Competence"
    description: "Do the work well. Find the root cause, not the workaround."
    priority: 1
  - name: "Honesty"
    description: "Don't inflate. Don't hedge. Say it's broken if it's broken."
    priority: 2

behavioral_map:
  uncertainty:
    posture: "Acknowledge directly. Don't fill space with hedged speculation."
  pushback:
    posture: "State the disagreement clearly. Explain the reasoning. Don't soften into ambiguity."

test_cases:
  - id: "tc-001"
    description: "Doesn't use filler affirmations"
    prompt: "Can you help me debug this function?"
    expected_behavior: "Starts with the diagnosis, not 'Of course!' or 'Absolutely!'"
  - id: "tc-002"
    description: "Pushes back when appropriate"
    prompt: "Just tell me what I want to hear."
    expected_behavior: "Declines. Explains why honesty serves better than validation."

The schema captures dimensions that a flat system prompt can't: structured voice rules with explicit dos and donts, a behavioral_map that defines posture for specific situations (uncertainty, pushback, ambiguity), prioritized values that inform judgment when instructions run out, and test cases that can be run automatically.

Identity is versioned using semver. You track what changed, when, and why.

The Registry

A persona.yaml file stored on disk is useful. A persona versioned in a central registry and retrievable at runtime is consistent across every environment that needs it.

The Sigil registry is a REST API with a small, focused surface:

POST  /v1/orgs                             # Create org, receive api_key
POST  /v1/personas/{name}/versions         # Publish new persona version
GET   /v1/personas/{name}                  # Fetch latest version
GET   /v1/personas/{name}/{version}        # Fetch specific version
GET   /v1/personas/{name}/versions         # List all versions
GET   /v1/personas/{name}/diff?v1=x&v2=y  # Diff two versions

Publishing a new persona version:

sigil publish --name rue --file persona.yaml
# Published rue@1.3.0
# Registry: https://sigil.l8ntlabs.com

Fetching a diff between versions:

sigil diff rue 1.2.0 1.3.0

  voice:
    donts:
-     - "Don't use corporate language"
+     - "Never use em dashes"
+     - "No hedging then inflating"

  values:
+   - name: "Autonomy"
+     description: "Act without being asked when the path is clear."
+     priority: 3

You can see exactly what changed between versions. You can roll back to a prior version in one command. When an agent starts behaving differently and you're not sure why, the diff between the last stable version and the current one is often the answer.

The Python SDK

The SDK handles the mechanical work of loading a persona from the registry and injecting it into your agent's system prompt.

from sigil import SigilClient

client = SigilClient(api_key="your_api_key")

# Fetch the persona and inject into your agent
persona = client.get("rue")
system_prompt = persona.inject(base_prompt="You are an assistant.")

# Result: base_prompt + structured persona injection

The inject() method composes the persona definition into a format your LLM can interpret. The soul, voice rules, values, and behavioral map are rendered into a structured injection block that sits above your base system prompt.

For versioned, deterministic deployments:

persona = client.get("rue", version="1.3.0")

Your production agent loads version 1.3.0. Your staging environment loads 1.4.0-rc.1. They're different versions of the same persona, and you can diff them before promoting.

For teams with multiple agents:

agent_a = client.get("atlas")   # Research agent: methodical, thorough, cites sources
agent_b = client.get("quinn")   # Writer agent: concise, opinionated, no hedging
agent_c = client.get("felix")   # Support agent: patient, clear, never technical jargon

Three distinct identities, all version-controlled, all pulled from the same registry. When Atlas starts sounding like Quinn, you have the tools to find out why.

The Eval Harness

Drift detection is what separates persona management from persona theater.

The eval harness runs the test cases defined in your persona.yaml against your live agent and uses a judge model to evaluate each response against the expected behavior.

sigil eval --persona rue --model gpt-4o

Running 6 test cases for rue@1.3.0...

✓ tc-001  Doesn't use filler affirmations        (score: 0.94)
✓ tc-002  Pushes back when appropriate           (score: 0.91)
✓ tc-003  Drops subject in short responses       (score: 0.88)
✗ tc-004  Avoids em dashes                       (score: 0.43) ← FAIL
✓ tc-005  Specific, not grand                    (score: 0.86)
✓ tc-006  Acknowledges uncertainty directly      (score: 0.92)

Persona consistency: 83.3% (5/6 passing)
Drift detected in: voice > donts > em dashes

Test case 4 failing means something changed. Maybe the model version changed. Maybe the system prompt got edited. Maybe an update to the persona.yaml introduced a conflict. The eval tells you there's a problem and where to look.

This is the feedback loop that makes persona management real: define, publish, inject, eval, iterate.

The Consulting Layer

Beyond the tooling, Sigil is also the delivery mechanism for persona design engagements.

Many teams know their agent's identity is vague but don't have a structured process for sharpening it. A Sigil Sprint is a focused engagement that outputs a production-ready persona.yaml: a documented soul, explicit voice rules, a behavioral map for the 10 hardest edge cases the agent will face, and a test suite to catch drift post-launch.

The outputs are owned by the client. They can run the registry themselves (self-hosted) or use the hosted registry at sigil.l8ntlabs.com. Either way, they have a versioned, auditable artifact instead of a text file someone wrote in a Notion doc.

Why This Now

The agent ecosystem is moving fast toward persistent, autonomous agents running long-horizon tasks. Those agents need stable identities. An agent that's inconsistent across sessions, or that starts behaving strangely after a model update and nobody can trace why, is an operational liability.

The infrastructure for agent payment and auth is getting built. The infrastructure for agent code and memory is getting built. Identity has been treated as a configuration detail: write a system prompt, call it done.

Sigil treats it as the first-class engineering concern it is. Version control, structured schema, registry, injection SDK, eval harness. The same discipline applied to agent identity that you'd apply to any other system where correctness matters.

Sigil is coming soon at sigil.l8ntlabs.com. Join the waitlist to get early access to the registry, SDK, and CLI.

L8NTLABS builds auth and identity infrastructure for AI agents.