Getting Started

Guides

Install Overmind, register your agent, and run your first optimization round in about 10 minutes.

This guide takes you from zero to an optimized agent. The whole flow is local: you run a CLI against your Python project and artifacts land under .overmind/.

Requirements

Python 3.10 or higher
uv (modern Python package manager)
API keys for at least one LLM provider (OpenAI, Anthropic)

1. Install Overmind

pip install overmind

For local development:

git clone https://github.com/overmind-core/overmind
cd overmind
uv tool install -e .

Verify:

overmind --help

Prefer uv run? All commands below also work as uv run overmind <command> after uv sync.

2. Initialize the project

From your agent’s project root:

cd your-agent-project/
overmind init

This creates .overmind/ and walks you through configuring API keys and default models. Settings are written to .overmind/.env. Safe to re-run.

3. Register your agent

Point Overmind at the Python function it should call for each test case.

overmind agent register my-agent agents.my_agent:run

The module path (agents.my_agent) is resolved relative to your project root; run is the function name.

Your function receives an input dict and must return a dict:

def run(input_data: dict) -> dict:
    # your agent logic
    return {"response": result}

Framework-based agents — Google ADK, LangChain, CrewAI, etc. — often don’t expose a plain callable. Overmind detects this during registration and offers to auto-generate an entrypoint wrapper for you. It also collects any additional API keys your agent needs at that point.

Other registry commands:

Command	Description
`overmind agent list`	List all registered agents
`overmind agent show <name>`	Show registration details and pipeline status
`overmind agent update <name> <mod:fn>`	Update the entrypoint after renaming
`overmind agent remove <name>`	Remove from registry
`overmind agent validate <name> --data <path>`	Run the first test case end-to-end

4. Validate the entrypoint (optional but recommended)

overmind agent validate my-agent --data tests/sample.json

Runs the first case from your dataset through the agent so you catch import/wrapper/API-key issues before investing time in full setup.

5. Set up evaluation

overmind setup my-agent

This is an interactive flow that prepares everything the optimizer needs:

Phase	What happens
Agent analysis	An LLM reads your code to detect input/output schema, tools, and decision logic.
Policy generation	Without `--policy`, a policy is inferred from the code. With `--policy <path>`, your document is analyzed against the code and refinements are suggested. Either way you can refine conversationally before approving.
Dataset	Overmind uses your existing test data if found, or generates diverse synthetic cases based on the policy and agent description.
Evaluation criteria	Scoring rules are proposed for each output field, with policy-aware stricter checks where relevant.

Variants:

# Bring an existing policy document
overmind setup my-agent --policy docs/my_policy.md

# Non-interactive (for CI / scripts) — requires ANALYZER_MODEL and
# SYNTHETIC_DATAGEN_MODEL in .overmind/.env
overmind setup my-agent --fast

Setup produces two artifacts in .overmind/agents/<name>/setup_spec/:

eval_spec.json — machine-readable evaluation spec (used at runtime)
policies.md — human-readable policy document you maintain

Both are editable after generation.

6. Optimize

overmind optimize my-agent

This kicks off the iterative optimization loop. Each iteration:

Runs your agent on every training case and collects traces + outputs
Scores outputs against the eval spec (0–100 across multiple dimensions)
Diagnoses failure patterns using the analyzer model
Generates N candidate fixes (best-of-N), each biased toward a different area — tool descriptions, core logic, input handling, system prompt
Validates candidates (syntax, interface, smoke test on a subset)
Evaluates surviving candidates on the full training set
Accepts or reverts — the best candidate is kept only if it improves the global best without regressing too many individual cases

Interactive config lets you tune analyzer model, LLM-as-Judge, iterations, candidates per iteration, parallel workers, train/holdout split, regression thresholds, and early stopping. For CI/scripted use:

overmind optimize my-agent --fast

See the full reference in the Overmind guide.

7. Inspect the results

Artifacts land in .overmind/agents/<name>/:

Path	Description
`setup_spec/policies.md`	Agent policy (human-editable)
`setup_spec/eval_spec.json`	Evaluation spec with embedded policy
`setup_spec/dataset.json`	Test dataset used for optimization
`experiments/best_agent.py`	Highest-scoring single-file agent
`experiments/best_agent/`	All optimized files (multi-file agents)
`experiments/results.tsv`	Score history per iteration
`experiments/traces/`	Per-run JSON traces
`experiments/report.md`	Summary with scores, improvements, and diffs

You can edit policies.md or eval_spec.json and re-run overmind optimize to continue improving from where you left off.

Using your own data

Data files are JSON arrays where each element has an input and expected_output:

[
  {
    "input": { "company_name": "Acme Corp", "inquiry": "Need enterprise pricing" },
    "expected_output": { "category": "hot", "lead_score": 85 }
  }
]

Place them under data/ in your agent directory and Overmind will detect them during setup. A test set of 10–50 diverse cases is usually enough. Without data, Overmind generates realistic synthetic cases from the policy and agent description.

Instrumenting a running app (optional)

If you also want Overmind traces from a deployed application — independent of the optimizer — install the Python or JS/TS tracing SDK and call init() once at startup:

Python
JavaScript / TypeScript

pip install overmind

import overmind
overmind.init(service_name="my-service", environment="production")

See the Python SDK reference.

npm install @overmind-lab/trace-sdk openai

import { OpenAI } from "openai";
import { OvermindClient } from "@overmind-lab/trace-sdk";

const overmindClient = new OvermindClient({
  apiKey: process.env.OVERMIND_API_KEY!,
  appName: "my app",
});
overmindClient.initTracing({ enabledProviders: { openai: OpenAI } });

See the JS/TS SDK reference.

Tracing and the optimizer are independent — you can use either or both.

Next Steps

Overmind guide — deep dive on policies, multi-file agents, bundle scope, and optimization safeguards
Integrations — supported providers and frameworks for the tracing SDKs
Python SDK reference · JS/TS SDK reference