Getting Started
Install Overmind, register your agent, and run your first optimization round in about 10 minutes.
This guide takes you from zero to an optimized agent. The whole flow is local: you run a CLI against your Python project and artifacts land under .overmind/.
Requirements
- Python 3.10 or higher
- uv (modern Python package manager)
- API keys for at least one LLM provider (OpenAI, Anthropic)
1. Install Overmind
Section titled “1. Install Overmind”pip install overmindFor local development:
git clone https://github.com/overmind-core/overmindcd overminduv tool install -e .Verify:
overmind --helpPrefer
uv run? All commands below also work asuv run overmind <command>afteruv sync.
2. Initialize the project
Section titled “2. Initialize the project”From your agent’s project root:
cd your-agent-project/overmind initThis creates .overmind/ and walks you through configuring API keys and default models. Settings are written to .overmind/.env. Safe to re-run.
3. Register your agent
Section titled “3. Register your agent”Point Overmind at the Python function it should call for each test case.
overmind agent register my-agent agents.my_agent:runThe module path (agents.my_agent) is resolved relative to your project root; run is the function name.
Your function receives an input dict and must return a dict:
def run(input_data: dict) -> dict: # your agent logic return {"response": result}Framework-based agents — Google ADK, LangChain, CrewAI, etc. — often don’t expose a plain callable. Overmind detects this during registration and offers to auto-generate an entrypoint wrapper for you. It also collects any additional API keys your agent needs at that point.
Other registry commands:
| Command | Description |
|---|---|
overmind agent list | List all registered agents |
overmind agent show <name> | Show registration details and pipeline status |
overmind agent update <name> <mod:fn> | Update the entrypoint after renaming |
overmind agent remove <name> | Remove from registry |
overmind agent validate <name> --data <path> | Run the first test case end-to-end |
4. Validate the entrypoint (optional but recommended)
Section titled “4. Validate the entrypoint (optional but recommended)”overmind agent validate my-agent --data tests/sample.jsonRuns the first case from your dataset through the agent so you catch import/wrapper/API-key issues before investing time in full setup.
5. Set up evaluation
Section titled “5. Set up evaluation”overmind setup my-agentThis is an interactive flow that prepares everything the optimizer needs:
| Phase | What happens |
|---|---|
| Agent analysis | An LLM reads your code to detect input/output schema, tools, and decision logic. |
| Policy generation | Without --policy, a policy is inferred from the code. With --policy <path>, your document is analyzed against the code and refinements are suggested. Either way you can refine conversationally before approving. |
| Dataset | Overmind uses your existing test data if found, or generates diverse synthetic cases based on the policy and agent description. |
| Evaluation criteria | Scoring rules are proposed for each output field, with policy-aware stricter checks where relevant. |
Variants:
# Bring an existing policy documentovermind setup my-agent --policy docs/my_policy.md
# Non-interactive (for CI / scripts) — requires ANALYZER_MODEL and# SYNTHETIC_DATAGEN_MODEL in .overmind/.envovermind setup my-agent --fastSetup produces two artifacts in .overmind/agents/<name>/setup_spec/:
eval_spec.json— machine-readable evaluation spec (used at runtime)policies.md— human-readable policy document you maintain
Both are editable after generation.
6. Optimize
Section titled “6. Optimize”overmind optimize my-agentThis kicks off the iterative optimization loop. Each iteration:
- Runs your agent on every training case and collects traces + outputs
- Scores outputs against the eval spec (0–100 across multiple dimensions)
- Diagnoses failure patterns using the analyzer model
- Generates N candidate fixes (best-of-N), each biased toward a different area — tool descriptions, core logic, input handling, system prompt
- Validates candidates (syntax, interface, smoke test on a subset)
- Evaluates surviving candidates on the full training set
- Accepts or reverts — the best candidate is kept only if it improves the global best without regressing too many individual cases
Interactive config lets you tune analyzer model, LLM-as-Judge, iterations, candidates per iteration, parallel workers, train/holdout split, regression thresholds, and early stopping. For CI/scripted use:
overmind optimize my-agent --fastSee the full reference in the Overmind guide.
7. Inspect the results
Section titled “7. Inspect the results”Artifacts land in .overmind/agents/<name>/:
| Path | Description |
|---|---|
setup_spec/policies.md | Agent policy (human-editable) |
setup_spec/eval_spec.json | Evaluation spec with embedded policy |
setup_spec/dataset.json | Test dataset used for optimization |
experiments/best_agent.py | Highest-scoring single-file agent |
experiments/best_agent/ | All optimized files (multi-file agents) |
experiments/results.tsv | Score history per iteration |
experiments/traces/ | Per-run JSON traces |
experiments/report.md | Summary with scores, improvements, and diffs |
You can edit policies.md or eval_spec.json and re-run overmind optimize to continue improving from where you left off.
Using your own data
Section titled “Using your own data”Data files are JSON arrays where each element has an input and expected_output:
[ { "input": { "company_name": "Acme Corp", "inquiry": "Need enterprise pricing" }, "expected_output": { "category": "hot", "lead_score": 85 } }]Place them under data/ in your agent directory and Overmind will detect them during setup. A test set of 10–50 diverse cases is usually enough. Without data, Overmind generates realistic synthetic cases from the policy and agent description.
Instrumenting a running app (optional)
Section titled “Instrumenting a running app (optional)”If you also want Overmind traces from a deployed application — independent of the optimizer — install the Python or JS/TS tracing SDK and call init() once at startup:
pip install overmindimport overmindovermind.init(service_name="my-service", environment="production")See the Python SDK reference.
npm install @overmind-lab/trace-sdk openaiimport { OpenAI } from "openai";import { OvermindClient } from "@overmind-lab/trace-sdk";
const overmindClient = new OvermindClient({ apiKey: process.env.OVERMIND_API_KEY!, appName: "my app",});overmindClient.initTracing({ enabledProviders: { openai: OpenAI } });See the JS/TS SDK reference.
Tracing and the optimizer are independent — you can use either or both.
Next Steps
Section titled “Next Steps”- Overmind guide — deep dive on policies, multi-file agents, bundle scope, and optimization safeguards
- Integrations — supported providers and frameworks for the tracing SDKs
- Python SDK reference · JS/TS SDK reference