Skip to content
Get started
Overview

Introduction

Automatically optimize your agents — better prompts, better tools, better models.

Overmind improves production agents by analyzing their code, capturing traces from runs, and driving autonomous optimization that keeps only changes that measurably help.

For a product-level overview, see the Platform Overview.

Today, Overmind includes:

You register your agent, define (or let Overmind infer) a policy that describes correct behavior, and start optimization. Overmind then:

  1. Runs your agent on a test dataset and records detailed traces of every LLM call and tool invocation.
  2. Scores outputs against an evaluation spec derived from your policy.
  3. Diagnoses failures with a strong reasoning model that sees your code, policy, traces, and scores.
  4. Produces candidate code fixes (best-of-N) across prompts, tool descriptions, model selection, and agent logic.
  5. Accepts or reverts each candidate using regression-aware criteria, keeping only changes that genuinely improve the agent.

After several iterations you get a measurably better agent, plus a readable report and diff.

Your Python agent (registered entrypoint)
overmind optimize <name>
┌────────────────────┴────────────────────┐
│ │
▼ │
Run agent on dataset ──▶ Traces + outputs │
│ │
▼ │
Score vs. eval │
spec (+ policy) │
│ │
▼ │
Diagnose failures │
│ │
▼ │
Generate N candidate fixes│
│ │
▼ │
Validate + re-score │
│ │
▼ │
Accept best / revert rest ┘
optimized agent + report (console)

Requirements: Python 3.10+, Cursor or Claude Code, and API keys for at least one LLM provider (OpenAI, Anthropic).

  1. Install:
Terminal window
pip install overmind
  1. Initialize (configures API keys and installs skills into your IDE):
Terminal window
cd your-agent-project/
overmind init
  1. Open Cursor or Claude Code and type these in the chat panel, in order:
/overmind-register-agent path/to/your/agent.py
/overmind-generate-spec-and-dataset my-agent
/overmind-optimize-agent my-agent

Each skill reads your codebase, asks what it can’t infer, and handles the rest. Results are pushed to console.overmindlab.ai/agents.

See the Getting Started guide for requirements, what each step does, and next steps.

If you want to trace LLM calls from a running application — independently of the optimizer — Overmind ships Python and JavaScript SDKs. Call init() once and every LLM call is captured with model, inputs/outputs, latency, token counts, and cost. See How To Use Tracing, the Python SDK reference, or the JS/TS SDK reference.

New to Overmind? Start with Getting Started, then How to Use Overmind.

Tracing only (no optimization)? Go to How To Use Tracing, then the Python SDK or JS/TS SDK.

Exploring fine-tuning? See Fine-tuning (Beta) — enable tracing first, then contact us.

For supported providers, frameworks, and the OTLP endpoint, see Integrations.