# zerodiff-agent

English | [简体中文](https://unpkg.com/zerodiff-agent/README.zh-CN.md)

`zerodiff-agent` is a TypeScript library for building LLM agents. You create an `Agent`, give it an LLM runtime and tools, then call `agent.run(...)`.

## Capabilities

- LLM backends: OpenAI-compatible Chat Completions, OpenAI Responses API, and Anthropic Messages API
- Agent runtime: stateful sessions, streaming output, reasoning output, detailed run results, cancellation, and context compression
- Tools: Zod schemas, custom tool registries, default local coding tools, and hook control
- Models: model profiles, runtime switching, per-model endpoints, thinking controls, reasoning-effort controls, and billing metadata
- Configuration: file-based config with `loadAppConfig()` and conversion to `AgentConfig`
- Observability: token usage, cache usage, context-window usage, cost tracking, and callback hooks
- Skills: discovery and loading from `data/skills`
- Optional CLI: interactive local agent UI built with Ink

## Install

```bash
npm install zerodiff-agent openai
```

For Anthropic:

```bash
npm install zerodiff-agent @anthropic-ai/sdk
```

Requires Node.js 20+.

## Create An Agent

```ts
import OpenAI from "openai";
import { Agent } from "zerodiff-agent";

const agent = new Agent({
  llm: {
    provider: "openai",
    client: new OpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  },
  model: "gpt-4o-mini",
});
```

## Ask A Question

```ts
const answer = await agent.run("Read package.json and summarize this project.");
console.log(answer);
```

Stream output:

```ts
await agent.run("Explain the source tree.", {
  onChunk: (text) => process.stdout.write(text),
});
```

Observe reasoning, tool, and usage events:

```ts
await agent.run("Inspect the workspace.", {
  onThinkingChunk: (text) => console.error(text),
  onToolCall: (name, args) => console.log("tool", name, args),
  onToolResult: (name, result) => console.log("result", name, result),
  onTurnCost: (cost) => console.log(cost),
  onContextUsage: (usage) => console.log(usage),
});
```

Keep conversation state:

```ts
const session = agent.createSession();

await agent.run("Find the main entry point.", { session });
await agent.run("Now explain why it is the entry point.", { session });
```

Manually compact a session:

```ts
const result = await agent.compressSession({
  session,
  keepRecent: 4,
  onContextCompression: (phase) => console.log(phase),
});

console.log(result.compressed, result.summaryTokens);
```

Inspect run results:

```ts
const result = await agent.run("Prepare a release checklist.", {
  onTurnCost: (cost) => console.log(cost),
  onContextUsage: (usage) => console.log(usage),
});
```

Cancel a run:

```ts
const controller = new AbortController();

await agent.run("Analyze the repository.", {
  signal: controller.signal,
});
```

## Configure Models

For simple use, `model` can be just a model id:

```ts
const agent = new Agent({ llm, model: "gpt-4o-mini" });
```

For multiple models, put full profiles in `models[]` and use `model` only to select the active id:

```ts
const agent = new Agent({
  llm,
  model: "gpt-5.5",
  models: [
    {
      id: "gpt-5.5",
      provider: "openai",
      baseUrl: "https://api.openai.com/v1/responses",
      controls: { reasoningEffort: "high" },
      capabilities: {
        thinking: true,
        reasoningEfforts: ["high", "xhigh"],
      },
    },
    {
      id: "claude-sonnet-4-5",
      provider: "anthropic",
    },
  ],
});

const session = agent.createSession();
agent.switchModel("claude-sonnet-4-5", session);
```

If an OpenAI-compatible model `baseUrl` ends with `/responses`, the agent uses the OpenAI Responses API. Otherwise it uses Chat Completions.

Model profiles can include `provider`, `apiKey`, `baseUrl`, `stream`, `contextWindowTokens`, `controls`, `capabilities`, and `billing`.

## Use File Configuration

If you prefer a config file, use `loadAppConfig()`:

```ts
import { createAgentConfig, createLlmRuntime, loadAppConfig } from "zerodiff-agent";

const appConfig = loadAppConfig(process.cwd());
const llm = createLlmRuntime(appConfig);
const agent = new Agent(createAgentConfig(appConfig, llm));
```

Minimal `data/config/config.json`:

```json
{
  "llm": {
    "provider": "openai",
    "apiKey": "REPLACE_WITH_YOUR_LLM_API_SECRET",
    "baseUrl": "https://api.openai.com/v1",
    "model": "gpt-4o-mini",
    "models": [
      {
        "id": "gpt-4o-mini",
        "provider": "openai",
        "baseUrl": "https://api.openai.com/v1"
      }
    ],
    "maxOutputTokens": 8192,
    "stream": true,
    "controls": {}
  },
  "agent": {
    "maxSteps": 25
  }
}
```

Legacy top-level config fields are rejected. Use nested `llm` and `agent` fields.

`loadAppConfig()` also writes `data/config/config.example.json` with the current schema.

## Tools

The default agent includes local coding tools: shell, glob, grep, file read/write/edit/diff, todo, and skill loading.

Replace the tool set when you need a controlled surface:

```ts
import { Agent, defineTool } from "zerodiff-agent";
import { z } from "zod";

const getIssue = defineTool({
  name: "get_issue",
  description: "Fetch an issue by id.",
  schema: z.strictObject({
    id: z.number().int().positive(),
  }),
  async execute({ id }) {
    return { success: true, issue: await issueClient.get(id) };
  },
});

const agent = new Agent({
  llm,
  model: "gpt-4o-mini",
  toolsOverride: [getIssue],
});
```

Tools return JSON-serializable objects. Use `success: false` and `error` for expected failures.

Run hooks:

```ts
const session = agent.createSession();

await agent.run("Inspect the repository.", {
  session,
  onPreToolUse: ({ toolName, args }) => {
    if (toolName === "shell" && String((args as { command?: unknown }).command).includes("rm -rf")) {
      return { action: "block", reason: "Refuse dangerous command" };
    }
  },
  onPreCompact: ({ messages }) => {
    saveFullContext(messages);
  },
  onPostCompact: ({ messagesAfter }) => ({
    messages: [
      { role: "user", content: "Keep following the project constraints." },
      ...messagesAfter,
    ],
  }),
});

session.enqueueUserGuidance("After the previous response finishes, explain the risks first.");
```

## Skills

Skills are instruction packages under `data/skills`. A skill directory needs `SKILL.md` or `skill.md` with YAML frontmatter containing `description`.

The default skill provider injects the skill catalog into the system prompt. The built-in `list_skills` and `load_skill` tools let the agent inspect and load skill instructions.

```ts
const agent = new Agent({
  llm,
  model: "gpt-4o-mini",
  listSkillsInPrompt: false,
});
```

## Optional CLI

The package also exports an Ink-based CLI:

```ts
import { createAgentConfig, createLlmRuntime, loadAppConfig } from "zerodiff-agent";
import { startCLI } from "zerodiff-agent/cli";

const appConfig = loadAppConfig(process.cwd());
startCLI(createAgentConfig(appConfig, createLlmRuntime(appConfig)));
```

CLI commands include `/models`, `/model <id>`, `/tokens`, `/messages`, `/prompt`, `/thinking`, and `/effort`.

## Public API

- `Agent`, `AgentSession`, `runLoop`
- `defineTool`, `createToolRegistry`, `createDefaultToolRegistry`, `coreTools`
- `loadAppConfig`, `createAgentConfig`, `createLlmRuntime`
- `discoverSkills`, `loadSkill`, `loadSkillByName`
- `AgentAbortError`, `isAbortError`, `throwIfAborted`
- `startCLI` from `zerodiff-agent/cli`
