---
name: integrate-llm-commands
description: >-
  Integrate LLM-powered commands into an existing CLI tool using
  agent-contracts DSL, agent-contracts-runtime SDK adapters, and
  cli-contracts x-agent metadata. Produces agent definitions, typed
  handoff schemas, context builders, orchestrator, formatter, and
  CLI command handlers. Use when adding LLM audit, proposal, or
  explanation commands to a CLI, LLM機能組み込み, エージェント統合,
  Agent-Native Toolchain 化.
---

# Integrate LLM Commands into a CLI Tool

Add LLM-powered commands (audit, propose, explain, etc.) to an existing
CLI tool using the agent-contracts / agent-contracts-runtime / cli-contracts
stack.

## Prerequisites

The target project must already have:

- A working CLI (Commander, yargs, or similar)
- `cli-contract.yaml` (or plan to create one via cli-contracts)
- Node.js / TypeScript codebase

## Workflow

### Phase 1: Design LLM commands

Identify which LLM commands to add. The canonical pattern has three types:

| Type | Purpose | Example |
|------|---------|---------|
| **audit** | Semantic review of project artifacts | `audit` — safety review of migrations |
| **propose** | Generate structured proposals | `propose-expand-contract` — decompose unsafe DDL |
| **explain** | Human-readable explanation of machine output | `explain` — translate lint/check JSON to prose |

For each command, decide:
- What domain-specific context to provide (files, lint results, schema, config)
- What structured output schema to return
- Whether it reads stdin, file arguments, or both

### Phase 2: Install dependencies

```bash
npm install --save-dev agent-contracts
npm install --save-dev agent-contracts-runtime zod
npm install --save-dev @openai/agents @cursor/sdk  # adapter SDKs
```

### Phase 3: Create DSL definitions

Create `dsl/` directory with these files. See [reference.md](reference.md)
for complete templates.

```
dsl/
├── {project}-dsl.yaml          # Main entry: system, agents, tasks, workflows, guardrails
├── agent-runtime.config.yaml   # Points to DSL + generated output dir
├── agents/{agent-name}.yaml    # Agent definition
├── tasks.yaml                  # Task definitions
└── handoff-types.yaml          # Request/result schemas
```

Key rules:
- Use `version: 1` and `system:` block in the main DSL
- **Handoff schemas**: prefer `$ref` to `cli-contract.yaml#/components/schemas/AgentAuditResult` as SSoT. If agent-contracts cannot resolve the `$ref` at generate time, inline the schema with a comment noting the SSoT location. Domain-specific result types (e.g., `ExpandContractProposal`) extend the base `AgentAuditResult` shape with additional fields
- Keep `can_read_artifacts: []` and `can_write_artifacts: []` to pass validation
- One agent can serve multiple tasks/workflows if the domain is the same
- Add guardrails: `output-schema-conformance`, `no-{dangerous-action}`, `confidence-threshold`

### Phase 4: Generate TypeScript from DSL

Add script to `package.json`:

```json
{
  "scripts": {
    "dsl:generate": "agent-runtime generate --config dsl/agent-runtime.config.yaml"
  }
}
```

```bash
npm run dsl:generate
```

This produces `src/generated/dsl/` with typed registries and Zod schemas.

### Phase 5: Implement `src/agents/` module

Create these files:

| File | Responsibility |
|------|----------------|
| `types.ts` | `TaskId` union, `AgentConfig`, `AgentOptions`, `AgentRunResult` |
| `orchestrator.ts` | `createAdapter()`, `runAgentWorkflow()` — dynamic imports of runtime + adapters |
| `context-builder.ts` | `build{Command}Context()` — transforms CLI input into LLM prompt |
| `formatter.ts` | `formatResultText()`, `formatResultJson()`, `computeExitCode()` |
| `index.ts` | Re-exports |

Critical implementation details:

**orchestrator.ts**:
- Dynamic-import `agent-contracts-runtime` so the tool works without it installed (graceful degradation)
- Dynamic-import adapter packages (`/adapters/cursor-sdk`, `/adapters/openai-agents-sdk`, etc.)
- Import generated DSL registries (including `workflowRegistry`) and pass to `runWorkflow()` — NEVER use `runTask()` (low-level internal API)
- Support `--show-prompt` by returning the prompt without calling the LLM
- Use exit codes: 0 = success, 1 = error, 10 = findings, 11 = runtime missing, 12 = adapter error

**context-builder.ts**:
- Build markdown-structured prompts with domain DATA only: `# Request Title`, `## Configuration`, `## Target Data`
- NEVER include agent instructions (evaluation criteria, rules, anti-patterns, output format) — these belong in the DSL agent definition and are injected by buildTaskPrompt / runWorkflow
- Include project-specific context (lint results, config, related artifacts)
- Filter large context to relevant portions (e.g., only referenced tables from schema dump)
- Cap context size (e.g., 16KB) to avoid truncation with smaller models

**formatter.ts**:
- Map severity to icons/prefixes
- Format findings as structured text with location + recommendation
- JSON mode outputs raw structured data
- Exit code based on `--fail-on` threshold vs finding severity

### Phase 6: Add CLI command handlers

For each LLM command, create `src/commands/{command}.ts`:

```typescript
export async function command{Name}(config, target, opts) {
  const context = await build{Name}Context(target, config);
  const result = await runAgentWorkflow(context, "{workflow-id}", agentConfig, agentOpts);
  // format + output + exit code
}
```

Register in CLI with the **cli-contracts standard LLM command options**:
- `--adapter <name>` — LLM adapter (cursor, openai, gemini, claude, mock)
- `--model <name>` — Model name to pass to the adapter
- `--show-prompt` — Output the constructed prompt without calling the LLM API
- `--fail-on <level>` — Minimum severity that causes a non-zero exit (warning, error, critical)
- `--output <file>` / `-o` — Write result to a file instead of stdout
- `--report-format <fmt>` — Output format for the report (json, text, yaml). Note: LLM commands use `--report-format`, NOT `--format`. `--format` is reserved for core deterministic commands

### Phase 7: Update cli-contract.yaml

Add `x-agent` metadata to each command:

```yaml
x-agent:
  riskLevel: low|medium|high
  requiresConfirmation: true|false
  idempotent: true|false
  sideEffects: [network]
  sideEffectNote: >-
    Network calls to LLM provider when adapter is not mock.
    Filesystem write only when --output is specified.
  safeDryRunOption: show-prompt
  expectedDurationMs: 120000
  retryableExitCodes: [1, 12]
```

For destructive (non-LLM) commands, also add:
- `reversible: true|false`
- `rollbackGuidance: "..."` (if not reversible)

### Phase 8: Update README

Follow this structure — concise mention high up, detailed section after mechanical features:

**1. Design Philosophy** — Add an "Agent-native" bullet:

> **Agent-native**: Domain-specific semantic reasoning is encapsulated
> inside the toolchain itself. Higher-level agents do not need to know
> every {domain-specific rule} — they invoke {tool} and consume
> structured findings.

**2. Commands** — Split into "Deterministic Commands" and "LLM-Powered Commands" subsections. Add safety note under LLM commands:

> LLM-powered commands are read-only by default. `audit` and `explain`
> do not modify files or state. `propose-*` produces a proposal;
> generated output should be reviewed before use. LLM commands do not
> replace deterministic gates — they are an additional semantic review
> layer.

**3. CI Integration** — Add commented-out optional audit step in workflow YAML.

**4. Agent-Native Toolchain** (after Comparison / mechanical features) — Include these subsections:

- **Intro**: "{Tool} is designed for development workflows where AI agents are first-class participants. It encapsulates domain-specific semantic reasoning inside the toolchain itself, returning structured results that humans, CI systems, and AI agents consume in the same way."
- **Deterministic checks first**
- **Semantic audit inside the toolchain** — list each LLM command and what domain reasoning it performs
- **Structured findings** — "Results conform to typed schemas such as `{ProjectResult}`. Audit-style results are compatible with the common `AgentAuditResult` / `AgentFinding` shape so that higher-level workflow agents can aggregate findings across toolchains."
- **Tool-owned domain knowledge**
- **Agent-readable interface** — reference cli-contract.yaml
- **LLM Adapter Configuration** — table with "runtime default" (not pinned model names), `--model` for pinning

**5. Technology Stack** — Add agent-contracts-runtime, agent-contracts, cli-contracts rows.

### Phase 9: Validate

```bash
npm run build
npm test
npx {tool} {llm-command} --show-prompt       # verify prompt construction
npx {tool} {llm-command} --adapter mock       # verify with mock adapter
npx {tool} {llm-command} --adapter openai     # real LLM test
```

Run `agent-contracts audit --config {project}-audit.config.yaml` to validate DSL completeness.

## Common Pitfalls

- **$ref resolution in handoff-types.yaml**: agent-contracts may not resolve `$ref` to external files like `cli-contract.yaml` at generate time. Prefer `$ref` (as cli-contracts does), but fall back to inlining with a comment noting the SSoT location
- **`--format` vs `--report-format`**: LLM commands use `--report-format` (json/text/yaml). `--format` is for core deterministic commands only. Do not mix them
- **Large context → truncation**: Filter context to relevant portions; cap at 16KB
- **Missing `version: 1` + `system:` block**: DSL generator produces numbered files instead of named ones
- **`can_read_artifacts` referencing non-existent IDs**: Keep empty `[]` if no artifact registry is defined
- **Missing workflows for tasks**: Every task needs a `workflow:` field; define all workflows in the main DSL
- **Handoff schema drift**: Domain-specific result types must extend the base `AgentAuditResult` shape (summary, riskLevel, findings, recommendedActions, metadata). Add domain fields alongside, not replacing base fields