# Production-Grade AIWG Guide

This guide documents the production-grade improvements to AIWG based on academic research analysis.

## Research Foundation

AIWG production-grade features are grounded in peer-reviewed research:

- **REF-001**: Bandara et al. (2024) ["Production-Grade Agentic AI Workflows"](#research-production-workflows) (arxiv 2512.08769)
- **REF-002**: Roig (2025) ["How Do LLMs Fail In Agentic Scenarios?"](#research-failure-modes) (arxiv 2512.07497v2)

## Agent Design Bible

The [Agent Design Bible](#ref-agent-design) defines 10 Golden Rules for building production-grade agents.

### Quick Reference

| Rule | Name | Key Principle |
|------|------|---------------|
| 1 | Single Responsibility | One agent, one job |
| 2 | Minimal Tools | 0-3 tools per agent |
| 3 | Explicit I/O | Clear inputs and outputs |
| 4 | Grounding First | Verify before acting |
| 5 | Escalate Uncertainty | Never guess silently |
| 6 | Scoped Context | Filter irrelevant data |
| 7 | Recovery-First | Design for failure |
| 8 | Model Tier | Match complexity to task |
| 9 | Parallel-Ready | Enable concurrent execution |
| 10 | Observable | Emit structured logs |

### Agent Linting

Validate agents against the 10 rules:

```bash
# Lint all framework agents
aiwg lint agents

# Lint specific directory with verbose output
aiwg lint agents .claude/agents/ --verbose

# CI-friendly JSON output
aiwg lint agents --json --strict
```

### Agent Scaffolding

Create new agents using validated templates:

```bash
# Simple agent (haiku, minimal tools)
aiwg add-agent code-analyzer --to my-addon --template simple

# Complex agent (sonnet, multiple tools)
aiwg add-agent architecture-reviewer --to my-addon --template complex

# Orchestrator (opus, coordinates other agents)
aiwg add-agent workflow-coordinator --to my-addon --template orchestrator
```

## LLM Failure Mode Mitigations

REF-002 identifies four failure archetypes. AIWG provides specific mitigations:

### Archetype 1: Premature Action Without Grounding

**Problem**: Agent guesses database schema instead of inspecting it.

**Mitigation**: Rule 4 (Grounding First)

```markdown
## Grounding Protocol

Before external operations:
1. List available resources (files, tables, APIs)
2. Verify target exists
3. Inspect structure before manipulation
4. NEVER assume schema from similar names
```

### Archetype 2: Over-Helpfulness Under Uncertainty

**Problem**: Agent substitutes similar entity when target not found.

**Mitigation**: Rule 5 (Escalate Uncertainty)

```markdown
## Uncertainty Protocol

When input cannot be resolved:
1. List candidates with confidence scores
2. Explain why no exact match
3. Ask user to select or provide more context
4. NEVER silently substitute
```

### Archetype 3: Distractor-Induced Context Pollution

**Problem**: Irrelevant data in context causes errors ("Chekhov's gun" effect).

**Mitigation**: Context Curator Addon

```bash
# Deploy context curator
aiwg use sdlc  # Includes context-curator addon
```

The Context Curator agent scores context as:
- **RELEVANT**: Directly needed for current task
- **PERIPHERAL**: May be useful, keep accessible
- **DISTRACTOR**: Remove from active context

### Archetype 4: Fragile Execution Under Load

**Problem**: Agent enters loops, loses coherence, or makes malformed tool calls.

**Mitigation**: Resilience Protocol

```markdown
## PAUSE→DIAGNOSE→ADAPT→RETRY→ESCALATE

1. **PAUSE**: Stop, preserve state
2. **DIAGNOSE**: Classify error type
3. **ADAPT**: Choose different approach
4. **RETRY**: Max 3 adapted attempts
5. **ESCALATE**: Structured report to user
```

Loop detection triggers when:
- Same tool called 3+ times consecutively
- Same error occurs 2+ times
- Identical outputs from different attempts

## @-Mention Traceability

Claude Code 2.0.43 fixed @-mention loading for nested files. AIWG leverages this for live traceability.

### Conventions

```
# Requirements
@.aiwg/requirements/UC-{NNN}-{slug}.md        # Use cases
@.aiwg/requirements/NFR-{CAT}-{NNN}.md        # Non-functional

# Architecture
@.aiwg/architecture/adrs/ADR-{NNN}-{slug}.md  # Decisions

# Security
@.aiwg/security/TM-{NNN}.md                   # Threats
@.aiwg/security/controls/{id}.md              # Controls

# Code
@src/{path}                                    # Source
@test/{path}                                   # Tests
```

### Code Integration

Add @-mentions to file headers:

```typescript
/**
 * @file Authentication Service
 * @implements @.aiwg/requirements/UC-003-user-auth.md
 * @architecture @.aiwg/architecture/adrs/ADR-005-jwt-strategy.md
 * @security @.aiwg/security/controls/authn-001.md
 * @tests @test/integration/auth.test.ts
 */
```

### Wiring Utilities

```bash
# Analyze codebase and suggest @-mentions
aiwg wire-mentions --dry-run

# Apply high-confidence suggestions
aiwg wire-mentions --auto

# Validate all @-mentions resolve
aiwg validate-mentions --strict

# Lint for style consistency
aiwg mention-lint --fix
```

## Hooks

AIWG hooks integrate with Claude Code 2.0.43+ lifecycle events.

### Trace Hook

Captures agent execution history for debugging and recovery.

```javascript
// .claude/hooks/aiwg-trace.js
// Automatically logs:
// - SubagentStart: agent_id, type, timestamp
// - SubagentStop: agent_id, transcript_path, duration
```

View traces:

```bash
# Tree view
node ~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-hooks/scripts/trace-viewer.mjs tree

# Timeline view
node ~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-hooks/scripts/trace-viewer.mjs timeline
```

### Permission Hook

Auto-approve trusted operations to reduce prompts:

```javascript
// .claude/hooks/aiwg-permissions.js
// Auto-approves:
// - Write to .aiwg/**
// - Read from ai-writing-guide/**
// - Git operations on AIWG branches
```

### Session Hook

Manage named sessions for workflow persistence:

```bash
# Generate session name
node aiwg-session.js suggest inception-to-elaboration
# Output: aiwg-inception-to-elaboration-2025-01-15-1030

# Record session
node aiwg-session.js record aiwg-my-session --workflow security-review

# List recent sessions
node aiwg-session.js list
```

## Prompt Registry

Import prompts directly into CLAUDE.md using @-imports.

### Available Prompts

| Path | Purpose |
|------|---------|
| `prompts/core/orchestrator.md` | Claude orchestrator guidance |
| `prompts/core/multi-agent-pattern.md` | Primary→Reviewers→Synthesizer |
| `prompts/core/consortium-pattern.md` | Multi-expert coordination |
| `prompts/agents/design-rules.md` | Condensed 10 Golden Rules |
| `prompts/reliability/decomposition.md` | Task decomposition templates |
| `prompts/reliability/parallel-hints.md` | Parallel execution patterns |
| `prompts/reliability/resilience.md` | Recovery protocol |

### Usage

In your project's CLAUDE.md:

```markdown
## Orchestration Guidelines

@~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-utils/prompts/core/orchestrator.md

## Multi-Agent Pattern

@~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-utils/prompts/core/multi-agent-pattern.md
```

## Deploy Generators

Generate production-ready deployment configurations.

```bash
# Docker (multi-stage, non-root, health checks)
/deploy-gen docker --app-name myapp --port 3000

# Kubernetes (SecurityContext, probes, resources)
/deploy-gen k8s --app-name myapp --port 3000

# Docker Compose
/deploy-gen compose --app-name myapp --port 3000
```

Generated configurations include:
- Multi-stage builds (deps → builder → runner)
- Non-root user execution
- Health check endpoints
- Resource limits
- Security contexts
- Anti-affinity rules (K8s)

## Evals Framework

Automated testing for agent behavior against failure archetypes.

### Running Evals

```bash
# Test agent against grounding scenario
/eval-agent my-agent --category archetype --scenario grounding-test

# Test agent against distractor scenario
/eval-agent my-agent --category archetype --scenario distractor-test

# Test recovery behavior
/eval-agent my-agent --category archetype --scenario recovery-test
```

### Scenario Types

| Scenario | Tests | Pass Criteria |
|----------|-------|---------------|
| grounding-test | Archetype 1 | Agent inspects before acting |
| distractor-test | Archetype 3 | Agent ignores irrelevant data |
| recovery-test | Archetype 4 | Agent recovers from errors |

### Creating Custom Scenarios

```yaml
# .aiwg/evals/my-scenario.yaml
name: custom-validation
category: archetype
description: Test custom validation behavior

setup:
  files:
    - path: target.md
      content: |
        # Target Document
        This is the correct target.

validation:
  type: output_contains
  expected: "target.md"
  pass_threshold: 1.0
```

## Agent Personas

Pre-configured agents for common workflows.

### Available Personas

| Persona | Focus | Model | Permissions |
|---------|-------|-------|-------------|
| `aiwg-orchestrator` | Full SDLC orchestration | opus | full |
| `aiwg-reviewer` | Code review | sonnet | write-artifacts |
| `aiwg-security` | Security audit | sonnet | read-only |
| `aiwg-writer` | Documentation | sonnet | write-artifacts |

### Usage

```bash
# Launch with persona
claude --agent aiwg-orchestrator

# Or via AIWG CLI
aiwg --persona orchestrator
```

## Multi-Agent Patterns

### Primary → Reviewers → Synthesizer

Standard pattern for document generation with review:

```
Primary Author (opus) → Creates draft
        ↓
Parallel Reviewers (sonnet) → Independent reviews
        ↓
Synthesizer (sonnet) → Merges feedback
        ↓
Archive → .aiwg/[category]/
```

**Key**: Launch reviewers in SINGLE message for true parallelism.

### Consortium Pattern

Multi-expert coordination with trade-off documentation:

```
Coordinator (opus)
    ↓
Parallel Experts (sonnet) → Independent analysis
    ↓
Trade-off Matrix → Document disagreements
    ↓
Synthesis → Majority + dissent record
```

## Metrics & Targets

Based on REF-002 benchmarks:

| Metric | Target | Source |
|--------|--------|--------|
| Grounding compliance | >90% | Archetype 1 |
| Entity substitution rate | <5% | Archetype 2 |
| Distractor error reduction | ≥50% | Archetype 3 |
| Recovery success rate | ≥80% | Archetype 4 |
| Parallel utilization | >60% | REF-001 BP-9 |

## Quick Start

### New Project

```bash
# Create project with SDLC framework
aiwg -new my-project
cd my-project

# Verify agent linting passes
aiwg lint agents

# Wire @-mentions
aiwg wire-mentions --dry-run
```

### Existing Project

```bash
# Deploy SDLC framework
aiwg use sdlc

# Setup hooks
cp ~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-hooks/hooks/*.js .claude/hooks/

# Generate intake from codebase
/intake-from-codebase .

# Wire @-mentions
aiwg wire-mentions --interactive
```

## References

- [Agent Design Bible](#ref-agent-design) - 10 Golden Rules
- [REF-001](#research-production-workflows) - Production-Grade Research
- [REF-002](#research-failure-modes) - LLM Failure Modes
- [SDLC Framework](#quickstart-sdlc) - Complete lifecycle