# LLM JSON Fix

[![npm version](https://img.shields.io/npm/v/llm-json-fix.svg)](https://www.npmjs.com/package/llm-json-fix)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![npm downloads](https://img.shields.io/npm/dm/llm-json-fix.svg)](https://www.npmjs.com/package/llm-json-fix)

A comprehensive library for repairing malformed JSON outputs from Large Language Models (LLMs).

## Why This Library?

JSON outputs from LLMs are powerful but notoriously inconsistent. Even a small 1% failure rate in JSON formatting can cause system failures that are difficult to debug. This library automatically identifies and repairs common issues in LLM-generated JSON, making your AI integrations more robust and reliable.

## Features

- **LLM-Specific Repairs**: Handles unique issues in AI-generated content
- **Markdown Cleanup**: Removes code blocks, explanatory text, and other non-JSON content
- **Streaming Support**: Process infinitely large documents with minimal memory usage
- **Schema Flexibility**: Works with any JSON structure
- **Model-Specific Optimizations**: Can be configured for OpenAI, Anthropic, or other LLMs

## Installation

```bash
# Using npm
npm install llm-json-fix

# Using yarn
yarn add llm-json-fix

# Using pnpm
pnpm add llm-json-fix
```

### Requirements

- Node.js 14.0.0 or higher
- Works in both CommonJS and ESM environments

## Basic Usage

```javascript
import { fixLLMJson } from 'llm-json-fix';

// Fix malformed JSON from an LLM
const response = `Here's the JSON you requested: \`\`\`json
{
  name: "John",
  items: ['apple', 'banana', ...],
  active: True
}
\`\`\``;

// Repair the JSON
const fixedJson = fixLLMJson(response);

// Use the fixed JSON
const data = JSON.parse(fixedJson);
console.log(data);
```

## Issues Fixed

### Incomplete JSON Structures
- Truncated outputs where closing brackets are missing
- Unfinished arrays or objects due to token limits
- Partial final elements

### Quote Inconsistencies
- Mixing of single and double quotes
- Unclosed quotes
- Incorrectly escaped quotes within strings

### Schema Violations
- Property names without quotes
- Extra or missing commas
- Trailing commas (valid in JavaScript but invalid in JSON)

### Markdown Artifacts
- Code block markers (```) included in the JSON
- Explanation text mixed with JSON output
- Markdown formatting within JSON strings

### LLM Hallucinations
- Explanatory comments included in the JSON
- "..." or "[more items]" placeholders
- Natural language interruptions mid-JSON

### Nested JSON Formatting Issues
- Inconsistent indentation
- Improperly escaped nested JSON strings
- Confusion between string representations of objects and actual objects

## API Reference

### Regular API

```typescript
fixLLMJson(text: string, options?: FixLLMJsonOptions): string
```

#### Options

```typescript
interface FixLLMJsonOptions {
  // Whether to apply model-specific fixes (default: true)
  applyModelSpecificFixes?: boolean;
  
  // The specific LLM model being used, for optimized repairs
  // Supported values: 'openai', 'anthropic', 'general'
  model?: 'openai' | 'anthropic' | 'general';
  
  // Whether to preserve comments in the JSON (default: false)
  preserveComments?: boolean;
  
  // Whether to be verbose about changes being made
  verbose?: boolean;
}
```

### Streaming API

For processing large files or streams:

```typescript
import { createLLMJsonFixStream } from 'llm-json-fix/stream';
import { createReadStream, createWriteStream } from 'fs';
import { pipeline } from 'stream';

const inputStream = createReadStream('broken.json');
const outputStream = createWriteStream('fixed.json');
const fixStream = createLLMJsonFixStream({ 
  bufferSize: 64 * 1024, // 64KB
  model: 'openai'
});

pipeline(inputStream, fixStream, outputStream, (err) => {
  if (err) {
    console.error('Error:', err);
  } else {
    console.log('JSON successfully repaired!');
  }
});
```

## Command Line Interface

This package provides a command-line tool for repairing JSON files:

```bash
# Install globally
npm install -g llm-json-fix

# Repair a file
llm-json-fix broken.json > fixed.json

# Or with options
llm-json-fix broken.json --output fixed.json --model openai --verbose
```

### CLI Options

```
--version, -v       Show application version
--help,    -h       Display help for command
--output,  -o       Output file
--overwrite         Overwrite the input file
--buffer            Buffer size in bytes, for example 64K (default) or 1M
--model             Specify the LLM model (openai, anthropic, general)
--verbose           Show detailed repair information
--preserve-comments Preserve comments in the output
```

## Examples

See the [examples](./examples) directory for more usage examples:

- [Basic Usage](./examples/basic-usage.js)
- [Streaming API](./examples/streaming-api.js)
- [OpenAI Integration](./examples/openai-integration.js)

## Common Patterns & Integration Tips

### With OpenAI

```javascript
try {
  const response = await openai.chat.completions.create({
    model: "gpt-4",
    messages: [
      { role: "system", content: "Respond with valid JSON only." },
      { role: "user", content: prompt }
    ]
  });
  
  const content = response.choices[0].message.content;
  const fixedJson = fixLLMJson(content, { model: 'openai' });
  const data = JSON.parse(fixedJson);
  
  // Use the data...
} catch (error) {
  console.error('Error:', error);
}
```

### With Anthropic Claude

```javascript
try {
  const response = await anthropic.messages.create({
    model: "claude-3-opus-20240229",
    max_tokens: 4000,
    messages: [
      { role: "user", content: "Return this data as JSON: " + prompt }
    ],
    system: "Return only valid JSON data with no additional text."
  });
  
  const content = response.content[0].text;
  const fixedJson = fixLLMJson(content, { model: 'anthropic' });
  const data = JSON.parse(fixedJson);
  
  // Use the data...
} catch (error) {
  console.error('Error:', error);
}
```

## License

[MIT License](LICENSE)

## Package Contents

The npm package includes:

- CommonJS build for Node.js environments
- ESM build for modern JavaScript environments
- UMD build for browser usage
- TypeScript type definitions
- CLI executable
- Full documentation

For more information, see the [changelog](CHANGELOG.md).
