# ResponseCache

`ResponseCache` is an input processor that caches LLM responses on the request/response boundary inside the agentic loop. It hooks into `processLLMRequest` (cache lookup; short-circuits on hit) and `processLLMResponse` (cache write on completion).

The cache key is derived from the resolved `LanguageModelV2Prompt` Mastra is about to send to the model — i.e. _after_ memory has loaded and earlier input processors have transformed the prompt — so two users with different memory contexts produce different cache keys. Each step in an agentic tool loop is independently cached.

There is no agent-level option for response caching; register `ResponseCache` explicitly on `inputProcessors`. Per-call overrides flow through `RequestContext` via [`ResponseCache.context()`](#static-helpers) and [`ResponseCache.applyContext()`](#static-helpers).

## Usage example

```typescript
import { Agent } from '@mastra/core/agent'
import { InMemoryServerCache } from '@mastra/core/cache'
import { ResponseCache } from '@mastra/core/processors'

const cache = new InMemoryServerCache()

const agent = new Agent({
  name: 'Search Agent',
  instructions: 'You answer questions concisely.',
  model: 'openai/gpt-5',
  inputProcessors: [new ResponseCache({ cache, ttl: 600 })],
})

// First call hits the LLM and writes to the cache.
await agent.generate('What is the capital of France?')

// Second identical call replays the cached response.
await agent.generate('What is the capital of France?')

// Force a fresh call but still update the cache.
await agent.generate('What is the capital of France?', {
  requestContext: ResponseCache.context({ bust: true }),
})
```

See [Response caching](https://mastra.ai/docs/agents/response-caching) for the conceptual overview, scoping rules, and recommended deployment patterns.

## Constructor parameters

**cache** (`MastraServerCache`): The cache backend. Required. Pass any \`MastraServerCache\` implementation — \`InMemoryServerCache\` for local development, \`RedisCache\` from \`@mastra/redis\` for production, or your own subclass for a custom backend.

**ttl** (`number`): Time-to-live (seconds) for entries written by this processor. Defaults to 300 seconds (5 minutes), matching OpenRouter's reference implementation. (Default: `300`)

**scope** (`string | null`): Tenant scope appended to the cache key. \`null\` opts out of scoping. When omitted, the processor falls back to the resource id resolved from the request context (\`MASTRA\_RESOURCE\_ID\_KEY\`) for automatic per-user isolation.

**key** (`string | (inputs: ResponseCacheKeyInputs) => string | Promise<string>`): Override the auto-derived cache key. Pass a string to pin a key, or a function that receives \`{ agentId, scope, model, prompt, stepNumber }\` and returns a key. If the function throws, the processor falls back to the deterministic hash so the call still benefits from caching.

**bust** (`boolean`): Force a cache miss on every call: skip the read but still write on completion. Useful for explicit refresh paths. (Default: `false`)

**agentId** (`string`): Logical id used in the cache key namespace. Defaults to \`'mastra-response-cache'\`. Set this to the owning agent's id when you want cache entries scoped per-agent. (Default: `'mastra-response-cache'`)

## Static helpers

`ResponseCache` exposes two static helpers for setting per-call overrides on a `RequestContext`. The helpers keep the underlying context key a private implementation detail — prefer them over reading/writing the raw key.

### `ResponseCache.context(options)`

Build a fresh `RequestContext` preloaded with per-call response cache overrides.

```typescript
await agent.stream('hello', {
  requestContext: ResponseCache.context({ key: 'custom', bust: true }),
})
```

### `ResponseCache.applyContext(requestContext, options)`

Merge per-call response cache overrides into an existing `RequestContext`. Returns the same context for chaining.

```typescript
const ctx = new RequestContext()
ctx.set('caller-meta', { userId: 'u-123' })
ResponseCache.applyContext(ctx, { bust: true })
await agent.stream('hello', { requestContext: ctx })
```

## ResponseCacheContextOptions

The shape passed to `ResponseCache.context()` / `ResponseCache.applyContext()`.

**key** (`string | (inputs: ResponseCacheKeyInputs) => string | Promise<string>`): Overrides the auto-derived cache key for this request only.

**scope** (`string | null`): Overrides the tenant scope for this request only. \`null\` opts out of scoping.

**bust** (`boolean`): Skip the cache read but still write on completion.

`cache`, `ttl`, and `agentId` are intentionally not overridable per call — they are instance-level concerns that should not vary per request.

## ResponseCacheKeyInputs

The argument passed to a `key` function (constructor or per-call). All fields contribute to the deterministic hash by default.

**agentId** (`string`): Logical processor id used to namespace the cache key.

**scope** (`string | null | undefined`): Resolved scope for this request, or \`null\` when scoping is disabled.

**model** (`{ provider?: string; modelId?: string; specVersion?: string }`): Provider/model identity. Different models produce different responses.

**prompt** (`LanguageModelV2Prompt`): The exact prompt the provider would receive, post memory load and post any prompt-modifying input processors.

**stepNumber** (`number`): 0-indexed step number within the agentic loop. Greater than zero for tool steps.

## Helper exports

- `buildResponseCacheKey(inputs)` — the deterministic hash used by default. Re-export it to override individual fields while preserving the rest of the standard key shape.
- `DEFAULT_RESPONSE_CACHE_TTL_SECONDS` — the default `ttl` (`300`).
- `RESPONSE_CACHE_CONTEXT_KEY` — the `RequestContext` key the static helpers write to. Exposed for advanced cases (e.g. clearing the override mid-pipeline); prefer the helpers.

## Related

- [Response caching](https://mastra.ai/docs/agents/response-caching)
- [Processors](https://mastra.ai/docs/agents/processors)
- [Processor interface](https://mastra.ai/reference/processors/processor-interface)