<div align="center">
  <h1>@cyanheads/pubmed-mcp-server</h1>
  <p><b>Search PubMed/Europe PMC, fetch articles and full text (PMC/EPMC/Unpaywall), citations, MeSH terms via MCP. STDIO or Streamable HTTP.</b>
  <div>10 Tools • 1 Resource • 1 Prompt</div>
  </p>
</div>

<div align="center">

[![npm](https://img.shields.io/npm/v/@cyanheads/pubmed-mcp-server?style=flat-square&logo=npm&logoColor=white)](https://www.npmjs.com/package/@cyanheads/pubmed-mcp-server) [![Version](https://img.shields.io/badge/Version-2.7.1-blue.svg?style=flat-square)](./CHANGELOG.md) [![Framework](https://img.shields.io/badge/Built%20on-@cyanheads/mcp--ts--core-259?style=flat-square)](https://www.npmjs.com/package/@cyanheads/mcp-ts-core) [![MCP SDK](https://img.shields.io/badge/MCP%20SDK-^1.29.0-green.svg?style=flat-square)](https://modelcontextprotocol.io/) 

[![License](https://img.shields.io/badge/License-Apache%202.0-orange.svg?style=flat-square)](./LICENSE) [![TypeScript](https://img.shields.io/badge/TypeScript-^6.0.3-3178C6.svg?style=flat-square)](https://www.typescriptlang.org/) [![Bun](https://img.shields.io/badge/Bun-v1.3.2-blueviolet.svg?style=flat-square)](https://bun.sh/)

</div>

<div align="center">

**Public Hosted Server:** [https://pubmed.caseyjhand.com/mcp](https://pubmed.caseyjhand.com/mcp)

</div>

---

## Tools

10 tools for working with PubMed, PubMed Central, and Europe PMC data:

| Tool | Description |
|:---|:---|
| `pubmed_search_articles` | Search PubMed with full query syntax, field-specific filters, date ranges, pagination, and optional brief summaries |
| `pubmed_europepmc_search` | Search Europe PMC for preprints, patents, Agricola, and EPMC-only OA records that don't surface in PubMed. Cursor-based pagination. |
| `pubmed_fetch_articles` | Fetch full article metadata by PMIDs — abstract, authors, journal, MeSH terms, grants |
| `pubmed_fetch_fulltext` | Fetch full-text articles via a chain: NCBI PMC EFetch → Europe PMC `fullTextXML` → Unpaywall. Accepts PMIDs, PMCIDs, or DOIs. |
| `pubmed_format_citations` | Generate formatted citations in APA 7th, MLA 9th, BibTeX, or RIS |
| `pubmed_find_related` | Find similar articles, citing articles, or references for a given PMID |
| `pubmed_spell_check` | Spell-check biomedical queries using NCBI's ESpell service |
| `pubmed_lookup_mesh` | Search and explore MeSH vocabulary — tree numbers, scope notes, entry terms |
| `pubmed_lookup_citation` | Resolve partial bibliographic references to PubMed IDs via ECitMatch |
| `pubmed_convert_ids` | Convert between DOI, PMID, and PMCID using the PMC ID Converter API |

### `pubmed_search_articles`

Search PubMed with full NCBI query syntax and filters.

- Free-text queries with PubMed's full boolean and field-tag syntax
- Field-specific filters: author, journal, MeSH terms, language, species
- Common filters: has abstract, free full text
- Date range filtering by publication, modification, or Entrez date
- Publication type filtering (Review, Clinical Trial, Meta-Analysis, etc.)
- Sort by relevance, publication date, author, or journal
- Pagination via offset for paging through large result sets
- Optional brief summaries for top N results via ESummary
- Returns the original query plus the fully applied PubMed query and normalized filter metadata

---

### `pubmed_fetch_articles`

Fetch full article metadata by PubMed IDs.

- Batch fetch up to 200 articles at once (auto-switches to POST for batches >= 100)
- Returns structured data: title, abstract, authors with deduplicated affiliations, journal info, DOI
- Direct links to PubMed and PubMed Central (when available)
- Optional MeSH terms, grant information, and publication types
- Handles PubMed's inconsistent XML (structured abstracts, missing fields, varying date formats)

---

### `pubmed_fetch_fulltext`

Fetch full-text articles via a three-stage chain: NCBI PMC EFetch → Europe PMC `fullTextXML` → Unpaywall.

- Accepts exactly one of `pmcids` (direct PMC IDs), `pmids` (PubMed IDs, auto-resolved), or `dois` (covers preprints and EPMC-only OA records that lack PMID/PMCID)
- NCBI PMC and Europe PMC both return structured JATS; output records origin via `viaSource: "pmc" | "europepmc" | "unpaywall"`
- Europe PMC layer (enabled by default; disable with `EUROPEPMC_ENABLED=false`) recovers PMC-counterpart records that NCBI PMC EFetch missed, and resolves DOI input to PMC counterparts when one exists. EPMC's `fullTextXML` is PMC-keyed, so preprints (PPR), patents (PAT), and Agricola (AGR) are reachable via `pubmed_europepmc_search` for metadata but have no full text via this chain.
- Unpaywall layer (enabled by setting `UNPAYWALL_EMAIL`) resolves DOIs to legal OA copies; extracts HTML landing pages to Markdown via Defuddle or PDFs to text via unpdf
- Discriminated output contract — `source: "pmc"` (structured sections, regardless of whether it came from PMC or EPMC) or `source: "unpaywall"` (best-effort body + `contentFormat`: `html-markdown` or `pdf-text`)
- Structured unavailable reasons (`not-found`, `no-pmc-fallback-disabled`, `no-epmc-fulltext`, `no-doi`, `no-oa`, `fetch-failed`, `parse-failed`, `service-error`) so callers can retry or explain to users without parsing text
- Each `unavailable` entry carries `idType` (`pmid` / `pmcid` / `doi`) and `triedTiers` — per-tier outcomes (`not-attempted`, `miss`, `no-fulltext`, `service-error`, …) in execution order, so callers can see which stage failed and why
- Section filtering by title (case-insensitive match, e.g. `["methods", "results"]`) and configurable max sections apply to PMC output
- Up to 10 articles per request

---

### `pubmed_europepmc_search`

Search Europe PMC (EBI/EMBL-EBI), a broader open-access biomedical corpus than PubMed alone.

- Surfaces records PubMed search can't reach: preprints (`source: PPR`), patents (`source: PAT`), Agricola (`source: AGR`), plus everything in PubMed (`MED`) and PMC (`PMC`)
- Default sources `["MED", "PMC", "PPR"]`; pass `sources` to include `PAT` / `AGR`
- Cursor-based pagination via `cursorMark` (unlike `pubmed_search_articles`, which uses offset) — `*` for the first page, return `nextCursorMark` for the next
- Output discriminator on `source` plus optional `pmid` / `pmcId` / `doi` cross-walking
- Disabled when `EUROPEPMC_ENABLED=false`; tool is not registered in that case

---

### `pubmed_format_citations`

Generate formatted citations for articles.

- Four citation styles: APA 7th, MLA 9th, BibTeX, RIS
- Request multiple styles per article in a single call
- Hand-rolled formatters — zero external dependencies, fully Workers-compatible
- Up to 50 articles per request
- Reports formatted counts and unavailable PMIDs for partial-result handling

---

### `pubmed_find_related`

Find articles related to a source article via ELink.

- Three relationship types: `similar` (content similarity), `cited_by`, `references`
- Results enriched with title, authors, publication date, and source via ESummary
- Results returned in NCBI's relevance order

---

### `pubmed_spell_check`

Spell-check a biomedical query using NCBI's ESpell.

- Returns the original query, corrected query, and whether a suggestion was found
- Useful for query refinement before searching

---

### `pubmed_lookup_mesh`

Search and explore the MeSH (Medical Subject Headings) vocabulary.

- Search MeSH terms by name with exact-heading matching
- Detailed records with tree numbers, scope notes, and entry terms by default
- Useful for building precise PubMed queries with controlled vocabulary

---

### `pubmed_lookup_citation`

Resolve partial bibliographic references to PubMed IDs via NCBI ECitMatch.

- Match citations by journal, year, volume, first page, and/or author name
- More fields = better match accuracy; at least one field required
- Batch up to 25 citations per request
- Deterministic matching — more reliable than free-text search for known references
- Returns explicit `matched`, `not_found`, and `ambiguous` statuses with recovery detail

---

### `pubmed_convert_ids`

Convert between article identifiers (DOI, PMID, PMCID) using the PMC ID Converter API.

- Batch up to 50 IDs per request
- Accepts DOIs, PMIDs, or PMCIDs (all IDs must be the same type)
- Only resolves articles indexed in PubMed Central
- Returns all available identifier mappings for each input ID

## Resource and prompt

| Type | Name | Description |
|:---|:---|:---|
| Resource | `pubmed://database/info` | PubMed database metadata via EInfo (field list, record count, last update) |
| Prompt | `research_plan` | Generate a structured 4-phase biomedical research plan outline |

## Features

Built on [`@cyanheads/mcp-ts-core`](https://github.com/cyanheads/mcp-ts-core):

- Declarative tool definitions — single file per tool, framework handles registration and validation
- Unified error handling across all tools
- Pluggable auth (`none`, `jwt`, `oauth`)
- Swappable storage backends: `in-memory`, `filesystem`, `Supabase`, `Cloudflare KV/R2/D1`
- Structured logging with optional OpenTelemetry tracing
- Runs locally (stdio/HTTP) or on Cloudflare Workers from the same codebase

PubMed-specific:

- Complete NCBI E-utilities integration (ESearch, EFetch, ESummary, ELink, ESpell, EInfo, ECitMatch) plus PMC ID Converter
- Sequential request queue with configurable delay for NCBI rate limit compliance
- NCBI-specific XML parser with `isArray` hints for PubMed's inconsistent XML structure
- Hand-rolled citation formatters (APA, MLA, BibTeX, RIS) — zero deps, Workers-compatible

## Getting started

### Public Hosted Instance

A public instance is available at `https://pubmed.caseyjhand.com/mcp` — no installation required. Point any MCP client at it via Streamable HTTP:

```json
{
  "mcpServers": {
    "pubmed": {
      "type": "streamable-http",
      "url": "https://pubmed.caseyjhand.com/mcp"
    }
  }
}
```

### Self-Hosted / Local

Add the following to your MCP client configuration file.

```json
{
  "mcpServers": {
    "pubmed": {
      "type": "stdio",
      "command": "bunx",
      "args": ["@cyanheads/pubmed-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info",
        "NCBI_API_KEY": "your-key-here"
      }
    }
  }
}
```

Or with npx (no Bun required):

```json
{
  "mcpServers": {
    "pubmed": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@cyanheads/pubmed-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info",
        "NCBI_API_KEY": "your-key-here"
      }
    }
  }
}
```

Or with Docker:

```json
{
  "mcpServers": {
    "pubmed": {
      "type": "stdio",
      "command": "docker",
      "args": ["run", "-i", "--rm", "-e", "MCP_TRANSPORT_TYPE=stdio", "ghcr.io/cyanheads/pubmed-mcp-server:latest"]
    }
  }
}
```

For Streamable HTTP, set the transport and start the server:

```sh
MCP_TRANSPORT_TYPE=http MCP_HTTP_PORT=3010 bun run start:http
# Server listens at http://localhost:3010/mcp
```

### Prerequisites

- [Bun v1.3.2](https://bun.sh/) or higher.
- Optional: [NCBI API key](https://www.ncbi.nlm.nih.gov/account/settings/) for higher rate limits (10 req/s vs 3 req/s).

### Installation

1. **Clone the repository:**

```sh
git clone https://github.com/cyanheads/pubmed-mcp-server.git
```

2. **Navigate into the directory:**

```sh
cd pubmed-mcp-server
```

3. **Install dependencies:**

```sh
bun install
```

## Configuration

All configuration is validated at startup via Zod schemas in `src/config/server-config.ts`. Key environment variables:

| Variable | Description | Default |
|:---|:---|:---|
| `MCP_TRANSPORT_TYPE` | Transport: `stdio` or `http` | `stdio` |
| `MCP_HTTP_PORT` | HTTP server port | `3010` |
| `MCP_HTTP_ENDPOINT_PATH` | HTTP endpoint path where the MCP server is mounted | `/mcp` |
| `MCP_PUBLIC_URL` | Public origin override for TLS-terminating reverse-proxy deployments (landing page, Server Card, RFC 9728 metadata). | none |
| `MCP_AUTH_MODE` | Authentication: `none`, `jwt`, or `oauth` | `none` |
| `MCP_LOG_LEVEL` | Log level (`debug`, `info`, `warning`, `error`, etc.) | `info` |
| `LOGS_DIR` | Directory for log files (Node.js only). | `<project-root>/logs` |
| `STORAGE_PROVIDER_TYPE` | Storage backend: `in-memory`, `filesystem`, `supabase`, `cloudflare-kv/r2/d1` | `in-memory` |
| `NCBI_API_KEY` | NCBI API key for higher rate limits (10 req/s vs 3 req/s) | none |
| `NCBI_ADMIN_EMAIL` | Contact email sent with NCBI requests (recommended by NCBI) | none |
| `NCBI_REQUEST_DELAY_MS` | Minimum gap between NCBI request starts in ms | 334 (100 with key) |
| `NCBI_MAX_CONCURRENT` | Max concurrent in-flight NCBI requests | `8` |
| `NCBI_MAX_RETRIES` | Retry attempts for failed NCBI requests | 6 |
| `NCBI_TIMEOUT_MS` | Per-request HTTP timeout in ms | `30000` |
| `NCBI_TOTAL_DEADLINE_MS` | Total deadline across all retry attempts for one NCBI call, in ms | `60000` |
| `UNPAYWALL_EMAIL` | Contact email for Unpaywall. When set, `pubmed_fetch_fulltext` falls back to Unpaywall open-access copies for non-PMC DOIs | none |
| `UNPAYWALL_TIMEOUT_MS` | Per-request HTTP timeout for Unpaywall lookups and content fetches, in ms | `20000` |
| `EUROPEPMC_ENABLED` | Enable Europe PMC search tool and the `pubmed_fetch_fulltext` JATS fallback chain. Set `false` to disable all EPMC calls and skip tool registration. | `true` |
| `EUROPEPMC_EMAIL` | Optional contact email sent with Europe PMC requests (EBI courtesy). | none |
| `EUROPEPMC_REQUEST_DELAY_MS` | Minimum gap between Europe PMC request starts in ms | `200` |
| `EUROPEPMC_MAX_RETRIES` | Retry attempts for failed Europe PMC requests | `3` |
| `EUROPEPMC_TIMEOUT_MS` | Per-request HTTP timeout for Europe PMC calls, in ms | `20000` |
| `OTEL_ENABLED` | Enable OpenTelemetry | `false` |

## Running the server

### Local development

- **Build and run the production version**:

  ```sh
  # One-time build
  bun run rebuild

  # Run the built server
  bun run start:http
  # or
  bun run start:stdio
  ```

- **Run checks and tests**:
  ```sh
  bun run devcheck  # Lints, formats, type-checks, and more
  bun run test      # Runs the test suite
  ```

## Project structure

| Directory | Purpose |
|:---|:---|
| `src/mcp-server/tools` | Tool definitions (`*.tool.ts`). Ten tools across PubMed, PMC, and Europe PMC. |
| `src/mcp-server/resources` | Resource definitions. Database info resource. |
| `src/mcp-server/prompts` | Prompt definitions. Research plan prompt. |
| `src/services/ncbi` | NCBI E-utilities service layer — API client, queue, parser, formatter. |
| `src/services/europe-pmc` | Europe PMC service — search + `fullTextXML` JATS retrieval. Reuses the NCBI JATS parser. |
| `src/services/unpaywall` | Unpaywall service — DOI → OA location resolution and content fetch (HTML/PDF). |
| `src/config` | Server-specific environment variable parsing and validation with Zod. |
| `tests/` | Unit and integration tests, mirroring the `src/` structure. |

## Development guide

See [`CLAUDE.md`](./CLAUDE.md) for development guidelines and architectural rules. The short version:

- Handlers throw, framework catches — no `try/catch` in tool logic
- Use `ctx.log` for logging, `ctx.state` for storage
- Register new tools and resources in the `createApp()` arrays

## Contributing

Issues and pull requests are welcome. Run checks and tests before submitting:

```sh
bun run devcheck
bun run test
```

## License

This project is licensed under the Apache 2.0 License. See the [LICENSE](./LICENSE) file for details.
