# MCP Catalog API — Full Reference

> Search and discovery engine for 23,000+ MCP servers indexed from multiple registries (GitHub, Smithery, mcpservers.org, MCP Registry, npm, PyPI). Provides hybrid semantic search with BM25 + dense vector + RRF fusion + tool embedding boost, trust scoring, and vulnerability tracking via NVD.

## Overview

- **What it is:** A REST API catalog of MCP (Model Context Protocol) servers and AI agents.
- **What it indexes:** 23,000+ MCP servers collected from GitHub, Smithery, mcpservers.org, MCP Registry, npm, and PyPI via scheduled scrapers.
- **Database:** TimescaleDB (PostgreSQL 16) with pgvector extension for vector similarity search.
- **Auth:** None required. All endpoints are public, read-only GETs.
- **Response format:** JSON (application/json). All list endpoints return paginated responses.
- **Rate limits:** 15 req/min (search), 10 req/min (list), 30 req/min (detail) — per IP.

---

## API Endpoints

Base URL: `/api/v1` (except `/health` which is at root)

---

### GET /health

Health check with database connectivity probe.

**Response:**
```json
{"status": "ok", "db": "connected"}
```

---

### GET /api/v1/servers

List MCP servers with pagination, filtering, and sorting.

**Query Parameters:**

| Param | Type | Default | Constraints | Description |
|-------|------|---------|-------------|-------------|
| limit | int | 20 | 1-20 | Results per page |
| offset | int | 0 | 0-40 | Pagination offset |
| sort_by | string | popularity | popularity, stars, downloads, name, newest | Sort order |
| source_registry | string | null | | Filter by registry |
| status | string | null | active, archived, deprecated | Filter by status |
| transport_type | string | null | stdio, sse, streamable-http | Filter by transport |
| license | string | null | MIT, Apache-2.0, etc. | Filter by license |
| protocol | string | null | mcp, a2a, rest, openapi | Filter by protocol |

**Example:**
```bash
curl "https://knyazevai.work/api/v1/servers?limit=10&sort_by=stars"
```

---

### GET /api/v1/servers/search

Hybrid semantic search across all indexed MCP servers.

**Query Parameters:**

| Param | Type | Default | Constraints | Description |
|-------|------|---------|-------------|-------------|
| q | string | **required** | 1-200 chars | Search query in natural language |
| limit | int | 20 | 1-20 | Results per page |
| offset | int | 0 | 0-40 | Pagination offset |
| min_similarity | float | 0.01 | 0-1 | Minimum cosine similarity threshold |
| transport_type | string | null | | Filter by transport |
| protocol | string | null | | Filter by protocol |
| requires_api_key | bool | null | | Filter by API key requirement |
| has_docker | bool | null | | Filter by Docker availability |
| min_stars | int | null | | Minimum GitHub stars |

**Response fields (per server):**
- `cosine_similarity` (float, 0.0-1.0) — cosine similarity between query embedding and server description embedding. **Use this to evaluate match quality.** 0.85+ = excellent, 0.70-0.85 = good, <0.70 = weak.
- `similarity_score` (float) — combined RRF ranking score with boosts (internal, used for ordering)
- `trust_score` (int, 0-100) — composite trust score
- `readme_summary` (string) — AI-generated 2-3 sentence summary
- `transport_type` (list[string]) — supported transports
- `protocol` (string) — mcp, a2a, rest, openapi
- `pricing_model` (string) — free, paid, freemium, per_call
- `requires_api_key` (bool) — whether API key is needed
- `install_command` (string) — ready-to-run install command

**Example:**
```bash
curl "https://knyazevai.work/api/v1/servers/search?q=database%20postgresql&limit=5"
```

**Example response:**
```json
{
  "items": [
    {
      "id": "582ac966-...",
      "title": "MCP-PostgreSQL-Ops",
      "description": "Professional MCP server for PostgreSQL database operations...",
      "cosine_similarity": 0.852,
      "similarity_score": 0.029,
      "trust_score": 68,
      "protocol": "mcp",
      "transport_type": ["stdio", "streamable-http"],
      "pricing_model": "free",
      "requires_api_key": false,
      "github_stars": 142,
      "popularity_score": 1346,
      "vulnerability_count": 0,
      "readme_summary": "MCP-PostgreSQL-Ops is a server for PostgreSQL database operations..."
    }
  ],
  "total": 34,
  "limit": 5,
  "offset": 0,
  "relevant_count": 12
}
```

---

### GET /api/v1/servers/{server_id}

Full detail for a single server including tools, packages, categories, metrics, and trust breakdown.

**Response:** `ServerDetailResponse` — includes all fields from search plus:
- `tools` — list of MCP tools with names, descriptions, and input schemas
- `packages` — npm/PyPI package distributions
- `categories` — server categories/tags
- `latest_metrics` — recent usage metrics (stars, forks, downloads)
- `trust_breakdown` — per-dimension trust scores (source, popularity, completeness, freshness, security, user_signals)
- `config_example` — example config for claude_desktop_config.json
- `client_compatibility` — compatible clients (claude_desktop, cursor, vs_code_copilot, etc.)

**Example:**
```bash
curl "https://knyazevai.work/api/v1/servers/582ac966-15df-403f-ae40-4c4c54dccb38"
```

---

### GET /api/v1/servers/{server_id}/vulnerabilities

Known vulnerabilities (CVE/GHSA) for a server, newest first.

**Example:**
```bash
curl "https://knyazevai.work/api/v1/servers/{id}/vulnerabilities"
```

---

### GET /api/v1/categories

List all server categories with counts.

---

## Search Algorithm

The `/servers/search` endpoint uses a multi-stage hybrid retrieval pipeline:

### Stage 1: Candidate Retrieval (Hybrid RRF)
Two parallel retrieval paths, each returning top-50 candidates:

1. **BM25 full-text search** — PostgreSQL `ts_rank_cd` over weighted tsvectors (`title` weight A, `description` weight B, `registry_name` weight C). GIN index.
2. **Dense vector search** — pgvector HNSW index with cosine distance over 1024-dim embeddings. Model: `intfloat/multilingual-e5-large`. `hnsw.ef_search=40`.

Fused via **Reciprocal Rank Fusion (RRF)** with k=60:
```
rrf_score = 1/(60 + semantic_rank) + 1/(60 + fulltext_rank)
```

### Stage 2: Tool Score Boost
For each candidate, the best-matching tool embedding adds 0.3x weight:
```
boosted = rrf_score + (best_tool_similarity * 0.3)
```

### Stage 3: Popularity & Trust Adjustments
```
+ ln(1 + popularity_score) * 0.002
+ (trust_score / 100) * 0.05
+ rating_avg * 0.05 * min(rating_count, 20) / 20
```

### Output
- `similarity_score` = final boosted RRF score (used for ranking)
- `cosine_similarity` = raw cosine similarity between query and description embeddings (0-1, use for evaluating match quality)

---

## Pydantic Schemas

### ServerResponse
```
id: UUID
registry_name: str
title: str | None
description: str | None
current_version: str | None
repository_url: str | None
homepage_url: str | None
license: str | None
transport_type: list[str] | None
status: str | None
source_registry: str
first_seen: datetime | None
last_updated: datetime | None
similarity_score: float | None         # RRF ranking score (search only)
cosine_similarity: float | None        # Cosine similarity 0-1 (search only)
vulnerability_count: int | None
data_age_seconds: int | None
provider: str | None
popularity_score: int | None
github_stars: int | None
npm_downloads_weekly: int | None
trust_score: int | None                # 0-100 composite trust
install_command: str | None
readme_summary: str | None
requires_api_key: bool
protocol: str                          # mcp, a2a, rest, openapi
pricing_model: str | None              # free, paid, freemium, per_call
rating_avg: float | None
rating_count: int | None
```

### ServerDetailResponse (extends ServerResponse)
```
tools: list[ToolResponse]
packages: list[PackageResponse]
categories: list[str]
latest_metrics: list[MetricResponse]
trust_breakdown: dict | None
config_example: dict | str | None
client_compatibility: list[str]
```

### PaginatedResponse[T]
```
items: list[T]
total: int
limit: int
offset: int
relevant_count: int | None
```

---

## MCP Server Interface

The catalog is also available as an MCP server for programmatic use.

**Remote SSE (for any MCP client):**
```
https://knyazevai.work/sse
```

**MCP tools available:**

| Tool | Description |
|------|-------------|
| search_servers | Semantic search across 23,000+ servers by natural language query |
| list_servers | Browse catalog with filters (registry, status, sort) |
| get_server | Full detail for a server (tools, packages, categories, metrics) |
| discover | Top recommendation(s) for a task — opinionated, agent-friendly |
| get_server_vulnerabilities | CVE/GHSA security data for a server |
| get_catalog_stats | Aggregate statistics (totals, registry breakdown) |

---

## Discovery Endpoints

| Method | Path | Description |
|--------|------|-------------|
| GET | `/llms.txt` | Concise API documentation for LLMs |
| GET | `/llms-full.txt` | Complete API reference (this file) |
| GET | `/.well-known/agent.json` | A2A agent card |
| GET | `/docs` | Interactive Swagger UI |
| GET | `/openapi.json` | OpenAPI 3.1 specification |