Files
ArchiveBox/archivebox/mcp/README.md
2025-12-25 01:51:42 -08:00

3.6 KiB

ArchiveBox MCP Server

Model Context Protocol (MCP) server for ArchiveBox that exposes all CLI commands as tools for AI agents.

Overview

This is a lightweight, stateless MCP server that dynamically introspects ArchiveBox's Click CLI commands and exposes them as MCP tools. It requires zero manual schema definitions - everything is auto-generated from the existing CLI metadata.

Features

  • Auto-discovery: Dynamically discovers all 19+ ArchiveBox CLI commands
  • Zero duplication: Reuses existing Click command definitions, types, and help text
  • Auto-sync: Changes to CLI commands automatically reflected in MCP tools
  • Stateless: No database models or state management required
  • Lightweight: ~200 lines of code

Usage

Start the MCP Server

archivebox mcp

The server runs in stdio mode, reading JSON-RPC 2.0 requests from stdin and writing responses to stdout.

Example Client

import subprocess
import json

# Start MCP server
proc = subprocess.Popen(
    ['archivebox', 'mcp'],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    text=True
)

# Send initialize request
request = {"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {}}
proc.stdin.write(json.dumps(request) + '\n')
proc.stdin.flush()

# Read response
response = json.loads(proc.stdout.readline())
print(response)

Example Requests

Initialize:

{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}

List all available tools:

{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}

Call a tool:

{
  "jsonrpc":"2.0",
  "id":3,
  "method":"tools/call",
  "params":{
    "name":"version",
    "arguments":{"quiet":true}
  }
}

Supported MCP Methods

  • initialize - Handshake and capability negotiation
  • tools/list - List all available CLI commands as MCP tools
  • tools/call - Execute a CLI command with arguments

Available Tools

The server exposes all ArchiveBox CLI commands:

Meta: help, version, mcp Setup: init, install Archive: add, remove, update, search, status, config Workers: orchestrator, worker Tasks: crawl, snapshot, extract Server: server, schedule Utilities: shell, manage

Architecture

Dynamic Introspection

Instead of manually defining schemas, the server uses Click's introspection API to automatically generate MCP tool definitions:

# Auto-discover commands
from archivebox.cli import ArchiveBoxGroup
cli_group = ArchiveBoxGroup()
all_commands = cli_group.all_subcommands

# Auto-generate schemas from Click metadata
for cmd_name in all_commands:
    click_cmd = cli_group.get_command(None, cmd_name)
    # Extract params, types, help text, etc.
    tool_schema = click_command_to_mcp_tool(cmd_name, click_cmd)

Tool Execution

Commands are executed using Click's CliRunner:

from click.testing import CliRunner

runner = CliRunner()
result = runner.invoke(click_command, args)

Files

  • server.py (~350 lines) - Core MCP server with Click introspection
  • archivebox/cli/archivebox_mcp.py (~50 lines) - CLI entry point
  • apps.py, __init__.py - Django app boilerplate

MCP Specification

Implements the MCP 2025-11-25 specification.

Sources