MCP Server Integration¶

gpumod includes an MCP (Model Context Protocol) server that lets AI assistants manage GPU services directly. The server exposes tools for querying status, simulating VRAM, and switching modes.

All IDE configurations below assume you cloned gpumod and installed it with uv sync. Adjust command paths if you used pip instead.

OpenTelemetry stdout pollution

gpumod depends on opentelemetry. Without OTEL_SDK_DISABLED=true in the env block, the SDK may print a startup message to stdout on some systems. This corrupts the JSON-RPC stream and causes MCP clients to fail with Failed to parse JSONRPC message from server. Always include "OTEL_SDK_DISABLED": "true" in your MCP server config.

Claude Code¶

Claude Code discovers MCP servers from .mcp.json in the project root. Create this file in your project (or home directory for global access):

{
  "mcpServers": {
    "gpumod": {
      "command": "uv",
      "args": ["--directory", "/path/to/gpumod", "run", "python", "-m", "gpumod.mcp_main"],
      "env": {
        "GPUMOD_DB_PATH": "~/.config/gpumod/gpumod.db",
        "OTEL_SDK_DISABLED": "true"
      }
    }
  }
}

Or add it via the CLI:

claude mcp add gpumod \
  -- uv --directory /path/to/gpumod run python -m gpumod.mcp_main

Cursor¶

Cursor reads MCP configuration from .cursor/mcp.json in the project root. Create the file:

{
  "mcpServers": {
    "gpumod": {
      "command": "uv",
      "args": ["--directory", "/path/to/gpumod", "run", "python", "-m", "gpumod.mcp_main"],
      "env": {
        "GPUMOD_DB_PATH": "~/.config/gpumod/gpumod.db",
        "OTEL_SDK_DISABLED": "true"
      }
    }
  }
}

After saving, restart the Cursor agent or open Settings > MCP to verify the server is connected.

Claude Desktop¶

Add to ~/.config/claude/claude_desktop_config.json (Linux) or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

{
  "mcpServers": {
    "gpumod": {
      "command": "uv",
      "args": ["--directory", "/path/to/gpumod", "run", "python", "-m", "gpumod.mcp_main"],
      "env": {
        "GPUMOD_DB_PATH": "~/.config/gpumod/gpumod.db",
        "OTEL_SDK_DISABLED": "true"
      }
    }
  }
}

Restart Claude Desktop after editing the config.

Antigravity (Google)¶

Antigravity stores MCP config in mcp_config.json. To edit it:

Open the ... dropdown at the top of the agent panel
Click Manage MCP Servers
Click View raw config
Add the gpumod entry:

{
  "mcpServers": {
    "gpumod": {
      "command": "uv",
      "args": ["--directory", "/path/to/gpumod", "run", "python", "-m", "gpumod.mcp_main"],
      "env": {
        "GPUMOD_DB_PATH": "~/.config/gpumod/gpumod.db",
        "OTEL_SDK_DISABLED": "true"
      }
    }
  }
}

Save and the server will connect automatically.

Running the MCP server manually¶

For testing or debugging, run the server directly:

cd /path/to/gpumod
uv run python -m gpumod.mcp_main

The server starts in stdio mode, which is the standard transport for MCP clients. Set GPUMOD_LOG_LEVEL=DEBUG for verbose output.

Available MCP Tools¶

The MCP server exposes 16 tools across three tiers:

Tool	Description	Type
`gpu_status`	Get current GPU status, VRAM usage, running services	Read-only
`list_services`	List all registered services with driver type and VRAM	Read-only
`list_modes`	List all available GPU modes	Read-only
`service_info`	Get detailed info for a specific service	Read-only
`model_info`	Get model metadata and VRAM estimates	Read-only
`simulate_mode`	Simulate VRAM for a mode with optional changes	Read-only
`switch_mode`	Switch to a different GPU mode (starts/stops services)	Mutating
`start_service`	Start a specific service	Mutating
`stop_service`	Stop a specific service	Mutating
`search_hf_models`	Search HuggingFace for models by author/keyword/task/driver	Discovery
`list_gguf_files`	List GGUF files in a repo with size and VRAM estimates	Discovery
`list_model_files`	List model files (GGUF or Safetensors) with format detection	Discovery
`fetch_model_config`	Fetch config.json from a HuggingFace repo	Discovery
`generate_preset`	Generate preset YAML configuration for a GGUF model	Discovery
`fetch_driver_docs`	Fetch driver documentation (llama.cpp or vLLM)	Discovery
`consult`	Multi-step reasoning for complex GPU/model questions	Consulting

Mutating tools are clearly marked in their descriptions and should trigger confirmation prompts in MCP clients.

Discovery Tools¶

The discovery tools help AI assistants find and configure new models:

search_hf_models

Parameters:
  author: str | None     # HuggingFace org (default: all)
  search: str | None     # Keyword search in model names
  task: str | None       # Filter: code, chat, embed, reasoning
  driver: str | None     # Filter: llamacpp (GGUF), vllm (Safetensors), any
  limit: int = 20        # Max results (1-100)
  no_cache: bool = False # Bypass cache

Returns: { models: [...], count: int }
  # When driver param used, models include model_format and driver_hint

list_gguf_files

Parameters:
  repo_id: str           # e.g., "unsloth/Qwen3-Coder-Next-GGUF"
  vram_budget_mb: int | None  # Filter files that fit in VRAM

Returns: { repo_id, files: [...], count: int }

list_model_files (unified format support)

Parameters:
  repo_id: str           # e.g., "unsloth/Qwen3-Coder-Next-GGUF"
  vram_budget_mb: int | None  # Filter files that fit in VRAM

Returns: { repo_id, files: [...], count, model_format, driver_hint }
  # model_format: "gguf" | "safetensors" | "unknown"
  # driver_hint: "llamacpp" | "vllm" | null

generate_preset

Parameters:
  repo_id: str           # HuggingFace repo ID
  gguf_file: str         # GGUF filename to use
  context_size: int = 8192  # Context window size
  service_id: str | None # Custom service ID

Returns: { preset: str, service_id: str }

Available MCP Resources¶

The MCP server provides 8 browsable resources:

URI	Description
`gpumod://help`	Overview of gpumod capabilities
`gpumod://config`	Current configuration and settings
`gpumod://modes`	List all defined modes
`gpumod://modes/{mode_id}`	Detail view of a specific mode
`gpumod://services`	List all registered services
`gpumod://services/{service_id}`	Detail view of a specific service
`gpumod://models`	List all registered models
`gpumod://models/{model_id}`	Detail view of a specific model