gpumod¶

Name: gpumod
Author: Jaigouk Kim

Use arrow keys to navigate slides. Open fullscreen

GPU Service Manager for ML workloads on Linux/NVIDIA systems.

gpumod manages vLLM, llama.cpp, FastAPI, and Docker-based inference services on NVIDIA GPUs. It tracks VRAM allocation, supports mode-based service switching, provides VRAM simulation before deployment, and exposes an MCP server for AI assistant integration.

Features¶

Service Management -- Register, start, stop, and monitor GPU services with support for vLLM, llama.cpp, FastAPI, and Docker drivers
Mode Switching -- Define named modes (e.g., "chat", "coding") that bundle services together and switch between them
VRAM Simulation -- Simulate VRAM for any configuration before deployment, with alternative suggestions when capacity is exceeded
Model Registry -- Track ML models with metadata from HuggingFace Hub or GGUF files, with automatic VRAM estimation
MCP Server -- Expose GPU management as an MCP server for Claude Code, Cursor, Claude Desktop, and other MCP-compatible AI assistants
Template Engine -- Generate and install systemd unit files from Jinja2 templates, customized per driver type
AI Planning -- LLM-assisted VRAM allocation suggestions (advisory only)
Interactive TUI -- Terminal dashboard with live GPU status
Rich CLI -- Beautiful output with tables, VRAM bar charts, and JSON mode

Quick Start¶

# Clone and install
git clone https://github.com/jaigouk/gpumod.git
cd gpumod
uv sync
uv tool install -e .  # makes `gpumod` available globally

# Initialize database and load presets
gpumod init

# Check GPU status
gpumod status

# Deploy a service (auto-generates systemd unit file)
gpumod template generate vllm-chat
gpumod template install vllm-chat --yes
gpumod service start vllm-chat

# Simulate VRAM usage before switching modes
gpumod simulate mode coding-mode

# Switch modes (starts/stops services automatically)
gpumod mode switch coding-mode

# Launch interactive TUI
gpumod tui

See the Getting Started guide for sudoers configuration and full deployment instructions.

MCP Integration¶

gpumod exposes 16 tools and 8 resources via the Model Context Protocol. Add it to your IDE to let AI assistants query GPU status, simulate VRAM, and switch modes.

{
  "mcpServers": {
    "gpumod": {
      "command": "uv",
      "args": ["--directory", "/path/to/gpumod", "run", "python", "-m", "gpumod.mcp_main"]
    }
  }
}

See MCP Integration for setup instructions for Claude Code, Cursor, Claude Desktop, and Antigravity.

Requirements¶

uv >= 0.4
Python >= 3.12
Linux with NVIDIA GPU
nvidia-smi in PATH

gpumod¶

Features¶

Quick Start¶

MCP Integration¶

Requirements¶

License¶