About civ6-mcp

Can a language model play a full game of Civilization VI? Not tic-tac-toe, not chess — a 300-turn 4X strategy game with fog of war, seven concurrent domains, six victory paths, and an action space that grows to 10¹⁶⁶ possible moves.

civ6-mcp is the environment that makes this question testable. It connects any MCP-compatible model to a live Civ VI game through the same tool-calling protocol agents already use for databases, APIs, and file operations — then lets them play.

tools

300+

turns per game

3,000+

tool calls per game

victory conditions

How It Works

The agent never touches the game directly. Every command flows through four layers, each translating between the world above and below it.

LLM Agent

Claude, GPT, Gemini — any model that speaks MCP

MCP Server

76 tools + narration layer translating visuals to text

FireTuner

Binary TCP protocol injecting Lua into the live game

Civilization VI

Full game engine enforcing all rules — no cheats, no omniscience

Seven Concurrent Domains

Every turn requires reasoning across all of these simultaneously. There is no phase structure — the agent decides what to attend to, what to defer, and what to ignore. Ignore the wrong thing and a religious victory you never saw coming ends the game.

Economic

Scientific

Cultural

Military

Diplomatic

Spatial

Temporal

The Narration Layer

A human player absorbs the minimap, score ticker, religion lens, unit health bars, and army positions at a glance. An LLM gets none of that passively. The narration layer translates Civ VI's visual state into structured text — markdown hex maps with terrain and threat markers, per-unit status readouts, city yield summaries, diplomatic relationship graphs — preserving the decision-relevant information a human extracts from the screen.

// get_map_area(x=10, y=22, radius=3)

(9,22): GRASS Hills [WHEAT] (FARM) {F:4 P:1}

(10,22): GRASS JUNGLE [BANANAS] {F:4 P:1} [my: WARRIOR]

(13,24): GRASS FOREST {F:1 P:2} **[Barbarian WARRIOR]**

(14,25): PLAINS Hills [DIAMONDS+] {F:1 P:2} [fog]

Each narration function flags urgency (bold threat markers, !! warnings), provides context for action (valid attack targets, buildable improvements), and compresses intelligently (fog tiles marked rather than omitted, resources classified by type).

The Sensorium Effect

The narration layer can tell the agent everything a human sees — but it can't replicate howa human sees it. A player glances at the minimap and notices a religion spreading. The agent only knows what it explicitly queries. Information it doesn't ask for doesn't enter its world model.

This produces a consistent pattern: game systems that need proactive monitoring go unmonitored until crisis forces attention. A rival converts every city to their religion over 100 turns. A barbarian camp spawns armies six tiles away. Gold piles up past 2,000 while the diary says “should spend.” The agent articulates the right strategy and then doesn't execute it — not because it can't, but because it doesn't look.

This is an architectural property of any agent that perceives a rich environment through text queries. It generalises well beyond Civilization, and it's what makes this environment interesting as a benchmark.

Open Source

The environment, all 76 tools, the narration layer, and the agent playbook are MIT-licensed. Any model that supports MCP can play. Game archives with full turn-by-turn diaries, agent reflections, and tool call logs are browsable on this site.

GitHub Docs Browse Games

Built by Liam Wilkinson, Jamie Heagherty, Harry Coppock, and Austin Andrews.