Hammerstein.
A strategic-reasoning AI for tabletop wargames. Trained on the Hammerstein-Equord doctrine. Hosted at hammerstein.ai.
What it does
Wargamer mode. You upload three things:
- A photo of your current board state
- A short status report — turn number, your situation, what is giving you trouble
- (Optional) The rulebook PDF, digested into an AI Commander Reference for that specific game
You get back kriegspiel-style Auftragstaktik orders for the side you are playing. The specific moves your subordinates would receive from a competent commander, not generic strategy tips. Persistent campaign context, so it remembers what happened in turn 3 when you ask about turn 7.
It works for any tabletop wargame. The framework is doctrine-driven; the model carries the framework forward into the specific game you're running.
Wargamer mode generating orders for an in-progress campaign. The screenshot Ty Bomba reacted to in the r/LocalLLM cross-post (Board Wargamer FB group, 2026-05-10).
Why this exists
I design tabletop wargames at Conflict Simulations Limited. The framework I use to think through scenarios — Hammerstein-Equord's clever-lazy / clever-industrious / stupid-industrious / stupid-lazy diagnostic — is the same one that drives this AI.
The framework is open source: github.com/lerugray/hammerstein. The distilled local model is open source: huggingface.co/lerugray/hammerstein-7b-lora. The hosted Wargamer mode — the part that is actually useful for an operator running a campaign — is what this site sells.
Proof
The framework wins blind LLM-judge head-to-heads against raw frontier models. We ran a benchmark on 2026-05-10:
- v0: 6 strategic-reasoning questions × 3 frontier families (Opus 4.7, Sonnet 4.6, GPT-5) × Hammerstein-vs-raw, judged blind by 4 LLM judges across 3 vendors (Anthropic Opus + Sonnet, OpenAI GPT-5, DeepSeek). 53 of 54 ratings preferred Hammerstein-on-frontier.
- v0.1 generic: 4 out-of-domain strategic-reasoning questions, same setup. 48 of 48 ratings preferred Hammerstein. Unanimous across judges and families.
- v0.1 + v0.2 ablation: the framework wins by different mechanisms per model. On Sonnet the full stack beats both ablations; on Opus the components are interchangeable; on GPT-5 corpus-only outperforms full. Headline holds; the optimal product shape is model-specific.
Methodology, per-rating verdicts, and reproducibility instructions: eval/RESULTS-v0.1.md. The benchmark is open source. If you replicate and get materially different results, open an issue.
Pricing
$15 / month
Recurring. Wargamer mode, all your campaigns, no query cap during early access. First 50 subscribers; pricing rises after.
Subscribe — early accessFAQ
- What does "Auftragstaktik" mean?
- Mission-type tactics. Orders that specify the intent, not the script — what the higher echelon wants accomplished, with the latitude given to subordinates to figure out how. The model produces orders in this register specifically because it's the doctrine I'm tuned to.
- Does it work for [insert specific tabletop wargame here]?
- If you can photograph the board and describe the rules, yes. The optional rulebook PDF gets digested into an AI Commander Reference for that specific game. Tested on hex-and-counter, area-movement, card-driven, and block-and-counter games so far.
- Is this multiplayer?
- No. Single-player at MVP — you against the model, or you using the model as your subordinate-orders generator while you play either side solo. Multiplayer is a future tier.
- What model is behind it?
- Anthropic Sonnet for vision + reasoning at MVP. The framework is what does the work; the underlying model can change without changing the user experience. The open-source local version (Hammerstein-7B QLoRA on HuggingFace) is the proof that the framework survives the model.
- What happens when Anthropic changes the model behavior? (Opus 4.7 reactions, etc.)
- Model behavior drifts release over release — Reddit threads complaining about Opus 4.7 vs 4.6 are recurring. The Hammerstein layer is a system-prompt that operates ABOVE whichever frontier model is current. When the model changes, the framework still applies the same reasoning shape: clever-lazy / clever-industrious / stupid-industrious diagnostic; verification-over-enthusiasm; legible failure. Model is variable; framework is constant. The v0.1 benchmark in the open-source hammerstein repo measures this directly across Opus 4.7, Sonnet 4.6, and GPT-5 — see § Proof above and eval/RESULTS-v0.1.md for full methodology + verdicts.
- What about my campaign data?
- Stored per-account, used only to power the persistent-context feature for your campaigns. Not shared, not sold, not used to train any model. You can delete a campaign and its data is gone.
- Refund policy?
- Cancel anytime; no refund on the current month, but no future charges. If something is genuinely broken on my end, email and we'll work it out.
- Who's behind this?
- Ray Weiss, designer at Conflict Simulations Limited. The framework is named after Kurt von Hammerstein-Equord, the German general whose doctrine the project is tuned to. Posted to r/LocalLLM and the Board Wargamer FB group on 2026-05-10; Ty Bomba liked the screenshot.
The free pieces
The framework, the local model, and the CLI / TUI are open-source and stay open-source. If you want to run audits locally with no subscription, those tools are free:
- hammerstein — framework canon (system prompt + RAG corpus)
- hammerstein-7b-lora — distilled QLoRA model on Qwen2.5-7B, runs on any 8GB+ Mac via Ollama
- hammerstein-tui — Rust TUI for daily-driver use
The hosted Wargamer is the paid piece because it's the piece that requires hosted vision + persistent campaign storage + Sonnet-API costs. Everything else stays free.