Agent-Coordinated Operations

A positioning note. The architectural pattern in symphony-pattern-agent-control-plane generalizes well past lean tech teams. This note captures the broader claim and what it means for Guild + Emporium positioning.

The pattern

Three ingredients:

Durable shared context. A graph of state, history, decisions, and intent that humans and agents can both read and write.
Capable agents. Agents that can act meaningfully on that context, not just answer questions about it.
A human willing to direct and review rather than execute. Operator role shifts from doer to conductor.

Any domain where all three exist gets eaten by this shape. Software is first because the context is already digital and the actions are already automatable. Other domains follow as their context graphs digitize.

Where Dungeon actually is (2026-05-02)

Earlier draft of this note overstated the operating model. Fact-checked against the repos and vault:

Two coordination surfaces, not one. GitHub Issues for guild. The vault itself for emporium. These are deliberately different — the choice tracks the work shape, not the team’s preference for a single tool.
Guild on GitHub Issues: dungeonbooks/guild has 43 issues total (27 open / 16 closed), ~40 created in April 2026. ~30 days of real tracker usage. Discrete engineering work-units fit Issues well.
Emporium in the vault: dungeonbooks/emporium has 0 GitHub issues. Coordination happens here in dungeonbooks/docs/ — plans, platform-eval audits, medusa-v2-platform-guide, journal entries. Emporium work is more architectural and exploratory; the vault’s prose-and-links shape fits it better than ticket form.
Other repos: webshop, cataloger, invoicer, store, buyer have 0 issues each. marty has 4. Most of these are dormant or exploratory.
Operations work (inventory, reconciliation, bookkeeping, email, social) is touched by agents during active Claude Code sessions but is not formally coordinated through either Issues or the vault today. Slack and ad-hoc sessions carry that load.
Agents are not running 24/7. Cost-conscious posture. Pattern is: Panat opens Claude Code while working on Guild or Emporium, has a few sessions running concurrently, drives them, closes them. Not autonomous background workers.

What is true:

The Symphony-shape loop runs for Guild engineering: GitHub Issues + Claude Code + Graphite PRs + Copilot review + vault context.
An adjacent loop runs for Emporium architectural work, with the vault as the durable coordination surface instead of a tracker.
Light agent assistance happens across operations session-by-session, not as a tracked workflow.

What is not true (yet):

“Dungeon runs its entire business through agents coordinated via GitHub Issues.”
Quantifiable leverage ratio across operations.
A documented conventions playbook for either surface.

Implication: two surfaces is a feature, not a bug

The vault-vs-Issues split is itself useful intelligence. Issues fit discrete, closeable work-units. The vault fits durable, evolving context. A future partner-shop offering probably needs both — agent-coordinated work-units (campaigns, member edits, reconciliation runs) plus a shared vault-shape for plans, decisions, and accumulated context. Treating “the issue tracker is the agent control plane” as the whole answer misses half the substrate we’re already using.

The pitch can’t lead with the unified-operating-model claim. It can lead with “Guild engineering runs the Symphony loop; we are extending it outward category by category as cost and confidence allow.” That’s defensible. The previous framing wasn’t.

Architectural patterns

Two reusable patterns sit underneath the broader thesis. Worth naming them explicitly so they generalize.

Pattern 1: Escalation-as-liberal-paternalism

Agents read everything. Mutating actions of consequence escalate to a human-reviewed work-unit. Path of least resistance for any consequential action is to file a tracked issue a human reviews.

This is the architectural form of liberal paternalism applied to agent operations:

Agency preserved: operator can override, redirect, approve faster paths over time as trust builds.
Paved path low-friction: agent does prep work, structures the issue with full context, attaches conversation/state.
Audit trail by construction: every consequential action is a tracked work-unit with provenance.

Without escalation, agents either mutate live store data with no audit, or refuse to do anything useful. Escalation collapses both failure modes.

First implementation: srctoolsescalation in Marty/Shopkeeper. open_issue + notify_human, per-persona-configured escalation targets, per-shop issue queues (not centralized — each shop owns their queue).

Pattern 2: Scheduled-research-job

Deterministic external input → agent reasoning grounded in store data → tracked work output for human review. A scheduled agent acts as research analyst, not actor. Output goes to a queue, not directly to the world.

Generalizable instances:

Trending books RSS → “what should we order?”
Event listings → “what should we host?”
Customer feedback → “what should we change?”
Inventory data → “what should we discount or promote?”
Calendar of book releases → “what pre-orders should we set up?”
Sales velocity by category → “what’s at risk of stockout?”

Cost-bounded (one run per cycle, not per user message). Deterministic input source (easier to debug than open-ended web search). Auditable by construction (output is a tracked issue, not a side effect). Cheap to ship — extends RSS plumbing already in marty (feeds.py).

First implementation: rss-as-the-entry-point-for-agent-research-jobs — trending-books feed extending the existing feeds.py shape, then research agent layered on top.

The two patterns compose: a scheduled research job’s output is itself an escalation. Agent doesn’t order books; agent files an order recommendation issue Carrie reviews. Read everywhere, write only via tracked work-units, run on cycles instead of constantly.

What’s actually next

Two questions worth answering before the pitch hardens:

Pick one operations category and formalize it. Bookkeeping/reconciliation is the strongest candidate — it has clear inputs (Square exports, bank statements), clear outputs (categorized entries), and is already partially agent-touched. Move it onto GitHub Issues (or a dedicated repo) for 30 days. Measure.
Decide the cost ceiling. “Not ready to full-send 24/7” is the right posture but it’s also a strategic constraint to make explicit. What monthly agent spend would you accept for what leverage? Without a number, the cost-conscious posture quietly caps every adjacent decision.

Leverage ratio (to measure, not yet measured)

The pitch and pricing depend on quantifying leverage. Once one operations category runs through Issues for 30 days, capture:

Issues per week, by category.
% closed by agent action vs. human action.
% requiring >1 review iteration.
Operator weekly hours in tracker vs. POS vs. other surfaces.
Estimated hours/week saved vs. doing the same work directly.
Agent spend per operator hour saved.

Target framing once data exists: “Operator spends ~N hours/week reviewing ~M agent-handled issues vs. ~K hours/week doing the work directly, at $X/mo agent cost.”

Pricing implication (platform-pricing): the leverage ratio is what justifies a price above the floor. Without numbers, $179/mo is a guess.

Conventions playbook (to write)

Even at current scope (Guild engineering only), the conventions aren’t documented. Worth capturing before Phil onboards:

Issue templates Panat uses today.
What context the agent expects in the issue body (links, attachments, schemas, vault references).
Review patterns: when to comment, when to close, when to kick back.
Failure modes: where humans intervene most, what doesn’t work.
How GitHub Issues and the vault divide labor (durable knowledge in vault, transient work-units in Issues).

Working artifact, not a polished doc. Seed of Option B in symphony-pattern-agent-control-plane.

Why this matters more than “AI for small teams”

The standard framing is “AI tools help lean teams ship more.” The sharper framing:

This is the operating model that lean teams now have access to that previously required mid-size companies.

A two-person bookstore can run ops sophistication that used to require a five-person ops team plus a PM, because the agent absorbs the procedural middle. The tracker-as-control-plane is what makes that legible and reviewable instead of chaotic.

This is a structural change in what scale of operation a small operator can credibly run. Not a productivity bump.

Generalization beyond software

Domains where the pattern already works or is close:

Indie bookstores (guild, emporium thesis).
Galleries, small museums. Inventory, donor records, exhibition planning, programming.
Restaurants. Inventory, menu engineering, staffing, reservations, supplier coordination.
Repair shops. Ticketing, parts inventory, customer history, diagnostics.
Small farms / CSAs. Crop planning, member management, distribution logistics.
Indie publishers. Acquisitions, production schedule, royalties, author comms.
Independent professionals (lawyers, accountants, therapists in small practice). Client state, deadlines, intake.

Common shape: small operator team, enough digital surface area to have a context graph, work that’s been historically too procedural for the team size.

What’s actually scarce

Not the technology. Three real constraints:

Operator willingness to direct rather than execute. Cultural shift. Many small operators built their identity around doing the work. “I review the agent’s plan” feels like loss of control until they live with it.
A context graph rich enough for the agent to act on. Most legacy small businesses have data scattered across paper, vendors’ admin panels, spreadsheets, owner’s head. The graph has to be assembled before agents can use it. This is real work.
A technical partner who knows how to wire it up. Carrie has Panat. Phil is his own technical partner. Most operators have neither. The platform play is being that partner-as-a-service.

The constraint isn’t technology. It’s organizational and cultural. Which is why most legacy small operators won’t adopt this even when tools are free.

Implications for Guild + Emporium positioning

Stronger fractional-ops thesis

Not selling: “we built you software.” Selling: “we run the ops layer of your store using the same pattern we use to run ours.”

The platform is the context graph + agent capability + operator UX. The service is the partnership that makes a non-technical operator able to direct it. See platform-pricing, expansion-strategy, outsiderpg-platform-vision.

Operator persona clarity

Engineer-operators (project_phil-pilot-candidate) recognize the shape immediately. Sell them the platform.
Non-engineer operators (carrie) need an embedded technical partner. Sell them the platform + ops partnership.
Carrie + Panat is the proof point. Phil + himself is the leveraged version. Future shops without a Phil need a Panat-as-a-service.

Vertical platform = Panat-as-a-service for indie bookstores

Reframing of the platform thesis: Guild + Emporium is the productized version of “what Panat does for Carrie.” The product is the context graph (Payload schema, Square integration, Discord, member data, financial state) plus the agent capability plus the operator surface. The service tier on top is the partnership for shops without a technical co-owner.

This explains why pure-software competitors won’t catch up easily — they’re building tools, not the partnership. And why generic AI startups won’t catch this market — they’re not embedded in the operator’s actual day.

Marty/Shopkeeper as the user-facing surface

Earlier framing treated marty as a Dungeon-specific application. Wrong. The right framing:

Marty (the wizard book bot) is Dungeon-specific. Stays Dungeon-only.
Shopkeeper is the platform persona-template. Every partner shop gets its own Shopkeeper persona configured for their store: voice, allowed tools, allowed channels, escalation rules, policies. Phil at Victory Point gets shopkeeper-victorypoint.
The agent infrastructure (Claude Agent SDK loop + tool dispatch + Langfuse + escalation primitive + persona-as-YAML) is the platform layer. Shopkeeper and Marty are both consumers of it.

Persona-as-YAML is the per-shop tenancy primitive on the agent layer — same role the Payload globals + multi-tenant plugin play on the data layer. Architecturally consistent with how the platform handles tenancy elsewhere.

Partner-shop onboarding will include: “configure your shop’s Shopkeeper persona.” This is the platform’s first AI-native feature shipping to partner shops, and probably the most differentiated. See marty-roadmap for the build sequence.

Long-term: this is true for every vertical

The same shape applies to galleries, restaurants, etc. Different verticals, different context graphs, different operator partnerships. Each one is a platform business with a services moat. Most will be built by people embedded in the vertical, not by horizontal AI companies.

This is why user_panat-context’s long-term vision (vertical integration for indie bookstore industry) is structurally correct. Pick a vertical, own the context graph, partner with operators, layer agents on top. The vertical embedding is the moat.

What this changes about the pitch

Old pitch: “Loyalty platform for indie bookstores.” Better pitch: “We run the ops layer of indie bookstores. Loyalty is the first surface.”

The loyalty platform is the wedge. The context graph is the product. The partnership is the moat.

Worth threading through manifesto-draft, director-curator-onepager, founding-statement, and the mana-application framing.

Criterion for committing to platform play

Previous criterion was “vibes” or revenue thresholds. Better, measurable criterion:

When partner shop A demonstrates the same leverage ratio Dungeon Books achieves, the pattern has generalized and we commit.

Phil at Victory Point is the first test. If we can stand up the GitHub Issues + agent loop at Victory Point and Phil reaches a comparable hours-saved-per-issue-reviewed ratio within 90 days, the pattern is portable. If he can’t, either the conventions need work, the operator partnership requires more hands-on time than a productized service can sustain, or the pattern is genuinely Dungeon-specific.

This is more aligned with what we’re actually building than revenue or shop-count thresholds alone.

Open questions

What’s the minimum-viable operator partnership offering? Hours/month? Embedded Slack? On-call?
How much of the partnership can itself be agent-mediated over time? (The recursion: agents helping operators direct other agents.)
Where does this leave Brian’s potential cofounder role (project_brian-buyout)? Retail + game producer background maps onto the operator-partnership side, not the platform side. Worth thinking through.
How does this story land with funders (mana-application, grant inquiries)? “Vertical AI ops platform with embedded partnership model” is a different pitch than “loyalty SaaS.”

Cross-references

symphony-pattern-agent-control-plane — the engineering-side architectural plan.
coordination-economics-reading — theoretical framing for coordination as the scarce input.
building-a-genius-swarm — adjacent thinking on multi-agent coordination.
outsiderpg-platform-vision — vertical platform thesis.
sales-lessons-from-withfriends — what doesn’t work for indie operators.

Quartz 4

Explorer

agent-coordinated-operations