2026-05-03 — Discord help channel setup
Summary
Set up the public help forum channel on the Dungeon Books Discord. Channel serves both bookstore customers and dungeon.club members, all posts public. Designed around: Marty deflects FAQs, @ Staff (Panat, Carrie, Lan) handles the rest, low friction to ask, feels like the shop not a ticket queue.
Decisions
Tags (8, customer-facing)
hoursordersreturnseventsmembers— the membership program (signup, perks at the shop, tier, billing)dungeon.club— the website/game/loyalty surface (login, points, site bugs)lost and foundrequests— “do you have this book,” “can you order this,” book recs, “please carry more X”
No moderator-only tags yet. Will add staff followup if/when threads start slipping through. Skipped mtg — add later if the volume shows up.
Forum settings
- Default reaction: 👍 (changed from initial ✅ recommendation, Panat’s call)
- Post slowmode: off
- Message slowmode: off
- Tag matching: Match ANY
- Require tag on post creation: on
- Auto-archive after inactivity: 1 week
Other server settings to confirm
- Channel default notifications set to “Only @mentions”
- AutoMod spam + link filters on
- @ Staff role mentionable by everyone
Post guidelines (drafted)
Friendly tone, no em dashes, no hyphens as dashes. Covers Marty deflection, privacy warning (channel is public), what to include, response time expectation.
Key elements:
- Ask @marty first. He covers hours, location, events, orders, returns, ticket policy, book recs, MTG cards from the docs.
- Channel is public. Don’t share order numbers, full names, addresses, email, payment info. If needed, ping @ Staff to move to DM.
- Title = the actual question, not the topic.
- Same-day weekday replies, slower on weekends. POS down at the shop = DM @ Staff directly.
Pinned posts to write
- What @marty can do (capabilities list: book lookups via Hardcover with bookshop.org affiliate links, MTG via Scryfall, store/policy questions from the docs site, weekly RPG news digest)
- What is dungeon.club (perks, signup link, “ping @ Staff if your tier or points look wrong”)
- Privacy + safety (sticky version of the warning so it doesn’t depend on people reading the new post prompt)
- Meet the staff (Panat, Carrie, Lan) — makes the channel feel like a shop not a service desk
Why this matters
Threads the same loop as the Carrie interview: route inbound to the right surface, let Marty handle the bot-tractable cases, free Carrie/Lan/Panat for the human stuff. The forum format also makes prior answers searchable, which is what email queues fail at. Every solved thread reduces the next person’s question.
Also: this is the first public surface where dungeon.club and the bookstore live under one roof for customers. Worth watching the tag distribution to see how the audiences actually split.
Action items
- Write and pin the four pinned posts above
- Decide privacy cleanup ownership (who deletes accidentally posted PII, how fast). Three of us, needs to not sit for a day while everyone assumes someone else has it.
- Confirm @ Staff role is mentionable by @everyone
- Make sure Marty has the docs in context for the policies he’s expected to quote (already partially done)
- Watch first week of tag usage. If anything keeps showing up untagged, add a tag. If a tag never gets used, drop it.
Cross-references
- 2026-05-02 — Carrie interview, Marty deflection plan, two-persona architecture
- marty / marty
- dungeon.club
- event-ticket-policy
- return-policy
2026-05-03 evening — Marty docs system, get_doc tool, voice tightening, Kimi attempt + revert
Long session. Built the docs-as-source-of-truth system Marty fetches at runtime, shipped the tool that fetches it, tightened his voice, set up CI properly, then spent 3+ hours evaluating Kimi K2.5 on Together as a faster/cheaper Sonnet replacement and abandoned it because tool-call reliability isn’t there.
What shipped
dungeonbooks/docs repo, public, content-only
Created the public docs vault as a sibling to the private notes/ vault. Trust boundary is the repo boundary, not a config check — same content used by the future Quartz site, by Marty at runtime, and by humans browsing GitHub.
Initial files:
index.md— root catalog, customer-prose +agent_indexHTML comment block (Marty inlines this into his system prompt at boot)store.md— hours, location (waspolicies/store-hours.md, moved out — it’s a fact, not a policy)events.md,orders.md— drafts,publish: falseuntil Carrie inputpolicies/event-ticket-policy.md,policies/return-policy.mddungeon-club.md— added late in sessionpolicies/membership-tiers.md— added late in session
get_doc tool in marty (PR #25, merged)
Fetches markdown from raw.githubusercontent.com/dungeonbooks/docs/main/{slug}.md. Parses YAML frontmatter, enforces publish: true gate, extracts HTML-comment agent_guidance directives. In-memory TTL cache (15 min). No GitHub API auth.
Plus: inlined the docs index into Marty’s system prompt per request, added generic dispatch logging, tightened voice rules.
CI + slim pre-commit (PR #26, merged)
Pre-commit reduced to file hygiene + ruff. CI workflow added with parallel jobs (lint, typecheck, tests, bandit), skip-on-draft, env placeholders for Hardcover/Anthropic. Pattern lifted from the guild repo’s CI. Pre-commit ruff version had to be bumped to match dev deps version — the skew caused “passes pre-commit, fails CI” early on.
Railway builder switched (PR #28, merged)
Production deploy was failing because railway.json still pointed at nixpacks while the project had moved to railpack. One-line fix.
Cache stampede + stale-on-failure + tests (PR #27, open, Copilot review addressed)
Per-slug asyncio.Lock collapsed to a single global refill lock (Copilot flagged unbounded growth from 404 spam — fair). Stale-on-failure fallback narrowed to transport errors only (aiohttp.ClientError, asyncio.TimeoutError, ConnectionError) — parse errors propagate so a bad docs commit surfaces loudly. touch() on stale-serve grants a 60s grace window so concurrent waiters don’t all retry against the broken upstream.
19 unit tests covering frontmatter parsing edge cases (LF/CRLF/empty), HTML extraction, cache policy, publish gate, error paths, stampede collapse, parse-error propagation, stale-grace window. 212/212 passing.
Voice rules tightened in marty_system_prompt.md
Cherry-picked from the humanizer skill. New rule 9 bans negative parallelisms (“not X, but Y”), rule-of-three lists, AI vocab (delve, realm, embark, transformative, etc.), superlatives (best/perfect/amazing), filler phrases (“it’s worth noting”). Rule 8 tightened to forbid hyphen-as-em-dash. Greeting variations rewritten to be neutral instead of book-rec-only, since Marty now handles store/event/policy questions too.
dungeon-club.md + policies/membership-tiers.md
Wrote both topic and policy file. Verified mechanics against the guild repo source rather than my own (stale) recollection. Two corrections to my mental model:
- Check-ins grant a flat 100 XP + 100 points per visit. No tier multiplier.
- The XP tier multiplier (1x/1.5x/2x/2.5x/3x) applies to purchases only. Confirmed in
guild/src/lib/loyalty.ts:48-50docstring: “the tier multiplier does NOT apply to non-purchase XP.” - Discounts apply automatically at Square checkout via customer groups + pricing rules. No coupon code, no manual staff intervention.
- Birthday credit and ARC pickup are currently staff-fulfilled (no auto-issuance code yet). Doc reflects that.
What I tried and abandoned: Kimi K2.5 on Together (PR #29, closed)
Spent ~3 hours evaluating Kimi K2.5 via Together AI as a Sonnet replacement. Built the full migration: rewrote ai_client.py against the OpenAI SDK pointing at Together, swapped the tool registry to OpenAI envelope, updated query_optimizer.py, env vars, tests, docs. 191/191 unit tests passed. Boot smoke clean. Cost projection ~3-5x cheaper, latency ~0.56s TTFT vs Sonnet’s 2-4s.
Killed it after the first Discord smoke run.
The failure mode: Kimi reaches for tools less aggressively than Sonnet. On the standard 9-prompt suite it skipped get_doc for the return policy, the events question, the membership question, and the refund-paragraph question — instead answering from training/system-prompt context with fabricated policies. It made up “card refunds take a few days” (not in our policy), invented an entire event catalog (D&D, Shadowrun, MTG prereleases — events.md is publish: false), confidently said “we don’t do memberships” (we do), and added “no returns on RPG books or items bought during events / sales” (also fabricated).
Customers getting fabricated refund rules is worse than paying 5x more for Sonnet. Not a “tighten the prompt and hope” problem when it’s already costing policy correctness on a smoke test. Closed the PR, killed the branch.
This matches Together’s own model recommendation table — they list GLM-5 ahead of Kimi K2.5 for function calling. We saw it bear out in real conditions.
Things learned about Together / Kimi (kept as future reference)
- Kimi K2.5 is a hybrid reasoning model. Reasoning must be disabled per request via
extra_body={"reasoning": {"enabled": False}}or the model fills the token budget with a thinking trace andmessage.contentcomes back empty. Together puts the reasoning trace inmessage.reasoning, notmessage.content. - Together’s automatic prompt caching is always on (the
disable_prompt_cacheflag is being deprecated Feb 2026). Cached input ~80% off uncached. Marty’s system prompt order is already stable→variable so the prefix hits cache for free. - K2.6 is slower than K2.5 on Fireworks (78 tok/s vs 174 for K2 0905, 399 for K2.5 reasoning). The Pi sample I’d seen was on K2.5; assuming K2.6 inherits the speed was wrong.
- Fireworks had documented chat-workload reliability issues (23% HTTP 429 even with adaptive throttling) in older benchmarks. Together has the better latency profile for chat shape (0.56s TTFT vs 5.45s on Fireworks).
- Use OpenAI SDK pointed at provider URLs, not provider SDKs. One dep, one API surface, one-line failover by changing base_url + env var. Together SDK is fine for Together-only shops.
- One-line revert path between models when migration is already in OpenAI shape:
LLM_MODELconstant in two files. Cheap insurance.
Decisions
Topic vs policy file ontology
Two file shapes in dungeonbooks/docs:
- Topic files (root): customer-shaped, descriptive/procedural, one per area of interest.
store.md,events.md,orders.md,dungeon-club.md. Self-contained for the common case. - Policy files (
policies/): atomic rules with conditions and exceptions.event-ticket-policy.md,return-policy.md,membership-tiers.md. Referenced from topic files; never duplicated.
Distinction test: descriptive/procedural answer → topic. Conditional yes/no → policy.
publish: true is the boolean for both Quartz render and Marty visibility
Mirrors Quartz’s ExplicitPublish plugin. Same flag controls both consumers. Default unpublished — forgetting the flag means a finished page doesn’t render (annoying, recoverable). The opposite default would mean leaking unfinished content (not recoverable).
Agent guidance in HTML comments
Customer-facing prose is visible markdown. Agent-only directives (when to escalate, what not to promise, never-name-individuals) live in <!-- ... --> blocks. Quartz strips comments from rendered HTML; Marty parses them from raw markdown. Same file, two audiences, no leak.
Stay on Sonnet 4.6 for now
Cost-and-speed thesis for Kimi was right. Tool-call reliability gap is the dealbreaker. Sonnet’s tool-calling consistency is a real moat at the support tier. Worth re-evaluating if Kimi K2.7+ closes the function-calling gap or if we want to A/B GLM-5 specifically (Together’s recommended function-calling model).
Anthropic prompt caching as the next cheap-cost move
Was always the safer answer to “make Sonnet cheaper” — 4-5x reduction on input tokens with no model swap, no migration risk, no reliability tradeoff. Deferred to a follow-up; needs messages.create restructure to use content blocks with cache_control. ~1 hour.
Things learned about the migration shape itself
- Stack PRs via gt for review hygiene. This session shipped #25, #26 (stacked on #25), #27 (off main), #28 (hotfix). Each one merged independently, Copilot got a clean review surface, and the rebase-when-parent-merges flow was painless.
- Default
gt submitopens drafts — CI doesn’t fire untilgh pr ready. Good for iterating without burning runners. - Pre-commit hooks should be cheap. Slow checks (pytest, bandit, ty) belong in CI, not local. Sub-second commits change how often you commit.
- Verify ground truth against source code, not memory or notes. I was about to write
dungeon-club.mdsaying check-ins use the tier multiplier. Readingguild/src/lib/loyalty.tscorrected that. Thenotes/plans/marty-roadmap.mdwas vague enough that going to source was the only way to be accurate.
Action items
- Merge #27 once Copilot re-reviews the cache hardening commit
- Wire Anthropic prompt caching on Marty’s system prompt (~1hr, big cost win at no reliability cost)
- Flesh out
events.mdandorders.mddrafts; flippublish: true. Both havetodo:HTML comments listing the questions for Carrie - Confirm with Carrie:
- Early-access RSVP mechanism for special events (window, gate)
- ARC pickup deadline / what happens to unclaimed
- Birthday credit handling at month end
- Tier upgrade/downgrade mid-cycle behavior
- What “points” do beyond ranking the leaderboard (vs XP)
-
open_issue/ escalation tool againstdungeonbooks/opsrepo (next big roadmap item) - Self-host Langfuse + OTel before any tool that touches live store data (Square reads, Hi.events reads)
- Tune voice further if Carrie spots Marty sounding canned — list of attributes triads in book descriptions still happens occasionally
Cross-references
- 2026-05-02 — Carrie interview that seeded all of this
- marty-roadmap — plan file updated mid-session to reflect topic/policy split and the new dungeon-club content
- guild — verified XP / discount / check-in mechanics against this repo’s source
dungeonbooks/docsGitHub — the new public docs vaultdungeonbooks/martyPRs #25, #26, #27, #28, #29