2026-04-05

payload-auth abandoned, Better Auth standalone integrated

The problem

Last session attempted payload-auth (Better Auth plugin for Payload) on feat/payload-auth. Complete failure — the admin panel couldn’t accept keyboard input in any collection. The form-state server action re-rendered the entire RSC tree on every keystroke, resetting all inputs. Confirmed on fresh DB with fresh user. The plugin was built against Payload 3.67.0, we’re on 3.80.0.

The solution: dual auth architecture

Instead of a plugin that takes over Payload’s internals, run Better Auth as a standalone service alongside Payload. Clear boundary:

  • Better Auth owns member authentication — passwords, sessions, email verification, OAuth, password reset
  • Payload native auth stays for admin/staff — Users collection, admin panel, staff API endpoints
  • Members collection becomes data-only — no auth: true, no password hashing, linked to BA via betterAuthUserId

Implementation

New files:

  • src/lib/auth-server.ts — Better Auth instance (Kysely/pg adapter, Resend emails, Discord OAuth config)
  • src/lib/auth-client.tscreateAuthClient() for frontend components
  • src/app/api/auth/[...all]/route.ts — BA API catch-all handler

Key changes:

  • Removed auth config from Members collection entirely (no password, no hash, no salt)
  • Added explicit email field (was previously provided by auth) + betterAuthUserId field
  • getMember() reads BA session → looks up Member by betterAuthUserId (not email)
  • All frontend forms (login, signup, verify-email, forgot/reset password) wired to authClient
  • Logout uses authClient.signOut() instead of Payload’s /api/members/logout
  • Billing portal uses BA session instead of payload.auth()
  • Email verification banner re-enabled on dashboard (reads emailVerified from BA session)
  • Login uses window.location.href instead of router.push to avoid session cookie race

Architectural issues caught in code review

First implementation had 5 problems, all fixed:

  1. Dual password storage — both BA and Payload hashing the same password independently. Password resets would drift. Fix: removed auth from Members entirely, BA is single source of truth for credentials.

  2. Dead Payload forgotPassword config — Members still had Payload’s forgot-password email template, but all forms used BA. Would fail silently if triggered via admin API. Fix: removed all auth config from Members.

  3. Email-only bridgegetMember() joined BA session to Member by email. No FK, breaks if email changes in one system. Fix: added betterAuthUserId field, lookup by ID.

  4. OAuth bypasses member creation — Discord login would create a BA user but no Payload Member doc. Fix: OAuth gated behind signup flow (must pick tier first).

  5. Two ORMs, one database — Payload dev mode uses drizzle-kit push on startup, which tried to delete BA’s tables (user, session, account, verification) since Payload doesn’t know about them. Fix: tablesFilter: ['!user', '!session', '!account', '!verification'] in postgres adapter config.

Kysely vs Drizzle adapter

Better Auth has two modes:

  • Full mode — pass a pg.Pool, BA uses Kysely internally. CLI migrate works.
  • Drizzle adapter mode — pass a drizzle instance. CLI migrate doesn’t work, need generate + drizzle-kit push.

Started with drizzle adapter (since Payload uses drizzle), but switched to full mode with new Pool() because:

  • BA is standalone, doesn’t share a drizzle instance with Payload
  • CLI migration just works (npx @better-auth/cli migrate --config src/lib/auth-server.ts)
  • Simpler, fewer dependencies (dropped drizzle-orm from direct deps)

Migration lessons (the hard way)

Several columns (nickname, lingo_points, lingo_member, lingo_staff) existed on staging/prod via dev schema push but were never captured in migrations. When we nuked and rebuilt from migrations, they were missing. Added catch-up migrations for all of them.

BA tables also need to be created via Payload migration — the BA CLI doesn’t work in Next.js standalone output (no node_modules/tsx in the runner). Created 20260405_160000_better_auth_tables with CREATE TABLE IF NOT EXISTS so it’s idempotent.

Migration ordering: Payload migrations include BA table creation now, so everything runs automatically via prodMigrations on first request. No separate CLI step needed.

The dev row: Payload writes a dev row to payload_migrations when using dev schema push. If this exists on staging/prod, it triggers an interactive prompt that blocks container startup. Delete it: DELETE FROM payload_migrations WHERE name = 'dev'.

Staging deployment

Deployed feat/better-auth directly to Railway staging (CI/CD disabled for this branch).

Issues hit and resolved:

  1. BA tables missing — BA CLI migration ran against wrong port (5432 vs 5433). Fixed by creating BA tables via Payload migration instead.
  2. NEXT_PUBLIC_SERVER_URL not baked in — client-side auth calls went to localhost:3000 (CORS errors). Fix: must be set before build, not just restart.
  3. Missing columnsnickname, lingo_* fields not in migrations. Added catch-up migrations.
  4. dev migration row — caused interactive prompt blocking Railway container. Deleted manually.
  5. Login form hangingrouter.push + router.refresh raced with session cookie. Fix: window.location.href = '/dashboard'.
  6. Double verification email — BA sends verification email on signup, Stripe webhook sends welcome email. User gets two emails with different CTAs. Functional but confusing — UX improvement for later.

Data migration script (scripts/migrate-members-to-ba.ts) used to create BA users for existing staging members. Members had to reset passwords since Payload bcrypt ≠ BA scrypt.

Production deployment

Same playbook as staging:

  1. Deployed feat/better-auth to prod Railway (CI/CD disabled)
  2. Backed up all prod data to /tmp/prod_*.csv
  3. Nuked prod DB (DROP SCHEMA public CASCADE)
  4. Migrations ran automatically on first request (Payload + BA tables)
  5. Created admin user via admin panel
  6. Restored via SQL: tiers (with prod Stripe price IDs), tier benefits, classes, shop (with prod Square location ID LTWW04J1BDRD7), kiosk (with prod API key), Square location
  7. Signed up fresh as Gold member
  8. Square webhook auto-linked and backfilled 15 purchases (11,886 pts)
  9. Manually restored: NFC card UID, nickname, Mage class

Prod-specific values (different from staging):

  • Stripe price IDs: price_1TGqa* (prod) vs price_1TEIa* (test)
  • Square location: LTWW04J1BDRD7 (prod) vs LDNFBHTJA8XNH (staging)
  • Kiosk API key: kiosk_7f3a9b2e (prod) vs kiosk_dev_test_key_dungeonbooks (staging)

Copilot review fixes

  • Made migration 20260405_145538 idempotent (IF EXISTS/IF NOT EXISTS)
  • Singleton pg.Pool via globalThis to prevent connection leaks in dev HMR
  • HTML-escape user.name in email templates (XSS prevention)
  • Handle BA signUpEmail error response explicitly in checkout route

Test results

  • 90 unit tests passing
  • 9 e2e tests passing
  • All auth flows tested on staging and prod

User feedback (Carrie)

Carrie tested the full signup flow on staging. Key issues:

  1. Double email confusion — verification email and Stripe welcome email arrive simultaneously. The welcome email has a “Member Dashboard” link that doesn’t work until email is verified. She clicked the dashboard link first, got stuck on the login screen (no session yet), then found the verification link and got in. The Stripe welcome email should only send after verification, or at minimum after the member has a session.

  2. Login hanging — after clicking the verification link and being auto-signed in, she tried to manually log in on a different device. The form hung because the old build had NEXT_PUBLIC_SERVER_URL baked as localhost:3000 (CORS errors). Fixed by rebuilding after setting the env var. Also fixed router.push/router.refresh race → window.location.href.

  3. UX gap — no clear indication on the login page that you need to verify your email first. The error message exists but isn’t prominent enough. Future: consider a dedicated “check your email” interstitial page after signup instead of going straight to the login form.

Future plan: free tier signup (verify email first, no payment required), then upgrade to paid tier later. This naturally solves the double-email ordering issue.

dungeon.club — brand direction session

Worked through the rebrand/naming today. Key decisions:

  • dungeon.club is the name of the game. The portal between the real world and the guild system. We own the domain.
  • Guilds are factions within dungeon.club. Right now only Dungeon Books Guild exists, but the world implies more (Victory Point Guild, etc.).
  • The platform stays “Guild” — Guild Points, Guild Hall, Join the Guild are all correct and universal across guilds.
  • Dungeon Books is still the bookstore. Tote bags, stickers, etc. still reference Dungeon Books.
  • Considered “Dungeon Club” but there’s a Scholastic D&D middle grade book series with that name. The . differentiates visually but not verbally. Dropped it.
  • Considered dungeon.game ($1.4k premium domain, someone squatting). Staying with dungeon.club.

Aesthetic direction captured in references/dungeon-club-digital-aesthetic.md:

  • .dungeon (John Battle / snow) — file-system naming conventions, .exe/.txt/.dll suffixes, minimal high-contrast layout
  • Guilded Youth (Jim Munroe) — BBS/terminal palette, soft greens, earnest melancholic tone, dual-layer retro-over-contemporary

The landing page (/) will eventually be the dungeon.club portal — cryptic, iykyk, the threshold into the game. No changes made to code today. Brand is a future design direction.

Action items

Done

  • Submit PR for feat/better-auth (#42)
  • Deploy and test on staging
  • Deploy to prod
  • Merge stack (#38 → #42) into main

Immediate

  • Re-enable CI/CD (disabled for this deploy)
  • Add access control — super-admins only create orgs, admins create shops, staff can edit members/tiers but not create orgs/shops

Auth & UX

  • Discord OAuth — Better Auth config is ready, needs Discord app credentials. Decision: link Discord after signup (from dashboard), not during signup. Avoids email mismatch with Square — member signs up with their store email, then connects Discord as an add-on. Better Auth supports account linking for this.
  • Improve double-email UX — verification + Stripe welcome email arrive simultaneously, confusing. Options: delay welcome email until after verification, or combine into one email.
  • Rate limiting on auth endpoints — no protection against brute force on login/forgot-password/verification resend. Better Auth may have built-in rate limiting worth investigating.
  • “Check your email” interstitial — after signup, show a dedicated page instead of redirecting to login. Reduces confusion about what to do next.

Onboarding rethink (blocking)

Problem: Existing WithFriends members already pay for their membership. The current signup flow shows a Stripe checkout, which confuses them — they think they’re paying twice. They just want to see their dashboard and points.

New onboarding flow (design needed):

  1. Sign up with any email or Discord — no payment required
  2. Verify email (or Discord OAuth handles it)
  3. Onboarding wizard walks them through:
    • “What email do you use at Dungeon Books?” (for Square purchase matching)
    • Or NFC card tap / phone number to link identity
  4. Staff approves the Square link (prevents abuse — free member can’t claim someone else’s purchases by entering their email)
  5. Tier assignment happens separately (staff sets it, or member upgrades via Stripe later)

Abuse vector: A free member enters someone else’s Square email and gets credit for their purchases. Mitigation options:

  • Staff approval required for Square linking (manual review)
  • Verification email sent to the Square email address (“confirm this is your store email”)
  • Only link if member can prove identity (NFC card tap, phone number match)
  • Rate limit / flag suspicious linking attempts

Key insight: Decouple auth identity (email/Discord) from store identity (Square email). They may be different people-shaped things. The link between them needs verification, not just a text field.

Future

  • Free tier signup — verify email first, no payment required, upgrade to paid tier later. Naturally solves the double-email ordering (Stripe email only sends when they subscribe). Changes the entire signup flow — needs design.
  • Square email mismatch handling — if a member’s auth email differs from their Square email (e.g. Discord OAuth), auto-linking breaks. Option 1: staff manually links Square customer from POS. Option 2: member enters “store email” during onboarding. Option 3: match by phone number as fallback.

PostHog analytics integration

PRs merged earlier today

Before starting on analytics, cleared a backlog of PRs:

  • #43 — fix: auth client baseURL and RSC Vary headers
  • #44 — feat: free signup with invite codes
  • #47 — feat: repurpose checkout as upgrade endpoint for free members
  • #48 — feat: CI workflow and test restructure
  • #49 — feat: replace mocked rewards and quest log with coming soon state
  • #50 — fix: prevent hydration mismatch in active buff countdown timer
  • #51 — feat: add role-based access control to all Payload collections
  • #52 — feat: add per-member Square purchase backfill script

Setup

Ran the PostHog wizard which produced instrumentation-client.ts (Next.js 15.3+ client init pattern, no provider needed) and src/lib/posthog-server.ts (posthog-node singleton). Set up /ingest reverse proxy in next.config.ts to bypass ad-blockers. Setup report at references/guild-posthog-setup.md.

Key fixes

Prod-only gating — gated init on NEXT_PUBLIC_POSTHOG_PROJECT_TOKEN being present. Token only set on Railway prod. getPostHogClient() returns null when not configured; all call sites use ?.capture().

Docker build argsNEXT_PUBLIC_ vars are baked at build time by Next.js. Railway doesn’t inject service env vars into Docker build stages unless declared as ARG. Added ARG NEXT_PUBLIC_POSTHOG_PROJECT_TOKEN + ARG NEXT_PUBLIC_POSTHOG_HOST to the builder stage. Railway automatically passes matching service vars as build args when ARG is declared.

Ad-blocker — AdGuard browser extension blocks /ingest/e/ even on own domain. Pi-hole wasn’t the issue (DNS shows “Processed”). Fix: @@||dungeon.club^ in AdGuard, @@||dungeon.club/ingest/* in uBlock.

Events instrumented

Client-side: signup_submitted, login_succeeded, login_failed, email_verified, tier_upgrade_clicked, manage_subscription_clicked

Server-side: checkout_started, subscription_activated, subscription_canceled, payment_failed, purchase_points_earned (with tier, points_earned, earn_multiplier), points_redeemed, check_in (with tier, method, kiosk_id), check_in_duplicate (balance-checking signal), level_up (from check-ins and purchases, with previous_level, new_level, tier, source)

Person identification

posthog.identify(email, { email, name, tier }) on login, signup, and every dashboard load via PostHogIdentify client component. posthog.reset() on logout.

Data warehouse

Connected Stripe live mode to PostHog data warehouse as stripe_prod via restricted API key.

Level-up UX (future)

level_up is worth designing a real moment around — see plans/guild-level-up-ux.md and dungeonbooks/guild#54. Ideas: animate XP bar from last-seen value on dashboard load, manual level-up trigger (bar fills and holds, member taps to spend XP into new level). Kiosk is a good candidate for real-time XP animation post-tap.

Open PR

feat/posthog-integration (#53) — Copilot review in progress, not yet merged.