The single registry of what must stay in sync across the monorepo's components, so there are no gaps between api/supabase · ws/realtime · mcp · dashboard · docs · proxy-server · cdp-connector · iOS. Every contract here is either machine-checked by
scripts/sync-audit.mjs(node scripts/sync-audit.mjs) or flagged as a manual confirm at ship time. Each per-componentCLAUDE.mdlinks back here; thecomponent-syncagent runs the audit and closes any gap it finds.
Component map
- supabase is the schema source of truth. The dashboard, proxy-server, iOS, cdp-connector, and mcp all read/write the same tables — a schema change fans out.
- mcp must expose every dashboard read/write (full parity).
- docs must document every shipped feature of any component.
- ws/realtime = the Supabase Realtime channels every client subscribes to.
The 5-surface parity rule (STANDING)
If a capability is possible in the DASHBOARD, it must align across MCP + REST + WS + BusyBro too — the same capability is reachable on every surface a human or agent might use.
For every dashboard read/write capability, all five of these must line up (a missing one is a gap the component-sync agent + the ship checklist must close):
- Dashboard — the UI surface itself (
web/dashboard). - MCP — a JSON-RPC tool in
supabase/functions/_shared/mcpRegistry.ts(the shared registry both the MCP server and the BusyBro brain consume). A dashboard capability with no registry tool = gap. - REST — reachable over PostgREST (table/view/RPC) or an Edge Function, and documented in the dashboard
/restexplorer (web/dashboard/app/(settings)/rest/manifest.ts). A capability not queryable/mutable via REST, or queryable but undocumented, = gap. - WS — any Realtime the dashboard consumes (postgres_changes / broadcast / broadcast-from-db / presence) is documented + subscribable in the
/wsexplorer (web/dashboard/app/(settings)/ws/channels.ts). A publisher with no subscriber (or vice-versa), or an undocumented channel/event, = gap. - BusyBro — because the brain sources its tools from the same MCP registry, a new MCP tool flows through automatically — but VERIFY the capability is not wrongly denylisted/curated-out in
supabase/functions/_shared/busybroDispatch.ts(REGISTRY_DENYLIST/REGISTRY_CURATED_OVERRIDE, imported bybusybro.ts) and that its gating/confirm/ownership checkpoint is correct. Since the full-sync pass the exclusion list is MINIMAL — only three classes are intentionally off the brain: (a) raw power primitives (db_*,realtime_broadcast,list_tables— no dashboard surface); (b) curated-superseded raw reads/writes (list_entries,get_entry,get_settings,set_{global,device}_settings,clear_device_settings,search_entries,export_har,get_device_settings,get_device_status— the brain owns a safer impl, so exposing them would create two impls per name); (c) brain-native personal memory (list_memories,save_memory,forget_memory,get_memory_stats). Everything else, including skills/agents/scripting authoring, imports, global-memory governance, and BusyBro settings, is now EXPOSED at its existing RBAC + adminOnly + confirm gates (the confirm prompt surfaces the full payload — script body / instruction bundle / exact memory / settings diff). Those three excluded classes are documented exclusions, not gaps.
Run node scripts/sync-audit.mjs (machine-checks the cheap subset: version/build, source enum, MCP tool count, code-path refs) — the four MANUAL checks (docs / MCP feature parity / realtime channels / schema→types) are where the 5-surface walk lives.
TEST-1 — Every change ships its own tester artifact (STANDING)
Owned by the tester component (
tester/CLAUDE.md). The executable arm of this rule is theproveverdict +scripts/test-system.mjs(issue #89) +scripts/coverage-audit.mjs.
TEST-1. Every implementation or modification of a capability MUST, in the same ship, add or update a tester artifact that reproduces and tests the functionality:
- a suite/script registered in
scripts/test-system.mjsSUITES[](with itsphaseand, for a load-bearing capability, aCRITICAL_SUITESentry — keyed by the stablecomponent/suiteid), and - a row in
notes/test-coverage-map.mdwhose Test column names that suite (or an acknowledgedgaprow carrying a P4/P5 phase tag), and - where the capability spans surfaces, an entry in the relevant tester skill
(
tester/skills/test-<section>/SKILL.md) describing HOW to test it and WHAT is expected.
A change to product code with no corresponding tester-artifact change is a contract violation.
"Tests all cases, not the happy path" applies: the artifact MUST assert the failure/denied path,
and for any auth/gate change BOTH the denied caller AND the legit caller. A new MCP tool ALSO
regenerates the registry golden (node scripts/mcp-golden.mjs --update) so the change lands as a
reviewed diff (contract #13). The BINDING machine-check is coverage-audit.mjs cross-validating the
map's Test column against the LIVE SUITES[] registry (a named suite must exist + run) — the
tests-required Danger check (a test file changed alongside code) is the WEAKER, advisory layer.
TEST-1 for CSS-only / visual / responsive changes (the sanctioned path). A purely-visual change
(a styling tweak, a responsive/layout fix, a spacing/color refinement) has no unit-testable logic, but
it is STILL product code (.tsx) and STILL owes a tester artifact — the tests-required gate does not
exempt it. Satisfy TEST-1 with these three artifacts in the SAME ship:
- a
notes/test-coverage-map.mdrow for the visual behavior (e.g. "/tester — mobile-responsive, no horizontal overflow at phone widths"), and - an entry in the relevant tester skill (
tester/skills/test-<section>/SKILL.md) describing the assertion, and - ideally a repeatable Playwright LAYOUT-INVARIANT assertion (NOT a brittle pixel screenshot) — e.g.
at a phone viewport (390px / 360px), assert the route mounts and grows NO horizontal overflow
(
document.documentElement.scrollWidth ≤ clientWidth (+1px)AND no<main>descendant's boundingrightspills past the viewport). It rides the e2e tier (creds + a launched :3838) and skips cleanly on a bare runner. Reference impl:web/dashboard/e2e/specs/tester-responsive.spec.ts(@tester-responsive, suitedashboard/ui-tester-responsive), shipped as the TEST-1 artifact for the dashboard-401/testerresponsive pass; thetest-dashboardskill carries the pattern.
The coverage-map row + the skill entry alone satisfy the tests-required file-level gate (and the ship
gate); the Playwright invariant is the repeatable proof that re-checks the behavior every e2e run.
TEST-1b — Exhaustive PATH coverage (the #105 extension). TEST-1 proves a feature's HAPPY path; it
does NOT force enumeration of the FAILURE / edge / denied paths. #105 (the busydrivers/Directus SSO
login) shipped a working happy path but untested failure paths (suspended account, wrong creds, the
password-less mirror, the Edge-fn config-secret presence) → a prod bug. For a feature whose paths
warrant exhaustive drilling (auth/session/RBAC, data-loss/destructive, security boundaries — priority
order), enumerate EVERY path in a path manifest, notes/test-paths/<section>.md (grammar:
notes/test-paths/README.md): one - id: per path, kind: happy|edge|failure|denied|security,
status: covered|partial|gap, naming the harness suite: (covered) or a phase: tag (gap). The
binding machine-check is scripts/path-coverage-audit.mjs (contract suite repo/path-coverage,
critical:true): every covered path maps to a registered suite, a critical: true manifest's un-covered
path is a hard fail, and a feature enumerating ONLY a happy path is WARNed (the #105 anti-pattern). It is
folded into prove's comprehensiveness block as pathsCovered/pathsEnumerated. The framework is
INCREMENTAL — a feature with no manifest still owes its TEST-1 coverage-map row (nothing regresses) and
gains a manifest when its paths are drilled. Reference: notes/test-paths/auth.md#directus-login (the
#105 demo, a critical manifest, 17 paths covered via supabase/directusAuthPure +
supabase/edge-fn-probes + repo/edge-secret-presence). Owned by the tester component.
TEST-2 — Green means PROVEN, not merely executed (STANDING). The prove verdict is GREEN only
when (a) every critical:true suite RAN (not skipped/quarantined) AND passed, (b) every
covered/partial coverage-map row maps to a suite that actually passed this run, (c) the
5-surface parity matrix holds. A bare runner (no e2e creds / no rig) yields a DISTINCT verdict
(GREEN-EXCEPT-CREDS / GREEN-EXCEPT-DEVICE / GREEN-EXCEPT-CREDS-AND-DEVICE), never plain GREEN —
a skip is never a pass. A coverage-map "covered" that names a non-existent or non-passing suite
is a violation. (The mutation-testing oracle + the behavioral 5-surface drive that complete TEST-2
are Phases D/F of the tester roadmap — notes/tester/RESEARCH-design.md §7; scoped to Node/TS, with
Deno/Swift mutation an acknowledged gap, not pretended parity.)
TEST-3 — The tester is itself a 5-surface capability. Running + reading test results MUST be
reachable from the dashboard tester view, an MCP tool (run_test_suite / get_test_status), the
/rest manifest, a /ws channel (live progress), and BusyBro — per the standing 5-surface parity
rule. (The view + its MCP/REST/WS/BusyBro surfaces are tester Phase E; Phase A ships the contract
arm + the harness + this wording.)
How it's enforced / machine-checked:
| Mechanism | What it does |
|---|---|
scripts/coverage-audit.mjs --strict | Cross-validates every covered/partial Test column against the live SUITES[] registry (--list-suites); --run=<report> warns when a covered suite did not actually run; --since=<ref> PR-delta. The binding TEST-1 check. |
scripts/path-coverage-audit.mjs --strict | Parses the path manifests (notes/test-paths/*.md); every enumerated path is covered/partial (naming a registered suite) or an acknowledged gap (phase-tagged); a critical:true manifest's un-covered path is a hard fail; WARNs the #105 happy-only anti-pattern. Contract suite repo/path-coverage. The binding TEST-1b (exhaustive-path) check. |
scripts/test-system.mjs contract tier | sync-audit + the drift detectors (content-parity, mcp-golden, source-enum-mirror, contracts-consistency, 5-surface-parity) — all critical:true, seconds, no creds. |
.github/workflows/tests-required.yml (Danger) | A PR changing source with no matching *.test.*/*.spec.*/test-system/coverage-map change → fails on a main-targeting PR, advisory on a feature branch (mirrors the pre-push strictness). The weakest layer — paired with coverage-audit + (Phase F) patch-coverage so it isn't gameable. |
.githooks/pre-push + scripts/ship.mjs step 5b | The contract additions run in the fast gate (pre-push, hard on push-to-main); the ship gate runs the component's tier from ship-gate-map.json. Both also run tests-required-audit.mjs --hard (pre-push: committed diff vs origin/main, hard on push-to-main; ship.mjs step 5b: --include-worktree, hard, EVERY ship) so a product-without-test change is BLOCKED before the push, not surfaced after as a red CI run. The ship gate is the deterministic layer (fires for every ship regardless of whether the local pre-push hook is wired via git config core.hooksPath .githooks). |
Contracts
| # | Contract | Lives in | Change-together rule | Audit |
|---|---|---|---|---|
| 1 | source connection enum (pac | vpn | cdp) | web/dashboard/types.ts (ConnectionSource), web/proxy-server/src/types.ts, cli/cdp-connector/src/types.ts, ios/Shared/TrafficLogEntry.swift | The three TS mirrors must be identical. cdp-connector is the component that produces source:'cdp' (daemon/heartbeat/host-device path), so its types.ts is the canonical home of the cdp value. iOS is a capture source (vpn/pac) — a subset by design (no cdp). | ✅ TS set; ⚠ iOS manual |
| 2 | Version + build | version.json (components.<x>.build + top-level build) ↔ each subproject package.json build | Only the changed component's build bumps; top-level = max(components); package.json must mirror version.json. iOS = pbxproj CURRENT_PROJECT_VERSION; supabase has no package build. | ✅ |
| 3 | MCP ↔ dashboard parity | supabase/functions/mcp/index.ts (TOOLS array + TOOL_FNS handlers) vs the dashboard's reads/writes; tool count documented in README.md + root CLAUDE.md | Every dashboard read/write has an MCP tool; every TOOLS entry has a TOOL_FNS handler; the documented count matches TOOLS.length — currently 188. The three-tier-scope (global → user → device) phase added the USER-scope writers set_scripts_user (scripts:edit + admin + confirm — IDENTICAL bar to the global/device script writers; the user tier does NOT lower it), set_block_rules_user (users:edit + confirm; an inline {type:'script'} action stays scripts:edit + admin + confirm via the extended mcp/index.ts privilege-guard name-check) and set_breakpoint_patterns_user (users:edit + confirm — low-risk, no code-exec), 177→180; get_scripts/get_block_rules now return the 3-tier union(global, user, device) (de-duped by id, device wins then user then global) backed by migration 20260617160000 (the settings_user table + the 3-way fold in effective_settings_for_device + the settings_user_scripts_gate + the settings_user:all/per-device settings-updated broadcast). All 3 flow to BusyBro automatically at those gates (NOT denylisted). The FOUR-tier-scope (global → user → SERVICE → device) phase added the SERVICE-scope writers set_scripts_service (scripts:edit + admin + confirm — IDENTICAL bar; the service tier does NOT lower it, the DB service_groups_scripts_gate enforces it server-side), set_block_rules_service (services:edit + confirm; the inline {type:'script'} privilege-guard name-check was extended to set_block_rules_service) and set_breakpoint_patterns_service (services:edit + confirm — low-risk), 180→183; get_scripts/get_block_rules now return the 4-tier union(global, service, user, device) (de-duped by id, device wins then user then service then global; multi-group collisions resolve by service-group name) backed by migration 20260618120000 (the service_groups.data + .breakpoint_patterns columns + the NET-NEW service-tier aggregation/fold in effective_settings_for_device + the service_groups_scripts_gate + the extended broadcast_service_groups_change per-device settings-updated fan-out — NO new channel). All 3 flow to BusyBro automatically at those gates (NOT denylisted; the {service_group} arg is id-or-name). The tester full-control phase added the 6 operator tools (cancel_test_run / rerun_test_suite — tester:run + confirm — and list_test_suites / get_test_coverage / get_test_path_coverage / get_test_suite_history — tester:view), and extended run_test_suite with an optional suite arg. The BusyBro agentic-expansion phase added skills (list/get/upsert/delete_skill + set_skill_enabled), agents (list/get/upsert/delete_agent + set_service_group_agents — FULL-REPLACE the busybro_agents linked to a service group via the service_group_agents join, exactly one is_primary, services:edit + confirm; list_service_groups/get_service_group now also return each group's primary_agent + ordered agents[]), memory imports (import_resource (GitHub/GitLab, private via a write-only credential_ref PAT; a category arg — 'general' = docs-only | 'code' = force all files through the code extractor | 'both' = WHOLE-REPO AUTO, per-file general/code; the DEFAULT when omitted is SOURCE-TYPE aware, mirroring the dashboard agenticApi.defaultCategoryFor (a repo → 'both' so MCP/BusyBro repo imports capture code like the dashboard, a file → 'general') — so BusyBro answers implementation questions, code facts tagged category='code', embedding the repo-relative file path + capturing a code snippet; migration 20260617010000 adds the category STORE dimension to busybro_global_memories/busybro_memory_imports + an optional p_category on match_global_memories, and 20260617030000 (IMPORT-V2) widens the imports CHECK to 'both', adds live facts_general/facts_code/tokens_*/cost_usd/current_file stat columns + busybro_memory_sources.snippet, and projects source_provider/source_ref/source_path/snippet from match_global_memories for code-answer source links; recall BLENDS by default, narrowing to code only on explicit code intent) / list_imports / get_import / delete_import), and usage/cost (get_busybro_usage); the Settings phase added BusyBro settings (get_busybro_settings / set_busybro_settings — the configurable brain knobs in the busybro_settings singleton: model + max_tokens + loop limits + prompt_cache/web_search/memory_auto_save toggles + persona_addendum/default_prefs/suggested_prompts; read users:view, write users:edit + admin + confirm). Authoring writes are skills:edit/agents:edit/users:edit + admin + confirm; reads are users:view. Since the full-sync pass these are ALSO exposed to the BusyBro free-text LLM at those same gates — the brain can now author skills/agents/scripts, run imports, govern global memory, and adjust its own settings (the confirm prompt surfaces the full payload before an admin approves); only raw primitives + curated-superseded duplicates + brain-native personal memory stay denylisted. The app-secrets phase added the Vault-backed secret store (list_app_secrets — metadata ONLY, no value, global:view — / create_app_secret / update_app_secret / delete_app_secret — value write-only, encrypted via native supabase_vault, NEVER read back; global:edit + admin + confirm), 159→163; BusyBro is LIST-ONLY (the 3 mutators added to REGISTRY_DENYLIST as a 4th class — value comes from the user, never the brain) — see docs/architecture/app-secrets-vault.md. The Stripe billing (Phase 1) phase added the billing reads (get_subscription / list_invoices / get_usage — owner-scoped via ctx.sub, billing:view; all_users:true for operators) + the confirm-gated create_checkout_session (the server-side-customer-derived checkout/portal mutator), 163→167; BusyBro keeps the 3 reads but create_checkout_session joins REGISTRY_DENYLIST as a 5th class (payment-initiating — the create_app_secret precedent) — see docs/architecture/stripe.md. The Stripe ADMIN (Phase A) phase added the 5 operator READS (get_stripe_config — value-blind { mode, publishable_key } only, mode derived from the pub-key prefix, secret/whsec_ NEVER read — / list_prices / list_customers / get_webhook_events / get_billing_settings, all billing:view; the fleet reads hit the local stripe_* tables service-role, NOT the db_* allowlists), 167→172; all 5 flow to BusyBro automatically (no secret values, NOT denylisted). Backed by the value-blind stripe-admin Edge fn + the billing_settings singleton + the billing_admin:all WS channel + the stripe_usage_report_status/_now + stripe_admin_orphans DEFINER RPCs — see docs/architecture/stripe.md. The BusyBro team-memory ATTRIBUTION phase added list_memory_contributors (operator-only, users:view) — joins the leak-proof busybro_global_contributions ledger + the row's approved_by to profiles to NAME who proposed/approved a team fact, 172→173; it is STRICTLY off the recall path (match_global_memories is unchanged, still projects NO user_id, so the recall store stays leak-proof — contributor identity only ever surfaces on this explicit lookup); a READ → flows to BusyBro automatically (NOT denylisted). The Stripe ADMIN (Phase B) phase added the 4 operator MUTATORS (set_default_price / set_billing_settings / report_usage_now — NON-secret config, route through the set_billing_settings/stripe_usage_report_now DEFINER RPCs which re-check billing:edit+is_admin() in-body — + cancel_subscription — DESTRUCTIVE/state-changing, straight to Stripe via the Vault-backed client; all billing:edit + admin + confirm), 173→177; the 3 config mutators flow to BusyBro automatically (parity with set_busybro_settings, NOT denylisted) but cancel_subscription joins REGISTRY_DENYLIST (money/state — the create_checkout_session precedent). The secret-touching rotate_webhook (fresh endpoint → captures the new whsec_ → writes it value-blind to the Vault) is a stripe-admin Edge-fn WRITE action ONLY, deliberately NOT exposed over MCP. Backed by the stripe-admin Edge fn's WRITE ops (create_price/update_price/archive_price/rotate_webhook/cancel_subscription, billing:edit+admin) — see docs/architecture/stripe.md. The BusyBro MULTI-SESSION phase (notes/specs/busybro-multisession.md) added the 5 owner-scoped session tools (list_busybro_sessions / get_busybro_session — users:view — / create_busybro_session / rename_busybro_session / delete_busybro_session — users:edit + confirm; a "session" = one busybro_threads row, threaded to the OAuth caller's sub, reusing _shared/busybroThreads.ts), 183→188; all 5 flow to BusyBro automatically (owner-scoped own data, NOT denylisted). Backed by migration 20260618170000 (the additive context_key/last_active_at columns + the partial-unique (user_id, context_key) find-or-create lock + the (user_id, last_active_at desc) ordering index) + the busybro-threads Edge fn's create/rename/delete/find ops — RLS unchanged (deny-all, service-role only, ownership by user_id). | ✅ count; ⚠ feature parity manual |
| 4 | Realtime channels (ws) | publishers (triggers/pgRealtime.ts/PostgresStreamer) ↔ subscribers (dashboard, iOS RealtimeSubscriber, proxy PgRealtime); documented home = web/dashboard/app/(settings)/ws/channels.ts (the /ws explorer) | A new channel/broadcast must (a) have a subscriber on every client that needs it AND (b) be documented in the /ws explorer registry (5-surface rule, WS). Current set: ws:<workspace_id>, ws:<workspace_id>:<owner_user_id>, device:<uuid>, devices:all, device_status:all, settings:global, settings_device:all, settings_user:all, service_groups:all, breakpoint_events:all, workspace_tabs:<workspace_id>, service_status:all, service_stats:all, proxy-control, tags:all, app_secrets:admin, builds:all, device_jwts:<uuid>, browser_profiles:all, profiles:all, roles:all, rbac:caps, device_pac_state:dashboard, busybro_global_memories:<view>, busybro_skills:authoring, busybro_agents:authoring, service_group_agents:links, busybro_agent_runs:history, busybro_memory_imports:list, ws:tester, subscriptions:<user_id>, billing_admin:all. The device:<uuid> events bundle open-sheet (sheet key settings|cert|pac, iOS RemoteSheetController), live-activity-message ({ message }, fired by device_live_activity_message trigger → iOS LiveActivityController; sibling pg_net → push-notify liveactivity push covers backgrounded), push-response (#75 — DEVICE→sender answer to an actionable push, fired by broadcast_push_response → payload.record = a push_responses row, filtered by correlation_id in lib/pushActions.ts), and cdp-command/cdp-result (contract #9). The entries firehose has TWO variants (#104 part 2): the entries trigger DUAL-EMITs every captured row to BOTH the FLEET ws:<workspace_id> topic AND the OWNER-SCOPED ws:<workspace_id>:<owner_user_id> topic; realtime.messages RLS authorizes the fleet topic only for a devices:view holder (operator/admin) and the owner-scoped topic for the matching owner (or any devices:view holder). The dashboard (DashboardClient firehoseTopic, web/dashboard/app/lib/feedFirehoseTopic.ts) branches on useCapabilities().can('devices','view') — operator → fleet, plain owner → their own owner-scoped topic — with identical payload + handlers; cold-load (PostgREST) is already owner-scoped by RLS. ws:tester is the tester (architecture-prove) run-progress firehose (Stage 1 backend, supabase migration 296): the broadcast_test_run trigger fans every test_runs INSERT (a new run) AND UPDATE (an ingest fills in the verdict/counts) to this private topic so the /tester view shows new runs + live per-phase progress without polling; payload.record = a test_runs row ({ id, tier, verdict (GREEN|GREEN-EXCEPT-DEVICE|GREEN-EXCEPT-CREDS|GREEN-EXCEPT-CREDS-AND-DEVICE|RED), host, source, pass/fail/skip/quarantined, … }); realtime.messages RLS authorizes it only for a has_capability('tester','view') holder (admin via grants-all). app_secrets:admin is the Vault-backed app-secret METADATA lifecycle firehose (the dashboard Environment → Secrets (encrypted) panel subscribes it for live updates): the public.app_secrets trigger fans INSERT/UPDATE/DELETE — but CRITICALLY the mirror table has NO value column, so the envelope is structurally value-free (name, secret_class, hint, note, created_by, last_modified_by, last_accessed_at, timestamps only — never ciphertext); realtime.messages RLS gates this PRIVATE topic to a has_capability('global','view') holder (admin). The encrypted value never traverses Realtime — it's decryptable only by the service-role Edge resolver resolve_app_secret (see App Secrets & Vault). subscriptions:<user_id> is the per-user Stripe billing firehose (Stripe Phase 2): the stripe_subscriptions_broadcast DEFINER trigger fans every stripe_subscriptions INSERT/UPDATE/DELETE (the webhook-driven status change) to this topic via realtime.broadcast_changes, so the dashboard /billing page (BillingClient) flips its subscription/tier card live on a webhook event — no polling; realtime.messages RLS gates this PRIVATE topic via the consolidated realtime_messages_read_human CASE's anchored subscriptions:<uuid> arm (owner reads their own uid; a billing:view operator — admin via grants_all — reads any), inserted BEFORE the trailing ELSE true so a billing topic can't leak cross-tenant. settings_user:all is the per-USER-settings (3-tier scope, global → user → device) operator fan-out (supabase migration 20260617160000): the broadcast_settings_user DEFINER trigger fans every settings_user INSERT/UPDATE/DELETE to this coarse topic (the dashboard refetches the user-tier scripts/blockRules/breakpoints lists), AND — critically — emits a settings-updated event on the device:<uuid> channel of EVERY device owned by that user, so the always-on iOS-tunnel + cdp subscribers invalidate on the channel they already subscribe to (the user tier reaches them with ZERO new RLS, since the server-side effective_settings_for_device fold already returns the union). realtime.messages RLS gates the settings_user:all topic via the consolidated realtime_messages_read_human CASE's anchored arm — owner-or-users:view: a user reads only the events whose payload.record/old_record.user_id is their own, a users:view operator (admin via grants_all) reads all — inserted BEFORE the trailing ELSE true. service_groups:all is the SERVICE-tier (4-tier scope, global → user → service → device) operator fan-out (supabase migration 20260618120000): a service_groups row's data.scripts/data.blockRules/breakpoint_patterns now fold into the effective settings of every device that has applied the group, so the broadcast_service_groups_change DEFINER trigger KEEPS the coarse service_groups:all emit (dashboard + proxy refresh-all) AND ADDS a settings-updated emit on the device:<uuid> channel of EVERY device that has applied the changed group — explicitly (settings_device.applied_service_groups) or via the global default (settings_global.applied_service_groups, which fans to ALL devices) — so iOS-tunnel + cdp (which do NOT subscribe service_groups:all) invalidate on the channel they already subscribe to (ZERO new RLS; NO new channel — it reuses the existing owner-scoped device:<uuid> arm). | ⚠ manual |
| 5 | Supabase schema → typed clients | supabase/migrations/** → web/dashboard/types/*, web/proxy-server/src/types.ts, iOS Codable structs | A migration that changes a shared table updates every reader. Prefer supabase gen types over hand-kept mirrors where possible. The settings_global.connection_type (global default) + nullable profiles.connection_type (per-user default) + nullable settings_device.connection_type (per-device override) columns flow through effective_settings_for_device (device → user → global → 'vpn') into iOS's effective-settings blob. The device_live_activity_message.message column maps to iOS BusymateActivityAttributes.ContentState.note (no new Codable — iOS reads it via a { message } row decode in LiveActivityController.fetchMessage). The settings_{global,device}.data.scripts array folds through effective_settings_for_device (global ++ device — migration 20260611150100_effective_settings_scripts_union.sql) into the same effective-settings blob iOS/proxy/cli consume; the scripts:view/scripts:edit capability + a Postgres write-gate on the settings tables (migration 20260611150000_scripts_rbac_write_gate.sql) gate every writer (dashboard, MCP, raw device token). | ⚠ manual |
| 6 | Docs coverage | docs/** (+ per-folder _meta.json nav) | Every shipped, user-visible feature has/updates a docs/ page; the dashboard ↔ mcp ↔ docs trio is updated in the same change. | ⚠ manual |
| 7 | cdp-connector distribution | cli/cdp-connector/package.json build ↔ dashboard /install.sh + /bmc.tar.gz + /api/version; bmc --help/README | Shipping a connector build needs BOTH the package.json bump AND a dashboard deploy (the tarball is served live). bmc command/flag changes update --help + README. | ✅ build only |
| 8 | Component identity (name/label/host/description) | version.json components.<x>.{name,label,host,description} = the single source; every component reads from it (dashboard/docs/proxy/cdp bundle the values via their package.json mirror; the notifier + /api/version read version.json directly). | version.json is canonical; scripts/sync-version.mjs propagates name/description/version/build into each subproject package.json and the iOS main-app CFBundleDisplayName (pbxproj × Debug+Release, run from the fastlane beta pre-archive step); never hardcode a component name/description anywhere else. iOS build stays ASC→version.json (TestFlight is authoritative for the accepted number); iOS display NAME flows version.json→pbxproj. | ✅ presence + uniqueness + package mirror + iOS display name |
| 9 | Remote browser control (CDP) wire | device:<uuid> cdp-command {id,method,params?,sessionId?} → cdp-result {id,ok,result?,error?} between the callers (_shared/cdp.ts sendCdpAndAwait, used by busybro.ts + mcp/index.ts; dashboard apiFetch.ts sendDeviceCommand) and the device (cli/cdp-connector RemoteCdpExecutor). | Event names + payload shapes match on both sides; meta methods bmc.targets/bmc.activePage are connector-local. Opt-in key settings_device.data.cdpControlEnabled (default OFF) is read by the device AND written by the dashboard/MCP/BusyBro toggles. Oversize results upload to the cdp-artifacts Storage bucket and return {url}. browser_cdp (raw) is admin-only; wrappers are devices:edit. A device JWT may write its own device:<uuid> topic (RLS 0017); the read policy lets a non-admin owner receive cdp-result. | ⚠ manual |
| 10 | Script config (the onRequest/onResponse hook contract) | canonical web/proxy-server/src/scriptEngine.ts (isolated-vm) + scriptConfig.ts; mirrors cli/cdp-connector/src/scriptEngine.ts, dashboard web/dashboard/types.ts (Script), ios/Shared/Script.swift, MCP SCRIPT_SCHEMA (supabase/functions/_shared/mcpRegistry.ts). Storage settings_{global,device}.data.scripts; effective = global ++ device via effective_settings_for_device. | The frozen req/res/ctx/busymate.*/Response.json API runs byte-identically on every engine — a change to the shape, return-value semantics, or caps updates all four mirrors at once. New MCP script tools keep parity (contract 3). See Scripts. | ⚠ manual |
| 11 | TrafficLogEntry script-result mirror (scriptRan | scriptError | scriptTrace) | web/proxy-server/src/types.ts, cli/cdp-connector/src/types.ts, web/dashboard/types.ts, ios/Shared/TrafficLogEntry.swift | The three fields must be the same JSON type on every writer: scriptRan?: string[] | null (executed hook ids, execution order; inline block-rule actions namespaced inline:<ruleId>), scriptError?: string, scriptTrace?: string[]; empty/nil OMITS the key. RESOLVED 2026-06-12: iOS reconciled to [String]? + emits the proxy shape (cb209d8f; the boolean it briefly emitted in build 180 crashed the feed — dashboard 379 normalizes defensively as a backstop). Residual: iOS attaches script fields only on SYNTHETIC entries (the proxy also attaches on normal pair-logged entries) — open parity follow-up. | ⚠ manual |
| 12 | Block-rule run-count cap + auto-disable (BlockRule.maxRuns) | canonical web/proxy-server/src/blockRules.ts (maxRuns field + parseMaxRuns) + runCounter.ts (executor enforcement) + pgRunCounter.ts (record_block_rule_run RPC call); mirrors cli/cdp-connector/src/blockRules.ts + its executor, ios/Shared/ block-rule reader + executor, MCP set_block_rules_* (carry maxRuns), dashboard block-rule editor (render the cap + auto-disabled state); supabase owns the record_block_rule_run RPC + the settings_device.data.{blockRuleRuns,blockRuleDisabled} storage. | See the FULL spec below this table. In one line: maxRuns?: number on a BlockRule (any action type) caps how many times the rule fires PER (device, ruleId); on the run that reaches the cap the executor serves the final action then AUTO-DISABLES the rule for that device only — DEVICE rule → its own enabled=false, GLOBAL rule → its id added to that device's settings_device.data.blockRuleDisabled[] (the global rule stays live fleet-wide). The count persists in settings_device.data.blockRuleRuns[ruleId] so it survives proxy restart + device reconnect ("once-ever"). Every executor enforces at the SAME chokepoint (firstBlockMatch + a per-device suppress predicate + a consume-before-fire gate); the write-back is the record_block_rule_run RPC which fires the existing settings_device:all Realtime broadcast so the dashboard reflects the auto-disable live. | ⚠ manual |
| 13 | MCP registry golden (gate/shape snapshot) | supabase/functions/_shared/mcpRegistry.ts (MCP_TOOLS) ↔ notes/test-goldens/mcp-registry.golden.json | An INTENTIONAL change to a tool's cap / adminOnly / confirm / deviceArg / inputSchema, or a tool added/removed, MUST regenerate the golden (node scripts/mcp-golden.mjs --update) in the SAME ship so the change lands as a reviewed diff. The description is deliberately NOT hashed (prose churns); the gate is the gating + shape. Owned by the tester component (TEST-1). | ✅ |
Contract 12 — Block-rule run-count cap + auto-disable (#87)
Canonical owner: proxy-server (
web/proxy-server/src/blockRules.tsis the source-of-truth type;runCounter.tsis the reference executor). cdp-connector + iOS mirror it byte-for-byte; supabase owns the RPC + storage; the dashboard renders it. This section is the exact spec the mirrors follow.
Why: a mock block rule with no run cap answers EVERY matching request — a teammate's 5 one-shot
mock 401s on Spark/Walmart driver endpoints (to force a token refresh) made the app refresh in an
infinite loop until disabled by hand. maxRuns lets a rule fire once (or N times) then stop.
1. The field (canonical type — web/proxy-server/src/blockRules.ts):
interface BlockRule {
id: string; enabled: boolean; method?: string; pattern: string;
action: BlockAction; note?: string;
maxRuns?: number; // (#87) cap — applies to ANY action type
}maxRunsis on the RULE (not the action) and applies toblock/mock/drop/scriptalike.- Absent / null / non-positive / fractional / non-numeric = unbounded (today's behavior).
1= fire once. - Tolerant coercion (
parseMaxRuns, exported, copy verbatim): accept a number or numeric string; require finite +Number.isInteger+> 0; everything else →undefined. A malformedmaxRunsis dropped but the rule is otherwise kept (never strands the rule).
2. Storage (per-device, in settings_device.data — supabase owns the columns):
settings_device.data.blockRuleRuns: { [ruleId: string]: number }— the per-device run count.settings_device.data.blockRuleDisabled: string[]— per-device suppress set for GLOBAL rule ids whose cap was reached on this device (distinct from a device rule's ownenabled=false).- Both are strictly per-device — they live ONLY on
settings_device, NOTsettings_global, and are NOT folded into theeffective_settings_for_deviceblock-rules union (they'd leak across devices). Readers fetch them from the device's ownsettings_devicerow directly.
3. Auto-disable semantics (per-device, even for global rules):
- DEVICE rule (id present only in
settings_device.data.blockRules): on cap, flip THAT rule'senabled = falseinsettings_device.data.blockRules→firstBlockMatchskips it (!enabled). - GLOBAL rule (id present in
settings_global.data.blockRules): on cap, add the id tosettings_device.data.blockRuleDisabled[](set semantics). The global rule's OWNenabledis untouched — it keeps firing on every other device. Scope is decided by membership in the global rule-id set.
4. Write-back path (race-free — supabase RPC, NOT proxy read-modify-write):
The executor must NOT read-modify-write the whole settings_device.data blob (it races a concurrent
dashboard edit and can't flip a nested array element via a top-level || merge). Instead it calls a
SECURITY DEFINER RPC, service-role only:
record_block_rule_run(
p_device_uuid uuid, -- device the cap is counted for
p_rule_id text, -- BlockRule.id
p_scope text, -- 'device' | 'global' (which auto-disable to do)
p_runs int, -- new run count; stored as greatest(stored, p_runs)
p_disable boolean -- true on the cap-reaching transition
) returns voidEffect: upsert the device's settings_device row; set
data.blockRuleRuns[p_rule_id] = greatest(stored, p_runs); and when p_disable: if p_scope='device'
flip data.blockRules[id==p_rule_id].enabled=false, else add p_rule_id to data.blockRuleDisabled[].
The settings_device write fires the existing settings_device:all Realtime broadcast (contract #4),
so the dashboard reflects the auto-disable live with no extra wire hop.
5. Enforcement point (every executor — proxy/cdp/iOS — at the SAME chokepoint):
- Build a per-device suppress predicate
isSuppressed(ruleId)= "is this rule's cap already reached for this device?" (in-memory disabled set, seeded from persistedblockRuleDisabled+ device-ruleenabled=false). Pass it tofirstBlockMatch(method, url, rules, isSuppressed)so a spent rule is skipped and a later matching block/mock/drop rule takes over. - On the matched rule, before firing, call
consume(deviceUuid, rule)→{ allowed, reachedCap }.allowed=false⇒ do not fire (cap reached; fall through to forward).reachedCap=true⇒ fire this FINAL action, then the gate auto-disables (persist + suppress). - Unbounded rules (
maxRuns===undefined) bypass the counter entirely (zero overhead, today's path). Unattributed traffic (no device UUID) is fail-open: allowed, not counted (nowhere to persist).
6. Durability + perf:
- Counter is in-memory, persisted (a) immediately on the auto-disable transition (the durability-critical write) and (b) on a debounce for high-N intermediate counts (best-effort).
- The one-shot (
maxRuns=1) case is exactly one write — fire + disable atomically — and is fully restart-durable: a DEVICE one-shot via its persistedenabled=false, a GLOBAL one-shot via the persistedblockRuleDisabled[]re-seeded into the in-memory disabled set at device prime / first-CONNECT (ensureDeviceSettingsawaits the seed). Trade-off: for a high-N cap, a crash between debounce flushes loses at most the un-flushed delta (the rule may fire a few extra times after a restart before re-reaching the cap); the security-relevant guarantee — "a one-shot fires once" — is exact.
7. MCP / dashboard: set_block_rules_global / set_block_rules_device accept maxRuns on each
rule (tolerant parse, contract #3 parity). The dashboard block-rule editor exposes a "max runs" field
and renders the auto-disabled state (a device-rule enabled=false or a blockRuleDisabled membership)
distinctly from an operator-disabled rule.
Shipping a feature — the no-gap checklist
When you implement a user-visible feature, update the whole chain in the same change. Items 3–6 are the 5-surface parity rule — a dashboard capability must align across MCP + REST + WS + BusyBro:
- Implement in the owning component (dashboard / proxy-server / iOS / cdp / supabase).
- Schema? add a migration in
supabase/migrations/; update every reader's types (contract 5). - MCP (surface 2): add/extend the tool in
supabase/functions/_shared/mcpRegistry.tsso MCP keeps full parity (contract 3); update the documented tool count inREADME.md+CLAUDE.md. - REST (surface 3): if the feature added a table/view/RPC/Edge-Function, document it in the
/restexplorer manifest (web/dashboard/app/(settings)/rest/manifest.ts). - WS (surface 4): if the feature added a Realtime channel/event the dashboard consumes, document it in the
/wsexplorer registry (web/dashboard/app/(settings)/ws/channels.ts) — the header count auto-derives fromCHANNELS.length. - BusyBro (surface 5): confirm the new MCP tool flows to the brain (it consumes the same registry) and is not wrongly denylisted/curated-out in
supabase/functions/_shared/busybroDispatch.ts(the sharedREGISTRY_DENYLISTthe brain imports); gating/confirm/ownership correct. The MINIMAL intentional exclusions are only raw power primitives (db_*/realtime_broadcast/list_tables), curated-superseded raw reads/writes (the brain owns a safer impl), and brain-native personal memory — documented, not gaps. Authoring/governance/settings tools are now exposed at their existing gates. - Docs: add/update the
docs/page (+_meta.json) (contract 6). - Test (TEST-1): add/update the tester artifact — a
SUITES[]suite inscripts/test-system.mjs+ anotes/test-coverage-map.mdrow + the relevanttester/skills/test-<section>/SKILL.mdentry; assert the failure/denied path too. A new MCP tool ALSO regenerates the golden (node scripts/mcp-golden.mjs --update, contract 13).node scripts/coverage-audit.mjs --strictmust pass. - Version: bump only the changed component's
buildinversion.json+ itspackage.json(contract 2). - README: banner (●/○) + Component status table.
- Notes: write
notes/next-ship.mdbeforenode scripts/notify-telegram.mjs ship --component=<x>. - Audit:
node scripts/sync-audit.mjs+node scripts/test-system.mjs contractmust pass (exit 0). Then deploy.
Running the audit
node scripts/sync-audit.mjs # full report, exit 1 on hard drift (CI-friendly)
node scripts/sync-audit.mjs --quiet # prints only on drift (hooks)
node scripts/sync-audit.mjs --json # machine-readable
# The tester's contract-tier drift detectors (TEST-1 — owned by the tester component):
node scripts/test-system.mjs contract # sync-audit + content-parity + mcp-golden + source-enum-mirror + contracts-consistency + 5-surface-parity
node scripts/coverage-audit.mjs --strict # every covered/partial row maps to a real, registered suiteAdd a new machine-checkable contract here and in scripts/sync-audit.mjs (or a dedicated
scripts/*.mjs detector registered in test-system.mjs's contract tier) together.