Skip to main content

Documentation Index

Fetch the complete documentation index at: https://grantmaster.dev/llms.txt

Use this file to discover all available pages before exploring further.

Architecture Decision Log (ADR)

This document records the major architectural decisions made during the development of GrantMaster.

1. Multi-Tenancy: Logical Isolation vs. Physical Isolation

  • Status: Accepted
  • Decision: Use Logical Isolation (shared collections, organizationId filter).
  • Rationale:
    • Firestore scaling handles large collections efficiently.
    • Easier to implement platform-wide analytics and cross-tenant support (imuseration).
    • Keeps operational costs lower by avoiding per-tenant database instances.
  • Consequence: Strictly requires organizationId checks at the Service and Security Rule layers.

2. Infrastructure: Firebase as Primary Backend

  • Status: Accepted
  • Decision: Use Firebase (Auth, Firestore, Cloud Functions).
  • Rationale:
    • Rapid development velocity.
    • Real-time capabilities (onSnapshot) are native.
    • Generous free tier for NGOs in early stages.
  • Consequence: Vendor lock-in; requires careful abstraction in dataService.ts to allow for potential future migration.

3. Communication: In-Memory EventBus

  • Status: Accepted
  • Decision: Use an in-memory EventBus for inter-module communication.
  • Rationale:
    • Avoids complexity of a message broker (RabbitMQ/PubSub) for internal sync.
    • Immediate UI consistency.
    • Persistence only where needed (Audit Logs).
  • Consequence: Listeners are synchronous; long-running tasks must be offloaded to Cloud Functions or background fetches.

4. Frontend: React Context vs. Redux/Zustand

  • Status: Accepted
  • Decision: Use React Context + Feature Providers.
  • Rationale:
    • Native to React; no additional library weight.
    • Data is naturally scoped to sections of the tree.
    • Encourages a “Domain-Driven” frontend structure.
  • Consequence: Careful management needed to avoid unnecessary re-renders in large lists.

5. Security: Auditor Access Grants

  • Status: Accepted
  • Decision: Implement temporary, scope-limited access tokens for auditors.
  • Rationale:
    • Compliance requirement: Auditors should not have full Admin credentials.
    • Allows for time-boxed reviews of organizational data.
  • Consequence: Requires a dedicated AuditorContext and complex security rules that check for “Grant” active windows.

6. Billing: Stripe Payment Service Migration

  • Status: Accepted
  • Decision: Replace monolithic stripeService.ts with stripePaymentService.ts backed by Firebase Cloud Functions.
  • Rationale:
    • All sensitive Stripe operations (secret keys, subscription mutations) must run server-side.
    • Cloud Functions enforce authentication and authorization before touching Stripe.
    • Frontend only handles the publishable key for Stripe Elements.
  • Consequence: Frontend calls httpsCallable wrappers; no direct Stripe SDK usage in the browser.

7. Billing: Credit & Entitlement System

  • Status: Accepted
  • Decision: Introduce a credit-based metering system (creditService.ts) with a reservation/consume/release lifecycle.
  • Rationale:
    • AI agent runs are expensive; per-call billing requires atomic credit accounting.
    • Firestore transactions prevent concurrent agent runs from overdrawing the credit balance.
    • Credit packs can be purchased as one-time top-ups alongside the subscription.
  • Consequence: AgentExecutionService must reserve credits before a run and release unused credits on completion or failure. Entitlements are gated via src/config/entitlements.ts.

8. Extensions: Pluggable Feature Architecture

  • Status: Accepted
  • Decision: Build a multi-milestone extension system with a stable public API facade, contribution registry, settings panels, lifecycle hooks, dependency validation, data migrations, observability, and DX scaffolding.
  • Rationale:
    • Features like Grant Calendar or Impact Tracking can be independently enabled/disabled per tenant.
    • Marketplace model allows pricing and trial periods per extension.
    • Clean separation from core via ExtensionAPI facade prevents internal coupling.
  • Consequence: Extensions register contributions (routes, sidebar items, widgets) through the contribution registry. Dependency graph validation prevents circular or missing dependencies.

9. AI Agents: Execution Architecture

  • Status: Accepted
  • Decision: Implement autonomous AI agents with a step-based execution model, tool registry, quota enforcement, and human-in-the-loop escalation.
  • Rationale:
    • Agents need a bounded execution model (max steps, credit budgets) for cost control.
    • Tool registry (AgentToolRegistry) restricts each agent to its declared allowed tools.
    • Escalation pattern (awaiting_human status) lets agents pause for human approval on sensitive actions.
  • Consequence: Agent runs follow a state machine (queued → running → paused/awaiting_human → completed/failed/cancelled). Every tool execution is scoped to the triggering user’s RBAC permissions.

10. Frontend: Tailwind CSS 4 with OKLCH Design Tokens

  • Status: Accepted
  • Decision: Use Tailwind CSS 4 with @theme block and OKLCH color functions for the design token system.
  • Rationale:
    • OKLCH produces perceptually uniform color scales (consistent perceived lightness across hues).
    • Tailwind 4 @theme replaces tailwind.config.js for token definitions.
    • primary-* tokens decouple brand color from hardcoded blue-* classes.
  • Consequence: All color references must use semantic tokens (primary-*, surface-*). Direct blue-*, gray-*, red-*, green-* Tailwind classes are prohibited (use primary-*, slate-*, rose-*, emerald-*).

11. Module Boundaries: Feature Public API Enforcement

  • Status: Accepted
  • Decision: Enforce a feature boundary rule where cross-feature imports must use the feature public API (@/features/<feature> or @/features/<feature>/index), not internal feature files.
  • Rationale:
    • Reduces accidental tight coupling across feature implementations.
    • Makes each feature easier to refactor internally without cascading breakages.
    • Encourages explicit public surface design at feature boundaries.
  • Consequence: CI runs check:feature-public-api with a no-regression baseline (config/feature-public-api-violations-baseline.json) and publishes a report artifact (artifacts/feature-public-api-report.json).

12. Code Stewardship: Layer Ownership Coverage

  • Status: Accepted
  • Decision: Require explicit ownership mapping for all source files under src/ via config/layer-ownership.json.
  • Rationale:
    • Clarifies accountability for architectural layers.
    • Prevents unowned code paths from accumulating hidden maintenance risk.
    • Supports review routing and faster incident response.
  • Consequence: CI runs check:layer-ownership and fails on unmapped files. A machine-readable report is published at artifacts/layer-ownership-report.json.

13. EventBus Migration: Superadmin → Platform Feature

  • Status: Accepted
  • Decision: Move the EventBus monitoring page from src/features/superadmin/ to src/features/platform/eventbus/ and simplify it to a 2-tab layout (Stream + Dead Letter).
  • Rationale:
    • The EventBus page serves as operational infrastructure monitoring, not tenant-specific admin — it belongs in platform/.
    • The old 4-tab layout (Overview, Events, Topics, Config) was backed by mock data from PlatformConsole seeds and provided no real operational value.
    • Replacing mock-backed tabs with Firestore-backed Stream and Dead Letter tabs provides actionable monitoring.
    • The Dead Letter tab now reads from the eventDlq collection and supports Cloud Function replay.
  • Consequence: Old superadmin EventBus components, seeds, and routes were deleted. The useEventBus and useDeadLetterQueue hooks now live under platform/eventbus/hooks/.

14. Procurement P2P: Server-Side tRPC Architecture

  • Status: Accepted
  • Decision: Implement the Procurement (Procure-to-Pay) feature as a server-side tRPC router with ServerProcurementService in Cloud Functions, rather than using client-side Firestore services.
  • Rationale:
    • Procurement involves multi-step approval workflows and vendor qualification — these benefit from server-side validation and authorization.
    • Zod schemas in packages/domain-schema/src/trpc/procurement.ts enforce input validation at the API boundary.
    • Aligns with the established pattern for new features (expenses and journals also gained server-side services in this iteration).
  • Consequence: Frontend consumes procurement data via tRPC hooks. Domain schemas are shared between the frontend and Cloud Functions via the domain-schema package.

15. Integration OAuth: Server-Side Token Delegation

  • Status: Accepted
  • Decision: Move OAuth token exchange and refresh for third-party integrations (Google Calendar, HubSpot) to Cloud Functions. Raw tokens are stored in Firestore and never returned to the client.
  • Rationale:
    • Client-side OAuth token handling exposes client secrets in the browser bundle.
    • Server-side delegation keeps secrets in environment variables and tokens in Firestore integrationConfigs/ subcollections.
    • The pattern is consistent with the existing Stripe payment service architecture (ADR #6).
  • Consequence: Frontend calls exchangeGoogleCalendarCode, refreshGoogleCalendarToken, and refreshHubSpotToken Cloud Functions. Environment variables GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET, HUBSPOT_CLIENT_ID, HUBSPOT_CLIENT_SECRET must be configured on the server.

16. Skeleton Loading States: AppShell Skeleton System

  • Status: Accepted
  • Decision: Introduce a set of skeleton loading components (AppShellSkeleton, DashboardSkeleton, TablePageSkeleton, FormPageSkeleton, WorkspaceSkeleton, ApprovalsSkeleton) for use as React.Suspense fallbacks in route-level code splitting.
  • Rationale:
    • Default spinner fallbacks cause layout shift when chunked route components load.
    • Purpose-built skeletons match the layout of common page patterns (dashboard grids, data tables, form pages) and provide instant perceived responsiveness.
  • Consequence: Route lazy() boundaries should use the appropriate skeleton as the Suspense fallback. Skeletons live in src/components/ui/skeletons/ with the shell wrapper in src/components/ui/AppShellSkeleton.tsx.

17. Grant→Project Auto-Linking on Conversion

  • Status: Accepted (2026-04-14)
  • Decision: When a pipeline entry is converted to an ActiveGrant via ServerActiveGrantServiceClass.convert(), automatically create a linked Project document and bidirectionally link the two records.
  • Rationale:
    • Previously, convertToActiveGrant() created only an ActiveGrant document; no project was spawned. This left won grants as data islands, disconnecting the entire downstream MVP flow (expenses, journals, compliance, reports) from the grant origin.
    • The Project schema already carried grantId/activeGrantId fields but they were never populated automatically.
    • Operational delivery (tasks, budget, expenses, journals, compliance) is scoped by projectId, so a missing project meant the grant had no operational hooks.
  • Consequence: ServerActiveGrantServiceClass.convert() (in functions/src/api/services/ServerGrantService.ts) now:
    1. Creates the ActiveGrant document.
    2. Looks up the pipeline entry to derive the project name and grant title.
    3. Calls serverProjectService.create(...) with grantId, managerId, startDate/endDate, and budget derived from the award.
    4. Back-links the project’s id onto the ActiveGrant as projectId.
    5. Emits GRANT_WON with { grantId, pipelineId, projectId } in the payload so downstream subscribers receive the project link.
  • Client impact: useGrants.convertToActiveGrant invalidates utils.projects.list and utils.projects.stats on success so the new project appears immediately. Local-workspace (demo) mode displays a toast noting the linked project was created.
  • Follow-up: Budget cascade from grant-approved line items to project budget lines is not yet wired; see docs/planning/ for the work-tracking entry.

18. Grant-Aware Compliance Policy Auto-Attach

  • Status: Accepted (2026-04-14)
  • Decision: When a grant is converted to active (ADR #17), the server also auto-provisions draft compliance policies derived from the grant opportunity’s complianceRequirements and grantorType.
  • Rationale:
    • Funder-specific compliance rules (e.g. EU procurement thresholds, USAID branding requirements) should flow from the grant context into the organization’s compliance surface without manual re-entry.
    • Seeding policies as status: 'draft' with adoptionSource: 'grant_conversion' gives compliance officers visibility and a reviewable starting point rather than silent auto-enforcement.
  • Consequence: ServerActiveGrantServiceClass.convert() reads the source opportunity from grantOpportunities/{id}, extracts string-based complianceRequirements plus any overrides from grantDetails, and fans out serverCompliancePolicyService.create(...) calls with:
    • name (truncated to 80 chars), description, category: 'Grant Compliance'
    • severity: 'high', status: 'draft', frequency: 'ongoing'
    • projectIds: [<newProjectId>], adoptionSource: 'grant_conversion', grantId, grantorType
    • Failures are logged but non-fatal to the grant conversion (Promise.allSettled).
  • Follow-up: Full integration with GrantorComplianceService.getRecommendedRules() (platform rule recommendations by donor type) is a future enhancement; current implementation seeds from grant-specific requirements only.

19. Report Narrative Enrichment with Financial Context

  • Status: Accepted (2026-04-14)
  • Decision: generateReportNarrative() (in src/features/ai/services/geminiForecast.ts) accepts an optional ReportFinancialContext parameter that summarizes expenses and journal hours for the reporting period.
  • Rationale:
    • Prior to this change, report generation received only projects[] and user. The AI produced generic, data-starved narratives with no real expenditure or effort figures.
    • Funders expect concrete numbers (total spend, budget utilization %, hours by activity). Providing a structured summary into the prompt lets the AI cite specific figures rather than hallucinate.
  • Consequence:
    • New exported type ReportFinancialContext encapsulates totalExpenses, expensesByCategory, currency, budgetUtilization, totalJournalHours, journalHoursByProject.
    • useReportGeneration({ projects, user, financialContext }) forwards the context to the AI service.
    • The Reports container (src/features/reports/components/Reports.tsx) composes the context from useExpenses() and useJournals() via useMemo and passes it to the hook.
    • Dev-mode fallback narrative (when no Gemini API key is present) also emits a financial overview section when financialContext is populated.

20. Partnership Routes and Mission Routes Build-Gated for Launch

  • Status: Superseded by ADR #22 (2026-04-15)
  • Decision: Set BUILD_ENABLE_PARTNERSHIP_ROUTES = false in src/config/launchScope.ts. This gated both partnershipRoutes and missionRoutes (the flag governed both in src/routes/config/grantDomainRoutes.tsx).
  • Rationale:
    • Partnerships were already enumerated in DISABLED_ROUTE_PATHS, creating a contradictory state where the route bundle was emitted but the paths were filtered at the router. Setting the build flag to false removes dead code from the production bundle.
    • Mission, invitations, and partner detail pages are not part of the core MVP operational flow (discover → pipeline → win → deliver → report).
  • Consequence:
    • partnershipRoutes and missionRoutes resolved to empty arrays at build time; their lazy imports tree-shook out.
    • PlatformNavItemDef gained an optional launchCheck: () => boolean field so the platform sidebar can hide the Partnerships item at runtime.
    • The tenant sidebar Mission nav item carried launchCheck: () => BUILD_ENABLE_PARTNERSHIP_ROUTES and was filtered out by useNavResolver.
    • routeConfig.test.ts was updated to assert Portal/Stakeholders routes remain while Mission is gated.

21. Route-Level Error Boundaries on Suspense-Wrapped Tab Content

  • Status: Accepted (2026-04-14)
  • Decision: Wrap Suspense boundaries that load tab/section content with RouteErrorBoundary in DashboardTabs.tsx and ComplianceWorkspace.tsx.
  • Rationale:
    • Without an error boundary, a failed React.lazy chunk load or a runtime error inside the loaded component causes a blank white screen for the whole dashboard/compliance workspace.
    • RouteErrorBoundary already existed for page-level route boundaries (withRouteBoundary). Re-using it at the nested Suspense level isolates errors to the active tab/section.
  • Consequence:
    • DashboardTabs wraps KeepAliveTabPanels in <RouteErrorBoundary isolationLevel="route">. A crashed widget tab shows the error UI inline while the rest of the dashboard (KPIs, header, nav) remains interactive.
    • ComplianceWorkspace wraps the section Suspense the same way — a failed AlertsSection or TrendsSection load no longer takes down the PageTabs header.

22. Decouple Partnerships from Mission Behind Independent Launch Flags

  • Status: Accepted (2026-04-15)
  • Decision: Split the shared BUILD_ENABLE_PARTNERSHIP_ROUTES gate into two flags in src/config/launchScope.ts:
    • BUILD_ENABLE_PARTNERSHIP_ROUTES = true — governs partnershipRoutes and the platform sidebar Partnerships item.
    • BUILD_ENABLE_MISSION_ROUTE = false — governs missionRoutes (mission + invitations) and the tenant sidebar Mission item.
  • Rationale:
    • The Partnerships platform workspace was wanted for launch, but Mission/invitations are still not part of MVP scope. Gating both under one flag forced an all-or-nothing choice.
    • Keeping the gates independent lets each surface ship when its content is ready without ceremony around edits to the shared flag or the DISABLED_ROUTE_PATHS set.
  • Consequence:
    • partnershipRoutes now emits its lazy bundles; missionRoutes still resolves to [] and tree-shakes out.
    • Platform sidebar Partnerships nav item becomes visible; tenant sidebar Mission remains filtered out by useNavResolver via its new launchCheck: () => BUILD_ENABLE_MISSION_ROUTE.
    • partnerships and partnerships/:partnerId removed from DISABLED_ROUTE_PATHS — the build flag is the single source of truth now.
    • LAUNCH_BUILD_CONTROLS gains a mission-route entry alongside partnership-routes.
    • routeConfig.test.ts was updated to assert Mission/invitations are absent while the rest of the Impact surface (Portal, Stakeholders) remains.

23. Repository Pattern for Persistence (IFirestoreRepository<T>)

  • Status: Accepted (2026-04-18)
  • Decision: Services do not import from firebase/firestore (client) or firebase-admin/firestore (server) directly. Persistence is routed through IFirestoreRepository<T> from @/core/repository (client) and ./core/repository (server). The interface is symmetric across both runtimes.
  • Rationale:
    • The pre-existing dual-path pattern on BaseService (optional IFirestoreClient) left a ~1,400-site raw-SDK surface area in feature services; BaseService’s audit / quota / validation guarantees only fired when subclasses remembered to call them.
    • A narrow, typed persistence interface lets the repository enforce tenant scoping (organizationId required on reads), schema validation (Zod), and policy (cache invalidation, error taxonomy) in one place rather than 1,400.
    • Client/server symmetry means service migration patterns are portable — a migration on the client (e.g. grantDataAccess.ts) applies almost verbatim to the equivalent server handler.
  • Consequence:
    • Two baselines enforce the rule: check:no-raw-firestore-in-services (client) and check:no-raw-firestore-in-functions (server). New violations fail CI; migrations monotonically shrink the baselines.
    • API surface (both runtimes): getById, getByIdUnscoped, list, listByIds, listAcrossTenants, paginate, paginateAcrossTenants, create, update, setMerge, delete, batchUpdate, cursorById (admin), plus options transform (pre-validation hook) and skipTenantCheck (explicit opt-out).
    • Client-only: stream(options) returns an Unsubscribe and routes through listenerManager for automatic listener pooling (30–50% fewer onSnapshot calls when the same query fans out to multiple components).
    • Cross-repo coordination helpers: fallbackGetByIdUnscoped / fallbackList (for COLLECTION_FALLBACKS read order), createBatchWriter (cross-collection atomic writes), runInTransaction (admin; read-then-write atomic correctness).
    • Shared taxonomy: @grantmaster/shared/errors (AppError hierarchy) and @grantmaster/shared/schemas (passthrough schemas for server-side collections) ensure both runtimes throw/catch the same classes and parse the same shapes.

When to use which repo method

NeedMethod
Single doc, tenant-scopedgetById(id, organizationId)
Single doc, cross-tenant or subcollectiongetByIdUnscoped(id)
List, tenant-scopedlist({ organizationId, where?, orderBy?, limit? })
List, cross-tenant / platformlistAcrossTenants({ where?, orderBy?, limit? })
Cursor pagination, tenant-scopedpaginate({ organizationId, cursor, pageSize })
Cursor pagination, platformpaginateAcrossTenants({ cursor, pageSize })
Bulk by id listlistByIds(ids, { organizationId })
Real-time subscription (client)stream({ organizationId, onNext, onError })
Create (auto-id)create(data)
Create (explicit id)create(data, { id })
Update existingupdate(id, data, { organizationId? })
Upsert (create-or-merge)setMerge(id, data, { organizationId? })
Deletedelete(id, { organizationId? })
Bulk same-collection updatebatchUpdate(entries)
Cross-collection atomic writecreateBatchWriter(db).set(repo, id, data).commit()
Read-then-write atomic (server)runInTransaction(db, tx => { ... })
Multi-collection fallback readfallbackGetByIdUnscoped([r1, r2], id) / fallbackList([r1, r2], opts)

Subcollection pattern

Subcollections under a parent id (e.g. grantTracker/<pid>/tasks) are addressed by constructing a per-parent repo on the fly:
function pipelineTasksRepo(pipelineId: string) {
  return new FirestoreRepository<PipelineTask>({
    collectionName: `grantTracker/${pipelineId}/tasks`,
    schema: pipelineTaskSchema,
    getCollectionRef: () => collections.path(`grantTracker`, pipelineId, `tasks`),
    getDocRef: (id) => docs.path(`grantTracker`, pipelineId, `tasks`, id),
  });
}
Subcollection docs typically don’t carry their own organizationId (the parent path is the tenant boundary), so these repos use listAcrossTenants + getByIdUnscoped — tenancy is enforced by the parent lookup.

24. Observability: Cloud Trace Disabled at Launch (H0.1.2)

  • Status: Accepted — 2026-04-22, launch posture
  • Decision: Do NOT set OTEL_CLOUD_TRACE_ENABLED=true in production Cloud Functions at launch. Ship with the env var unset. Sentry backend tracesSampleRate=0.2 and frontend tracesSampleRate=0.1 provide the launch-critical observability baseline.
  • Rationale:
    • Cloud Trace implementation already exists in functions/src/core/sentry.ts (installCloudTraceExporter) and is one env-var flip to enable — this is a reversible default, not a permanent exclusion.
    • At launch-scale traffic (3 pilot tenants, ~100 sessions/day) cost is effectively zero either way — under the 2.5M-span/month free tier until ≥100 tenants. Cost is not the gating factor.
    • Real gating concerns are (a) operational complexity — one more sink to monitor, one more “is it healthy?” to check weekly — and (b) an unfinished data-residency audit on the production GCP project region. Shipping a trace pipeline to an unverified region would be a self-inflicted credibility problem for an EU-first NGO tool under GDPR scrutiny.
    • Sentry already covers 90% of the debug surface. Cloud Trace’s marginal value is distributed traces across Cloud Functions + Firestore/Pub-Sub spans — a real benefit for “why is this slow?” investigations, but not launch-blocking.
  • Consequence:
    • Perf debugging for pilots relies on Sentry’s 20% sample. If a specific session isn’t sampled, we can raise the relevant handler’s rate to 1.0 via Sentry.startSpan on demand — documented workaround.
    • No GCP-native alerting on Cloud Trace metrics (e.g. “alert if p95 of sendNotification exceeds 2s”). This gap is accepted for the launch window; Sentry performance thresholds cover it.
    • Revisit trigger: 30 days post-launch, review on the H1 agenda. Flip the flag ON iff we have seen ≥1 perf incident that full-sample traces would have resolved faster. Pre-flip: run gcloud config get-value compute/region --project=<prod> and confirm EU region; if not, set GCP data-residency restrictions before enabling.
    • Decision is reversible in ~5 minutes via firebase functions:config:set or the v2 env-var mechanism. No code change needed — the exporter picks up the flag on init.
  • Related: docs/planning/2026-04-22-h0-1-2-cloud-trace-decision.md (full decision memo with cost model, decision matrix, and revisit criteria).