Skip to main content

Documentation Index

Fetch the complete documentation index at: https://grantmaster.dev/llms.txt

Use this file to discover all available pages before exploring further.

AI Engine and Prompt Management

GrantMaster leverages a multi-model AI architecture to provide specialized intelligence across the grant lifecycle. This document details our AI stack, prompt engineering strategies, and cost-control mechanisms.

AI Stack Architecture

GrantMaster uses Google Gemini exclusively as the AI provider, accessed through the @google/genai SDK and routed via Firebase Cloud Functions powered by Genkit for tracing, monitoring, and server-side key management.
  • Primary Model: gemini-2.5-flash — used for all generation tasks: journal drafting, compliance auditing, grant proposal writing, report generation, and RAG query answering.
  • Embeddings: text-embedding-004 (Google) — powers semantic similarity search in the RAG pipeline.
  • Execution Mode: USE_CLOUD_FUNCTIONS = true in both geminiService.ts and ragService.ts. Production calls are routed through Firebase Callable Functions (Genkit-instrumented) rather than direct client-side SDK calls. Client-side fallback paths remain in code but are not active in production.

Source File Map

Since March 2026, the formerly monolithic geminiService.ts and ragService.ts have been decomposed into domain-specific modules. The table below organizes every file by architectural layer.

Gemini Core Infrastructure

FileRole
ai/services/geminiClient.tsInitializes GoogleGenAI client; exports MODEL_NAME constant and getAIClient()
ai/services/geminiRetry.tscallWithRetry() wrapper — exponential backoff on HTTP 429 and 5xx (max 3 retries)
ai/services/geminiJsonParser.tsStrips Markdown code fences and safely parses Gemini JSON responses
ai/services/geminiSchemas.tsGoogle Genkit Schema / Type definitions for typed JSON output (journal, audit, forecast, etc.)
ai/services/geminiServiceCore.tsCentral AI entrypoint — generateJournalEntries() and delegation to specialized modules
ai/services/geminiServiceFacade.tsThin BaseService wrapper class for dependency injection of geminiServiceCore functions

Specialized Gemini Feature Modules

FileRole
ai/services/geminiAssistant.tsqueryGrantAssistant() — grant assistant query handler with project/grant/team context injection
ai/services/geminiCompliance.tsanalyzeCompliance(), checkExpenseEligibility() — audit and compliance analysis
ai/services/geminiDocuments.tsanalyzeReceipt() (OCR), analyzeDocumentContent() — receipt and document analysis
ai/services/geminiForecast.tsgenerateReportNarrative(), generateProjectForecast() — burn rate and narrative generation
ai/services/geminiServiceProposalImpact.tsgenerateProposalSection(), generateFullProposal(), suggestMEIndicators(), generateMENarrative(), detectMEAnomalies()

RAG Pipeline (Modular)

The RAG pipeline has been decomposed from a single ragService.ts into 12 focused modules:
FileRole
ai/services/ragShared.tsShared constants: MODEL_NAME, EMBEDDING_MODEL, caching TTLs, freshness TTLs by category
ai/services/ragTypes.tsType definitions: RAGCategory, RAGQueryType, RAGQueryOptions, RAGQueryResponse, CacheStatus
ai/services/ragCache.tsQuery result caching with Firestore persistence — getCachedQuery(), cacheQuery(), hashQuery()
ai/services/ragDocumentProcessing.tsDocument chunking (chunkText()), category detection (detectCategory()), embedding generation
ai/services/ragDocumentManagement.tsUpload/processing workflow — uploadAndProcessDocument(), getDocumentStatus(), subscribeToDocumentStatus()
ai/services/ragRetrieval.tsSemantic chunk retrieval with freshness filtering and deduplication — retrieveRelevantChunks()
ai/services/ragMetadata.tsDeadline and activity rule extraction — extractDeadlines(), extractActivityRules()
ai/services/ragPipeline.tsMain query orchestration — queryRAG() combining retrieval, context assembly, and safety checks
ai/services/ragCloud.tsCloud Function fallback — queryRAGCloudFunction(), queryRAGAuto()
ai/services/ragServiceFacade.tsRAGService class (extends BaseService) delegating to RAG modules
ai/services/ragLogging.tsQuery logging to ai-query-logs Firestore collection with cache/safety metadata
ai/services/ranking.tscosineSimilarity(), jaccardSimilarity(), dedupeScoredChunks(), isChunkFresh()
ai/services/retrievalQueryBuilder.tsFirestore query builder for document-chunks collection filtered by project/org/category

Context Assembly and Safety

FileRole
ai/services/contextAssembler.tsassembleContextFromChunks(), buildRAGPrompt(), getSystemInstruction(), extractCitations() — RAG prompt building and citation extraction
ai/services/responseSafety.tsapplySafetyGuard() — prevents prompt injection, secret exfiltration, and PII exposure

Other AI Services

FileRole
ai/services/promptLibraryService.tsPrompt storage and retrieval from Firestore
ai/data/defaultPrompts.tsDefault prompt templates
ai/services/complianceExtractionService.tsAI-driven compliance rule extraction from uploaded documents
ai/services/visionService.ts(Removed April 2026 — dead code)
ai/services/reportGeneration.tsAI-assisted report section generation
ai/services/chatHistoryService.tslocalStorage-based chat session management (all sessions, save, add message, archive)
ai/services/documentService.tsDocument upload/processing management with Firebase Storage integration and progress tracking
ai/services/roleChangeRecommendations.tsRoleChangeRecommendationsService — AI-driven role change recommendations with audit logging

Hooks

FileRole
ai/hooks/useGrantAssistantState.tsManages Grant Assistant panel state (open/close, input, loading, message scroll)

Agent Services

FileRole
agents/services/AgentExecutionService.tsAutonomous agent run lifecycle (start, step, pause, complete/fail/cancel)
agents/services/AgentRunService.tsAgentRun query/list service with filtering by status, agentType, triggeredBy
agents/services/AgentQuotaService.tsAgent-specific quota enforcement: concurrent limits, step/token budgets, monthly/hourly rate limits
agents/services/AgentToolRegistry.tsRegistry of agent-callable tool adapters
agents/services/agentDefinitions.tsBuilt-in agent definitions (IDs, budgets, permissions)
All file paths are relative to src/features/.

Retrieval-Augmented Generation (RAG)

The RAG pipeline is spread across the rag*.ts modules in src/features/ai/services/. RAG responses are grounded in uploaded project documents (grant agreements, guidelines).
  1. Ingestion (ragDocumentProcessing.ts): Uploaded files are chunked by chunkText() using sentence-boundary-aware splitting with a 200-token overlap window. Each chunk is categorised as budget, compliance, reporting, or general by keyword detection in detectCategory().
  2. Embeddings (ragDocumentProcessing.ts): generateEmbedding() calls the text-embedding-004 model via @google/genai. Embeddings are stored on DocumentChunk records in the document-chunks Firestore collection, scoped by organizationId and projectId for tenant isolation.
  3. Retrieval (ragRetrieval.ts, ranking.ts, retrievalQueryBuilder.ts): retrieveRelevantChunks() builds a Firestore query via buildDocumentChunkQuery(), fetches tenant-scoped chunks, computes cosine and Jaccard similarity, applies freshness filtering via isChunkFresh(), deduplicates via dedupeScoredChunks(), and returns the top-N results (default 5). Production document processing is triggered by a Firebase Storage upload event via the processDocument Cloud Function.
  4. Context Assembly (contextAssembler.ts): assembleContextFromChunks() and buildRAGPrompt() construct the final prompt. The system switches structure based on queryType. getSystemInstruction() provides the GrantControl AI persona. After generation, extractCitations() parses [Source N] references.
  5. Safety (responseSafety.ts): Before returning any RAG response, applySafetyGuard() checks for prompt injection patterns, secret exfiltration attempts, and PII in the output.
  6. Query Orchestration (ragPipeline.ts): queryRAG() is the internal orchestrator combining retrieval, context assembly, LLM call, and safety checks into one pipeline.
  7. Cloud Function Routing (ragCloud.ts): queryRAGAuto() is the public entry point. In production it delegates to queryRAGCloudFunction(), which calls the queryRAG Callable Function (server-side key, auth checks, audit logging).
  8. Caching (ragCache.ts): Query results are cached for 1 hour (configurable per category via ragShared.ts) in the query-cache Firestore collection keyed by a hash of queryText + projectId. Cache hits are recorded in the AI query audit log.
  9. Audit Logging (ragLogging.ts): Every query (cache hit or miss) is written to the ai-query-logs collection via logQuery(), including cache status and safety check metadata.
  10. Document Management (ragDocumentManagement.ts): uploadAndProcessDocument() handles the full upload-to-processing lifecycle. subscribeToDocumentStatus() enables real-time status updates in the UI.

Supported Query Types

queryTypeFocus
expense-eligibilityEligibility (YES/NO/CONDITIONAL), amount limits, documentation requirements
activity-eligibilityAllowed activities, hour/percentage caps, approval requirements
deadlineReporting and submission deadlines extracted from grant documents
budgetBudget-related compliance questions
generalOpen-ended questions answered from uploaded context

Metadata Extraction

ragMetadata.ts exports standalone extractors that use gemini-2.5-flash with structured JSON output:
  • extractDeadlines() — pulls reporting deadlines from document text and returns DeadlineInfo[]
  • extractActivityRules() — pulls allowed/prohibited activity rules and returns ActivityRule[]

Prompt Versioning and Management

Prompts are managed via promptLibraryService.ts and data/defaultPrompts.ts within src/features/ai/:
  • System Personas: Each AI task (RAG assistant, compliance auditor, journal generator, etc.) has a dedicated system instruction string. The RAG persona is built by getSystemInstruction() in contextAssembler.ts.
  • Default Templates: Seed prompts are defined in defaultPrompts.ts and loaded by promptLibraryService.ts.
  • Dynamic Injection: Context is injected at call time (project data, document chunks, user role). The RAG system builds prompts via buildRAGPrompt() in contextAssembler.ts which switches structure based on queryType.
  • Structured Output: geminiSchemas.ts defines Gemini response schemas (Schema, Type from @google/genai) to enforce typed JSON output for all structured generation tasks. geminiJsonParser.ts safely parses the responses.

Autonomous Agent Execution

The src/features/agents/ feature provides a full autonomous agent framework on top of the AI services above.

Agent Definitions (agentDefinitions.ts)

Five built-in agents are registered in AGENT_DEFINITIONS_MAP:
IDNameFeature gateCredit budgetMax steps
compliance_checkerCompliance CheckerAGENT_MULTI_STEP3010
report_generatorReport GeneratorAGENT_MULTI_STEP5015
grant_proposal_writerGrant Proposal WriterAGENT_AUTONOMOUS8020
expense_auditorExpense AuditorAGENT_MULTI_STEP4015
journal_assistantJournal AssistantAGENT_BASIC155
Each definition declares the required entitlement Feature, required RBAC Permission[], and the set of allowedTools the agent may invoke.

Run Lifecycle (AgentExecutionService)

AgentExecutionService (extends BaseService) manages the full state machine stored in organizations/{orgId}/agent_runs:
queued → running → completed
                 → failed
                 → cancelled
                 → awaiting_human → running (after resolution)
Key lifecycle methods:
  • startRun() — validates definition, checks RBAC, verifies quota via AgentQuotaService.validateRunStart() (concurrent limits, monthly/hourly rate limits), reserves credits via CreditService.reserveCredits(), creates the AgentRun document, emits AGENT_TASK_STARTED.
  • executeStep() — validates step budget via AgentQuotaService.validateStepExecution(), invokes a tool from AgentToolRegistry, consumes credits via CreditService.consumeCredits(), appends an AgentStep to the run, emits AGENT_STEP_COMPLETED.
  • pauseRun() — transitions to awaiting_human, creates an AgentEscalation record with resolution options, emits AGENT_ESCALATION_REQUIRED.
  • resumeRun() — records resolution, transitions back to running, emits AGENT_ESCALATION_RESOLVED.
  • completeRun() / failRun() / cancelRun() — release unused reserved credits via CreditService.releaseCredits(), emit the corresponding terminal event.
All state transitions emit typed EventBus events for observability and notification workflows.

Agent Quota Service (AgentQuotaService)

Added in March 2026, AgentQuotaService extends the generic QuotaService with agent-specific enforcement:
  • validateRunStart() — checks concurrent agent limit (per org, per tier), monthly run limit, and hourly rate limit before any agent run begins.
  • validateStepExecution() — checks step budget and token budget per run before each step.
  • getCreditUsageSummary() — returns dashboard/billing summary of credit consumption.
  • Warning thresholds at 80% and 90% trigger QUOTA_WARNING events via EventBus for proactive alerting.

Agent Run Service (AgentRunService)

AgentRunService (extends BaseService<AgentRun>) provides query capabilities for the agent runs collection:
  • listRuns(params) — queries agent_runs with filtering by status, agentType, and triggeredBy.
  • parseAgentRun() — schema validation with Sentry error capture for malformed run documents.

Tool Registry (AgentToolRegistry)

AgentToolRegistryImpl is a Map<string, AgentTool> singleton (agentToolRegistry). Tools are adapter wrappers around existing services — they do not modify the underlying services. Each tool:
  1. Checks the agent’s inherited user permissions.
  2. Calls the underlying service method.
  3. Returns a structured AgentToolResult with creditsUsed and tokensUsed.
Tools referenced by built-in agents include: analyze_compliance, query_documents, generate_report, forecast_budget, scan_expense, generate_journal.

Credit Accounting

Agent runs use a reserve-then-consume model:
  1. Credits are reserved upfront at startRun() based on definition.defaultCreditBudget.
  2. Credits are consumed incrementally per step at executeStep().
  3. Unused credits are released at run completion/failure/cancellation.
Credit operations use Firestore transactions for atomic consistency (prevents concurrent overdraft). Reservations have a 1-hour TTL and auto-expire.

UI

  • AgentDashboard — lists runs and current status.
  • AgentRunDetail — step-by-step execution view.
  • CreditUsageWidget — shows credits consumed per run.
  • AgentAttributionBadge — inline badge for AI-generated content attribution.
  • useAgentRuns hook — React Query integration with filtering and computed summary statistics (totalRuns, completedRuns, failedRuns, creditsConsumed, averageCreditCost, averageDurationMs).

Cost Control and Quotas

To prevent runaway API costs, the system implements:
  1. Credit budgets: Each agent run reserves a credit budget from definition.defaultCreditBudget. Steps consume credits atomically via Firestore transactions; unused credits are released on completion.
  2. Agent quota service: AgentQuotaService.validateRunStart() enforces per-tenant concurrent agent limits, monthly run limits, and hourly rate limits — all tied to subscription tier via TIER_LIMITS.
  3. Step and token budgets: AgentQuotaService.validateStepExecution() prevents runaway agent runs by checking per-run step and token budgets.
  4. Usage tracking: All agent operations are metered via usageTrackingTrackers.tstrackAgentRun(), trackAgentStep(), trackCreditConsumption(), trackCreditPurchase().
  5. Usage alerts: usageAlerts.ts triggers notifications at 80%, 90%, 95%, and 100% thresholds for API calls, AI generations, storage, and other metered dimensions.
  6. Query caching: RAG query results are cached with configurable TTLs per category (via ragShared.ts) in Firestore (query-cache) to avoid redundant Gemini calls.
  7. Retry with backoff: geminiRetry.ts wraps all Gemini calls in callWithRetry() — retries up to 3 times with exponential backoff on HTTP 429 and 5xx responses.

Responsible AI & Privacy

  • Safety Guards: responseSafety.ts runs applySafetyGuard() on all RAG responses to detect and block prompt injection, secret exfiltration, and PII exposure before returning results to the user.
  • PII Stripping: Sensitive data is sanitized before being sent to the LLM provider.
  • Zero-Retention: We opt-out of “Data Training” programs with AI providers to ensure customer data is never used to re-train base models.
  • Audit Trail: Every AI query is logged to ai-query-logs via ragLogging.ts with full metadata (cache status, safety check results, token counts).