Documentation Index
Fetch the complete documentation index at: https://grantmaster.dev/llms.txt
Use this file to discover all available pages before exploring further.
AI Engine and Prompt Management
GrantMaster leverages a multi-model AI architecture to provide specialized intelligence across the grant lifecycle. This document details our AI stack, prompt engineering strategies, and cost-control mechanisms.
AI Stack Architecture
GrantMaster uses Google Gemini exclusively as the AI provider, accessed through the @google/genai SDK and routed via Firebase Cloud Functions powered by Genkit for tracing, monitoring, and server-side key management.
- Primary Model:
gemini-2.5-flash — used for all generation tasks: journal drafting, compliance auditing, grant proposal writing, report generation, and RAG query answering.
- Embeddings:
text-embedding-004 (Google) — powers semantic similarity search in the RAG pipeline.
- Execution Mode:
USE_CLOUD_FUNCTIONS = true in both geminiService.ts and ragService.ts. Production calls are routed through Firebase Callable Functions (Genkit-instrumented) rather than direct client-side SDK calls. Client-side fallback paths remain in code but are not active in production.
Source File Map
Since March 2026, the formerly monolithic geminiService.ts and ragService.ts have been decomposed into domain-specific modules. The table below organizes every file by architectural layer.
Gemini Core Infrastructure
| File | Role |
|---|
ai/services/geminiClient.ts | Initializes GoogleGenAI client; exports MODEL_NAME constant and getAIClient() |
ai/services/geminiRetry.ts | callWithRetry() wrapper — exponential backoff on HTTP 429 and 5xx (max 3 retries) |
ai/services/geminiJsonParser.ts | Strips Markdown code fences and safely parses Gemini JSON responses |
ai/services/geminiSchemas.ts | Google Genkit Schema / Type definitions for typed JSON output (journal, audit, forecast, etc.) |
ai/services/geminiServiceCore.ts | Central AI entrypoint — generateJournalEntries() and delegation to specialized modules |
ai/services/geminiServiceFacade.ts | Thin BaseService wrapper class for dependency injection of geminiServiceCore functions |
Specialized Gemini Feature Modules
| File | Role |
|---|
ai/services/geminiAssistant.ts | queryGrantAssistant() — grant assistant query handler with project/grant/team context injection |
ai/services/geminiCompliance.ts | analyzeCompliance(), checkExpenseEligibility() — audit and compliance analysis |
ai/services/geminiDocuments.ts | analyzeReceipt() (OCR), analyzeDocumentContent() — receipt and document analysis |
ai/services/geminiForecast.ts | generateReportNarrative(), generateProjectForecast() — burn rate and narrative generation |
ai/services/geminiServiceProposalImpact.ts | generateProposalSection(), generateFullProposal(), suggestMEIndicators(), generateMENarrative(), detectMEAnomalies() |
RAG Pipeline (Modular)
The RAG pipeline has been decomposed from a single ragService.ts into 12 focused modules:
| File | Role |
|---|
ai/services/ragShared.ts | Shared constants: MODEL_NAME, EMBEDDING_MODEL, caching TTLs, freshness TTLs by category |
ai/services/ragTypes.ts | Type definitions: RAGCategory, RAGQueryType, RAGQueryOptions, RAGQueryResponse, CacheStatus |
ai/services/ragCache.ts | Query result caching with Firestore persistence — getCachedQuery(), cacheQuery(), hashQuery() |
ai/services/ragDocumentProcessing.ts | Document chunking (chunkText()), category detection (detectCategory()), embedding generation |
ai/services/ragDocumentManagement.ts | Upload/processing workflow — uploadAndProcessDocument(), getDocumentStatus(), subscribeToDocumentStatus() |
ai/services/ragRetrieval.ts | Semantic chunk retrieval with freshness filtering and deduplication — retrieveRelevantChunks() |
ai/services/ragMetadata.ts | Deadline and activity rule extraction — extractDeadlines(), extractActivityRules() |
ai/services/ragPipeline.ts | Main query orchestration — queryRAG() combining retrieval, context assembly, and safety checks |
ai/services/ragCloud.ts | Cloud Function fallback — queryRAGCloudFunction(), queryRAGAuto() |
ai/services/ragServiceFacade.ts | RAGService class (extends BaseService) delegating to RAG modules |
ai/services/ragLogging.ts | Query logging to ai-query-logs Firestore collection with cache/safety metadata |
ai/services/ranking.ts | cosineSimilarity(), jaccardSimilarity(), dedupeScoredChunks(), isChunkFresh() |
ai/services/retrievalQueryBuilder.ts | Firestore query builder for document-chunks collection filtered by project/org/category |
Context Assembly and Safety
| File | Role |
|---|
ai/services/contextAssembler.ts | assembleContextFromChunks(), buildRAGPrompt(), getSystemInstruction(), extractCitations() — RAG prompt building and citation extraction |
ai/services/responseSafety.ts | applySafetyGuard() — prevents prompt injection, secret exfiltration, and PII exposure |
Other AI Services
| File | Role |
|---|
ai/services/promptLibraryService.ts | Prompt storage and retrieval from Firestore |
ai/data/defaultPrompts.ts | Default prompt templates |
ai/services/complianceExtractionService.ts | AI-driven compliance rule extraction from uploaded documents |
ai/services/visionService.ts | (Removed April 2026 — dead code) |
ai/services/reportGeneration.ts | AI-assisted report section generation |
ai/services/chatHistoryService.ts | localStorage-based chat session management (all sessions, save, add message, archive) |
ai/services/documentService.ts | Document upload/processing management with Firebase Storage integration and progress tracking |
ai/services/roleChangeRecommendations.ts | RoleChangeRecommendationsService — AI-driven role change recommendations with audit logging |
Hooks
| File | Role |
|---|
ai/hooks/useGrantAssistantState.ts | Manages Grant Assistant panel state (open/close, input, loading, message scroll) |
Agent Services
| File | Role |
|---|
agents/services/AgentExecutionService.ts | Autonomous agent run lifecycle (start, step, pause, complete/fail/cancel) |
agents/services/AgentRunService.ts | AgentRun query/list service with filtering by status, agentType, triggeredBy |
agents/services/AgentQuotaService.ts | Agent-specific quota enforcement: concurrent limits, step/token budgets, monthly/hourly rate limits |
agents/services/AgentToolRegistry.ts | Registry of agent-callable tool adapters |
agents/services/agentDefinitions.ts | Built-in agent definitions (IDs, budgets, permissions) |
All file paths are relative to src/features/.
Retrieval-Augmented Generation (RAG)
The RAG pipeline is spread across the rag*.ts modules in src/features/ai/services/. RAG responses are grounded in uploaded project documents (grant agreements, guidelines).
- Ingestion (
ragDocumentProcessing.ts): Uploaded files are chunked by chunkText() using sentence-boundary-aware splitting with a 200-token overlap window. Each chunk is categorised as budget, compliance, reporting, or general by keyword detection in detectCategory().
- Embeddings (
ragDocumentProcessing.ts): generateEmbedding() calls the text-embedding-004 model via @google/genai. Embeddings are stored on DocumentChunk records in the document-chunks Firestore collection, scoped by organizationId and projectId for tenant isolation.
- Retrieval (
ragRetrieval.ts, ranking.ts, retrievalQueryBuilder.ts): retrieveRelevantChunks() builds a Firestore query via buildDocumentChunkQuery(), fetches tenant-scoped chunks, computes cosine and Jaccard similarity, applies freshness filtering via isChunkFresh(), deduplicates via dedupeScoredChunks(), and returns the top-N results (default 5). Production document processing is triggered by a Firebase Storage upload event via the processDocument Cloud Function.
- Context Assembly (
contextAssembler.ts): assembleContextFromChunks() and buildRAGPrompt() construct the final prompt. The system switches structure based on queryType. getSystemInstruction() provides the GrantControl AI persona. After generation, extractCitations() parses [Source N] references.
- Safety (
responseSafety.ts): Before returning any RAG response, applySafetyGuard() checks for prompt injection patterns, secret exfiltration attempts, and PII in the output.
- Query Orchestration (
ragPipeline.ts): queryRAG() is the internal orchestrator combining retrieval, context assembly, LLM call, and safety checks into one pipeline.
- Cloud Function Routing (
ragCloud.ts): queryRAGAuto() is the public entry point. In production it delegates to queryRAGCloudFunction(), which calls the queryRAG Callable Function (server-side key, auth checks, audit logging).
- Caching (
ragCache.ts): Query results are cached for 1 hour (configurable per category via ragShared.ts) in the query-cache Firestore collection keyed by a hash of queryText + projectId. Cache hits are recorded in the AI query audit log.
- Audit Logging (
ragLogging.ts): Every query (cache hit or miss) is written to the ai-query-logs collection via logQuery(), including cache status and safety check metadata.
- Document Management (
ragDocumentManagement.ts): uploadAndProcessDocument() handles the full upload-to-processing lifecycle. subscribeToDocumentStatus() enables real-time status updates in the UI.
Supported Query Types
queryType | Focus |
|---|
expense-eligibility | Eligibility (YES/NO/CONDITIONAL), amount limits, documentation requirements |
activity-eligibility | Allowed activities, hour/percentage caps, approval requirements |
deadline | Reporting and submission deadlines extracted from grant documents |
budget | Budget-related compliance questions |
general | Open-ended questions answered from uploaded context |
ragMetadata.ts exports standalone extractors that use gemini-2.5-flash with structured JSON output:
extractDeadlines() — pulls reporting deadlines from document text and returns DeadlineInfo[]
extractActivityRules() — pulls allowed/prohibited activity rules and returns ActivityRule[]
Prompt Versioning and Management
Prompts are managed via promptLibraryService.ts and data/defaultPrompts.ts within src/features/ai/:
- System Personas: Each AI task (RAG assistant, compliance auditor, journal generator, etc.) has a dedicated system instruction string. The RAG persona is built by
getSystemInstruction() in contextAssembler.ts.
- Default Templates: Seed prompts are defined in
defaultPrompts.ts and loaded by promptLibraryService.ts.
- Dynamic Injection: Context is injected at call time (project data, document chunks, user role). The RAG system builds prompts via
buildRAGPrompt() in contextAssembler.ts which switches structure based on queryType.
- Structured Output:
geminiSchemas.ts defines Gemini response schemas (Schema, Type from @google/genai) to enforce typed JSON output for all structured generation tasks. geminiJsonParser.ts safely parses the responses.
Autonomous Agent Execution
The src/features/agents/ feature provides a full autonomous agent framework on top of the AI services above.
Agent Definitions (agentDefinitions.ts)
Five built-in agents are registered in AGENT_DEFINITIONS_MAP:
| ID | Name | Feature gate | Credit budget | Max steps |
|---|
compliance_checker | Compliance Checker | AGENT_MULTI_STEP | 30 | 10 |
report_generator | Report Generator | AGENT_MULTI_STEP | 50 | 15 |
grant_proposal_writer | Grant Proposal Writer | AGENT_AUTONOMOUS | 80 | 20 |
expense_auditor | Expense Auditor | AGENT_MULTI_STEP | 40 | 15 |
journal_assistant | Journal Assistant | AGENT_BASIC | 15 | 5 |
Each definition declares the required entitlement Feature, required RBAC Permission[], and the set of allowedTools the agent may invoke.
Run Lifecycle (AgentExecutionService)
AgentExecutionService (extends BaseService) manages the full state machine stored in organizations/{orgId}/agent_runs:
queued → running → completed
→ failed
→ cancelled
→ awaiting_human → running (after resolution)
Key lifecycle methods:
startRun() — validates definition, checks RBAC, verifies quota via AgentQuotaService.validateRunStart() (concurrent limits, monthly/hourly rate limits), reserves credits via CreditService.reserveCredits(), creates the AgentRun document, emits AGENT_TASK_STARTED.
executeStep() — validates step budget via AgentQuotaService.validateStepExecution(), invokes a tool from AgentToolRegistry, consumes credits via CreditService.consumeCredits(), appends an AgentStep to the run, emits AGENT_STEP_COMPLETED.
pauseRun() — transitions to awaiting_human, creates an AgentEscalation record with resolution options, emits AGENT_ESCALATION_REQUIRED.
resumeRun() — records resolution, transitions back to running, emits AGENT_ESCALATION_RESOLVED.
completeRun() / failRun() / cancelRun() — release unused reserved credits via CreditService.releaseCredits(), emit the corresponding terminal event.
All state transitions emit typed EventBus events for observability and notification workflows.
Agent Quota Service (AgentQuotaService)
Added in March 2026, AgentQuotaService extends the generic QuotaService with agent-specific enforcement:
validateRunStart() — checks concurrent agent limit (per org, per tier), monthly run limit, and hourly rate limit before any agent run begins.
validateStepExecution() — checks step budget and token budget per run before each step.
getCreditUsageSummary() — returns dashboard/billing summary of credit consumption.
- Warning thresholds at 80% and 90% trigger
QUOTA_WARNING events via EventBus for proactive alerting.
Agent Run Service (AgentRunService)
AgentRunService (extends BaseService<AgentRun>) provides query capabilities for the agent runs collection:
listRuns(params) — queries agent_runs with filtering by status, agentType, and triggeredBy.
parseAgentRun() — schema validation with Sentry error capture for malformed run documents.
AgentToolRegistryImpl is a Map<string, AgentTool> singleton (agentToolRegistry). Tools are adapter wrappers around existing services — they do not modify the underlying services. Each tool:
- Checks the agent’s inherited user permissions.
- Calls the underlying service method.
- Returns a structured
AgentToolResult with creditsUsed and tokensUsed.
Tools referenced by built-in agents include: analyze_compliance, query_documents, generate_report, forecast_budget, scan_expense, generate_journal.
Credit Accounting
Agent runs use a reserve-then-consume model:
- Credits are reserved upfront at
startRun() based on definition.defaultCreditBudget.
- Credits are consumed incrementally per step at
executeStep().
- Unused credits are released at run completion/failure/cancellation.
Credit operations use Firestore transactions for atomic consistency (prevents concurrent overdraft). Reservations have a 1-hour TTL and auto-expire.
AgentDashboard — lists runs and current status.
AgentRunDetail — step-by-step execution view.
CreditUsageWidget — shows credits consumed per run.
AgentAttributionBadge — inline badge for AI-generated content attribution.
useAgentRuns hook — React Query integration with filtering and computed summary statistics (totalRuns, completedRuns, failedRuns, creditsConsumed, averageCreditCost, averageDurationMs).
Cost Control and Quotas
To prevent runaway API costs, the system implements:
- Credit budgets: Each agent run reserves a credit budget from
definition.defaultCreditBudget. Steps consume credits atomically via Firestore transactions; unused credits are released on completion.
- Agent quota service:
AgentQuotaService.validateRunStart() enforces per-tenant concurrent agent limits, monthly run limits, and hourly rate limits — all tied to subscription tier via TIER_LIMITS.
- Step and token budgets:
AgentQuotaService.validateStepExecution() prevents runaway agent runs by checking per-run step and token budgets.
- Usage tracking: All agent operations are metered via
usageTrackingTrackers.ts — trackAgentRun(), trackAgentStep(), trackCreditConsumption(), trackCreditPurchase().
- Usage alerts:
usageAlerts.ts triggers notifications at 80%, 90%, 95%, and 100% thresholds for API calls, AI generations, storage, and other metered dimensions.
- Query caching: RAG query results are cached with configurable TTLs per category (via
ragShared.ts) in Firestore (query-cache) to avoid redundant Gemini calls.
- Retry with backoff:
geminiRetry.ts wraps all Gemini calls in callWithRetry() — retries up to 3 times with exponential backoff on HTTP 429 and 5xx responses.
Responsible AI & Privacy
- Safety Guards:
responseSafety.ts runs applySafetyGuard() on all RAG responses to detect and block prompt injection, secret exfiltration, and PII exposure before returning results to the user.
- PII Stripping: Sensitive data is sanitized before being sent to the LLM provider.
- Zero-Retention: We opt-out of “Data Training” programs with AI providers to ensure customer data is never used to re-train base models.
- Audit Trail: Every AI query is logged to
ai-query-logs via ragLogging.ts with full metadata (cache status, safety check results, token counts).