Documentation Index
Fetch the complete documentation index at: https://grantmaster.dev/llms.txt
Use this file to discover all available pages before exploring further.
Search and Indexing Strategy
This document details how GrantMaster implements high-performance search across structured data (tenants, users) and unstructured content (grant listings, documents).
Hybrid Search Architecture
We use a three-tier strategy to balance speed, consistency, and offline resilience:
1. Firestore Native Search (Point Queries)
Used for simple, exact-match filtering (e.g., “Find user by email”, “List all grants for Tenant A”).
- Pros: Real-time consistency, no extra cost.
- Cons: No support for partial matches (fuzzy search) or complex “OR” queries.
2. Typesense (Full-Text Search — Primary)
Typesense Cloud is the primary full-text search engine (typesense v3.0.1). It powers the command palette global search and cross-entity queries.
- Multi-collection search: A single query fans out across all indexed collections via
client.multiSearch.perform().
- Fuzzy matching: Supports typo-tolerant search on configurable query fields per collection.
- Faceted filters: Status, project, user, date-range filters are applied server-side via Typesense
filter_by clauses.
- Hosted on Typesense Cloud (
*.a2.typesense.net); no self-hosted infra required.
3. Fuse.js (Client-Side Fallback)
When Typesense is unreachable (offline, misconfigured), the platform degrades to in-memory fuzzy search via Fuse.js. The hybridSearch() function in searchService.ts orchestrates the fallback:
| Mode | Behavior |
|---|
server | Always queries Typesense; throws on failure. |
local | Always uses Fuse.js against a pre-built in-memory index. |
auto | Tries Typesense when online and configured; falls back to Fuse.js on error. |
Indexed Collections
Seven Firestore collections are synced to Typesense. Schemas are defined in functions/src/search/typesenseSchema.ts and collection names use the gm_ prefix.
| Typesense Collection | Firestore Source | Query Fields | Default Sort |
|---|
gm_projects | projects | name, description, funder, grantNumber, projectManager | createdAt |
gm_employees | employees | name, email, jobTitle, department | createdAt |
gm_expenses | expenses | description, vendor, category, employeeName, projectName | createdAt |
gm_journals | journals | activityDescription, activityType, employeeName, projectName | date |
gm_documents | documents | filename, title, content, projectName | createdAt |
gm_contacts | contacts | name, email, company, notes | createdAt |
gm_compliance_rules | complianceRules | name, description, category, projectName | createdAt |
Each schema includes faceted fields for organizationId, status, projectId, and domain-specific dimensions (e.g., funder, category, severity). See typesenseSchema.ts for the full field definitions.
Data Synchronization
Firestore-to-Typesense Triggers
Every indexed Firestore collection has an onDocumentWritten Cloud Function trigger defined in functions/src/search/indexingTriggers.ts:
| Cloud Function | Firestore Path | Typesense Collection |
|---|
searchIndexProject | projects/{projectId} | gm_projects |
searchIndexEmployee | employees/{employeeId} | gm_employees |
searchIndexExpense | expenses/{expenseId} | gm_expenses |
searchIndexJournal | journals/{journalId} | gm_journals |
searchIndexDocument | documents/{documentId} | gm_documents |
searchIndexContact | contacts/{contactId} | gm_contacts |
searchIndexComplianceRule | complianceRules/{ruleId} | gm_compliance_rules |
Sync Lifecycle
- Trigger:
onDocumentWritten fires on any create, update, or delete in the source collection.
- Guard: If Typesense is not configured (
isTypesenseConfigured() returns false), the trigger exits silently — the system degrades gracefully.
- Schema Assurance:
ensureCollection() creates the Typesense collection on first write (idempotent).
- Transformation: The Cloud Function extracts a flat search document, converting Firestore
Timestamp values to Unix milliseconds and stripping internal-only fields.
- Upsert / Delete:
upsertDocument() or deleteDocument() is called on the Typesense client.
- Error Logging: Failures are logged via
firebase-functions/v2 structured logger with document ID and collection name for observability.
Manual Reindex
The reindexCollection callable function allows admins to trigger a full reindex of a specific collection for a given organization. Requires the admin custom claim on the caller’s auth token.
// Client-side invocation
const reindex = httpsCallable(functions, 'reindexCollection');
await reindex({ collection: 'gm_projects', organizationId: 'org_abc' });
Frontend Search Service
Client Initialization
The frontend Typesense client (src/shared/platform/typesenseSearchService.ts) is a lazy-initialized singleton configured via environment variables:
| Variable | Purpose |
|---|
VITE_TYPESENSE_HOST | Typesense Cloud node hostname |
VITE_TYPESENSE_SEARCH_API_KEY | Read-only Search API key (no write access) |
VITE_TYPESENSE_PORT | Port (default 443) |
VITE_TYPESENSE_PROTOCOL | Protocol (default https) |
VITE_TYPESENSE_USE_CLOUD_FUNCTIONS | If true, route searches through callable searchTypesenseSecure (recommended for production) |
Multi-Collection Search
searchTypesense() builds a multiSearch request that fans out across all permitted collections in a single round-trip. Each sub-request includes:
query_by — collection-specific text fields.
filter_by — always starts with organizationId:={orgId} (mandatory tenant isolation), then appends optional status/project/user filters.
per_page — configurable limit per collection (default 10).
highlight_full_fields — enables snippet highlighting for UI display.
When VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=true, the frontend does not query Typesense directly. It calls the callable Cloud Function searchTypesenseSecure, which executes multi-search server-side and returns only mapped result payloads.
RBAC Post-Filtering
After Typesense returns results, applyRBACFilter() removes items the current user lacks permission to view:
- Collection-level: Each collection maps to one or more
Permission enums. If the user lacks the required permission, the entire collection’s results are dropped.
- Ownership-level: For journals and expenses, non-approvers only see their own records (filtered by
userId).
- Pre-optimization: Collections the user cannot access are excluded from the
multiSearch request entirely, reducing payload and latency.
Each Typesense hit is transformed into a unified SearchResult object that includes:
title, description, subtitle — human-readable display fields.
navigationTarget — deep-link path (e.g., /projects/{id}, /expenses?id={id}).
icon — Lucide icon name for UI rendering.
score — Typesense text_match score for relevance ranking.
matchedFields — fields that contributed to the match (for highlight display).
Vector Search (AI Discovery)
For advanced semantic matching (e.g., “Find grants related to rural healthcare for seniors”), the platform uses Vector Embeddings via the RAG service (src/features/ai/services/ragService.ts). This is a separate system from the Typesense full-text search:
- Embedding: Text content is converted into vector arrays using Google Gemini models.
- Similarity search: Cosine similarity queries run against the vectorized grant listings.
- Hybrid scoring: Semantic results are combined with traditional filters (location, dollar amount) to produce a final Match Score.
Security and Tenancy
Search indices are strictly partitioned by organizationId.
Server-Side (Cloud Functions)
- Every indexing trigger validates that
organizationId exists before upserting. Documents without an organizationId are skipped and logged as warnings.
- The Typesense admin API key (used for writes) is stored as a Firebase Secret (
TYPESENSE_API_KEY), never exposed to the client.
Client-Side (Frontend)
- The frontend uses a read-only Search API key (
VITE_TYPESENSE_SEARCH_API_KEY) that cannot modify indices.
- Every search query programmatically injects
organizationId:=${orgId} into filter_by — this is enforced in code, not via Typesense scoped keys.
- Data masking: Search results contain only the metadata needed for display. Full records are always fetched from Firestore when the user navigates to a detail view.
Callable Search Proxy (Recommended)
searchTypesenseSecure (Cloud Functions callable):
- Verifies authentication.
- Resolves caller tenant context from
people collection.
- Rejects cross-tenant search unless caller is
superadmin.
- Applies mandatory
organizationId filters server-side before querying Typesense.
- Supports phased rollout with frontend feature flag:
- Enable:
VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=true
- Roll back: set
VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=false
Environment Configuration
Frontend (src/.env.local)
VITE_TYPESENSE_HOST=<cluster>.a2.typesense.net
VITE_TYPESENSE_SEARCH_API_KEY=<search-only-key>
VITE_TYPESENSE_PORT=443
VITE_TYPESENSE_PROTOCOL=https
VITE_TYPESENSE_ENABLED=true
VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=false
Cloud Functions (functions/.env)
TYPESENSE_API_KEY=<admin-key> # Write access; use Firebase Secrets in production
TYPESENSE_HOST=<cluster>.a2.typesense.net
TYPESENSE_PORT=443
TYPESENSE_PROTOCOL=https
Rollout Playbook (Secure Search Mode)
- Deploy Functions containing
searchTypesenseSecure.
- Keep
VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=false and verify no regressions.
- Enable
VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=true in staging frontend.
- Validate command palette search:
- works for normal user in own organization
- rejects cross-tenant access
- logs show non-zero
searchTime and expected collections
- Promote env flag to production.
- If issues occur, immediately roll back by setting
VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=false.
Search Observability and Operations
Metrics Counters
Search telemetry is aggregated into searchMetricsDaily/{YYYY-MM-DD} with per-day counters:
typesense_success
secure_callable_error
fuse_fallback_count
Frontend reports telemetry through callable reportSearchTelemetry; secure callable search also records success/error metrics server-side.
No-Result Query Capture
No-result queries are stored in searchNoResultQueries (daily aggregated) and used for nightly search tuning.
Nightly Jobs
Two scheduled Cloud Functions run daily (Europe/Amsterdam timezone):
nightlyTuneSearchFromNoResults (02:00) - derives synonym updates + typo config from top no-result queries.
nightlyReconcileSearchIndex (02:30) - compares Firestore vs Typesense collection counts and writes drift reports to searchReconciliationReports.
Key Source Files
| File | Purpose |
|---|
functions/src/search/typesenseSchema.ts | Collection schemas and field definitions |
functions/src/search/typesenseClient.ts | Server-side Typesense client (upsert, delete, search, health check) |
functions/src/search/indexingTriggers.ts | Firestore onDocumentWritten triggers for all 7 collections |
functions/src/search/searchCallable.ts | Callable secure search endpoint (searchTypesenseSecure) |
functions/src/search/index.ts | Barrel export for the search module |
src/shared/platform/typesenseSearchService.ts | Frontend Typesense client (multi-search, RBAC, result transformation) |
src/shared/platform/searchService.ts | Hybrid search orchestrator (Typesense + Fuse.js fallback) |