Skip to main content

Documentation Index

Fetch the complete documentation index at: https://grantmaster.dev/llms.txt

Use this file to discover all available pages before exploring further.

Search and Indexing Strategy

This document details how GrantMaster implements high-performance search across structured data (tenants, users) and unstructured content (grant listings, documents).

Hybrid Search Architecture

We use a three-tier strategy to balance speed, consistency, and offline resilience:

1. Firestore Native Search (Point Queries)

Used for simple, exact-match filtering (e.g., “Find user by email”, “List all grants for Tenant A”).
  • Pros: Real-time consistency, no extra cost.
  • Cons: No support for partial matches (fuzzy search) or complex “OR” queries.

2. Typesense (Full-Text Search — Primary)

Typesense Cloud is the primary full-text search engine (typesense v3.0.1). It powers the command palette global search and cross-entity queries.
  • Multi-collection search: A single query fans out across all indexed collections via client.multiSearch.perform().
  • Fuzzy matching: Supports typo-tolerant search on configurable query fields per collection.
  • Faceted filters: Status, project, user, date-range filters are applied server-side via Typesense filter_by clauses.
  • Hosted on Typesense Cloud (*.a2.typesense.net); no self-hosted infra required.

3. Fuse.js (Client-Side Fallback)

When Typesense is unreachable (offline, misconfigured), the platform degrades to in-memory fuzzy search via Fuse.js. The hybridSearch() function in searchService.ts orchestrates the fallback:
ModeBehavior
serverAlways queries Typesense; throws on failure.
localAlways uses Fuse.js against a pre-built in-memory index.
autoTries Typesense when online and configured; falls back to Fuse.js on error.

Indexed Collections

Seven Firestore collections are synced to Typesense. Schemas are defined in functions/src/search/typesenseSchema.ts and collection names use the gm_ prefix.
Typesense CollectionFirestore SourceQuery FieldsDefault Sort
gm_projectsprojectsname, description, funder, grantNumber, projectManagercreatedAt
gm_employeesemployeesname, email, jobTitle, departmentcreatedAt
gm_expensesexpensesdescription, vendor, category, employeeName, projectNamecreatedAt
gm_journalsjournalsactivityDescription, activityType, employeeName, projectNamedate
gm_documentsdocumentsfilename, title, content, projectNamecreatedAt
gm_contactscontactsname, email, company, notescreatedAt
gm_compliance_rulescomplianceRulesname, description, category, projectNamecreatedAt
Each schema includes faceted fields for organizationId, status, projectId, and domain-specific dimensions (e.g., funder, category, severity). See typesenseSchema.ts for the full field definitions.

Data Synchronization

Firestore-to-Typesense Triggers

Every indexed Firestore collection has an onDocumentWritten Cloud Function trigger defined in functions/src/search/indexingTriggers.ts:
Cloud FunctionFirestore PathTypesense Collection
searchIndexProjectprojects/{projectId}gm_projects
searchIndexEmployeeemployees/{employeeId}gm_employees
searchIndexExpenseexpenses/{expenseId}gm_expenses
searchIndexJournaljournals/{journalId}gm_journals
searchIndexDocumentdocuments/{documentId}gm_documents
searchIndexContactcontacts/{contactId}gm_contacts
searchIndexComplianceRulecomplianceRules/{ruleId}gm_compliance_rules

Sync Lifecycle

  1. Trigger: onDocumentWritten fires on any create, update, or delete in the source collection.
  2. Guard: If Typesense is not configured (isTypesenseConfigured() returns false), the trigger exits silently — the system degrades gracefully.
  3. Schema Assurance: ensureCollection() creates the Typesense collection on first write (idempotent).
  4. Transformation: The Cloud Function extracts a flat search document, converting Firestore Timestamp values to Unix milliseconds and stripping internal-only fields.
  5. Upsert / Delete: upsertDocument() or deleteDocument() is called on the Typesense client.
  6. Error Logging: Failures are logged via firebase-functions/v2 structured logger with document ID and collection name for observability.

Manual Reindex

The reindexCollection callable function allows admins to trigger a full reindex of a specific collection for a given organization. Requires the admin custom claim on the caller’s auth token.
// Client-side invocation
const reindex = httpsCallable(functions, 'reindexCollection');
await reindex({ collection: 'gm_projects', organizationId: 'org_abc' });

Frontend Search Service

Client Initialization

The frontend Typesense client (src/shared/platform/typesenseSearchService.ts) is a lazy-initialized singleton configured via environment variables:
VariablePurpose
VITE_TYPESENSE_HOSTTypesense Cloud node hostname
VITE_TYPESENSE_SEARCH_API_KEYRead-only Search API key (no write access)
VITE_TYPESENSE_PORTPort (default 443)
VITE_TYPESENSE_PROTOCOLProtocol (default https)
VITE_TYPESENSE_USE_CLOUD_FUNCTIONSIf true, route searches through callable searchTypesenseSecure (recommended for production)
searchTypesense() builds a multiSearch request that fans out across all permitted collections in a single round-trip. Each sub-request includes:
  • query_by — collection-specific text fields.
  • filter_by — always starts with organizationId:={orgId} (mandatory tenant isolation), then appends optional status/project/user filters.
  • per_page — configurable limit per collection (default 10).
  • highlight_full_fields — enables snippet highlighting for UI display.
When VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=true, the frontend does not query Typesense directly. It calls the callable Cloud Function searchTypesenseSecure, which executes multi-search server-side and returns only mapped result payloads.

RBAC Post-Filtering

After Typesense returns results, applyRBACFilter() removes items the current user lacks permission to view:
  • Collection-level: Each collection maps to one or more Permission enums. If the user lacks the required permission, the entire collection’s results are dropped.
  • Ownership-level: For journals and expenses, non-approvers only see their own records (filtered by userId).
  • Pre-optimization: Collections the user cannot access are excluded from the multiSearch request entirely, reducing payload and latency.

Result Transformation

Each Typesense hit is transformed into a unified SearchResult object that includes:
  • title, description, subtitle — human-readable display fields.
  • navigationTarget — deep-link path (e.g., /projects/{id}, /expenses?id={id}).
  • icon — Lucide icon name for UI rendering.
  • score — Typesense text_match score for relevance ranking.
  • matchedFields — fields that contributed to the match (for highlight display).

Vector Search (AI Discovery)

For advanced semantic matching (e.g., “Find grants related to rural healthcare for seniors”), the platform uses Vector Embeddings via the RAG service (src/features/ai/services/ragService.ts). This is a separate system from the Typesense full-text search:
  • Embedding: Text content is converted into vector arrays using Google Gemini models.
  • Similarity search: Cosine similarity queries run against the vectorized grant listings.
  • Hybrid scoring: Semantic results are combined with traditional filters (location, dollar amount) to produce a final Match Score.

Security and Tenancy

Search indices are strictly partitioned by organizationId.

Server-Side (Cloud Functions)

  • Every indexing trigger validates that organizationId exists before upserting. Documents without an organizationId are skipped and logged as warnings.
  • The Typesense admin API key (used for writes) is stored as a Firebase Secret (TYPESENSE_API_KEY), never exposed to the client.

Client-Side (Frontend)

  • The frontend uses a read-only Search API key (VITE_TYPESENSE_SEARCH_API_KEY) that cannot modify indices.
  • Every search query programmatically injects organizationId:=${orgId} into filter_by — this is enforced in code, not via Typesense scoped keys.
  • Data masking: Search results contain only the metadata needed for display. Full records are always fetched from Firestore when the user navigates to a detail view.
searchTypesenseSecure (Cloud Functions callable):
  • Verifies authentication.
  • Resolves caller tenant context from people collection.
  • Rejects cross-tenant search unless caller is superadmin.
  • Applies mandatory organizationId filters server-side before querying Typesense.
  • Supports phased rollout with frontend feature flag:
  • Enable: VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=true
  • Roll back: set VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=false

Environment Configuration

Frontend (src/.env.local)

VITE_TYPESENSE_HOST=<cluster>.a2.typesense.net
VITE_TYPESENSE_SEARCH_API_KEY=<search-only-key>
VITE_TYPESENSE_PORT=443
VITE_TYPESENSE_PROTOCOL=https
VITE_TYPESENSE_ENABLED=true
VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=false

Cloud Functions (functions/.env)

TYPESENSE_API_KEY=<admin-key>       # Write access; use Firebase Secrets in production
TYPESENSE_HOST=<cluster>.a2.typesense.net
TYPESENSE_PORT=443
TYPESENSE_PROTOCOL=https

Rollout Playbook (Secure Search Mode)

  1. Deploy Functions containing searchTypesenseSecure.
  2. Keep VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=false and verify no regressions.
  3. Enable VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=true in staging frontend.
  4. Validate command palette search:
    • works for normal user in own organization
    • rejects cross-tenant access
    • logs show non-zero searchTime and expected collections
  5. Promote env flag to production.
  6. If issues occur, immediately roll back by setting VITE_TYPESENSE_USE_CLOUD_FUNCTIONS=false.

Search Observability and Operations

Metrics Counters

Search telemetry is aggregated into searchMetricsDaily/{YYYY-MM-DD} with per-day counters:
  • typesense_success
  • secure_callable_error
  • fuse_fallback_count
Frontend reports telemetry through callable reportSearchTelemetry; secure callable search also records success/error metrics server-side.

No-Result Query Capture

No-result queries are stored in searchNoResultQueries (daily aggregated) and used for nightly search tuning.

Nightly Jobs

Two scheduled Cloud Functions run daily (Europe/Amsterdam timezone):
  • nightlyTuneSearchFromNoResults (02:00) - derives synonym updates + typo config from top no-result queries.
  • nightlyReconcileSearchIndex (02:30) - compares Firestore vs Typesense collection counts and writes drift reports to searchReconciliationReports.

Key Source Files

FilePurpose
functions/src/search/typesenseSchema.tsCollection schemas and field definitions
functions/src/search/typesenseClient.tsServer-side Typesense client (upsert, delete, search, health check)
functions/src/search/indexingTriggers.tsFirestore onDocumentWritten triggers for all 7 collections
functions/src/search/searchCallable.tsCallable secure search endpoint (searchTypesenseSecure)
functions/src/search/index.tsBarrel export for the search module
src/shared/platform/typesenseSearchService.tsFrontend Typesense client (multi-search, RBAC, result transformation)
src/shared/platform/searchService.tsHybrid search orchestrator (Typesense + Fuse.js fallback)