Skip to main content

Documentation Index

Fetch the complete documentation index at: https://grantmaster.dev/llms.txt

Use this file to discover all available pages before exploring further.

Disaster Recovery and Business Continuity

This document outlines the protocols for handling major system failures, data loss, or regional outages.

🏗️ Backup & Recovery Visual

📅 Backup Policy

ResourceFrequencyRetentionMethod
FirestoreDaily30 DaysGCP Managed Export
Cloud StorageReal-time30 DaysVersioning enabled
GCP SecretsOn ChangeManualExport to Cold Storage
Auth UsersWeekly90 Daysfirebase auth:export

🚨 Recovery Objectives

  • RPO (Recovery Point Objective): 24 hours. In the worst case, we lose the last day of data.
  • RTO (Recovery Time Objective): 4 hours. The platform must be back online within 4 hours of a critical failure.

🏃 Emergency Procedures (SOPs)

1. Data Corruption / Accidental Wipe

If a large segment of tenant data is lost:
  1. Identify the last healthy backup timestamp.
  2. Run the restore-firestore-shard script (targets specific collections/tenants to avoid overwriting newer healthy data).
  3. Notify affected organizations via the “Incident Status” banner.

2. Regional Outage

If us-central1 experiences a complete blackout:
  1. Redirect Traffic: Update Cloudflare DNS to point to the europe-west3 standby hosting endpoint.
  2. State: Platform operates in Read-Only mode until Firestore cross-region replication is confirmed healthy or failover is complete.

🔐 The “Break Glass” Protocol

In extreme scenarios where the primary owner account is compromised:
  • Use the Secondary Custodian hardware key stored in the company vault.
  • Trigger the “Revoke All Service Account Keys” script to freeze current API access.
  • Regenerate all Firebase API keys and Novu secrets.

📉 Testing the DRP

We perform a Simulated Outage every 6 months to ensure:
  • Engineers know how to access backup buckets.
  • The restore scripts are still compatible with the current schema.
  • The business communication tree is up to date.