Documentation Index
Fetch the complete documentation index at: https://grantmaster.dev/llms.txt
Use this file to discover all available pages before exploring further.
Disaster Recovery and Business Continuity
This document outlines the protocols for handling major system failures, data loss, or regional outages.🏗️ Backup & Recovery Visual
📅 Backup Policy
| Resource | Frequency | Retention | Method |
|---|---|---|---|
| Firestore | Daily | 30 Days | GCP Managed Export |
| Cloud Storage | Real-time | 30 Days | Versioning enabled |
| GCP Secrets | On Change | Manual | Export to Cold Storage |
| Auth Users | Weekly | 90 Days | firebase auth:export |
🚨 Recovery Objectives
- RPO (Recovery Point Objective): 24 hours. In the worst case, we lose the last day of data.
- RTO (Recovery Time Objective): 4 hours. The platform must be back online within 4 hours of a critical failure.
🏃 Emergency Procedures (SOPs)
1. Data Corruption / Accidental Wipe
If a large segment of tenant data is lost:- Identify the last healthy backup timestamp.
- Run the
restore-firestore-shardscript (targets specific collections/tenants to avoid overwriting newer healthy data). - Notify affected organizations via the “Incident Status” banner.
2. Regional Outage
Ifus-central1 experiences a complete blackout:
- Redirect Traffic: Update Cloudflare DNS to point to the
europe-west3standby hosting endpoint. - State: Platform operates in Read-Only mode until Firestore cross-region replication is confirmed healthy or failover is complete.
🔐 The “Break Glass” Protocol
In extreme scenarios where the primary owner account is compromised:- Use the Secondary Custodian hardware key stored in the company vault.
- Trigger the “Revoke All Service Account Keys” script to freeze current API access.
- Regenerate all Firebase API keys and Novu secrets.
📉 Testing the DRP
We perform a Simulated Outage every 6 months to ensure:- Engineers know how to access backup buckets.
- The restore scripts are still compatible with the current schema.
- The business communication tree is up to date.