Documentation Index
Fetch the complete documentation index at: https://grantmaster.dev/llms.txt
Use this file to discover all available pages before exploring further.
Data Scraper Portfolio
The Grant Discovery engine is powered by a fleet of automated scrapers that monitor the global grant ecosystem. This document provides a catalog of our sources and the normalization process.🛰️ Active Data Sources
We categorize our scrapers by region and source type:🇺🇸 North America
- Grants.gov (Federal): High-frequency API sync (every 6 hours).
- Sam.gov: Scrubbing and PDF extraction from federal opportunity notices.
- Foundation Directory (Candid): Semi-annual bulk import for private foundation data.
🇪🇺 European Union
- TED (Tenders Electronic Daily): Continuous monitoring of EU-funded projects.
- Erasmus+ / Horizon Europe: Specialized scrapers for research-heavy grants.
🌐 International & Private
- World Bank / UN: Periodic ingestion of global developmental aid grants.
- Major Corporate Foundations: Focused scrapers for Google.org, Bill & Melinda Gates Foundation, etc.
⚙️ The Normalization Pipeline
22:- Ingestion: Capture raw HTML/JSON and any attached PDFs.
- AI Extraction: Use LLMs (Gemini Flash) to parse unstructured text into the Unified Grant Schema:
title,summary,eligibility,deadline,amountRange.
- Entity Resolution: Link the grantor to a canonical list of “Grantor Entities” to prevent duplicates.
- Taxonomy Tagging: Assign categories based on the Mission Impact Taxonomy (e.g., “Sustainability”, “Community Health”).
- Quality Gate: If the data is incomplete (missing deadline or amount), it is marked as
PENDING_VERIFICATIONfor a SuperAdmin review.
🛠️ Monitoring and Health
Scraper health is monitored via the Intelligence Hub:- Failure Alerts: Triggered if a portal changes its structure (DOM breakage) or if an API returns 403/Forbidden.
- Freshness Score: Measures how many days it has been since a source last added a new grant listing.
- Diffing: We track “Opportunity Versioning” to alert users if a grant they are pursuing has changed its terms or deadline.