Skip to main content

Documentation Index

Fetch the complete documentation index at: https://grantmaster.dev/llms.txt

Use this file to discover all available pages before exploring further.

Data Scraper Portfolio

The Grant Discovery engine is powered by a fleet of automated scrapers that monitor the global grant ecosystem. This document provides a catalog of our sources and the normalization process.

🛰️ Active Data Sources

We categorize our scrapers by region and source type:

🇺🇸 North America

  • Grants.gov (Federal): High-frequency API sync (every 6 hours).
  • Sam.gov: Scrubbing and PDF extraction from federal opportunity notices.
  • Foundation Directory (Candid): Semi-annual bulk import for private foundation data.

🇪🇺 European Union

  • TED (Tenders Electronic Daily): Continuous monitoring of EU-funded projects.
  • Erasmus+ / Horizon Europe: Specialized scrapers for research-heavy grants.

🌐 International & Private

  • World Bank / UN: Periodic ingestion of global developmental aid grants.
  • Major Corporate Foundations: Focused scrapers for Google.org, Bill & Melinda Gates Foundation, etc.

⚙️ The Normalization Pipeline

22:
  1. Ingestion: Capture raw HTML/JSON and any attached PDFs.
  2. AI Extraction: Use LLMs (Gemini Flash) to parse unstructured text into the Unified Grant Schema:
    • title, summary, eligibility, deadline, amountRange.
  3. Entity Resolution: Link the grantor to a canonical list of “Grantor Entities” to prevent duplicates.
  4. Taxonomy Tagging: Assign categories based on the Mission Impact Taxonomy (e.g., “Sustainability”, “Community Health”).
  5. Quality Gate: If the data is incomplete (missing deadline or amount), it is marked as PENDING_VERIFICATION for a SuperAdmin review.

🛠️ Monitoring and Health

Scraper health is monitored via the Intelligence Hub:
  • Failure Alerts: Triggered if a portal changes its structure (DOM breakage) or if an API returns 403/Forbidden.
  • Freshness Score: Measures how many days it has been since a source last added a new grant listing.
  • Diffing: We track “Opportunity Versioning” to alert users if a grant they are pursuing has changed its terms or deadline.