HeadwayForge
Home / Data coverage
Transparency

Where the numbers come from.

Claims about Title VI, NTD, Census, R5, GTFS, and real-time data deserve transparency. Here are the sources HeadwayForge uses, how fresh they are, how they're validated, and how the key calculations work.

Inputs

Data sources

Every analysis draws from these sources. Cadence describes how often each is refreshed.

Source What it provides Cadence
GTFS Schedule (MobilityData catalog, open GitHub mirror) ~1,150 US schedule feeds: routes, stops, trips, stop times, calendars, shapes. Normalized into PostGIS on demand. Catalog synced nightly; per-agency feed parsed on demand.
GTFS-Realtime Hundreds of US RT feeds: vehicle positions and trip updates for live operations and validation. ~40% require agency credentials. Polled per-feed (≈15–60s) when active.
National Transit Database (NTD, FTA via Socrata) 2,200+ reporting agencies: ridership (UPT), passenger miles, vehicle revenue miles/hours, fleet size (VOMS), operating expenses, peer context. Annual NTD release.
Census ACS (vintage 2022) 239k+ block groups across 50 states: race/ethnicity, income for Title VI equity overlays. ACS annual vintage.
LEHD LODES (workplace area characteristics) Jobs by block for access-to-opportunity analysis. Annual (loaded per state).
APTA Public Transportation Fact Book agency stats plus the national quarterly ridership-by-mode trend. Annual Fact Book + quarterly ridership.
Canonical GTFS Validator (v8) Schedule feed quality notices, rolled up per feed version. Run on each feed ingest.
Methodology

How key calculations work

Each output is built from a documented method. No black boxes — here's what's actually computed.

GTFS & GTFS-Realtime validation

Feeds run through the Canonical GTFS Validator; notice counts are summarized per feed version, and RT feeds are checked for staleness, missing vehicles, and decode errors so analysis starts from trustworthy data.

R5 access analysis assumptions

Travel-time isochrones and cumulative-opportunity counts (jobs/people reachable in 15/30/45 min) are computed with an embedded R5 (Conveyal) routing engine over the GTFS network plus walking; departure window and walk access are configurable.

Census & ACS handling

ACS block-group attributes (race/ethnicity, income) join to TIGER block-group geometry; equity overlays count population within walk buffers of frequent service, broken out by minority and low-income status per FTA Circular 4702.1B.

NTD peer selection

Peer sets are built from agencies sharing the same urbanized area and primary mode, following standard FTA/NTD peer-grouping convention, with a curated GTFS↔NTD crosswalk to match feeds to reporters.

Service supply metrics

Trips/day, headways by period, and span of service are derived directly from the agency's parsed GTFS for the representative service day; route directness and stop spacing come from shapes and stop geometry.

Confidence & traceability

Every figure traces back to a named source and the feed version or data vintage it came from, so outputs can be audited and explained.

Start from data you can defend.

Open any US agency and see exactly which feeds, vintages, and validation results back the analysis.