Basic geographic and demographic fields: State Postal Code, State
FIPS Code, District ID, Name
Population estimates: Estimated Total Population, Estimated
Population 5-17, and the estimated number of relevant children 5 to 17
years old in poverty
Adjustments:
Convert population fields to numeric
Construct a combined NCES district identifier by concatenating state
FIPS and District ID
ACS 5-Year Estimates
Data source: American Community Survey 5-Year Estimates accessed via
the tidycensus
package
Raw variables selected:
Economic indicators: Median household income (B19013_001) and median
property value (B25077_001)
Educational attainment: Total population 25 years or older
(B15003_001) and subsets of that population holding bachelor’s degrees
(B15003_022), master’s degrees (B15003_023), professional degrees
(B15003_024), and doctoral degrees (B15003_025).
Data are pulled for different geographic breakdowns (unified,
elementary, and secondary school districts)
Adjustments:
Reshape data from long to wide format
Rename “GEOID” to a standard ncesid and ensure proper
formatting of district identifiers
Convert estimates to numeric as needed
CPI
Data source: U.S. Bureau of Labor Statistics, specifically the
Consumer Price Index for All Urban Consumers (CPI-U)
Raw variables selected:
CPI time series data (specific variable names as provided in the raw
file)
Adjustments:
Calculate an averaged CPI value using the second half of one year
and the first half of the following year to align with the academic
calendar, with the 2011-12 school year as the baseline year
Clean and reformat CPI data for consistency across processing
scripts
Joining Data
The joining process is implemented in the
07_edfinr_join_and_exclude.R script.
Data from the F-33 survey, CCD Directory, ACS (unified, elementary,
and secondary), and SAIPE sources are merged using left joins on shared
district identifiers (ncesid) and fiscal year.
The procedure ensures that each district record is enriched with
revenue, expenditure, demographic, and economic data.
Revenue Adjustments
Additional transformations are applied after the join: - Capital
expenditures and debt service (C11) are subtrated from state revenues -
Property sales (U11) are subtracted from local revenues - For Texas LEAs
in 2012-13 and earlier, payments to state governments (L12) are
subtracted from local revenues - Payments to other school systems (V91,
V92, and Q11) are proportionally subracted from local, state, and
federal revenues
Exclusions
Districts with enrollment less than 0 are removed.
Districts with total revenue less than 0 are removed.
Districts with an invalid LEA type (i.e. where lea_type_id is not
one of 1, 2, 3, or 7) are excluded.
Districts with invalid LEA/school level type (i.e. where schlev is
not one of “01”, “02”, or “03”, except for specified CA exceptions) are
excluded.
Districts where total revenue per-pupil is greater than $70,000 in
2011-12 dollars are excluded.
Districts where total revenue per pupil is less than $500 in 2011-12
dollars are excluded.
Connecticut LEAs consisting of semi-private high schools are removed
(NCES IDs “0905371”, “0905372”, and “0905373”).
Data Notes and Cautions
Users should note the following when working with the
edfinr datasets:
Some variables were originally coded with -1 to
indicate missing values; these have been replaced with NA
during processing.
During data processing, we identified a sharp rise in the number of
California districts appearing only from 2019 onward in the data. This
reflects the fact that many charter schools became separate LEAs in
those years. Beginning in 2018–19, a wave of California charter schools
switched to independent CALPADS/CBEDS reporting and thus were assigned
their own NCES LEA IDs for the first time. Once in the NCES LEA
universe, those new charter‐LEAs automatically show up in the F-33
finance survey (with blanks or flags if they report no finance data),
and Census’s SAIPE and ACS school‐district products (which mirror NCES
LEA boundaries).
The joined dataset represents a synthesis of data from multiple
sources; discrepancies in source data formats may lead to minor
variations.
Inflation and adjustment factors (e.g., CPI adjustments) are based
on averages and may not perfectly reflect local cost variations.
Caution is advised when comparing data across fiscal years
due to potential differences in data collection and processing
methods.