A 10% Sample Covering All Years
Description
We provide in this section a consolidated dataset comprising a 10% random sample of the 30-year fixed-rate mortgages appearing in Fannie Mae’s July 27, 2023 release of its Single-Family Loan Performance primary dataset. For each year, from 2000 to 2022 inclusive1, we consolidate the data into one row per loan as described in the Data Consolidation section, filter out loans whose original loan term did not equal 360 months, and draw a 10% simple random sample of the remaining loans in that year. Sampled loans from all acquisition years are then combined into a single dataset. The dataset consists of 3,897,758 loans. We provide this dataset in several file formats.
Download
Delimited <.csv> flat file (delimiter: “|”)
- Uncompressed (~1,150MB) filename: <10_perc_sample.csv>
- Compressed (~235MB) filename: <10_perc_sample_csv.zip>
Feather <.feather> file
- Uncompressed (~560MB) filename: <10_perc_sample.feather>
- Compressed (~275MB) filename: <10_perc_sample_feather.zip>
Footnotes
we ignore the first quarter of 2023↩︎