A 10% Sample Covering All Years

Description

We provide in this section a consolidated dataset comprising a 10% random sample of the 30-year fixed-rate mortgages appearing in Fannie Mae’s July 27, 2023 release of its Single-Family Loan Performance primary dataset. For each year, from 2000 to 2022 inclusive1, we consolidate the data into one row per loan as described in the Data Consolidation section, filter out loans whose original loan term did not equal 360 months, and draw a 10% simple random sample of the remaining loans in that year. Sampled loans from all acquisition years are then combined into a single dataset. The dataset consists of 3,897,758 loans. We provide this dataset in several file formats.

Download

Delimited <.csv> flat file (delimiter: “|”)

  • Uncompressed (~1,150MB) filename: <10_perc_sample.csv>
  • Compressed (~235MB) filename: <10_perc_sample_csv.zip>

Feather <.feather> file

  • Uncompressed (~560MB) filename: <10_perc_sample.feather>
  • Compressed (~275MB) filename: <10_perc_sample_feather.zip>

Footnotes

  1. we ignore the first quarter of 2023↩︎