clean_all_sections.RdApplies section-specific cleaning and reshaping to each element of the raw scraped data list. Each section has unique quirks that require custom handling before the data can be combined into a lookup table.
clean_all_sections(datas, url_pdf)A named list of raw tibbles as returned by
scrape_all_sections.
URL to the CDC mortality public use data page, used to scrape mortality user guide links separately.
The same named list with each element cleaned and pivoted to wide format, with one row per year and columns for each subsection's URL, file size, and file type.
Section-specific handling:
Addenda filtered out; user guide URL forward-filled for years without a dedicated guide.
Rows with no U.S. Data URL are dropped.
Redundant 1995-1997 file dropped in favour of the superseding 1995-2000 file.
User guides are hosted on a separate page and scraped independently. 1997 and 1998 require a further dedicated scrape to extract the Detail Record Layout PDF. User guide URL is forward-filled for years without a dedicated guide.
For 2014 and 2015, the plain U.S. Data and U.S. Territories files are dropped in favour of the richer "with cause of death" versions.