scrape_cdc_section.RdExtracts downloadable file links from a CDC Vital Statistics page section identified by an anchor ID. The function navigates the HTML structure, collects links from listScroll elements, and returns a tidy tibble with metadata about each file.
scrape_cdc_section(page, anchor_id, section_name, subsection_names)An HTML document returned by rvest::read_html().
Character string giving the HTML anchor ID for the section.
Human-readable name of the section.
Character vector of subsection names. Must match the number of listScroll elements found in the section.
A tibble with columns:
Section name
Subsection name
Text of the download link
Extracted year or leading label
File size string, if present
Absolute URL to the file
File extension
The function assumes the CDC page structure uses .listScroll
containers and that the anchor is nested three levels below the section
root. Changes to page structure may require updating the DOM traversal.