Get an HTML page from a URL — get_html

Downloads a web page using httr2 and returns it as an HTML document. This is useful for passing directly to scraping functions like scrape_cdc_section().

get_html_page(url)

Arguments

url: Character string containing the URL of the web page to retrieve.

Value

An HTML document of class xml_document, ready for use with rvest functions.

Details

The function performs a GET request using httr2, sets a user agent to mimic a browser, and parses the response into an HTML document.

Examples

if (FALSE) { # \dontrun{
page <- get_html_page("https://www.cdc.gov/nchs/data_access/VitalStatsOnline.htm")
births <- scrape_cdc_section(
    page, "Births", "birth",
    c("User Guide", "U.S. Data", "U.S. Territories")
)
} # }