Code
# install.packages("CDCPLACES")
# install.packges("tidyverse")
library(CDCPLACES)
library(tidyverse)
Brenden Smith
September 19, 2024
This is part of the CDCPLACES
blog series. To view the other posts in this series click here.
This brief blog post explains the new additions to CDCPLACES
1.1.8. This update provides a few new features:
Updated 2024 release data, including several new measures under the health-related social needs category
Two new arguments to help filter your data, cat
and age_adjust
The ability to query Zip Code Tabulation Areas (ZCTAs)
Improved functionality when querying counties with the same name across different states
In addition to these features, CDCPLACES
now depends on yyjsonr
instead of jsonlite
. Originally, the get_places
function included a step to clean the returned geolocation
variable (essentially a centroid of the geography queried). This step was removed as it was computationally intensive on larger queries and unncessary given the support for shapefiles with the geometry
argument. These changes drastically improve the speed of the package.
To begin, we will load the required packages.
With the 2024 release of the PLACES data, the default option for the release argument in get_places
has been updated to “2024”. You can find all the details of these updated data in the release notes.
An exciting addition to the PLACES data are the new health-related social needs variables. These include: social isolation, food stamps, food insecurity, housing insecurity, utility services threat, transportation barriers, and lack of social and emotional support. These measures are only available in 39 states and the District of Columbia (DC).
You can view these measures by calling get_dictionary
. The category ID for these new measures is “SOCLNEED”.
# A tibble: 7 × 17
measureid measure_full_name measure_short_name categoryid category_name
<chr> <chr> <chr> <chr> <chr>
1 ISOLATION Feeling socially isol… Social Isolation SOCLNEED Health-Relat…
2 FOODSTAMP Received food stamps … Food Stamps SOCLNEED Health-Relat…
3 FOODINSECU Food insecurity in th… Food Insecurity SOCLNEED Health-Relat…
4 HOUSINSECU Housing insecurity in… Housing Insecurity SOCLNEED Health-Relat…
5 SHUTUTILITY Utility services thre… Utilities Service… SOCLNEED Health-Relat…
6 LACKTRPT Lack of reliable tran… Transportation Ba… SOCLNEED Health-Relat…
7 EMOTIONSPT Lack of social and em… Lack of Social/Em… SOCLNEED Health-Relat…
# ℹ 12 more variables: places_release_2024 <chr>, measurename16_23 <chr>,
# places_release_2023 <chr>, places_release_2022 <chr>,
# places_release_2021 <chr>, places_release_2020 <chr>,
# `_500_cities_release_2019` <chr>, `_500_cities_release_2018` <chr>,
# `_500_cities_release_2017` <chr>, `_500_cities_release_2016` <chr>,
# frequency_brfss_year <chr>, shortname16_23 <chr>
Some minor quality of life improvements are introduced with the new arguments cat
and age_adjust
.
We can filter our results to returns a set of measures by category ID.
# A tibble: 938 × 20
year stateabbr statedesc locationname datasource category measure
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 2022 AL Alabama Choctaw BRFSS Health-Related Soc… Feelin…
2 2022 AL Alabama Colbert BRFSS Health-Related Soc… Food i…
3 2022 AL Alabama Escambia BRFSS Health-Related Soc… Food i…
4 2022 AL Alabama Greene BRFSS Health-Related Soc… Utilit…
5 2022 AL Alabama Morgan BRFSS Health-Related Soc… Receiv…
6 2022 AL Alabama Franklin BRFSS Health-Related Soc… Feelin…
7 2022 AL Alabama Dale BRFSS Health-Related Soc… Food i…
8 2022 AL Alabama Crenshaw BRFSS Health-Related Soc… Food i…
9 2022 AL Alabama Greene BRFSS Health-Related Soc… Feelin…
10 2022 AL Alabama Tuscaloosa BRFSS Health-Related Soc… Housin…
# ℹ 928 more rows
# ℹ 13 more variables: data_value_unit <chr>, data_value_type <chr>,
# data_value <dbl>, low_confidence_limit <dbl>, high_confidence_limit <dbl>,
# totalpopulation <chr>, totalpop18plus <chr>, locationid <chr>,
# categoryid <chr>, measureid <chr>, datavaluetypeid <chr>,
# short_question_text <chr>, geolocation <list>
If a measure is provided as well as a category, the category will override it. A message is displayed in the console noting this when it occurs.
To return only the age-adjusted prevalence rates, we can set the argument age_adjust
to TRUE. Age-adjusted rates are only available at the county level.
# A tibble: 469 × 20
year stateabbr statedesc locationname datasource category measure
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 2022 AL Alabama Escambia BRFSS Health-Related Soc… Food i…
2 2022 AL Alabama Morgan BRFSS Health-Related Soc… Receiv…
3 2022 AL Alabama Dale BRFSS Health-Related Soc… Food i…
4 2022 AL Alabama Etowah BRFSS Health-Related Soc… Lack o…
5 2022 AL Alabama Wilcox BRFSS Health-Related Soc… Housin…
6 2022 AL Alabama Limestone BRFSS Health-Related Soc… Food i…
7 2022 AL Alabama Coosa BRFSS Health-Related Soc… Food i…
8 2022 AL Alabama Crenshaw BRFSS Health-Related Soc… Housin…
9 2022 AL Alabama Cleburne BRFSS Health-Related Soc… Lack o…
10 2022 AL Alabama Jackson BRFSS Health-Related Soc… Receiv…
# ℹ 459 more rows
# ℹ 13 more variables: data_value_unit <chr>, data_value_type <chr>,
# data_value <dbl>, low_confidence_limit <dbl>, high_confidence_limit <dbl>,
# totalpopulation <chr>, totalpop18plus <chr>, locationid <chr>,
# categoryid <chr>, measureid <chr>, datavaluetypeid <chr>,
# short_question_text <chr>, geolocation <list>
A new option has been added to query ZCTAs. To do this, simply set the geography
argument equal to “zcta”.
Like other geographies, we can query shapefiles in the same call and easily plot the output:
w_sleep |>
ggplot(aes(fill = data_value, label = locationname)) +
geom_sf() +
geom_sf_label(fill = "white") +
theme_void() +
scale_fill_viridis_c(labels = scales::percent_format(scale = 1)) +
labs(title = "% Sleeping less than 7 hours among adults aged >=18 years",
subtitle = "In Barry County, Michigan ZCTAs")
This update provides the ability to query ZCTAs in different states and counties at the same time. This can raise issues when we want to look at counties that have the same name in multiple states.
Consider the following example. If we were interested in looking at dental health access around the Michigan/Ohio border and the Toledo area, we might query Monroe and Lucas Counties. We would set the query up like this:
The main issue here is that Ohio also has a Monroe County. CDCPLACES
will automatically check to see if your returned data contains these overlaps. You will see output in the console that looks like this:
The package will prompt you if you want to include these overlaps. After asking this, it will ask you to specify the counties you wish to exclude from your returned data:
If you choose not to make any exclusions you will get the full data with overlaps. In this example, I excluded Ohio’s Monroe County because it is far from the area of interest.1
We can now plot our returned data:
tol |>
ggplot(aes(fill = data_value)) +
geom_sf() +
theme_void() +
scale_fill_viridis_c(labels = scales::percent_format(scale = 1)) +
labs(title = "Visited a dentist or dental clinic in the past year among adults aged >=18 years",
subtitle = "In Monroe County, Michigan and Lucas County, Ohio ZCTAs")
It is crucial to mention that using this function in an R Markdown or Quarto document will override this user input. Full data with overlaps will be returned when knitting/rendering the document. If this is your specific use case, it is recommended to disregard this functionality and filter your data once it is returned.↩︎