EPA Border

Maps
R
Author

George Girton

Published

January 28, 2024

Get the geography

I searched, and then found, the 2022 Congressional districts shapefile

I downloaded the file, and unzipped it into a folder.

You may use a different folder

```{r}
library(tidyverse)
library(sf)

# https://catalog.data.gov/dataset/2022-cartographic-boundary-file-shp-118th-congressional-districts-for-united-states-1-20000000

wd <- getwd()

# You could easily do this a different way!

folder <- paste0(wd,"/","posts/RWithoutStatistics/")
datafolder <- paste0(folder,"/CD2022/")

# Then, read the file
CD2022 <- read_sf(paste0(datafolder,"cb_2022_us_cd118_20m.shp"))

#Check it out on the map

CD2022 %>%
  ggplot() +
  geom_sf(
    color = "#00ff00",
    linewidth = 1
  )
## Ok, this worked great.
# it maps HI and AK including the Aleutians.
## Let's focus on the US-Mexico border.
```

Border areas

I’m sure there’s a way in RStats to develop the US-Mexico border districts, but for me, I have my Scan/US subscription at hand (Scan/US web page). (disclaimer: I am a Scan/US employee, all opinions expressed here are my own) I load the shapefile into Scan/US, use my mouse to draw a polygon around the congressional districts on the border to create a grouping, and export the GEOIDs (Geographic ID’s) for the Congressional Districts on the US-Mexico border, using Scan/US’s “Export Data” menu entry. (Border Congressional districts shown in orange below)

grouped Congressional Border districts

format the vector!

Get the district array in the right format. 🍸🙀

```{r}
# Not correct
BorderDistricts <- c (0648,0650,0651,0652,0406,0625,0407,0409,3502,4823,4834,4828,4815,0649)

# also not correct -- I thought it was in character, with elided zeroes.  Nope!
BorderDistricts <- c (648,650,651,652,406,625,407,409,3502,4823,4834,4828,4815, 649)

# Correct!  A character match will work
BorderDistricts <- c ("0648","0650","0651","0652","0406","0625","0407","0409","3502","4823","4834","4828","4815","0649")

BorderCD <- CD2022 |> filter(GEOID %in% BorderDistricts)
```

Ok now get the EPA points

This URL (below) might change, you never know, but at the moment it works:

Get the EPA FRS file here

```{r}
# Get the file https://www.epa.gov/frs/epa-frs-facilities-state-single-file-csv-download
### Referred to on this site: https://www.epa.gov/frs/geospatial-data-download-service

epa_frs <- read_csv("/Users/georgegirton/Downloads/Facilities/national_single/NATIONAL_SINGLE.CSV")
problems(epa_frs)

what <- unique(epa_frs$NAICS_CODE_DESCRIPTIONS)
what
# what |> write_file("epa_frs.txt") # Error: Expected string vector of length 1

# 'cats' is short for categories.  It has nothing to do with actual 🐈🐈🐈
outfile <- paste0(folder,"cats.txt")
for(i in 1:length(what)){
  readr::write_file(paste0(what[i],"\n"),outfile, append = TRUE)
}
```

Some filtering

Choose the fields to to keep (with dplyr’s ‘select’), filter out other countries, and distill a list of locations within 100 miles of th US Mexico border, using the criterion

US_MEXICO_BORDER_IND ==“Yes”

```{r}
keepfields <- c (REGISTRY_ID, PRIMARY_NAME, NAICS_CODE_DESCRIPTIONS, LOCATION_ADDRESS, CITY_NAME,STATE_CODE,COUNTRY_NAME,CONGRESSIONAL_DIST_NUMBER,LONGITUDE83,LATITUDE83)

OtherCountries <- c ("AFGHANISTAN","ALBANIA","ALGERIA","AMERICAN SAMOA","AUSTRALIA","BASSAS DA INDIA","BELARUS","BR","BRAZIL","BRITISH VIRGIN ISLANDS","BURKINA FASO","CANADA","CHINA","COOK ISLANDS","DOMINICAN REPUBLIC","EAST TIMOR","FRANCE","GEORGIA","GERMANY","GREAT BRITAIN (UK)","GREECE","GUADELOUPE","GUAM","HONG KONG","INDIA","ISRAEL","KIRIBATI","MALAYSIA","MAURITIUS","MEXICO","MX","NETHERLANDS","NORTHERN MARIANA ISLANDS","NORWAY","PORTUGAL","PUERTO RICO","RQ","SAINT KITTS AND NEVIS","SAIPAN","SENEGAL","SUDAN","TAIWAN","THE GAMBIA","UGANDA","UNIT","UNITED ARAB EMIRATES","UNITED KINGDOM","UNITED STATES MINOR OUTLYING ISLANDS","URUGUAY","US MINOR OUTLYING ISLANDS","UZBEKISTAN","VANUATU","VATICAN CITY STATE (HOLY SEE)","VENEZUELA","VIET NAM","VIRGIN ISLANDS (U.S.)","VQ")

reduced <- epa_frs |>
  select(REGISTRY_ID, PRIMARY_NAME,US_MEXICO_BORDER_IND,INTEREST_TYPES,NAICS_CODE_DESCRIPTIONS, LOCATION_ADDRESS, CITY_NAME,     STATE_CODE,COUNTRY_NAME,CONGRESSIONAL_DIST_NUM,LONGITUDE83,LATITUDE83)

usa_only_coords <- reduced |>
  filter(!COUNTRY_NAME %in% OtherCountries) |>
  filter(!is.na(LONGITUDE83))

## 'Yes' means within 100 km of the border
nearborder <- usa_only_coords |>
    filter(US_MEXICO_BORDER_IND =="Yes")

```

Aaaand …. mapping

```{r}
nearborder |>  ggplot(aes(x=LONGITUDE83, y=LATITUDE83)) + geom_point()
# that worked great

BorderCD$centroid <- st_centroid(BorderCD$geometry)

# I got these three lines from Bing. And not Bing Crosby either.  Thanks, Bing!
BorderCD$coords <- st_coordinates(BorderCD$centroid )
# The x and y coordinates can be accessed as follows:
BorderCD$x <- BorderCD$coords[,1]
BorderCD$y <- BorderCD$coords[,2]
```

Now, getting back to business

```{r}
    BorderCD |>  ggplot() +
  geom_sf( color = "#ff0000", linewidth = 1) +
  geom_point(data = nearborder,
    aes(x=LONGITUDE83, y=LATITUDE83),
    shape = 21, size = 1, fill = "#ff7400", color = "purple"
  ) +
  geom_text(data= BorderCD,
            aes(x=x, y=y, label=GEOID),
            color = "darkblue", fontface = "bold",
            check_overlap = FALSE)
```

Border districts with locations from EPA facility registry service

Some more exploratory data exploration

```{r}
unique(nearborder$STATE_CODE)
table(nearborder$STATE_CODE)

nearborder$CD <- paste0(nearborder$STATE_CODE,nearborder$CONGRESSIONAL_DIST_NUM)
table(nearborder$CD)

nearborder <- nearborder |> filter(STATE_CODE %in% c("CA","AZ","NM","TX"))
table(nearborder$CD)
```

Welcome to the Tacos Cult of Actions!

Did you follow along successfuly? Welcome to the action cult & treat yourself to a taco!

— all photos Copyright © 2022-2024 George D Girton all rights reserved