Seven States: AZ, GA, MI, NV, NC, PA, WI

El Ohmbre! (Hombre)

I loaded a file from seven states covered by a recent poll, reported in The New Republic and the New York Post

It shows the Hispanic ancestry in these states, but in tabular form, perhaps ‘too many numbers?’

tibble [7 × 25] (S3: tbl_df/tbl/data.frame)
 $ statefips                          : chr [1:7] "#04" "#32" "#26" "#55" ...
 $ name                               : chr [1:7] "04 (AZ) Arizona" "32 (NV) Nevada" "26 (MI) Michigan" "55 (WI) Wisconsin" ...
 $ pop_spanish_speakers               : num [1:7] 1326443 597414 282048 260083 670397 ...
 $ pop_hispanic_mexican               : num [1:7] 1969385 669072 366932 285171 153031 ...
 $ pop_hispanic_puerto_rican          : num [1:7] 63237 31854 49425 66547 424104 ...
 $ pop_hispanic_cuban                 : num [1:7] 22918 38003 17289 7586 26190 ...
 $ pop_hispanic_dominican             : num [1:7] 6921 4639 7440 4121 116899 ...
 $ pop_hispanic_costa_rican           : num [1:7] 4006 2932 1950 1053 4245 ...
 $ pop_hispanic_guatemalan            : num [1:7] 23736 22225 12535 6508 17682 ...
 $ pop_hispanic_honduran              : num [1:7] 5705 6244 4651 3292 12825 ...
 $ pop_hispanic_nicaraguan            : num [1:7] 6217 7017 1355 4451 3165 ...
 $ pop_hispanic_salvadoran            : num [1:7] 21734 40227 4955 3233 13593 ...
 $ pop_hispanic_other_central_american: num [1:7] 5113 4178 2647 1274 5890 ...
 $ pop_hispanic_argentinean           : num [1:7] 5433 4047 2338 2611 8020 ...
 $ pop_hispanic_bolivian              : num [1:7] 1562 1546 454 544 1518 ...
 $ pop_hispanic_chilean               : num [1:7] 3978 1939 1784 1085 5299 ...
 $ pop_hispanic_colombian             : num [1:7] 14393 9477 6520 6473 25701 ...
 $ pop_hispanic_ecuadorian            : num [1:7] 5051 3226 2303 1949 16344 ...
 $ pop_hispanic_peruvian              : num [1:7] 9410 7745 4036 2941 12597 ...
 $ pop_hispanic_uruguayan             : num [1:7] 341 936 259 1052 1492 ...
 $ pop_hispanic_venezuelan            : num [1:7] 4429 3311 2979 2398 7333 ...
 $ pop_hispanic_south_american_nec    : num [1:7] 739 408 601 378 1716 ...
 $ pop_hispanic_spanish               : num [1:7] 81475 32891 22000 11113 31918 ...
 $ pop_hispanic_all_other             : num [1:7] 76936 22760 19886 10128 33755 ...
 $ state                              : chr [1:7] "AZ" "NV" "MI" "WI" ...

Yes, I did. I found the table of values to be ‘too many numbers’, so I asked one the LLM’s to send me some R code using ggplot, so I could highlight the two larger populations – Mexican and Puerto Rican – in the seven states … it gave me this code, to add the border color directly into the data frame.

Code

# Define custom colors for borders
border_colors <- c("pop_hispanic_mexican" = "blue",
                   "pop_hispanic_puerto_rican" = "purple")

fill_colors <- c("pop_hispanic_mexican" = "#F8766D", "pop_hispanic_puerto_rican" = "#00BFC4")

# Add a new column for border color
df_long$border_color <- ifelse(df_long$ancestry %in% names(border_colors), df_long$ancestry, NA)

Code

myplot <- ggplot(df_long, aes(x = name, y = population, fill = ancestry)) + geom_bar(stat = "identity", position = "stack", aes(color = border_color)) +
  theme_minimal() +
  labs(title = "Hispanic Ancestry Distribution by State",
       x = "State",
       y = "Population",
       fill = "Ancestry",
       color = "Highlighted Ancestry") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_color_manual(values = border_colors, na.translate = TRUE)

Code

# Convert the ggplot2 chart to an interactive plotly chart
ggplotly(myplot, tooltip = "fill")

You can see Pennsylvania has a different ancestry balance than the other states.

--- title: "7 states poll" author: "George Girton" date: "2024-08-12" categories: - R - ggplot image: "Ohmbre.png" execute: cache: true code-fold: show draft: false --- # Seven States: AZ, GA, MI, NV, NC, PA, WI El Ohmbre! (Hombre) ![electric hombre](Ohmbre.png) ```{r, echo=FALSE, message=FALSE, warning=FALSE} #| code-fold: true #| code-summary: "Show the R libraries used" #| library(tidyverse) library(gt) library(gtExtras) library(htmltools) library(plotly) anc <- rgtableanc <- read_csv("HispanicAncestryDetail.csv") # str(anc) ``` ```{r, echo=FALSE, message=FALSE, warning=FALSE} # Rename the columns to more reasonable names HispAnc <- anc |> rename( state = `_sysOL.key:Object key`, type = `_sysOL.type:Object type`, name = `_sysOL.name:Object name`, pop_spanish_speakers = `_acssalng.langSH.1:24 Pop 5+, speak Spanish`, pop_hispanic_mexican = `_acssalng.Horg.1:24 Pop Hispanic, Mexican`, pop_hispanic_puerto_rican = `_acssalng.Horg.2:24 Pop Hispanic, Puerto Rican`, pop_hispanic_cuban = `_acssalng.Horg.3:24 Pop Hispanic, Cuban`, pop_hispanic_dominican = `_acssalng.Horg.4:24 Pop Hispanic, Dominican`, pop_hispanic_costa_rican = `_acssalng.Horg.5:24 Pop Hispanic, Costa Rican`, pop_hispanic_guatemalan = `_acssalng.Horg.6:24 Pop Hispanic, Guatemalan`, pop_hispanic_honduran = `_acssalng.Horg.7:24 Pop Hispanic, Honduran`, pop_hispanic_nicaraguan = `_acssalng.Horg.8:24 Pop Hispanic, Nicaraguan`, pop_hispanic_salvadoran = `_acssalng.Horg.9:24 Pop Hispanic, Salvadoran`, pop_hispanic_other_central_american = `_acssalng.Horg.10:24 Pop Hispanic, Other Central American`, pop_hispanic_argentinean = `_acssalng.Horg.11:24 Pop Hispanic, Argentinean`, pop_hispanic_bolivian = `_acssalng.Horg.12:24 Pop Hispanic, Bolivian`, pop_hispanic_chilean = `_acssalng.Horg.13:24 Pop Hispanic, Chilean`, pop_hispanic_colombian = `_acssalng.Horg.14:24 Pop Hispanic, Colombian`, pop_hispanic_ecuadorian = `_acssalng.Horg.15:24 Pop Hispanic, Ecuadorian`, pop_hispanic_peruvian = `_acssalng.Horg.16:24 Pop Hispanic, Peruvian`, pop_hispanic_uruguayan = `_acssalng.Horg.17:24 Pop Hispanic, Uruguayan`, pop_hispanic_venezuelan = `_acssalng.Horg.18:24 Pop Hispanic, Venezuelan`, pop_hispanic_south_american_nec = `_acssalng.Horg.19:24 Pop Hispanic, South American (nec)`, pop_hispanic_spanish = `_acssalng.Horg.20:24 Pop Hispanic, Spanish`, pop_hispanic_all_other = `_acssalng.Horg.21:24 Pop Hispanic, All other Hispanic` ) HispAnc <- HispAnc |> mutate(state_code = str_extract(name, "(?<=\$)\\w{2}(?=\$)")) somestates = c("AZ", "NV", "PA", "MI", "WI", "GA", "NC", "TX", "CA", "IL","MT") reduced <- HispAnc |> filter(state_code %in% somestates) |> select(-type) |> rename(statefips=state, state=state_code) ``` I loaded a file from seven states covered by a recent poll, reported in [The New Republic](https://newrepublic.com/article/184533/surprise-poll-reveals-key-trump-weakness-kamala-harris "Harris leads Trump by 55 percent to 37 percent in the head-to-head finding, which sampled 800 Latinos across Arizona, Nevada, Pennsylvania, Michigan, Wisconsin, Georgia, and North Carolina.") and the [New York Post](https://nypost.com/2024/08/05/us-news/kamala-harris-up-18-points-with-battleground-hispanics-cutting-trumps-support-democratic-aligned-pollster/ "Harris, per the survey, leads 55% to 37% among voters from that demographic in seven key battlegrounds: Arizona, Nevada, Pennsylvania, Michigan, Wisconsin, Georgia and North Carolina") It shows the Hispanic ancestry in these states, but in tabular form, perhaps 'too many numbers?' ```{r, echo=FALSE, message=FALSE, warning=FALSE} str(reduced) ``` ```{r, echo=FALSE, message=FALSE, warning=FALSE} # Then making it long # Pivot the data frame longer for easier manipulation df_long <- reduced %>% pivot_longer(cols = starts_with("pop_hispanic"), names_to = "ancestry", values_to = "population") ``` Yes, I did. I found the table of values to be 'too many numbers', so I asked one the LLM's to send me some R code using ggplot, so I could highlight the two larger populations -- Mexican and Puerto Rican -- in the seven states ... it gave me this code, to add the border color directly into the data frame. ```{r} # Define custom colors for borders border_colors <- c("pop_hispanic_mexican" = "blue", "pop_hispanic_puerto_rican" = "purple") fill_colors <- c("pop_hispanic_mexican" = "#F8766D", "pop_hispanic_puerto_rican" = "#00BFC4") # Add a new column for border color df_long$border_color <- ifelse(df_long$ancestry %in% names(border_colors), df_long$ancestry, NA) ``` ```{r} myplot <- ggplot(df_long, aes(x = name, y = population, fill = ancestry)) + geom_bar(stat = "identity", position = "stack", aes(color = border_color)) + theme_minimal() + labs(title = "Hispanic Ancestry Distribution by State", x = "State", y = "Population", fill = "Ancestry", color = "Highlighted Ancestry") + theme(axis.text.x = element_text(angle = 45, hjust = 1)) + scale_color_manual(values = border_colors, na.translate = TRUE) ``` ```{r} # Convert the ggplot2 chart to an interactive plotly chart ggplotly(myplot, tooltip = "fill") ``` You can see Pennsylvania has a different ancestry balance than the other states.