Exercise 2_3_1: Neighborhood Matrices

Thus far, we have only used Queen neighborhood matrices with our data. Let’s use this exercise to try out different variations. First of all, run the code below to compile the data that were also used in the lecture.

## Reading layer `Stimmbezirk' from data source 
##   `C:\Users\mueller2\a_talks_presentations\gesis-workshop-geospatial-techniques-R-2024\data\Stimmbezirk.shp' using driver `ESRI Shapefile'
## Simple feature collection with 543 features and 14 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 343914.7 ymin: 5632759 xmax: 370674.3 ymax: 5661475
## Projected CRS: ETRS89 / UTM zone 32N

## ℹ Using "','" as decimal and "'.'" as grouping mark. Use `read_delim()` for more control.

## Rows: 949 Columns: 79
## ── Column specification ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## Delimiter: ";"
## chr  (3): wahl, ags, gebiet-name
## dbl (71): gebiet-nr, max-schnellmeldungen, anz-schnellmeldungen, A1, A2, A3, A, B, B1, C, D, E, F, D1, F1, D2, F2, D3, F3, D4, F4, D5,...
## num  (1): datum
## lgl  (4): D30, F30, D31, F31
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

voting_districts <-
  sf::st_read("./data/Stimmbezirk.shp") |> 
  dplyr::transmute(Stimmbezirk = as.numeric(nummer)) |> 
  sf::st_transform(3035)

afd_votes <-
  glue::glue(
    "https://www.stadt-koeln.de/wahlen/bundestagswahl/09-2021/praesentation/\\
    Open-Data-Bundestagswahl476.csv"
  ) |> 
  readr::read_csv2() |> 
  dplyr::transmute(Stimmbezirk = `gebiet-nr`, afd_share = (F1 / F) * 100)

election_results <-
  dplyr::left_join(
    voting_districts,
    afd_votes,
    by = "Stimmbezirk"
  )

immigrants_cologne <-
  z11::z11_get_100m_attribute(STAATSANGE_KURZ_2) |> 
  terra::crop(election_results) |> 
  terra::mask(election_results)


inhabitants_cologne <-
  z11::z11_get_100m_attribute(Einwohner) |> 
  terra::crop(election_results) |> 
  terra::mask(election_results)

immigrant_share_cologne <-
  (immigrants_cologne / inhabitants_cologne) * 100

election_results <-
  election_results |> 
  dplyr::mutate(
    immigrant_share = 
      exactextractr::exact_extract(
        immigrant_share_cologne, election_results, 'mean', progress = FALSE
        ),
    inhabitants = 
      exactextractr::exact_extract(
        inhabitants_cologne, election_results, 'mean', progress = FALSE
        )
  )

1

As in the lecture, create a neighborhood (weight) matrix, but this time, do it for Queen and Rook neighborhoods. Also, apply a row normalization.

Clues

You could either use the sdep package with its function spdep::poly2nb() or the more modern approach of the sfdep package using the function sfdep::st_contiguity(). In both cases, you have to set the option queen = FALSE for Rook neighborhoods.

solution

# spdep
queen_neighborhood <-
  spdep::poly2nb(
    election_results,
    queen = TRUE
  )

queen_W <- spdep::nb2listw(queen_neighborhood, style = "W")

rook_neighborhood <-
  spdep::poly2nb(
    election_results,
    queen = FALSE
  )

rook_W <- spdep::nb2listw(rook_neighborhood, style = "W")

# sfdep
election_results <-
  election_results |> 
  dplyr::mutate(
    queen_neighborhood = sfdep::st_contiguity(election_results, queen = TRUE),
    queen_W = sfdep::st_weights(queen_neighborhood),
    rook_neighborhood = sfdep::st_contiguity(election_results, queen = FALSE),
    rook_W = sfdep::st_weights(rook_neighborhood)
  )

2

We have not used them, but you can also create distance-based weight matrices. Use the package of your choice again and create weights for a distance between 0 and 5000 meters. Use again row-normalization.

You must also convert the polygon data to point coordinates for this exercise. We’d propose to use the centroids for this task:

election_results_centroids <- sf::st_centroid(election_results)

Use a map to corroborate this conversion was successful.

Clues

If you use spdep, use the function spdep::dnearneigh(); if you use sfdep, use the function sfdep::st_dist_band().

solution

# convert to centroids
election_results_centroids <- sf::st_centroid(election_results)

tm_shape(election_results_centroids) +
  tm_dots()

# spdep
distance_neighborhood_5000 <-
  spdep::dnearneigh(election_results_centroids, 0, 5000)

distance_neighborhood_5000_W <- 
  spdep::nb2listw(distance_neighborhood_5000, style = "W")

# sfdep
election_results_centroids <-
  election_results_centroids |> 
  dplyr::mutate(
    neighbors_5000 = sfdep::st_dist_band(election_results_centroids, 0, 5000),
    weights_5000 = sfdep::st_weights(neighbors_5000)
  )

2

Now, let’s see how these different spatial weights perform in an analysis. Calculate Moran’s I and Geary’s C for each one of the weights and report their results for the variable afd_share.

Clues

It is essential to which path you have taken before – using spdep and sfdep – as it determines how you solve this exercise.

solution

# spdep
spdep::moran.test(election_results$immigrant_share, listw = queen_W)

## 
##  Moran I test under randomisation
## 
## data:  election_results$immigrant_share  
## weights: queen_W    
## 
## Moran I statistic standard deviate = 20.428, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      0.5408375504     -0.0018450185      0.0007057415

spdep::moran.test(election_results$immigrant_share, listw = rook_W)

## 
##  Moran I test under randomisation
## 
## data:  election_results$immigrant_share  
## weights: rook_W    
## 
## Moran I statistic standard deviate = 19.86, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      0.5473356435     -0.0018450185      0.0007646997

spdep::moran.test(
  election_results_centroids$immigrant_share, 
  listw = distance_neighborhood_5000_W
)

## 
##  Moran I test under randomisation
## 
## data:  election_results_centroids$immigrant_share  
## weights: distance_neighborhood_5000_W    
## 
## Moran I statistic standard deviate = 22.666, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      1.227739e-01     -1.845018e-03      3.022943e-05

spdep::geary.test(election_results$immigrant_share, listw = queen_W)

## 
##  Geary C test under randomisation
## 
## data:  election_results$immigrant_share 
## weights: queen_W   
## 
## Geary C statistic standard deviate = 17.355, p-value < 2.2e-16
## alternative hypothesis: Expectation greater than statistic
## sample estimates:
## Geary C statistic       Expectation          Variance 
##       0.442840220       1.000000000       0.001030604

spdep::geary.test(election_results$immigrant_share, listw = rook_W)

## 
##  Geary C test under randomisation
## 
## data:  election_results$immigrant_share 
## weights: rook_W   
## 
## Geary C statistic standard deviate = 17.169, p-value < 2.2e-16
## alternative hypothesis: Expectation greater than statistic
## sample estimates:
## Geary C statistic       Expectation          Variance 
##       0.433792252       1.000000000       0.001087528

spdep::geary.test(
  election_results_centroids$immigrant_share, 
  listw = distance_neighborhood_5000_W
)

## 
##  Geary C test under randomisation
## 
## data:  election_results_centroids$immigrant_share 
## weights: distance_neighborhood_5000_W   
## 
## Geary C statistic standard deviate = 11.529, p-value < 2.2e-16
## alternative hypothesis: Expectation greater than statistic
## sample estimates:
## Geary C statistic       Expectation          Variance 
##      0.8815008215      1.0000000000      0.0001056496

# sfdep
library(magrittr)

election_results %$% 
  sfdep::global_moran_test(immigrant_share, queen_neighborhood, queen_W)

## 
##  Moran I test under randomisation
## 
## data:  x  
## weights: listw    
## 
## Moran I statistic standard deviate = 20.428, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      0.5408375504     -0.0018450185      0.0007057415

election_results %$% 
  sfdep::global_moran_test(immigrant_share, rook_neighborhood, rook_W)

## 
##  Moran I test under randomisation
## 
## data:  x  
## weights: listw    
## 
## Moran I statistic standard deviate = 19.86, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      0.5473356435     -0.0018450185      0.0007646997

election_results_centroids %$% 
  sfdep::global_moran_test(immigrant_share, neighbors_5000, weights_5000)

## 
##  Moran I test under randomisation
## 
## data:  x  
## weights: listw    
## 
## Moran I statistic standard deviate = 22.666, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      1.227739e-01     -1.845018e-03      3.022943e-05

election_results %$% 
  sfdep::global_c_test(immigrant_share, queen_neighborhood, queen_W)

## 
##  Geary C test under randomisation
## 
## data:  x 
## weights: listw   
## 
## Geary C statistic standard deviate = 17.355, p-value < 2.2e-16
## alternative hypothesis: Expectation greater than statistic
## sample estimates:
## Geary C statistic       Expectation          Variance 
##       0.442840220       1.000000000       0.001030604

election_results %$% 
  sfdep::global_c_test(immigrant_share, rook_neighborhood, rook_W)

## 
##  Geary C test under randomisation
## 
## data:  x 
## weights: listw   
## 
## Geary C statistic standard deviate = 17.169, p-value < 2.2e-16
## alternative hypothesis: Expectation greater than statistic
## sample estimates:
## Geary C statistic       Expectation          Variance 
##       0.433792252       1.000000000       0.001087528

election_results_centroids %$% 
  sfdep::global_c_test(immigrant_share, neighbors_5000, weights_5000)

## 
##  Geary C test under randomisation
## 
## data:  x 
## weights: listw   
## 
## Geary C statistic standard deviate = 11.529, p-value < 2.2e-16
## alternative hypothesis: Expectation greater than statistic
## sample estimates:
## Geary C statistic       Expectation          Variance 
##      0.8815008215      1.0000000000      0.0001056496

Exercise 2_3_1: Neighborhood Matrices

Stefan Jünger & Anne-Kathrin Stroppe

Introduction to Geospatial Techniques for Social Scientists in R

1

Clues

solution

2

Clues

solution

2

Clues

solution