Exercise 7: Neighborhood Matrices and Autocorrelation
Introduction to Geospatial Techniques for Social Scientists in R
Author
Stefan Jünger, Anne Stroppe & Dennis Abel
Thus far, we have only used Queen neighborhood matrices with our data. Let’s use this exercise to try out different variations. First of all, run the code below to compile the data that were also used in the lecture.
Reading layer `Stimmbezirk' from data source
`C:\Users\abelds\Desktop\gesis-workshop-geospatial-techniques-R-2026\data\Stimmbezirk.shp'
using driver `ESRI Shapefile'
Simple feature collection with 543 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 343914.7 ymin: 5632759 xmax: 370674.3 ymax: 5661475
Projected CRS: ETRS89 / UTM zone 32N
Exercises
Note🏋 Exercise 1
As in the lecture, create a neighborhood (weight) matrix, but this time, do it for Queen and Rook neighborhoods. Also, apply a row normalization.
Caution💡 Tip
You could either use the sdep package with its function spdep::poly2nb() or the more modern approach of the sfdep package using the function sfdep::st_contiguity(). In both cases, you must set the option queen = FALSE for Rook neighborhoods.
Note🏋 Exercise 2
We have not used them, but you can also create distance-based weight matrices. Use the package of your choice again and create weights for a distance between 0 and 5000 meters. Use again row-normalization.
You must also convert the polygon data to point coordinates for this exercise. We’d propose to use the centroids for this task:
Use a map to corroborate this conversion was successful.
Caution💡 Tip
If you use spdep, use the function spdep::dnearneigh(); if you use sfdep, use the function sfdep::st_dist_band().
Note🏋 Exercise 3
Now, let’s see how these different spatial weights perform in an analysis. Calculate Moran’s I and Geary’s C for each one of the weights and report their results for the variable spd_share.
Caution💡 Tip
It is essential to know which path you have taken before – using spdep and sfdep – as it determines how you solve this exercise.
Note🏋 Exercise 4
To wrap up, plot two Moran scatterplots to visualize the relationship between the actual values and the lagged values. Plot one plot with the neighbour-based Queen or Rook weights matrix and another one with the inverse-distance based. Compare the results.
Moran I test under randomisation
data: election_results$spd_share
weights: queen_W
Moran I statistic standard deviate = 20.869, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.5404799889 -0.0018450185 0.0006753055
Moran I test under randomisation
data: election_results$spd_share
weights: rook_W
Moran I statistic standard deviate = 20.332, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.5444447649 -0.0018450185 0.0007219029
Moran I test under randomisation
data: election_results_centroids$spd_share
weights: distance_neighborhood_5000_W
Moran I statistic standard deviate = 45.601, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
2.494222e-01 -1.845018e-03 3.036107e-05
Geary C test under randomisation
data: election_results$spd_share
weights: queen_W
Geary C statistic standard deviate = 19.232, p-value < 2.2e-16
alternative hypothesis: Expectation greater than statistic
sample estimates:
Geary C statistic Expectation Variance
0.4556471765 1.0000000000 0.0008011317
Geary C test under randomisation
data: election_results$spd_share
weights: rook_W
Geary C statistic standard deviate = 18.947, p-value < 2.2e-16
alternative hypothesis: Expectation greater than statistic
sample estimates:
Geary C statistic Expectation Variance
0.4477206733 1.0000000000 0.0008496821
Geary C test under randomisation
data: election_results_centroids$spd_share
weights: distance_neighborhood_5000_W
Geary C statistic standard deviate = 31.847, p-value < 2.2e-16
alternative hypothesis: Expectation greater than statistic
sample estimates:
Geary C statistic Expectation Variance
7.535200e-01 1.000000e+00 5.990103e-05
# sfdeplibrary(magrittr)
Attache Paket: 'magrittr'
Die folgenden Objekte sind maskiert von 'package:terra':
extract, inset
Moran I test under randomisation
data: x
weights: listw
Moran I statistic standard deviate = 20.869, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.5404799889 -0.0018450185 0.0006753055
Moran I test under randomisation
data: x
weights: listw
Moran I statistic standard deviate = 20.332, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.5444447649 -0.0018450185 0.0007219029
Moran I test under randomisation
data: x
weights: listw
Moran I statistic standard deviate = 45.601, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
2.494222e-01 -1.845018e-03 3.036107e-05
Geary C test under randomisation
data: x
weights: listw
Geary C statistic standard deviate = 19.232, p-value < 2.2e-16
alternative hypothesis: Expectation greater than statistic
sample estimates:
Geary C statistic Expectation Variance
0.4556471765 1.0000000000 0.0008011317
Geary C test under randomisation
data: x
weights: listw
Geary C statistic standard deviate = 18.947, p-value < 2.2e-16
alternative hypothesis: Expectation greater than statistic
sample estimates:
Geary C statistic Expectation Variance
0.4477206733 1.0000000000 0.0008496821
Geary C test under randomisation
data: x
weights: listw
Geary C statistic standard deviate = 31.847, p-value < 2.2e-16
alternative hypothesis: Expectation greater than statistic
sample estimates:
Geary C statistic Expectation Variance
7.535200e-01 1.000000e+00 5.990103e-05
Tip✅ Solution 4
spdep::moran.plot(x = election_results$spd_share,listw = queen_W,labels = election_results$district_id,xlab ="SPD share (Queen W)",ylab ="Spatial lag of SPD share (Queen W)")
spdep::moran.plot(x = election_results_centroids$spd_share,listw = distance_neighborhood_5000_W,labels = election_results_centroids$district_id,xlab ="SPD share (5km)",ylab ="Spatial lag of SPD share (5km)")