Exercise 6: Using SoRa

Introduction to Geospatial Techniques for Social Scientists in R

Author

Stefan Jünger, Anne-Kathrin Stroppe, Dennis Abel

In this exercise, we will work with the SoRa API to search for some data and link them to some input datasets. It is quite open in terms which datasets you want to use and how you want to link them. Generally, we recommend to have a closer look at the cheat sheet in the slides folder.

Exercises

Note🏋Exercise 1

Go ahead and register an PI key for the SoRa Geolinking API at https://sora.gesis.org/public/sora-user-mod/users/request-api-key

Make sure to save this token somewhere safe.

Note🏋Exercise 2

Have a look at the IOER Monitor at https://monitor.ioer.de/ and search for some attributes you might find interesting.

There’s a small British flag in the upper right corner to set the language to English.

Have you found something nice? Now let’s switch to the SoRa interfaces.

🏋️ Exercise 3

Corroborate of the attribute you find interesting can also be found in the SoRa datapicker.

You can either use the online datapicker https://sora.gesis.org/public/datapicker/ or navigate the datapicker in R. You may want to have a look at the corresponding help file ?sora::sora_datapicker or the cheat sheet.

🏋Exercise 4

We are getting closer, it is time to test the connection to the API.

  1. Register your API key in R
  2. Test the connection :::

There are several methods you can use to register your API key. To do it just for this session, you may want to store it in a text file and set the environment variable in R with Sys.setenv(SORA_API_KEY = readLines(<sora_key>)).

🏋Exercise 5

Load the synthetic geocoordinates in the data folder and define them as a custom dataset for sora. If you have your own coordinates, you can also use them. :::

Custom datasets in sora can be defined using sora::sora_custom()

🏋Exercise 6

From now on, experiment freely with different linking endeavors. You can request data with a selection area comprising circles, squares or whatever you want. :::

There are several methods you can use to register your API key. To do it just for this session, you may want to store it in a text file and set the environment variable in R with Sys.setenv(SORA_API_KEY = readLines(<sora_key>)).

Solutions

Well, I just clicked on the link of the website.

Well, I just clicked on the link of the website again.

Well, I just clicked on the link of the website again. Just kidding, if you want to see the geospatial datasets in the SoRa datapicker in R use this code:

sora::sora_datapicker("spatial")
# A tibble: 9,443 × 14
   data_provider dataset_id    datatype description geometry_type keywords label
   <chr>         <chr>         <chr>    <chr>       <chr>         <chr>    <chr>
 1 IOER          ioer-monitor… numeric  Number of … Raster        ""       1000…
 2 IOER          ioer-monitor… numeric  Number of … Raster        ""       1000…
 3 IOER          ioer-monitor… numeric  Number of … Raster        ""       5000…
 4 IOER          ioer-monitor… numeric  Number of … Vector: Poly… ""       Citi…
 5 IOER          ioer-monitor… numeric  Number of … Vector: Poly… ""       Dist…
 6 IOER          ioer-monitor… numeric  Number of … Vector: Poly… ""       Muni…
 7 IOER          ioer-monitor… numeric  Number of … Vector: Poly… ""       Muni…
 8 IOER          ioer-monitor… numeric  Number of … Vector: Poly… ""       Spat…
 9 IOER          ioer-monitor… numeric  Number of … Vector: Poly… ""       Stat…
10 IOER          ioer-monitor… numeric  Number of … Raster        ""       1000…
# ℹ 9,433 more rows
# ℹ 7 more variables: required_permissions <chr>, service_provider <chr>,
#   spatial_resolution <chr>, time_frame <int>, title <chr>, unit <chr>,
#   url <chr>
# register the API key
Sys.setenv(SORA_API_KEY = readLines("sora_key"))

# test the connection
sora::sora_available()
[1] TRUE
synthetic_survey_geocoordinates <-
  readRDS("./data/synthetic_survey_geocoordinates.rds") |> 
  sora::sora_custom()

synthetic_survey_geocoordinates
<sora_custom>
Using custom data with EPSG code 3035. 

     id       x       y
1   142 4294500 3306500
2   263 4119500 3159500
3   839 4132500 3169500
4  1400 4319500 3397500
5  1783 4582500 3264500
6   705 4225500 3231500
7   392 4127500 3052500
8   120 4204500 3272500
9  1612 4619500 3140500
10 1533 4302500 2840500
# circles
sora_circles <-
  sora::sora_request(
    dataset = synthetic_survey_geocoordinates,
    link_to = "ioer-monitor-r01rg-2021-1000m",
    method = "aggregate_attribute",
    selection_area = "circle",
    radius = 10000,
    output = "mean",
    wait = TRUE
  )
→ The provided coordinates have unique identifiers and are in a valid format.
→ The requested sora-provided geospatial dataset exists.
→ Chosen geospatial dataset: Percentage of flood zones to reference area (2021,
  1000m Raster) from IOER-Monitor (IOER)
→ Chosen linkage: Aggregate attribute within circle on raster with numeric
  field - Geocoded Dataset: Vector: Point | Geospatial Dataset: Raster -
  numeric
→ The requested linkage is plausible and fits the chosen geospatial dataset.
  All provided parameters are valid, including their values.
→ All required permissions for accessing the data are available.
→ Total number of provided coordinates: 500 (valid: 500, having null values:
  0).
→ Number of coordinates located within the bounding box of the geospatial
  dataset: 500 (outside: 0).
→ All required external services are available.
Information from SoRa: 
• Hello from the SoRa-X-API. Everything is in good order. Have a nice day!
Waiting for results...
→ The linking job was created at 2026-04-22 15:31:48
→ The processing of the linking jobs has started at 2026-04-22 15:31:54
→ The processing of the linking job has finished at 2026-04-22 15:32:31
→ The total count of input items is 500 (successfully linked: 500)
→ Linking method: aggregate_attribute
→ Chosen linking parameter: output = ['mean']; radius = 10000; selection_area =
  circle
→ Chosen geospatial dataset: ioer-monitor-r01rg-2021-1000m - Percentage of
  flood zones to reference area (2021, 1000m Raster) from IOER-Monitor (IOER)
Information from SoRa: 
• Hello from the SoRa-X-API. Everything is in good order. Have a nice day!
sora_circles
# A tibble: 500 × 4
   id          area   mean count
   <chr>      <dbl>  <dbl> <int>
 1 142   312144515.  1.96    309
 2 263   312144515.  1.78    309
 3 839   312144515.  0.629   309
 4 1400  312144515.  0.779   309
 5 1783  312144515.  5.52    309
 6 705   312144515.  5.53    309
 7 392   312144515.  6.69    309
 8 120   312144515. 15.0     309
 9 1612  312144515.  4.51    309
10 1533  312144515.  2.02    309
# ℹ 490 more rows
# squares
sora_squares <-
  sora::sora_request(
    dataset = synthetic_survey_geocoordinates,
    link_to = "ioer-monitor-r01rg-2021-1000m",
    method = "aggregate_attribute",
    selection_area = "square",
    length = 10000,
    output = "mean",
    wait = TRUE
  )
→ The provided coordinates have unique identifiers and are in a valid format.
→ The requested sora-provided geospatial dataset exists.
→ Chosen geospatial dataset: Percentage of flood zones to reference area (2021,
  1000m Raster) from IOER-Monitor (IOER)
→ Chosen linkage: Aggregate attribute within square on raster with numeric
  field - Geocoded Dataset: Vector: Point | Geospatial Dataset: Raster -
  numeric
→ The requested linkage is plausible and fits the chosen geospatial dataset.
  All provided parameters are valid, including their values.
→ All required permissions for accessing the data are available.
→ Total number of provided coordinates: 500 (valid: 500, having null values:
  0).
→ Number of coordinates located within the bounding box of the geospatial
  dataset: 500 (outside: 0).
→ All required external services are available.
Information from SoRa: 
• Hello from the SoRa-X-API. Everything is in good order. Have a nice day!
Waiting for results...
→ The linking job was created at 2026-04-22 15:32:35
→ The processing of the linking jobs has started at 2026-04-22 15:32:44
→ The processing of the linking job has finished at 2026-04-22 15:33:20
→ The total count of input items is 500 (successfully linked: 500)
→ Linking method: aggregate_attribute
→ Chosen linking parameter: length = 10000; output = ['mean']; selection_area =
  square
→ Chosen geospatial dataset: ioer-monitor-r01rg-2021-1000m - Percentage of
  flood zones to reference area (2021, 1000m Raster) from IOER-Monitor (IOER)
Information from SoRa: 
• Hello from the SoRa-X-API. Everything is in good order. Have a nice day!
sora_squares
# A tibble: 500 × 4
   id         area   mean count
   <chr>     <dbl>  <dbl> <int>
 1 142   100000000  2.68    121
 2 263   100000000  0.655   121
 3 839   100000000  0.715   121
 4 1400  100000000  0.201   121
 5 1783  100000000  2.95    121
 6 705   100000000  7.97    121
 7 392   100000000 11.4     121
 8 120   100000000 16.4     121
 9 1612  100000000  5.30    121
10 1533  100000000  2.03    121
# ℹ 490 more rows
# isochrones
sora_isochrones <-
  sora::sora_request(
    dataset = synthetic_survey_geocoordinates,
    link_to = "ioer-monitor-r01rg-2021-1000m",
    method = "aggregate_attribute",
    selection_area = "isochrone",
    transport_mode = "foot-walking",
    routing_type = "time",
    interval = 10,
    output = "mean",
    wait = TRUE
  )
→ The provided coordinates have unique identifiers and are in a valid format.
→ The requested sora-provided geospatial dataset exists.
→ Chosen geospatial dataset: Percentage of flood zones to reference area (2021,
  1000m Raster) from IOER-Monitor (IOER)
→ Chosen linkage: Aggregate attribute within isochrone on raster with numeric
  field - Geocoded Dataset: Vector: Point | Geospatial Dataset: Raster -
  numeric
→ The requested linkage is plausible and fits the chosen geospatial dataset.
  All provided parameters are valid, including their values.
→ All required permissions for accessing the data are available.
→ Total number of provided coordinates: 500 (valid: 500, having null values:
  0).
→ Number of coordinates located within the bounding box of the geospatial
  dataset: 500 (outside: 0).
→ All required external services are available.
Information from SoRa: 
• Hello from the SoRa-X-API. Everything is in good order. Have a nice day!
Waiting for results...
→ The linking job was created at 2026-04-22 15:33:24
→ The processing of the linking jobs has started at 2026-04-22 15:33:34
→ The processing of the linking job has finished at 2026-04-22 15:34:21
→ The total count of input items is 500 (successfully linked: 498)
→ Linking method: aggregate_attribute
→ Chosen linking parameter: output = ['mean']; interval = 10; routing_type =
  time; selection_area = isochrone; transport_mode = foot-walking
→ Chosen geospatial dataset: ioer-monitor-r01rg-2021-1000m - Percentage of
  flood zones to reference area (2021, 1000m Raster) from IOER-Monitor (IOER)
Warning in "function `sora_results()`": ! There are problems with rows 295 and 404.
ℹ To inspect the problems, call `sora_problems(...)` on your output.
Information from SoRa: 
• Hello from the SoRa-X-API. Everything is in good order. Have a nice day!
sora_isochrones
# A tibble: 500 × 4
   id        area  mean count
   <chr>    <dbl> <dbl> <int>
 1 142     68962.  0        1
 2 263   1476842   0        1
 3 839   1020117.  2.53     1
 4 1400  1433308.  0        1
 5 1783  1036146.  0        1
 6 705   1485221.  0        1
 7 392    724347. 91.8      1
 8 120    632333.  0        1
 9 1612   309423. NA        0
10 1533  1472702.  7.22     1
# ℹ 490 more rows