As in the presentation, we will use data from the German General Social Survey - Allbus 2021 for this exercise. You should (have) download(ed) the dataset in .sav
format and saved it in a folder caller data
within the folder containing the materials for this workshop. Also remember that it is helpful to consult the codebook for the data set. That being sad, let’s get wrangling…
…but before we can do that, we need to load the tidyverse
and haven
package(s) and import the data.
library (tidyverse)
library (haven)
## Warning: Paket 'haven' wurde unter R Version 4.1.3 erstellt
library(sjlabelled)
## Warning: Paket 'sjlabelled' wurde unter R Version 4.1.3 erstellt
allbus_2021 <- read_sav("./data/allbus_2021ZA5280_v1-0-0.sav") %>%
remove_all_labels() %>%
as_tibble()
base R
, create a new object called allbus_institut_trust
that contains all variables that assess how much people trust institutions (e.g., European Commission, Bundestag, police). To find the required variable names, you can check the codebook (search for “trust”) or have a look at the clue for this task.
pt01
, and the last one is pt20
. They appear consecutively in the data set. Remember that there are two options for selecting columns in base R
: One is subsetting using [ ], the other is the subset()
function.
dplyr
package to create a new object named allbus_2021_info
that only contains the (binary) variables that asked about the use of different devices for the individual Internet consumption. Again, you can consult the code book to find the right variable names (search for “Internet”) or have a look at the clue for this task, instead.
lm27
, and the last one is lm34
. They appear consecutively in the data set.
tidyverse
package dplyr
, select only the character variables from the allbus_2021
data set and assign them to an object named allbus_char
.
where()
for this task.
After creating subsets of variables, let’s now rename those variables using dplyr
functions again for the allbus_2021_info
object in one step. First, rename the variables lm27
to internet_use_pc
, lm28
to internet_use_laptop
, and lm29
to internet_use_tablet
. Then rename the variables lm30
lm31
lm32
lm33
, and lm34
to internet_use_smartphone
, internet_use_TV
, internet_use_playstation
, internet_use_ebook
, and internet_use_other
, respectively, using a function from dplyr
.
select()
command.
As the final task in this set of exercises, do the previous selection and renaming procedure in base R
. That is, first rename the variables lm27
to internet_use_pc
, lm28
to internet_use_laptop
, and lm29
to internet_use_tablet
with base R
.
Then, rename the variables lm30
lm31
lm32
lm33
, and lm34
to internet_use_smartphone
, internet_use_TV
, internet_use_playstation
, internet_use_ebook
, and internet_use_other
, respectively, using a function from dplyr
.
dplyr
function for renaming the variables, assign the result to the same object name as before (i.e., overwrite the internet_use_pc
object).
base R
function we need here is colnames()
, and the dplyr
function is rename()
. Remember that the correct syntax the rename()
function is new_name = old_name
.