In this first short set of exercises for the second data wrangling
session, we will focus on creating and transforming variables using
functions from the dplyr
package.
NB: For the following exercises, it will be helpful to look at the bespoke (mini-)codebook we have created for this workshop.
Before we can start, we need to load the packages we have seen on the slides for this session so far.
library(sjlabelled)
library(tidyverse)
library(haven)
To avoid potential issues due to different data set versions produced
by coding along during the session, we will start with a blank slate.
Hence, we will import the ALLBUS 2021 .sav
file
(again),
.sav
file and assign it to
an object named allbus_2021
. We want the user-defined
missing values from the SPSS file to be converted to
NA
in the data import and also remove all labels.
.sav
file we can use a function from the
haven
package. The conversion of SPSS user-defined
missings to NA
is the default in this function (meaning we
do not need to specify that optional argument). For getting rid of all
labels we can use a function from the sjlabelled
package.
dplyr
function
for creating and transforming variables to create a new variable
representing political orientation (on the left-right spectrum) named
pol_view_new
that ranges from 0 to 9 instead of from 1 to
10 as is the case for the original variable.
pa01
. For this
transformation, we we simply need to subtract 1 from the existing
variable.
dplyr
package, recode the values of the variable
measuring trust in the federal government into a new variable named
distrust_gov
that captures distrust instead of trust.
pt12
and
it ranges from 1 to 7. Remember that the correct syntax for recoding
values with the corresponding dplyr
function is old value
(enclosed in backticks) = new value.