For this exercise, we will use the same subset of the ALLBUS
2021 data as in the lecture. If you have stored that data set as an
.rds
file as shown in the slides, you can simply load it
with the following command:
allbus_2021_eda <- readRDS("./data/allbus_2021_eda.rds")
If you have not saved the wrangled data as an .rds
file
yet, you need to go through the data wrangling pipeline shown in the EDA
slides (again).
Also, in addition to base R
and packages from the
tidyverse
, we will use the datawizard
package
in this exercise, so make sure that you have it installed.
base R
function,
print some basic summary statistics for the variables
xenophobia
and contact
.
dplyr
function for selecting variables and
pipe the result into the required function.
datawizard
package to get summary
statistics (descriptions of the distribution) for the following
variables in our data set: sat_dem
,
xenophobia
, contact
. We do not want
information on quartiles or the IQR.
?describe_distribution
.
dplyr
to create grouped
summary statistics. Compute separate means for the variables
xenophobia
and contact
for the different age
groups in the data set. The resulting summary variables should be called
xenophobia_mean
and contact_mean
. You should
exclude respondents with missing values for the variables of interest.