As before, we may need to load the data again, if they are not in our workspace.

allbus_2021_eda <- readRDS("./data/allbus_2021_eda.rds")

Besides base R and the tidyverse, for this set of exercises, we also need the janitor and the correlation package. So, make sure to install and load them.

1

As a first exercise, use base R to create a crosstab for the variables agec (rows) and party_vote (columns) showing row percentages.
We need to combine round(), table(), and prop.table() here, add an argument to prop.table() to get row totals, and transform the results to represent percentages. Extra hint: Rows are the second dimension in R dataframes.

2

Now, let’s use the janitor package to get the same results.
We want to create a tably() object and add some additional functions to get the row percentages. As the table() function excludes missing values by default, we need to make sure that missing values for the party_vote variable are excluded here as well.

3

As a final exercise on crosstabs, compute a chi-square test for the tabyl we have created before.
This time, we need to filter our missing values for our variables of interest. We do not need the percentage sign or the row percentages for this.

4

Let’s turn to correlations: Use the correlation package to calculate and print correlations between the following variables: left_right, sat_dem, xenophobia, contact
The name of the function you need is the same as that of the package we use here.

5

As a final exercise, compute the correlations between sat_dem, xenophobia, and contact, using the same function and variables as in the previous exercise, but group them by agec this time.
You need to group the data by agec before computing the correlations.