class: center, middle, inverse, title-slide .title[ # Introduction to R for Data Analysis ] .subtitle[ ## Outlook ] .author[ ### Johannes Breuer, Stefan Jünger, Veronika Batzdorfer ] .date[ ### 2021-08-19 ] --- layout: true --- ## Recap: Course schedule - Day 1 <table class="table" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Day </th> <th style="text-align:left;"> Time </th> <th style="text-align:left;"> Topic </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: gray !important;"> Monday </td> <td style="text-align:left;color: gray !important;"> 09:30 - 10:30 </td> <td style="text-align:left;font-weight: bold;"> Getting Started with R and RStudio </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Monday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 10:30 - 10:45 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Monday </td> <td style="text-align:left;color: gray !important;"> 10:45 - 12:00 </td> <td style="text-align:left;font-weight: bold;"> Getting Started with R and RStudio </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Monday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 12:00 - 13:00 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Lunch Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Monday </td> <td style="text-align:left;color: gray !important;"> 13:00 - 14:00 </td> <td style="text-align:left;font-weight: bold;"> Data Import & Export </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Monday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 14:00 - 14:15 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Monday </td> <td style="text-align:left;color: gray !important;"> 14:15 - 15:30 </td> <td style="text-align:left;font-weight: bold;"> Data Import & Export </td> </tr> </tbody> </table> --- ## Recap: Course schedule - Day 2 <table class="table" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Day </th> <th style="text-align:left;"> Time </th> <th style="text-align:left;"> Topic </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: gray !important;"> Tuesday </td> <td style="text-align:left;color: gray !important;"> 09:30 - 10:30 </td> <td style="text-align:left;font-weight: bold;"> Data Wrangling - Part 1 </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Tuesday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 10:30 - 10:45 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Tuesday </td> <td style="text-align:left;color: gray !important;"> 10:45 - 12:00 </td> <td style="text-align:left;font-weight: bold;"> Data Wrangling - Part 1 </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Tuesday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 12:00 - 13:00 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Lunch Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Tuesday </td> <td style="text-align:left;color: gray !important;"> 13:00 - 14:00 </td> <td style="text-align:left;font-weight: bold;"> Data Wrangling - Part 2 </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Tuesday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 14:00 - 14:15 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Tuesday </td> <td style="text-align:left;color: gray !important;"> 14:15 - 15:30 </td> <td style="text-align:left;font-weight: bold;"> Data Wrangling - Part 2 </td> </tr> </tbody> </table> --- ## Recap: Course schedule - Day 3 <table class="table" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Day </th> <th style="text-align:left;"> Time </th> <th style="text-align:left;"> Topic </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: gray !important;"> Wednesday </td> <td style="text-align:left;color: gray !important;"> 09:30 - 10:30 </td> <td style="text-align:left;font-weight: bold;"> Exploratory Data Analysis </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Wednesday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 10:30 - 10:45 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Wednesday </td> <td style="text-align:left;color: gray !important;"> 10:45 - 12:00 </td> <td style="text-align:left;font-weight: bold;"> Exploratory Data Analysis </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Wednesday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 12:00 - 13:00 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Lunch Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Wednesday </td> <td style="text-align:left;color: gray !important;"> 13:00 - 14:00 </td> <td style="text-align:left;font-weight: bold;"> Data Visualization - Part 1 </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Wednesday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 14:00 - 14:15 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Wednesday </td> <td style="text-align:left;color: gray !important;"> 14:15 - 15:30 </td> <td style="text-align:left;font-weight: bold;"> Data Visualization - Part 1 </td> </tr> </tbody> </table> --- ## Recap: Course schedule - Day 4 <table class="table" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Day </th> <th style="text-align:left;"> Time </th> <th style="text-align:left;"> Topic </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: gray !important;"> Thursday </td> <td style="text-align:left;color: gray !important;"> 09:30 - 10:30 </td> <td style="text-align:left;font-weight: bold;"> Confirmatory Data Analysis </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Thursday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 10:30 - 10:45 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Thursday </td> <td style="text-align:left;color: gray !important;"> 10:45 - 12:00 </td> <td style="text-align:left;font-weight: bold;"> Confirmatory Data Analysis </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Thursday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 12:00 - 13:00 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Lunch Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Thursday </td> <td style="text-align:left;color: gray !important;"> 13:00 - 14:00 </td> <td style="text-align:left;font-weight: bold;"> Data Visualization - Part 2 </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Thursday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 14:00 - 14:15 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Thursday </td> <td style="text-align:left;color: gray !important;"> 14:15 - 15:30 </td> <td style="text-align:left;font-weight: bold;"> Data Visualization - Part 2 </td> </tr> </tbody> </table> --- ## Recap: Course schedule - Day 5 <table class="table" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Day </th> <th style="text-align:left;"> Time </th> <th style="text-align:left;"> Topic </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: gray !important;"> Friday </td> <td style="text-align:left;color: gray !important;"> 09:30 - 10:30 </td> <td style="text-align:left;font-weight: bold;"> Reporting with R Markdown </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Friday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 10:30 - 10:45 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Friday </td> <td style="text-align:left;color: gray !important;"> 10:45 - 12:30 </td> <td style="text-align:left;font-weight: bold;"> Reporting with R Markdown </td> </tr> <tr> <td style="text-align:left;color: gray !important;color: gray !important;"> Friday </td> <td style="text-align:left;color: gray !important;color: gray !important;"> 12:30 - 13:30 </td> <td style="text-align:left;font-weight: bold;color: gray !important;"> Lunch Break </td> </tr> <tr> <td style="text-align:left;color: gray !important;"> Friday </td> <td style="text-align:left;color: gray !important;"> 13:45 - 14:30 </td> <td style="text-align:left;font-weight: bold;"> Outlook, Q&A </td> </tr> </tbody> </table> --- ## Our jou`R`ney this week (hopefully) <img src="data:image/png;base64,#C:\Users\breuerjs\Documents\Lehre\r-intro-gesis-2022\content\img\r_pkgs_mindblowing.png" width="85%" style="display: block; margin: auto;" /> .footnote[[Source](https://res.cloudinary.com/syknapptic/image/upload/v1521320144/tidyverse_meme_oceake.png)] --- ## Where to go from here? Hopefully, after this week, you feel prepared to do your next steps in `R`. Some recommendations for continuing your jou`R`ney: - Keep up working with `R`! - If time permits, do stuff you usually do in `SPSS` or `Stata` in `R`, even when it's harder - Try to do at least one research task solely in `R` (one analysis, a whole paper, a report, etc.) - Look for tutorials and guides online - trust us, there's way more (good & free) online material for `R` than there is, e.g., for `SPSS` or `Stata` --- ## Resources: Introductory books [R for Data Science](https://r4ds.had.co.nz/) by Hadley Wickham [R Cookbook: Proven recipes for data analysis, statistics, and graphics](https://rc2e.com/) by JD Long & Paul Teetor [Hands-On Programming with R](https://rstudio-education.github.io/hopr/) by Garrett Grolemund [R Programming for Data Science](https://bookdown.org/rdpeng/rprogdatascience/) by Roger D. Peng [Quantitative Social Science Data with R](https://uk.sagepub.com/en-gb/eur/quantitative-social-science-data-with-r/book257236) by Brian J. Fogarty [Introduction to R for Social Scientists - A Tidy Programming Approach](https://www.routledge.com/Introduction-to-R-for-Social-Scientists-A-Tidy-Programming-Approach/Kennedy-Waggoner/p/book/9780367460723) by Ryan Kennedy & Philip D. Waggoner [Quantitative Social Science: An Introduction in tidyverse](https://press.princeton.edu/books/hardcover/9780691222271/quantitative-social-science) by Kosuke Imai & Nora Webb Williams --- ## Resources: Online courses & tutorials - Overview of resources - [*learnR4free*](https://www.learnr4free.com/) by Mine Dogucu - the [*Big Book of R*](https://www.bigbookofr.com/) by Oscar Baruffa - [*swirl* - Learn `R` in `R`](https://swirlstats.com/) - One of my personal favorites: Learning `R` (and statistics) with a cute story and beautiful illustrations: [Teacups, Giraffes, & Statistics by Hasse Wallum & Desirée de Leon](https://tinystats.github.io/teacups-giraffes-and-statistics/) --- ## Working with other data types Johannes, Stefan, and Veronika use different data types in their daily work: - digital trace data (Johannes/Veronika) <sup>*</sup> - georeferenced/geospatial data (Stefan)<sup>**</sup> **Remember that `R` is data-agnostic! It can serve as a fancy data science tool for extracting social media data but also as a full-blown Geographic Information System (GIS)** .small[ <sup>*</sup> see, e.g., https://github.com/jobreu/twitter-linking-workshop-2022 or https://github.com/jobreu/youtube-workshop-gesis-2022 <sup>**</sup> see, e.g., https://github.com/StefanJuenger/gesis-workshop-geospatial-techniques-R or https://github.com/StefanJuenger/esra-workshop-first-steps-R-GIS ] --- ## Working with text data in `R` As with almost everything else, there are many great resources for working with text data in `R`. Two good options (and starting points) are: - the [`tidytext` package](https://juliasilge.github.io/tidytext/) and the "accompanying" book [*Text Mining with R: A Tidy Approach*](https://www.tidytextmining.com/) by Julia Silge & David Robinson - the [`quanteda` package](https://quanteda.io/) and its accompanying [tutorials](https://tutorials.quanteda.io/) - a nice and free self-paced online course is [*Text mining in R for the social sciences and digital humanities*](https://tm4ss.github.io/docs/index.html) by Andreas Niekler and Gregor Wiedemann --- ## What Are Geospatial Data? .pull-left[ Data with a direct spatial reference `\(\rightarrow\)` **geo-coordinates** - Information about geometries - Optional: Content in relation to the geometries Can be projected jointly in one single space - Allows data linking and extraction of substantial information ] .pull-right[ <img src="data:image/png;base64,#C:\Users\breuerjs\Documents\Lehre\r-intro-gesis-2022\content\img\fig_geometries.png" width="85%" style="display: block; margin: auto;" /> .tinyisher[Sources: OpenStreetMap / GEOFABRIK (2018), City of Cologne (2014), and the Statistical Offices of the Federation and the Länder (2016) / Jünger, 2019] ] --- ## Mapping is so easy nowadays .pull-left[ ```r library(mapsf) mtq <- mf_get_mtq() mf_map(x = mtq) mf_map(x = mtq, var = "POP", type = "prop") mf_layout( title = "Population in Martinique", credits = "T. Giraud; Sources: INSEE & IGN, 2018" ) ``` ] .pull-right[ <img src="data:image/png;base64,#5_2_Outlook_files/figure-html/mapsf-print-1.png" style="display: block; margin: auto;" /> ] Example from: https://riatelab.github.io/mapsf/ --- ## 'Web development' using `R` These days, a lot of `R` packages provide tools originally developed for the web. For example: - [bookdown](https://pkgs.rstudio.com/bookdown/) enables you to publish books written in `R Markdown` online - [pkgdown](https://pkgdown.r-lib.org/) does the same for your own `R` packages - [blogdown](https://pkgs.rstudio.com/blogdown/l) is more general and helps you with creating websites (examples to follow) --- ## Shiny apps > Shiny is an R package that makes it easy to build interactive web apps straight from R. You can host standalone apps on a webpage or embed them in R Markdown documents or build dashboards. You can also extend your Shiny apps with CSS themes, htmlwidgets, and JavaScript actions. https://shiny.rstudio.com/ --- class: middle ## Example 1: Movie Explorer .center[https://shiny.rstudio.com/gallery/movie-explorer.html] --- class: middle ## Example 2: CRAN explorer .center[https://gallery.shinyapps.io/cran-explorer/] --- ## Creating your own homepage with `R` .pull-left[ <img src="data:image/png;base64,#C:\Users\breuerjs\Documents\Lehre\r-intro-gesis-2022\content\img\homepage_johannes.png" width="1665" style="display: block; margin: auto;" /> .center[.small[https://www.johannesbreuer.com/]] ] .pull-right[ <img src="data:image/png;base64,#C:\Users\breuerjs\Documents\Lehre\r-intro-gesis-2022\content\img\homepage_stefan.png" width="1315" style="display: block; margin: auto;" /> .center[.small[https://stefanjuenger.github.io/]] ] .center[Powered by [`blogdown`](https://cran.r-project.org/web/packages/blogdown/index.html) &[ Hugo Academic](https://academic-demo.netlify.app/)] --- ## Writing your own `R` packages .pull-left[ At a certain point (not now!), you may want to consider writing your own `R` package - useful for creating reproducible code - great for distributing your work to others - for example, we created an [`R` package](https://stefanjuenger.github.io/woRkshoptools/) to facilitate working on our workshop materials ] .pull-right[ <img src="data:image/png;base64,#C:\Users\breuerjs\Documents\Lehre\r-intro-gesis-2022\content\img\r_packages.jpg" width="1381" style="display: block; margin: auto;" /> [Read the book here!](https://r-pkgs.org/) ] --- class: middle ## It's straightforward in `RStudio` <img src="data:image/png;base64,#C:\Users\breuerjs\Documents\Lehre\r-intro-gesis-2022\content\img\new_package.png" width="75%" style="display: block; margin: auto;" /> --- ## Acknowledgements ❤️ All slides were created with the `R` package [`xaringan`](https://github.com/yihui/xaringan) which builds on [`remark.js`](https://remarkjs.com), [`knitr`](http://yihui.name/knitr), and [`R Markdown`](https://rmarkdown.rstudio.com). The exercises were created with the [`unilur` package](https://github.com/koncina/unilur). Please make sure to properly cite all data that you use for your research (archives usually provide suggested citations). Also make sure to cite the free and open-source software (FOSS) that you use, such as `R` and the packages for it. Veronika, Stefan, and Johannes want to thank Theresa for her excellent support before and during the course! Finally, all of us want to thank the *GESIS Training* team for taking good care of the organization of this course (and the whole Summer School) and all of you for participating!