Welcome to this set of workshops on R, Open Research, and Reproducibility. My name is Andrew Stewart and I am an Experimental Psychologist in the Division of Neuroscience and Experimental Psychology at the University of Manchester and a Fellow of The Software Sustainability Institute. You can contact me via email or via direct message on Twitter. If you are interested in my academic activities, you can check out my website here.

In the summer of 2020 I started recording these workshops as part of a couple of Masters level courses I teach at the University of Manchester. I realised they might have broader interest so I decided to share a subset of them via this website. Hopefully you will find them useful. Workshop 1 provides a general introduction to Open Research and Reproducibility. The other workshops assume you have some training in undergraduate-level statistics and research design.

All the videos in these workshops are best viewed in fullscreen mode at 1080p resolution. I have recorded the audio with a podcasting microphone, so the sound will sound best with headphones. YouTube generates subtitles automatically, so please turn those on if you’d find them useful. They are fairly accurate - although sometimes ‘open research’ gets mis-subtitled as ‘urban research’…

In this workshop I will first introduce you to the key concepts in open research, and talk about the so-called “replication crisis” in the Psychological, Biomedical, and Life Sciences that has resulted in the Open Research movement. I will also discuss the importance of adopting reproducible research practices in your own research, and provide an introduction to various tools and processes you can incorporate into your own research workflows that will allow you to conduct reproducible research. To go to this workshop, just click on the image below.

In this workshop I’ll introduce you to R (the language) and RStudio Desktop (the environment we use to interact with the language). I’ve also added a link to a great talk by the founder of RStudio, J.J. Alaire. At the end of the workshop I have put together a video which will show you how to run your first R script.

In this workshop I will introduce you to a number of key packages known as the `Tidyverse`

These packages contain a large number of functions for working with data in tidy format. By making our data wrangling reproducible (i.e., by coding it in R), we can easily re-run this stage of our analysis pipeline as new data gets added. Reproducibility of the data wrangling stage is a key part of the analysis process and often gets overlooked in terms of it being reproducible. There are two parts to this workshop. The first focuses on data wrangling/tidying. To go to this first part, just click on the image below.

In this workshop we will explore Simple Regression in the context of the General Linear Model (GLM). You will also have the opportunity to build some regression models where you predict an outcome variable on the basis of one predictor. You will also learn how to run model diagnostics to ensure you are not violating any key assumptions of regression.

In this workshop will explore Multiple Regression in the context of the General Linear Model (GLM). Multiple Regressions builds on Simple Regression, except that having one predictor (as is the case with Simple Regression) we will be dealing with multiple predictors. Again, you will have the opporunity to build some regression models and use various methods to decide which one is ‘best’. You will also learn how to run model disagnostics for these models as you did in the case of Simple Regression.

In this workshop I will show you how to generate a report in `.html`

format using R Markdown. Reports written using R Markdown allow you to combine narrative that you’ve written alongwith R code chunks, and the output associated with those code chunks all in one `knitted`

document. This workshop was written for my M.Res. class who need to submit their assignments as an .html document knitted using R Markdown.

In this workshop we will explore Analysis of Variance (ANOVA) in the context of model building in R for between participants designs, repeated measures designs, and factorial designs. You will learn how to use the `afex::`

package for building models with Type III Sums of Squares, and the `emmeans::`

package to conduct follow up tests to explore main effects and interactions. This workshop is intended to keep you occupied for *two* sessions (i.e., 4 hours in total).

In this workshop we will build on Workshop 7 to explore Analysis of Covariance (ANCOVA). In this workshop we will also examine ANOVA and ANCOVA as special cases of regression and see how we can build both via a linear model. By then doing this yourselves, you wil hopefully be convinced that ANOVA and regression are really the same thing.

In this workshop we will see how mixed models combine aspects of linear regression (for model fitting) while circumventing the need for observations to be independent of each other. We will also examine how we model the influence of random effects in our mixed models, and see how mixed models can cope with unbalanced designs and missing data.

In this workshop we will examine mixed models for factorial designs, and explore how to model non-continuous dependent variables (e.g., binary and ordinal outcome variables) using the `glmer()`

family of mixed models.

All of the material in this workshop was created using open source where possible using an Entroware Apollo laptop running GNU/Linux distro Ubuntu 20.04 LTS (Focal Fossa). The audio was captured with a Fifine USB Podcasting microphone and the video with a Razer Kiyo webcam. The audio and video were recorded using Open Broadcast Software and edited using Shotcut. The R code was written using R 3.6.3, and run in the RStudio Desktop IDE version 1.3.959. Ubuntu 20.04 LTS (Focal Fossa), OBS, Shotcut, R, and RStudio Desktop are all open source.

The structure for this unit was very much inspired by the Sharing At Short Notice webinar by Alison Hill and Desirée De Leon.

The repo for each workshop can be accessed via the ‘Improve this Workshop’ link at the bottom of each workshop page. The workshops and this website were all written using R Markdown and the website is hosted on Netlify via continunous deployment from this GitHub repository.

The source code for each of the Workshops above is licensed under the MIT license, and the lecture content under CC-BY.