Instructor Notes
Timing
Leave about 30 minutes at the start of each workshop and another 15 mins at the start of each session for technical difficulties like WiFi and installing things (even if you asked students to install in advance, longer if not).
Lesson Plans
The lesson contains much more material than can be taught in a day. Instructors will need to pick an appropriate subset of episodes to use in a standard one day course.
Some suggested paths through the material are:
(suggested by @liz-is)
- 01 Introduction to R and RStudio
- 04 Data Structures
- 05 Exploring Data Frames (“Realistic example” section onwards)
- 08 Creating Publication-Quality Graphics with ggplot2
- 10 Functions Explained
- 13 Dataframe Manipulation with dplyr
- 15 Producing Reports With knitr
(suggested by @naupaka)
- 01 Introduction to R and RStudio
- 02 Project Management With RStudio
- 03 Seeking Help
- 04 Data Structures
- 05 Exploring Data Frames
- 06 Subsetting Data
- 09 Vectorization
- 08 Creating Publication-Quality Graphics with ggplot2 OR 13 Dataframe Manipulation with dplyr
- 15 Producing Reports With knitr
A half day course could consist of (suggested by @karawoo):
- 01 Introduction to R and RStudio
- 04 Data Structures (only creating vectors with
c()
) - 05 Exploring Data Frames (“Realistic example” section onwards)
- 06 Subsetting Data (excluding factor, matrix and list subsetting)
- 08 Creating Publication-Quality Graphics with ggplot2
Setting up git in RStudio
There can be difficulties linking git to RStudio depending on the operating system and the version of the operating system. To make sure Git is properly installed and configured, the learners should go to the Options window in the RStudio application.
-
Mac OS X:
- Go RStudio -> Preferences… -> Git/SVN
- Check and see whether there is a path to a file in the “Git executable” window. If not, the next challenge is figuring out where Git is located.
- In the terminal enter
which git
and you will get a path to the git executable. In the “Git executable” window you may have difficulties finding the directory since OS X hides many of the operating system files. While the file selection window is open, pressing “Command-Shift-G” will pop up a text entry box where you will be able to type or paste in the full path to your git executable: e.g. /usr/bin/git or whatever else it might be.
-
Windows:
- Go Tools -> Global options… -> Git/SVN
- If you use the Software Carpentry Installer, then ‘git.exe’ should
be installed at
C:/Program Files/Git/bin/git.exe
.
To prevent the learners from having to re-enter their password each time they push a commit to GitHub, this command (which can be run from a bash prompt) will make it so they only have to enter their password once:
RStudio Color Preview
RStudio has a feature to preview the color for certain named colors and hexadecimal colors. This may confuse or distract learners (and instructors) who are not expecting it.
Mainly, this is likely to come up during the episode on “Data Structures” with the following code block:
R
cats <- data.frame(coat = c("calico", "black", "tabby"),
weight = c(2.1, 5.0, 3.2),
likes_string = c(1, 0, 1))
This option can be turned off and on in the following menu setting: Tools -> Global Options -> Code -> Display -> Enable preview of named and hexadecimal colors (under “Syntax”)
Pulling in Data
The easiest way to get the data used in this lesson during a workshop is to have attendees download the raw data from gapminder-data and gapminder-data-wide.
Attendees can use the File - Save As
dialog in their
browser to save the file.
Overall
Make sure to emphasize good practices: put code in scripts, and make sure they’re version controlled. Encourage students to create script files for challenges.
If you’re working in a cloud environment, get them to upload the gapminder data after the second lesson.
Make sure to emphasize that matrices are vectors underneath the hood and data frames are lists underneath the hood: this will explain a lot of the esoteric behaviour encountered in basic operations.
Vector recycling and function stacks are probably best explained with diagrams on a whiteboard.
Be sure to actually go through examples of an R help page: help files can be intimidating at first, but knowing how to read them is tremendously useful.
Be sure to show the CRAN task views, look at one of the topics.
There’s a lot of content: move quickly through the earlier lessons. Their extensiveness is mostly for purposes of learning by osmosis: so that their memory will trigger later when they encounter a problem or some esoteric behaviour.
Key lessons to take time on:
- Data subsetting - conceptually difficult for novices
- Functions - learners especially struggle with this
- Data structures - worth being thorough, but you can go through it quickly.
Don’t worry about being correct or knowing the material back-to-front. Use mistakes as teaching moments: the most vital skill you can impart is how to debug and recover from unexpected errors.
Introduction to R and RStudio
Instructor Note
When installing ggplot2, it may be required for some users to use the dependencies flag as a result of lazy loading affecting the install. This suggestion is not tied to any known bug discussion, and is advised based off instructor feedback/experience in resolving stochastic occurences of errors identified through delivery of this workshop:
R
install.packages("ggplot2", dependencies = TRUE)
Project Management With RStudio
Seeking Help
Data Structures
Exploring Data Frames
Navigating Files and Directories
Instructor Note
Introducing and navigating the filesystem in the shell (covered in Navigating Files and Directories section) can be confusing. You may have both terminal and GUI file explorer open side by side so learners can see the content and file structure while they’re using terminal to navigate the system.
Automated Version Control
Setting Up Git
Creating a Repository
Tracking Changes
Exploring History
Ignoring Things
Supplemental: Using Git from RStudio
Subsetting Data
Creating Publication-Quality Graphics with ggplot2
Writing Data
Remotes in GitHub
Collaborating
Conflicts
Data Frame Manipulation with dplyr
Data Frame Manipulation with tidyr
Producing Reports With Quarto
Basic Statistics: describing, modelling and reportingDescribing dataInferential statisticsRegression Modelling
Instructor Note
Emphasise that parametric is not equal to normal.
Instructor Note
Get them to plot the graphs. Explain that we are generating random data from different distributions and plotting them.