The Road to Progress
November 14, 2022 ยท View on GitHub
Step-by-step guide for vectorizing/parallelizing your R code
You can best follow this tutorial the following way: checkout the individual commits and look at the diffs. This way you'll be able to observe how the code evolved. The evolution shows the typical workflow.
What you'll need
install.packages(c("pbapply", "mgcv"))
Steps
These steps demonstrate the usual workflow of how to interactively develop code and encapsulate it into a loop, then a function. This all sets us up for using vectorized functions that are well suited for parallel computing as well.
Locally with Git
Clone the repository:
git clone https://github.com/psolymos/the-road-to-progress.git
Open the repository as an R project in RStudio Desktop, VSCode, or R GUI. Check out revisions using git tags to follow the steps:
- Step 1:
git checkout 45d5a67orgit checkout step-1 - Step 2:
git checkout 59eacb9orgit checkout step-2 - Step 3:
git checkout da685aeorgit checkout step-3 - Step 4:
git checkout 8321cdcorgit checkout step-4 - Step 5:
git checkout 9fc2c61orgit checkout step-5 - Step 6:
git checkout c0e1973orgit checkout step-6 - Step 7:
git checkout 370432forgit checkout step-7 - Step 8:
git checkout 8ea4cd9orgit checkout step-8 - Step 9a:
git checkout b6c7729orgit checkout step-9b - Step 9b:
git checkout db7c892orgit checkout step-9b
The example.R code will change along the steps, introducing new tricks.

Locally without Git
Download the zip file for this release: https://github.com/psolymos/the-road-to-progress/releases/tag/start.
Then follow along this commit history: https://github.com/psolymos/the-road-to-progress/commits/master/example.R.

In your browser with Gitpod
This link will open up a preinstalled Gitpod environment where you can run the scripts from each step by launching R and copy-pasting the contents from the step-*.R files.

Exercise
Check out Step 4 (git checkout 8321cdc) while creating a new branch from it: git checkout -b <new-branch-name> 8321cdc, or dowload this release: https://github.com/psolymos/the-road-to-progress/releases/tag/middle, then
- Develop modular code by splitting the function into 2 pieces: (1) data processing + model training, and (2) prediction.
- Use
lapply/sapplyto run the code in a vectorized fashion. - Adapt the vectorized format to show the progress and do it in parallel.
Additional topics
- Promises: the future API
- RNGs
- foreach:
%do%and%dopar% - purr & map-reduce