Time and place:
The course consists of two sessions:
Monday April 7th, 12:15-15:00, in seminar room Java, Ole-Johan Dahls hus
Thursday April 10th, 12:15-15:00, in seminar room Prolog, Ole-Johan Dahls hus
Language:
English
Note: At least 10 people need to participate for the course to be held
Target audience:
UiO reseachers and students who want to get started with machine learning in R.
A video (approximately 25 minutes) has been prepared that might be useful for those that are completely new to machine learning, with example use-cases in research.
Prerequisites:
It is an advantage but not necessary that you are accustomed to writing code in R. Basic knowledge of descriptive statistics and tidyverse is a plus.
Contents:
- Exploratory data analysis
- Binary classification
- Feature importance
- Multiclass classification
- Cross-validation
- Additional topics
- Preprocessing data with "recipe"
- Building and evaluating multiple models
simultaneously - Statistically comparing models
- Hyperparamater tuning
- Predicting a continuous variable
Briefly about the course:
The focus will be on building and evaluating machine learning models in R rather than an in-depth breakdown of specific algorithms. We will be building models to distinguish between different categories of text based on linguistic features (including number of nouns, adjectives, etc.) using XGBoost.
Important: Participants must use their own PC or Mac (laptop) with both R and RStudio installed. Both R (≥ 4.1.0) and RStudio are free and do not require a licence. R can be installed from https://cran.r-project.org and RStudio from https://www.rstudio.com/products/rstudio/download/.
Contact IT-support from your faculty or department if you need help with installation. You can use UiO Programkiosk ("Statistikk fullskjerm") if it is not possible to install either R or RStudio on your own computer.
Install the following packages in R(studio) before the start of the course:
tidyverse, tidymodels, xgboost, vip, patchwork, workflowsets
*extra packages* doParallel, discrim, kernlab
Optional (for more experienced R-users): An alternative way to install the above R-packages is to use some kind of package management, such as renv. First, install the "renv" package, initialize "renv" in a specific Rstudio project, and install each of the above packages with "renv.":
renv::install("the_package")The correct libraries should be loaded once the Rstudio project is opened. If not, you can do
renv::restore()
Links to course material
- Dataset and R code (TBA)