Introduction to Machine learning in R: Classification

An introduction to machine learning in R focusing on classification (supervised learning)

Time and place:
The course consists of two sessions:

Monday April 7th, 12:15-15:00, in seminar room Java, Ole-Johan Dahls hus

Thursday April 10th, 12:15-15:00, in seminar room Prolog, Ole-Johan Dahls hus

Language: 
English

Note: At least 10 people need to participate for the course to be held

Target audience:
UiO reseachers and students who want to get started with machine learning in R.

A video (approximately 25 minutes) has been prepared that might be useful for those that are completely new to machine learning, with example use-cases in research.

 

Prerequisites:
It is an advantage but not necessary that you are accustomed to writing code in R. Basic knowledge of descriptive statistics and tidyverse is a plus.

Contents:

  • Exploratory data analysis
  • Binary classification
    • Feature importance
  • Multiclass classification
  • Cross-validation
  • Additional topics
    • Preprocessing data with "recipe" 
    • Building and evaluating multiple models
      simultaneously
    • Statistically comparing models
    • Hyperparamater tuning
    • Predicting a continuous variable
Profilbilde
Instructor:?
Luigi Maglanoc

Briefly about the course: 
The focus will be on building and evaluating machine learning models in R rather than an in-depth breakdown of specific algorithms. We will be building models to distinguish between different categories of text based on linguistic features (including number of nouns, adjectives, etc.) using XGBoost.

Important: Participants must use their own PC or Mac (laptop) with both R and RStudio installed. Both R (≥ 4.1.0) and RStudio are free and do not require a licence. R can be installed from https://cran.r-project.org and RStudio  from https://www.rstudio.com/products/rstudio/download/

Contact IT-support from your faculty or department if you need help with installation. You can use UiO Programkiosk ("Statistikk fullskjerm") if it is not possible to install either R or RStudio on your own computer. 

Install the following packages in R(studio) before the start of the course:
tidyverse, tidymodels, xgboost, vip, patchwork, workflowsets
*extra packages* doParallel, discrim, kernlab

How to install packages in R

Optional (for more experienced R-users): An alternative way to install the above R-packages is to use some kind of package management, such as renv. First, install the "renv" package, initialize "renv" in a specific Rstudio project, and install each of the above packages with "renv.":

renv::install("the_package")
The correct libraries should be loaded once the Rstudio project is opened. If not, you can do
renv::restore()

 

Links to course material

  • Dataset and R code (TBA)

Organizer

TASK
Published Jan. 20, 2025 2:43 PM - Last modified Jan. 20, 2025 2:58 PM