Introduction to Machine learning in R: Classification

This is a course for you that wants an introduction to machine learning in R focusing on classication, a type of supervised learning

The course is held over 2 half-days

Day 1: Wednesday 23. june 10:15 - 13:00, zoom

Day 2: Friday 25. june 10:15 - 13:00, zoom

Aim

Learn how to build machine learning models in R (using tidymodels), interpret them, and how to 'improve' model evaluation using cross-validation.

Content

The two algorithms that will be used as examples are linear discriminant analysis (LDA) and XGBoost. Important: the focus will be on building and evaluating machine learning models in R rather than an in-depth breakdown of specific algorithms. We will be building models to distinguish between different categories of text based on linguistic features (including number of nouns, adjectives, etc.) 

  • Exploratory data analysis
  • Binary classification
    • Feature importance
  • Multiclass classification
  • Cross-validation
  • *Extra (if enough time)*
    • Hyperparameter tuning
    • PCA 
    • Cluster analysis

Target audience

This "workshop" is for UiO-affiliated students or researchers that are comfortable with using R and would like to learn more about machine learning (classification), how it can be used in research, but do not have a strong mathematical or data scientific background. Basic knowledge of descriptive statistics is a plus, and some knowledge of the tidyverse is preferable, as the main package used for this course (tidymodels) is based on tidyverse principles.

Duration

2 x 3 hours

Signing up

The course is full, but to be put on the waiting list, sign this form

Important: Participants must use their own PC or Mac (laptop) with both R and RStudio installed. B?de R (≥ 3.3.0) and RStudio are free and do not require a liscence. R can be installed from https://cran.r-project.org and RStudio  from https://www.rstudio.com/products/rstudio/download/

Contact IT-support from your faculty or department if you need help with installation. You can use UiO Programkiosk ("Statistikk fullskjerm") if it is not possible to install either R or RStudio on your own computer. 

Install the following packages in R(studio) before the start of the course:
tidyverse, tidymodels, discrim, mda, xgboost, vip, patchwork
*extra packages* doParallel, factoextra 

How to install packages in R

Number of participants

30 

    Language

    The course will be held in english

    Instructor

    Luigi Maglanoc PhD

    Contact information

    If you have any questions about the course, send us an email: statistikk@usit.uio.no

     

    Links to course material

    • Dataset
    • R-code (to be added)
    Emneord: Machine learning, R, Data science
    Publisert 28. mai 2021 13:30 - Sist endret 21. okt. 2021 13:10