Messages
There are 2 project (formally take-home exams), split into 3 parts each. Each one takes 2-4 hours and is partly done in a tutorial session.
Each question is weighted equally in each home exam, so that by correctly answering the elementary parts of each question, students can be guaranteed a passing grade. Each exam counts for 40% of the score. A 15-min seminar is also given by the students. This counts for 20% of the final score.
Criteria for full marks in each part of the exam are the following.
- Documenting of the work in a way that enables reproduction.
- Technical correctness of their analysis.
- Demonstrating that they have understood the assumptions underlying their analysis.
- Addressing issues of reproducibility in...
1 | Harald Werner Grannes | T13:15 |
2 | Ivar Kristoffer Huitfeldt| T12:55 |
3 | Chris Ghai | W13:00 |
| Vebj?rn Nevland | W13:00 |
4 | Sindre Gr?nmyr | T12:35 |
| Victor Skeide Undli | T12:35 |
5 | Aleksander Wang-Hansen | W13:20 |
6 | Espen H. Kristensen | W13:40 |
7 | Fredrik Wollert Hansen | T13:35 |
8 | Lars Henry Berge Olsen | T12:15 |
| Oda Johanne Kristensen | T12:15 |
Starting with a YouTube live here in 10 minutes:
https://www.youtube.com/watch?v=HxcGjf0YFGI
Identify the sensitive variable for Credit project data and measure the conditional independence of your model with respect to the sensitive variable.
You can use ci_test.py as basis for this analysis.
Go through the already implemented analysis of compas data by propublica and understand the conditional dependence of results on race and other factors.
https://github.com/olethrosdc/ml-society-science/blob/master/src/fairness/COMPAS.ipynb
For the project part of the lab:
1. Implement one of two possible differentially private decision
functions for a given model. (This model can be calculated on
non-private data). The choices are:
(a) Laplace mechanism where each new person's data is randomised before
a decision is made.
(b) Exponential mechanism where the utility function is used directly to
make a random decision.
In either case, try to plot the effect of the privacy on
the utility.
In tomorrow's session, bring your own laptops for implementing the following tasks.
1. Run salary.py with different parameters (number of people, epsilon)
to see what happens. Compare the results by plotting the error.
2. Extend the ideas to credit.py, for multiple features. Upper and lower bounds for each variable should be considered for calculating the laplace noise. Generate data for different values of epsilon and analyse the changes in the accuracy of the classifier. Make a plot for model accuracy as epsilon changes.
https://github.com/olethrosdc/ml-society-science/tree/master/src/privacy
Projects can be done together in groups of 2-3 students. Students may do their projects individually.
Students who wants to work on their projects in groups should send the name of their group members on email id summayam@ifi.uio.no . You will be assigned a group number on Piazza that you will need to join.
Feel free to post your queries on Piazza or send via email.
You can download the main course book from this link Notes. The chapters which are relevant to each lecture are mentioned now on UIO course webpage under Schedule.
The additional chapters of other books are not manadatory. However, students are encouraged to read them for better understanding of concepts.
You will work on the implementation of first part of the Credit Risk project. The skeleton code is provided at
Credit Risk Python Skeleton Code
Note: Bring your laptops.