Module 1: Introduction to R and R Markdown (Torkild Hovde Lyngstad)
Syllabus: Wickham, Hadley (2017) R for Data Science. O'Reilly Press. Available online at http://r4ds.had.co.nz
This course module introduces the student to the statistical software system R and its Markdown variant R Markdown. It puts special emphasis on the role of R, Markdown and related tools for reproducible research. The objective of the course module is to provide a foundation for further self-study and practical work, not mastery. The first day of the course is a short introduction to R from the standpoint of being a basic level user of another statistics package (e.g. Stata, SPSS or SAS). The second day of the course is a short introduction to the concept of reproducible research and examples of using RMarkdown and other tolls to take steps towards making one's research more reproducible.
After the module, you should be somewhat familiar with the R system, the integrated development environment RStudio, RMarkdown and the basic ideas of robust and reproducible research using literate programming.
Prerequisites
To follow this module, you only need some familiarity with any statistical software package. Before the course, it is advised that you also:
- Download and install the latest version of RStudio and R on your laptop
- Familiarize yourself with the RStudio interface, for example by using RStudio's online learning materials
- In order to open your mind, read: “Choosing Your Workflow Applications.” by Kieran Healy in The Political Methodologist, 18 (2), 9–18. PDF is here
Further readings and materials
There are several useful books on mastering R that are available for free on the web:
- R for Data Science by Hadley Wickham and Garrett Grolemund is a starter (even if a bit slow).
- Data visualization: A practical introduction by Kieran Healy is a wonderful book us can you use to become a good graph-designer using R. The whole book is available online: here.
- For more advanced R use, see Wickham, Hadley. 2014. Advanced R. CRC Press. Available here.
For learning R, it is strongly recommended to become proficient in formulating questions and searching for answers on the massive amount of material available at various sties on the internet. Seriously, all you need is available. - http://StackOverflow.com is a major resource for aspiring R users. - There are also plenty of R course materials and tutorials available on GitHub. For example, you could check out Thomas Leeper's materials here - http://DataCamp.com has an online R tutorial. - Check out RStudio's collection of cheatsheets: https://www.rstudio.com/resources/cheatsheets/ The most relevant ones are at the bottom of the page.
Assignment
For credits, you must complete an assignment and hand it in as a running RMarkdown document.
The assignment requires you to
- obtain data according to instructions,
- transform these data into a usable shape,
- analyze the data using statistical models,
- prepare the results from these models
These steps should be done within an RMarkdown document. You will only pass the module if the Rmd document can without error be compiled into either HTML or a PDF. We will use example data for the assignment.
Module 2: Quantile regressions (Nicolai T. Borgen)
Borah, Bijan J, and Anirban Basu. 2013. "Highlighting differences between conditional and unconditional quantile regression approaches through an application to assess medication adherence." Health economics 22(9):1052-1070.
Borgen, Nicolai T. 2016. "Fixed effects in unconditional quantile regression." Stata Journal 16(2):403-415.
*Budig, Michelle J., and Melissa J. Hodges. 2014. "Statistical Models and Empirical Evidence for Differences in the Motherhood Penalty across the Earnings Distribution." American Sociological Review 79(2):358-364.
*Firpo, Sergio, Nicole Fortin, and Thomas Lemieux. 2007. "Unconditional Quantile Regressions. Technical working paper, 339." National Bureau of Economic Research.
Firpo, Sergio, Nicole M Fortin, and Thomas Lemieux. 2009. "Unconditional quantile regressions." Econometrica 77(3):953-973.
Fr?lich, Markus, and Blaise Melly. 2008. "Estimation of quantile treatment effects with Stata." Stata Journal 10(3):423-457.
Hao, Lingxin, and Daniel Q. Naiman. 2007. Quantile regression. Sage.
*Killewald, Alexandra, and Jonathan Bearak. 2014. "Is the Motherhood Penalty Larger for Low-Wage Women? A Comment on Quantile Regression." American Sociological Review 79(2):350-357.
Koenker, Roger, and Kevin Hallock. 2001. "Quantile regression: An introduction." Journal of Economic Perspectives 15(4):43-56.
*Porter, Stephen R. 2015. "Quantile regression: analyzing changes in distributions instead of means." Pp. 335-381 In Higher Education: Handbook of Theory and Research. Springer.
Guide to the reading list
The references marked with asterisk are most important. You will benefit from having read Porter (2015), Killewald and Berak (2014), and Budig and Hodges (2014) before the course begin. After reading these text, you should consider reading Koenker and Hallock (2001) and Firpo et al. (2007). The rest of the text are supplementary.
Introductory text on conditional and unconditional quantile regressions:
Porter, Stephen R. 2015. "Quantile regression: analyzing changes in distributions instead of means." Pp. 335-381 In Higher Education: Handbook of Theory and Research. Springer.
Killewald, Alexandra, and Jonathan Bearak. 2014. "Is the Motherhood Penalty Larger for Low-Wage Women? A Comment on Quantile Regression." American Sociological Review 79(2):350-357.
Budig, Michelle J., and Melissa J. Hodges. 2014. "Statistical Models and Empirical Evidence for Differences in the Motherhood Penalty across the Earnings Distribution." American Sociological Review 79(2):358-364.
Conditional quantile regressions
Koenker, Roger, and Kevin Hallock. 2001. "Quantile regression: An introduction." Journal of Economic Perspectives 15(4):43-56.
Hao, Lingxin, and Daniel Q. Naiman. 2007. Quantile regression. Sage.
Unconditional quantile regressions
Firpo, Sergio, Nicole Fortin, and Thomas Lemieux. 2007. "Unconditional Quantile Regressions. Technical working paper, 339." National Bureau of Economic Research.
Firpo, Sergio, Nicole M Fortin, and Thomas Lemieux. 2009. "Unconditional quantile regressions." Econometrica 77(3):953-973.
Borah, Bijan J, and Anirban Basu. 2013. "Highlighting differences between conditional and unconditional quantile regression approaches through an application to assess medication adherence." Health economics 22(9):1052-1070.
Estimation in Stata
Fr?lich, Markus, and Blaise Melly. 2008. "Estimation of quantile treatment effects with Stata." Stata Journal 10(3):423-457.
Borgen, Nicolai T. 2016. "Fixed effects in unconditional quantile regression." Stata Journal 16(2):403-415.
Module 3: Quantitative Text Analysis (Zoltán Fazekas)
Module 4: Introduction to evnet history analysis (Torkild Hovde Lyngstad)
Course content
This module offers an introduction to event history analysis, a family of statistical methods that are increasingly popular in the social sciences. Event history analysis is a tool used for analyzing the occurrence and timing of events. Typical examples are life course transitions such as the transition to parenthood and partnership formation processes, labor market processes such as job promotions, mortality, and transitions to and from sickness and disability. The researcher may be interested in examining how the rate of a particular event varies over time or with individual characteristics, social conditions, or other factors. Event history analysis lets the researcher handle censoring and truncation, include time-varying independent variables, account for unobserved heterogeneity (frailty), and more.
The course takes an intuitive approach to the subject with an emphasis on practical applications. Formal theory will only be covered to a limited extent. The course will rely on Stata as the main computing tool, but users of other statistical software will still benefit from the course. The course is taught through both lectures and lab sessions.
Learning outcomes
The module will cover a range of topics including the following:
- Event history data structures, collection instruments, coding schemes
- The concepts of censoring and truncation
- Numerical and graphical descriptions of survival data, including life tables and Kaplan-Meier and related estimators
- The Cox proportional hazards model
- Time-varying covariates
- Discrete-time hazard models
- Frailty models (i.e. models incorporating unobserved heterogeneity)
Recommended previous knowledge
The course is for anyone who wants to understand and apply basic event history analysis. It is intended for graduate students in the social and medical sciences. Participants should have a good working knowledge of applied regression analysis and some prior exposure to Stata.
Readings
The main readings are course materials, and articles (listed below). However, students would benefit from having a book that covers the basics of event history analysis. There are plenty of such books available. Consult with the teacher if you choose a book not mentioned here. A “classical” book that cover the absolute basic event history concepts and models is
- Allison, P. (1984) Event History Analysis. Sage Publications. There is also a later, 2nd edition.
A Stata-oriented book is
- Blossfeld, Golsch, Rohwer, Event History Analysis with Stata (2009), Psychology press.
For R users, an alternative is:
- Mills, M (2011) Introduction to survival and event history analysis with R. Sage publications.
Other readings
This list of papers contain simple (and not-so-simple) examples of use of the methods taught in the module, as well as some introduction texts that offer alternative approaches to the subject matter.
- Singer, J and Willett, j. (1993) “It’s About Time: Using Discrete-Time Survival Analysis to Study Duration and the Timing of Events”, J. Educ. & Behav. Stat. 18: 155.
- Skardhamar, Torbj?rn, and Kjetil Telle. “Post-release employment and recidivism in Norway.” Journal of Quantitative Criminology 28.4 (2012): 629-649.
- Lappeg?rd T. and R?nsen M., 2005, The multifaceted impact of education on entry into motherhood, European Journal of Population 21: 31–49.
- Western, B. 1995. “A Comparative Study of Working-Class Disorganization: Union Decline in Eighteen Advanced Capitalist Countries” American Sociological Review, Vol. 60, No. 2
Work requirements
To pass the module, students must submit an assignment that will include a simple analysis. The analysis should include non-parametric parts and a regression-model type part.