Practical exercise 9
At the university hospital of the University of Massachusetts one has for a number of years studied the survival of patients admitted with an acute myocardial infarction (AMI). One aim of the study has been to investigate whether the survival of AMI patients has improved over time. A number of covariates were measured at hospitalization. In addition to information on time of hospitalization (here given in five years periods), we will in this assignment restrict attention to the two demographic covariates age and sex, one covariate that is related to the circumstances of the AMI (amount of "heart enzyme" - a measure of the seriousness of the AMI), and one covariate that gives information on whether the patient has had an AMI earlier or not. Further we will restrict attention to the first three years after AMI, so patients who live longer than 1095 days have been censored at that time.
You may read the data into R by the command:
ami=read.table("http://www.uio.no/studier/emner/matnat/math/STK4080/h14/ami.txt",header=T)
The data are organized with one line for each of the 481 patients, and with the following variables in the seven columns:
id: patient number
days: number of days from hospitalization to death or censoring
status: indicator for death (1) or censoring (0)
per: five year period (1 = 1975-79, 2 = 1980-84, 3 = 1985-89)
age: age at hospitalization (in years)
sex: sex (0 = male, 1 = female)
enzym: amount of "heart enzyme" (measured as "international units" divided by 100)
prev: information on earlier AMIs (0 = no earlier AIM, 1 = at least one earlier AIM).
In this exercise we will use additive regression to study the effect of the covariates.
We will first consider additive regressions for one covariate at a time. To study the effect of five year period of hospitalization, we then give the commands:
fit.per=aareg(Surv(days,status)~factor(per),data=ami)
par(mfrow=c(1,3))
print(fit.per)
plot(fit.per)
a) Fit the additive model for each of the covariates (one at a time) and discuss the results. It is useful to centre the numeric covariate age (by subtracting its mean). The distribution of heart enzyme is very skewed. So it may be useful to first transform this covariate by taking logs (and then base 10 logs is a good choice) and then centre the log-transformed enzyme values.
b) Study the simultaneous effects of all of the covariates using the additive regression model. Reduce the model by removing non-significant covariates (if there are any), so that you end up with a "final model" where all covariates have significant effects. Give an interpretation of the cumulative regression functions for your "final model".