PRACTICAL EXERCISE 10

 

Based on data from the Danish census in 1970 and official mortality statistics from the following 10 years, suicides among non-manual workers have been ascertained according to sex, age group and job status category. (Source: Andersen, P.K., Borgan, Ø., Gill, R.D. and Keiding, N. (1993). Statistical models based on counting processes. Springer-Verlag, New York.)

The data set contains the following variables:

 

·     jobgr:            Job status category (1=academics, 2=advanced nonacademic training, 3= extensive practical training, 4=other nonmanual worker).

·     sex:                 Sex (male=1, female=2).

·     agegr:            Age group (1=20-24 years, 2=25-29 years, 3=30-34 years, 4=35-39 years, 5=40-44 years, 6=45-49 years, 7=50-54 years, 8=55-59 years, 9=60-64 years).

·     suicides:     Number of suicides.  

·     pyears:          Pearson years.

 

 

You may read the data into R by the command:

 

suicides=read.table("http://folk.uio.no/borgan/BGC1-2012/data/suicides.txt", header=T)

 

In this exercise we will compute and plot occurrence/exposure rates for the two genders disregarding job status.

 

a) We start by computing the number of suicides and the number of person years. For males this can be done by the commands:

suicides.aggr=aggregate(suicides,list(age=suicides$agegr,gender=suicides$sex),sum)

males.suicides=suicides.aggr[suicides.aggr$gender==1,]$suicides

males.pyears=suicides.aggr[suicides.aggr$gender==1,]$pyears

 

 

We then compute the occurrence/exposure rates and their standard errors.

males.occexp=males.suicides/males.pyears

males.se=males.occexp/sqrt(males.suicides)

 

 

Perform these commands (make sure that you understand the commands!) and plot the occurrence/exposure rates for males with 95% standard confidence limits. Use the mid-point of the age intervals, i.e. seq(22.5,62.5,5), as plotting positions for the rates. (To avoid too small numbers, it may be convenient to plot the rates per 10000 person years.)

 

b) Also compute the log-transformed confidence intervals and plot these in the plot from question a.

 

c) Repeat the analysis in questions a and b for females.

 

d) Plot the rates for both genders in the same plot (without confidence limits).

 

e) Discuss what you may learn from the plots in questions a-d.