R-help to exercise 3.3 in BSS

 

 

# Read the data into a dataframe, give names to the variables, and inspect the data:

firms<-read.table("http://www.math.uio.no/avdc/kurs/STK4900/data/exer3_3.dat")

names(firms)<-c("months","size","type")

firms

 

# Check that the data correspond to those given in the exercise.

 

# Attach the dataframe:

attach(firms)

 

 

# Compute summary measures for the variables:

summary(firms)

 

# Make sure that you understand what the summary measures tell you!

 

 

# Make plots (side by side) of months versus each of the other two variables:

# For the numeric covariate size we make a scatterplot, while we make a box plot for the categorical covariate type:

par(mfrow=c(1,2))

plot(size,months)

boxplot(months~type)

par(mfrow=c(1,1))

 

# What do the plots tell you?

 

 

# Do univariate regression analyses of months versus each of the other two variables:

fit1<-lm(months~size)

fit2<-lm(months~type)

summary(fit1)

summary(fit2)

 

# Which of the two variables, size and type, is most important for explaining the variation in the number of months elapsed?

# Does any of the variables (alone) have a significant effect?

# In the latter of the two regression models, we have only one categorical covariate (type).

# Could we have estimated/tested its effect using another method? Would that give different conclusions?

 

 

 

# Do a regression analysis including both size and risk type:

fit3<-lm(months~size+type)

summary(fit3)

 

# What does this model tell you? Does it look better than the best of the two models with only one covariate?

 

 

# Try yourself models with interaction and/or a second order term for size.