Q2: In problem 2a, I need some help to figure out the likelihood p(yt|xt).
A2: Note that yt is the observation. It has a deterministic link to xt. The only information you get from xt??????? through yt, is which interval it is in, so the Likelihood becomes zero if there is a miss match, and one if there is a match, i.e. L(y=1|x=-1.3)=1, whereas L(y=1|x=1.3)=0, similarly L(y=3|x=-1.3)=0 and L(y=3|x=1.3) = 1.
Q1: I have problems with 1b. I do not get how we derive g(x). Why are the transitions between the models -0.5 and 0.5 when we do the gradient at -1 0 and 1?
A1: if the function is -0.5x^2, then the derivative is -x. Thus the linear tangent in the three locations will be
-1: x-0.5
0 : 0
1 : 0.5-x
All of these three tangents will always be above the function -0.5x^2, so to have the one that is closest we use the minimum of the three. The point where the crossings are at -0.5 and 0.5. so this gives the points where we change the approximation.