Messages
Sted: "Handy Dandy", 3. etasje i Mat-Inst-lokalene p? Ullev?l Stadion. Dato: fredag 15. juni. Vi regner ca. 25 minutter per kandidat. Tentativt tidsskjema:
09:00 Christoffer Haug Laache
09:35 Martin Tveten
10:15 Jonas Gjesvik
10:45 Jens Kristoffer Haug
11:15 Knut Haneborg Ringnes
A certain detail of my text for Exercise 1(e) didn't come out well and might cause confusion. The main points are in order, by all means -- but since I've said that h(v) is the density for V, not for \sqrt{V}, the density for \theta ought to be expressed as the integral of \phi(\theta / \sqrt{v}) (1/\sqrt{v}) h(v) d v.
Nils
The Exam Project is out, Fri June 1 at 10:01, with reports to be handed in, in duplicate, by Tue June 12 at 11:55 (or earlier). The oral examinations take place Fri June 15, with a list of candidates and timeslots to be given here later on.
Note that the written reports need to contain *Page A* (self-declaration form) and also *Page B*, a one-page self-assessment of your efforts. Here you may comment on your work process, what you found more challenging than other points, perhaps what you consider more interesting than other parts, etc. Good luck.
1. No teaching Tue 22-v, because of the V?rens Vakreste Variabler FocuStat Conference. Have a look through the programme (perhaps there are lurking exam questions there).
http://www.mn.uio.no/math/english/research/projects/focustat/workshops%20and%20conference/vvv18_finalprogramme.pdf
2. So last day of teaching is Tue 29-v, where I'll sum up various matters, and answer questions, etc.
3. Then the Exam Project: will be made public & visible here at the course website, Fri 1-vi, with reports to be handed in, in duplicate, by Tue 12-vi, before 11:58, to the Reception at the Department of Mathematics (still at Ullev?l).
4. Importantly, Part Two of the exam, the 25 minute per candidate oral examination, takes place *Friday June 15*, with place and time slots to come. Those candidates who have time constraints, please send me a mail, quickly, about this. If Friday June 15 is too complicated, for one or two of you, we find time slots during Thursd...
There is no teaching on the Slava Trudu Day.
Exercises for May 8: Print out the most recent Version E of the Nils Collection (version 0.74 of 1-May-2018, 47 pages). Do Exercises 30, 51, 52, 53.
The PhD candidates taking the course need to give 30 minute presentations to us, as part of their examinations (these are Faculty rules, not my own invention for the occasion). *Martin Tveten* is on for Tue May 8, and *Christoffer Laache* for Tue May 15. Can each of you send me a title and a brief summary (a few lines)?
Here's Martin's title & summary:
** Cluster analysis with the Dirichlet process mixture
I will present the Dirichlet process mixture (DPM) and explain its close relation to the Chinese Restaurant Process as well as the Polya Urn model. Through an example, we will see that the DPM can be used as a flexible basis for cluster analysis, where one does not have to spe...
1. On Tue April 10 I spent time on two "supplementary dimensions" of the BNP course -- (A), how to use the machinery to demonstrate that certain estimators or tests or decision functions have good performance properties, and (B) how to use the new tools and machines and probabilistic constructions to build new models. I went into minimaxity estimators and survival type models based on Gamma processes.
2. I've extended the Lecture Notes & Exercises Notes, and there is a 42 page "version 0.69" uploaded to the course site Fri 13th evening.
3. Exercises for next week: from this extended exercises set, do #24, on the biggest jumps of Gamma processes, with ensuing models; and then the minimaxity exercises 42, 43, 44, 45, 46. These include genuine nonparametric minimaxity estimators, e.g. for a distribution function.
4. On Tue Apr 17 I also continue with Gaussian nonparametric regression.
1. For a Gamma process Z(t), with parameters (a t, 1), followed over the time interval [0, 1], find the distribution of its biggest jump J. Generalise to the biggest jump J(\tau) when Z(t) is followed over [0, tau], and also to the case where Z is Gamma with parameters (a M(t), 1). Hint: work out that the cdf \Gamma_\eps(t) for a Gamma(\eps, 1) can be expressed as 1 - \eps E_1(t) + small, where E_1(t) = \int_t^\infty (1/x) \exp(-x) dx is the so-called exponential integral. Study aspects of this distribution for J(\tau).
2. Suppose certain creatures have gamma processes (a M(t), 1) in their rucksacks, and that they live until the biggest jump reaches a threshold c. Find the distribution for their lifetimes T. Create regression models that are consistent with the Cox regression model (and others that are not).
3. Consider iid data x_1, .., x_n from an unknown distribution P on [0, 1], where the problem is estimating the mean \theta(P) = \int_0^1 x dP(x), with s...
As I mentioned before Easter, the first Tuesday after Easter is working day, even at UiO, but by tradition it's not a teaching day.
I'll put up a message very soon about exercises and themes for Tue April 10.
Martin T, thanks for mail -- the other PhDs should also send me a mail, so that we can set up a little time table for the oral presentations.
1. There are apparently 3 PhD and as many as 14 Master students who are planning to take the exam for the STK 9190 / 4190 course. The main part of the examination process is *The Project*. I give a set of exercises on day t_0, currently set to Fri June 1, and the students hand in their written reports on day t_1, currently set to Tue June 12.
2. There will then be *oral examination*, so far stipulated to take place on Thu June 14 and Fri June 15. This might be 20 minute sessions, depending on the actual number of Master students taking the exam.
3. The 3 PhD students are required by the system to prove their worth & maturity by giving *oral presentations*, on topics to be selected and agreed upon. These might be 25 minute presentation in class (i.e. for all of us to listen to), followed by some discussion. These presentations should take place during the first half of May. Can these three PhD students please send me a mail? Martin T, Christoffer Laache...
New version 0.61, as of 19-iii-2018, now 35 pages, uploaded to the website, and with more to come. It might be a total of 45 pages by the end of Apri.
Tue 20 March: we finish off the Roman Era Egypt story, and continue with Bayesian Kriging and Bayesian Regression. We'll soon analyse the Bj?rnholt skiing days time series data, from 1897 to 2012, and where there's a gap from 1938 to 1954. The task will be (a) to interpolate, for the missing data window, with a Gaussian process prior, complete with a 90% pointwise credibility band, and (b) to extrapolate, for the years 2013 to 2028, again with a prediction curve and with a 90% credibility band.
We'll also attempt to set up dates for the Exam Project, in June, plus some extra practicalities for the exam.
I've uploaded an updated 32-page version of the Nils Collection of Exercises and Lecture Notes (12-iii-2018). You should print it out for your convenience.
1. On Tue Mar 6, I went through Bernoulli and (nonhomogeneous) Poisson processes, and we discussed the updating of a Beta process to observed Bernoulli time points. There's an emerging Nils-Emil project here, of volume to be decided upon later, but at least it's clear how to analyse the cumulative intensity process for a given observed Bernoulli process. The extended Gamma might be used in lieu of the Beta process. I also started discussing Bayesian Kriging and Bayesian nonparametric regression.
2. Over the past 36 hours I've finally come round to work with my Bragel?fte, and have produced a 23-page preliminary version of "Nils Exercises and Lecture Notes" (so far version 0.31, as of 9-iii-2018), now placed at the course website. Over the coming weeks that Nils Collection will be suitable polished, supplemented, and extended. Comments from the Crowd of Eager Students are warmly welcomed, as always.
3. Exercises for T...
(Ved en s.k. inkurie er denne meldingen forsinket, og jeg beklager.)
1. Finalise the Bayesian nonparametric analysis of the Old Egypt life-times, men and women. Invent an interesting parameter or two, say \gamma(A_men, A_women), which in a suitable fashion helps to see the different by the two populations, and, with your own priors for A_men and A_women, produce the posterior distribution for \gamma.
2. Let A be a Beta process with parameters (c, A_0). On top of A, there is a Bernoulli process Z with A as parameter. This is taken to mean that Z given A has independent increments, with lots of 0 and occasional 1, and with dZ(s), again given A, a Bernoulli variable with probability dA(s). Show that A given Z is another Beta process, and identify its updated parameters.
3. Create an illustration of Exercise 2, in the following fashion. (i) Let A be a Beta process on say [0, 10], with A_0 the integral of \alpha_0(s) = 1 + a s, with some constant a to reflect hi...
1. On Tue Feb 20 we had a good and interesting round on Gamma processes, extended Gamma processes, Levy representations, and Nils' Beta processes. An idea was thrown out that we could make an extended Gamma process generating increments as parameters for a nonhomogeneous Poisson process, and which should be updatable. Emil S has pursued this idea. I also indicated how a prior Beta process (c, A_0) is updated to a posterior Beta process (c_n, \hat A_n), after observing a survival dataset (t_1,\delta_1), ..., (t_n,\delta_n).
2. I've updated (a) the R script com6a and (b) the dataset egypt_data to the course website. com6a does prior and posterior Beta process simulation; egypt_data has lifelengths for 82 men and 59 women, from the first century B.C.
3. Exercises:
(i) First, run the com6a programme, playing with its ingredients, including the prior parameters (c, A_0) of the Beta process.
(ii) Then, use variations o fthis programme to analyse the Egy...
For Tue Feb 20, we attempt to finish what was not completed for the exercises last week. In addition:
Look at the Beta process construction via a "fine grid": Let A_m(t) be the sum of independent B_i \sim Beta(c a_0(i/m) (1/m), c (1 - a_0(i/m)(1/m)), over all i/m \le t. Work at the mean and variance of A_m(t), and take the limit as m goes to infinity. Then work with the Laplace transform \E\exp(-\theta A_m(t)), and find an expression for its limit.
Then *apply* the Beta process, with A the cumulative hazard for survival data. Let A be a Beta process with parameters (c, A_0), with A_0(t) = a_0 t, i.e. an exponential. Simulate paths from A(t). Play with a_0 and c.
Then take n data t_1, ..., t_n from some given distribution on (0,\infty). Update the A process, following the result that A given data is another Beta process, with parameters (c + Y, \hat A), with Y(s) = number of individuals at risk at time s, and \h...
1. On Feb 6 I went through more Dirichlet Process matters, including the Sethuranam stick-breaking representation, and how to simulate from a simple hierarchical setupt where parameters \theta_i stem from Dir(a P_0) and observattions are seen as y_i from f(y_i | x_i).
2. Check my R scripts com3a, com4a, com5a.
3. I also did a bit of Bayesian nonparametrics for the quantile function Q(y) = Finverse(y) = the minimum x for which F(x) \ge y. See also the Hjort-Petrone paper about such things which is now at the course webpage.
4. I'll rather soon tex up some Nils Exercises and Lecture Notes.
5. Exercises for Tue Feb 13:
(i) Use the stick-breaking representation to put an infinite sum representation for P(A), with A a fixed set. Show that E P(A) = p_0 = P_0(A), and find the 2nd and 3rd central means, E (P(A) - p_0)^2 and E (P(A) - p_0)^3. Check that these are as they should be, from the Beta distribution for P(A).
(ii) With n = 100 data poi...
1. On Jan 30 I went through some basic processes for the Dirichlet process, including existence, big support, the posterior, the marginal distribution of a new sample point, etc. We also discussed the war-and-peace dataset, cf. Nils's FocuStat Blog Post. Use my "com2a" to look more into this, with Dirichelt process priors for the two cumulatives for the battle deaths before and after Vietnam.
2. Next week, still in pre-Olympian modus, we discuss aspects of Ch 2, including density estimation and clustering models.
3. Soon I'll TeX up some Nils Exercises and Lecture Notes (but not this week).
4. Information regarding curriculum and exam project: next week.
5. Exercises for next week are as follows.
(i) Do some more analyses for the war data, including inventing and examining your own interest parameter \delta(F_L, F_R), and with a reasonable setup for the two priors.
(ii) Let P be a Dir(a P_0) on [0,1], with P_0 the uniform, and let \the...
1. On Jan 23 I went through more of the "gentle introduction to Bayesian Nonparametrics" material, and did also discuss how to simulate from the Dirichlet process, which is one of the basic tools of the course. See my "com1a" R script. Next week there'll be more about the Dirichlet process and its properties, before and after data.
2. Exercises for Jan 30 are as follows.
(i) Go to the krigogfred dataset (on the webpage), comprising (x, z) for 95 horrible wars, from 1823 to the 2003, with x = time of onset and z = the number of battle deaths. This is the set of all wars with z \ge 1000. Now form the subset of 51 wars where z \ge 7061, where the power law tail behaviour is meant to hold. This means that v_i = \log z_i - \log 7061 are seen as Expo(\theta). Do a simple ML analysis to see the size of \theta, and also a logLprofile to detect the Cunen-Hjort Vietnam Hypothesis of Peace-and-War statistics (read the FocuStat blog, which Pinker liked so much, etc.). Di...
1. We started the course Tue Jan 16, where I gave a broad introduction, partly via the "A gentle introduction to Bayesian Nonparametrics" pdf, and which I'll continue using next week. I defined the first Real Thing of the course, the Dirichlet process, with P \sim Dir(a P_0), which when relevant or practical is written F \sim Dir(a F_0). I dared to mention my Fame Parameter, which had a Pinker caused peak Mon Jan 15, and which via its World of Wars theme will also lead to a couple of exercises in this course, reasonably soon.
https://www.mn.uio.no/math/english/research/projects/focustat/the-focustat-blog!/krigogfred.html
2. On the course website there are now two documents which you should read through: the "Gentle introduction" (a talk I gave in Feb 2017) and the intro chapter to the Hjort, Holmes, Müller, Walker book (2008).
3. Pretty soon we start on Ch 2 of the course book Müller, Quintana, Jara, Hanson (2015), but we'll need some more time on "general...
This is the first time we've given a course on Bayesian Nonparametrics, and I look forward to the experience. The intended level is PhD plus upper level Master, and it's assumed that students know the basics of "usual" parametric Bayesian statistics, with a bit of computational experience etc.
Lectures are held Tuesdays 9:15 to 12:00 in Undervisningsrom 107, the Abel Building, and we start out Tuesday January 16. The typical structure for these three times forty-five minutes will be two parts lecture plus one part exercises. Exercises might partly be developed during the course, and will also involve analysis of real datasets.
There are several books on Bayesian Nonparametrics, including Hjort, Holmes, Müller, Walker (Cambridge, 2010), where its introductory chapter will be part of the curriculum. The main course material will however be from Müller, Quintana, Jara, Hanson (Springer, 2015), "Bayesian Nonparametric Data Analysis". Please get hold of your own copy,...