Book by book
Manning and Schütze: Foundations of Statistical Natural Language processing (FSNLP):
- Ch 1
- Ch 2:
- Sec 2.1 except 2.1.10,
- the essencials from sec 2.2 up to and including 2.2.3
- Ch 3 is considered known background and should be studied by they who lack this background
- Ch 4
- Ch 5, except 5.3.4
- Ch 6:
- Introduction
- Sec 6.1
- Sec 6.2 up to (but not including) Sec. 6.2.4
- Ch 7, except 7.3-7.4
- Ch 8:
- Sec 8.1
- Sec 8.5
- Ch 14:
- Sec 14.2
- Introduction+
- 14.2.1
- Sec 14.2
- Ch 15:
- Sec 15.1-15.2
- Ch 16:
- Introduction (up to but not including Sec 16.1)
- Sec 16.2
- Sec 16.4
Nivre’s web course: Statistical Natural Language Processing (NW)
- Lect. 1-4
Bird, Klein and Loper: Natural Language Processing with Python (NLTK)
- Ch 1: Sec 1.1-1.3, 1.5
- Ch 2: Sec 2.1-2.2
- Ch 3: Sec 3.1-3.2
- Ch 6: Everything except Sec 6.4
Manning, Raghavan and Schütze: Introduction to Information Retrieval (IIR):
- Ch 13, except 13.2.1
- Ch 14:
- Introduction
- Sec 14.1-14.3
Jurafsky and Martin: Speech and Language Processing (J&M)
- Ch 6: Sec. 6.6-6.8 (except 6.6.4)
- Ch 20: Sec 20.7
By subject
Basics: ”Working with texts”
- FSNLP:Ch 1, Ch 4
- NLTK: Ch 1, 2.1, 3.1-3.2
- Slides from lecture 22 Aug
Probability theory
- FSNLP: Sec 2.1 except 2.1.10
- Nivre’s web course: Lect. 1-3
Main concepts of Entropy
- FSNLP 2.2-2.2.3 (We do not expect all details here, but you should know formulas 2.26 and 2.36 from FSNLP and have some ideas about why entropy is an essential concepts.)
Statistics and inference
- FSNLP 5.1-5.3.3
- Nivre’s web course, lect. 4
- Slides from lecture 12 Sept.
- Could be useful to consider other sources as well
Collocations
- FSNLP:Ch. 5, except 5.3.4
Methodology, evaluation, smoothing
- FSNLP:
- Ch 6 up to (but not including) Sec. 6.2.4
- Sec 8.1
- Ch 16: Introduction (up to 16.1)
Na?ve Bayes classification and word sense disambiguation
- FSNLP Ch. 7, except 7.3-7.4
- Manning, Raghavan, Schütze, IIR, Ch. 13
- NLTK 6.1-6.3, 6.5
Vector space semantics and IR
- FSNLP Sec 8.1, 8.5, 15.1-2
- J&M, Sec. 20.7
- Slides from lecture 14 Nov.
Vector space classification: Rocchio and k nearest neighbors
- FSNLP 16.4
- IR 14-14.3
- Slides from lecture 14 Nov.
Vector space flat clustering: k means
- FSNLP 14.2: intro+14.2.1
- Slides from lecture 14 Nov.
Linear classifiers, logistic regression, maximum entropy classifiers and tagging
- FSNLP, Ch. 16.2
- Jurafsky&Martin, Sec. 6.6-6.8 (except 6.6.4)
- Ratnaparkhi 1996
- NLTK, sec. 6.6
- Slides from lecture 21& 28 Nov.