Young Wu's Homepage


Prev: W4 ; Next: W6

Lecture Note

* Slides
Lecture 9: Slides, With Quiz
Lecture 10: Slides, With Quiz
Annotated Lecture 9 Section 1: Slides
Annotated Lecture 10 Section 1: Slides
Annotated Week 5 Section 2: Part I, Part II

* Typos and mistakes
(not a mistake) In common effect configuration, A -> B <- C, A and C are independent but not conditional independent.

* Websites
Markov Chain: Link
Matrix Calculator: Link
Zipf's Law: Link
Google N-Gram: Link
Simple Bayes Net: Link, Link 2
Pathfinder: Link
Minimum Spanning Tree: Link

* YouTube Videos
How to find maximum likelihood estimates for Bernoulli distribution? Link
How to generate realizations of discrete random variables using CDF inversion? Link
Example: How to compute the joint probaility given the conditional probability table? Link
Example (Quiz): How to compute conditional probability table given training data? Link
Example (Quiz): How to do inference (find joint and conditional probability) given conditional probability table? Link
Example (Quiz): How to find the conditional probabilities for a common cause configuration? Link

Written (Math) Problems

Submit on Canvas: PDF

Programming Problem

* Short Instruction
(1) Download the script of a movie from IMSDB or any other website. You can pick your favorite movie, OR use the one according to your ID:
Type in your ID:
Download the script of the movie:
(2) Create a bigram and a trigram table for characters (26 letters + space). Ignore cases and punctuations. All punctuations can be converted to spaces.
(3) Generate 10 sentences (strings) of length 100 starting from each letter using the trigram model.

* Files to submit
(1) output.txt contains 260 lines, each line is a string created by starting with "a", "b", "c", ...
(2) bigram.txt contains the 27 x 27 bigram transition matrix (after smoothing). The order of the states should be "a", "b", ..., "y", "z", " " (space at the end).
(3) comments.txt contains information on how to run your program, in particular, the names of the data files are required.
(4) code.

* Things to try
(1) (Not required) Try Laplace smoothing and smoothing with other weights between 0 and 1.
(2) Repeat the process and find interesting sentences with actual words.

More (nonessential) details and hints: PDF.

* TAs' Solution
(1) Java: Link written by Ainur
(2) Python: Link written by Dandi
Important note: You are not allowed to copy any code from the solution. MOSS will be used check for code similarity: changing just variable names and the spacing etc is still considered cheating. You can read and learn what the solution is doing but you MUST write all code yourself. The deadline for resubmission without 50 percent penalty is July 14.





Last Updated: July 01, 2019 at 7:07 PM

  2010 layout design created by Francis Poulin