0. Backup to volume
a. Volume can be found in `/vol_c`
b. directories to back up
- `raw_data`, `data`, `mfcc`, `exp`
c. command to use
- `cp -r [dir_to_copy] [path_to_volume]`
1. General Questions/Comments
2. Big picture contribution of LM
3. Brief summary of language modeling process: from corpus to LM
4. Smoothing
a. Why we need it
b. Which approach to use
5. IRSTLM
a. manual: http://hermes.fbk.eu/people/bertoldi/te … manual.pdf
b. already compiled and code in `/scratch/kaldi/tools/irstlm/bin` (must view from *inside* `docker`!)
c. how to use (see notebook 2.1)
6. Next week's HW
a. submission
- `File -> Download As -> HTML`
b. copy template
7. Generating the probability of a sequence (see notebook 2.2)
a. Default situation (len(sequence) <= size(n-gram) and n-gram in LM)
b. Two special situations
- n-gram not in LM
- sequence is larger than n-gram
8. "ate the lion" v. "ate the mouse" problem
a. Why did it happen?
9. Impact of LM n-gram size
10. Impact of LM size
a. space
b. speed
c. alternative options
- pruning (IRSTLM manual, section 5)
- hyperparameter = ???
- rescoring
11. Intuitions about ARPA-style LMs
12. n-gram LM v. RNN
13. What to expect next week
a. Week 3 items
1. `kaldi_config.json` usage
2. Building the `data` directory
b. Week 2 HW
1. identifying a case study
2. reviewing others' case studies
Week 2 Agenda
Page: 1
Posts 1 to 1 of 1
Share12018-01-25 20:24:57
Page: 1