0. Skim paper describing dataset building process: https://www.danielpovey.com/files/2015_ … speech.pdf
    1. Table 1
    2. 2.2-2.3 Alignment
    3. 2.4 Data Segmentation
    4. Table 2
1. General Questions/Concerns
2. Librispeech Dataset
    a. Librivox v. Project Gutenberg
        - https://librivox.org/
    b. See paper here describing the build process of dataset: https://www.danielpovey.com/files/2015_ … speech.pdf
        - See 2.4 Data Segmentation
        - See Table 1
        - See Table 2
        - See 2.2-2.3 Alignment
2. Review of "Data Files"
    a. splits
        - train: `dev-clean`
        - test: `test-clean`
        - language model: `3-gram.pruned.3e-7.arpa`
    b. audio
        - 8 kHz v 16 kHz v flac
            - ~20% reduction in performance with 8kHz v 16kHz (https://www.superlectures.com/odyssey20 … on-systems)
            - listen to some samples
        - male v. female
            - listen to some samples
    c. segmented v. unsegmented
        - split on "pause"
            - pause = silence for more than X seconds
            - silence = no signal > Y dB
    d. phones
         - see http://www.speech.cs.cmu.edu/cgi-bin/cmudict
         - silence phones
    d. lexicon
         - see http://www.speech.cs.cmu.edu/cgi-bin/cmudict
         - find stressed v. unstressed examples
3. out-of-vocabulary (OOV)
         - see http://www.speech.cs.cmu.edu/tools/lextool.html
         - see https://github.com/sequitur-g2p/sequitur-g2p
             - (`tmux session=sequitur` on desktop for demo)
4. What to expect next week
    - resources, see schedule: https://docs.google.com/document/d/1pXt … mWpKbIeyqc
    - 2.1
        - in shell using IRSTLM (manual in resource_files)
        - using a toy corpus
            - "real" language model will be built in Week 3
    - 2.2
        - in python
        - "exploratory" with "case study"
    - 2.HW
         - due before Week 3 class

Last edited by Michael Capizzi (2018-01-17 01:20:44)