A HMM-based continuous speech recognition and HTK Toolkit Introduction to the classification of speech recognition system
Person identified: Specific person non-specific
Vocabulary: Large vocabulary in small vocabulary
The way of speaking: solitary word connecting word continuous word
Language: Chinese, English, French ...
We do the non-specific people's vocabulary continuous Chinese speech recognition system
Also called Chinese voice dictation machine
Hidden Markov models (Hidden Markov model)
The Hidden Markov model is a kind of Markov chain, its state cannot be observed directly, but it can be observed by observing vector sequence that each observation vector is represented by some probability density distribution, and each observation vector is produced by a state sequence with response probability density distribution. Therefore, the hidden Markov model is a double stochastic process----a hidden Markov chain with a certain number of States and a set of display random functions.
~o <VecSize> <MFCC_0_D_A>
~h "Proto"
<BeginHMM>
<NumStates> 5
<State> 2
<Mean> 39
0.0 ... 0.0
<Variance> 39
1.0 ... 1.0
<State> 3
<Mean> 39
0.0 ... 0.0
<Variance> 39
1.0 ... 1.0
......
<TransP> 5
0.0 1.0 0.0) 0.0 0.0
0.0 0.6 0.4) 0.0 0.0
0.0 0.0 0.6) 0.4 0.0
0.0 0.0 0.0) 0.7 0.3
0.0 0.0 0.0) 0.0 0.0
<EndHMM>
A HMM model
The HTK Toolkit includes:
Data Preparation Tools
Hdman, Hcopy, hled, Hsgen, Hbuild, Hlstats, Hparse
Model Training and optimization tools
Herest, Hinit, Hrest, hhed, HCOMPV
Identification tool
Hvite
Performance evaluation Tools
HResults, HRec
Two-building continuous speech recognition system
Data preparation
Definition syntax
$word =a|ai|an|ang|ao|ba|bai|ban|bang|
... | Silence;
(Sent-start < $word > Sent-end)
Building an acoustic model
B
P
M
F
D
T
N
L
X
Zh
Ch
Sh
Z
C
...
Last updated to context-sensitive acoustics model
Z+uo
Z-uo
H+ao
H-ao
N+a
N-a
Sh+i
Sh-i
L+i
Sh-ang
Y+ou
Y-ou
D+e
...
Corpus
Sentence 010001: Work number One
Sentence Spell:zuo4 pin3 yi1 Hao4
Sentence 010002: That's a tree for the race.
Sentence spell:na4 shi4 li4 zheng1 shang4 you2 de0 yi1 zhong3 shu4
Sentence 010003: Straight dry
Sentence Spell:bi3 Zhi2 de0 gan4
Sentence 010004: straight branches
Sentence Spell:bi3 Zhi2 de0 zhi1
Sentence 010005: What's it doing?
Sentence spell:ta1 de0 gan4 ne0
Sentence 010006: Usually the height of the Zhang
Sentence spell:tong1 chang2 shi4 zhang4 ba3 gao1
Sentence 010007: As if to be artificial
Sentence Spell:xiang4 shi4 jia1 yi3 ren2 gong1 shi4 de0
Sentence 010008: within one Zhang
Sentence spell:yi1 Zhang4 yi3 nei4
Sentence 010009: Absolutely no side branches
Sentence spell:jve2 Wu2 pang2 zhi1
Sentence 010010: What about all its sluggishly?
Sentence spell:ta1 Suo3 you3 de0 ya1 zhi1 ne0
Sentence 010011: all upward
Sentence spell:yi1 lv4 Xiang4 Shang4
Sentence 010012: and tightly aligned
Sentence spell:er2 qie2 jin3 jin3 Kao4 long3
Sentence 010013: It seems to be artificial.
Sentence spell:ye3 Xiang4 shi4 jia1 yi3 ren2 gong1 shi4 de0
Own recording + online Exchange corpus a total of about 3G corpus
Data feature Extraction
MFCC using hcopy Tools
Data training
Establish a hidden horse model for each primitive
+
Context-Independent training
+
Context-sensitive training
+
Increase the mixing level training
Recognition rate
------------Overall Results------
WORD:%corr=85.71, acc=79.15
==================================