A phonetic modeling model for continuous speech recognition with large vocabulary
When adding a new pronunciation, you can only modify dict and gram.
Recognition results to time conversion in HTK
13600000 16320000 hao-1452.207031
Directly divided by 10 of the 7-time Square
Hao pronunciation from 1.36 seconds to 1.632 seconds that means HTK is output in 100 microseconds as the base unit.
HTK bad data or over pruning alarms
Find an e-mail
Ask:
Hi
Omer Moav wrote:
Processing Data:cmu_us_arctic_awb_a0015.cmp; Label Cmu_us_arctic_awb_a0015.lab
Unable to traverse states in 1 frames
WARNING [ -7324] Stepback:file
/home/omergil/downloads/hts-demo_cmu-arctic-awb/cmp/cmu_us_arctic_awb_a0015.cmp
-Bad data or over pruning
In/home/omergil/downloads/htk/bin.linux/herest
For:
The above warning means that cmu_us_arctic_awb_a0015.cmp is corrupted.
I recommend to re-run it.
I find the corresponding alarm file according to the alarm information by looking at my error message because the cut voice is too small, basically only half of the pronunciation, so the training when the recognition will not come out a warning
Three-factor model State binding
... (reason not to join SP Mute model)
WB SP
WB sil
Tc
SP and SIL are responsible for segmentation
SIL th ih s sp m ae n sp ...
Hled.exe-n./lists/triphones1-l ' * '-I./LABELS/WINTRI.MLF mktri.led./LABELS/ALIGNED.MLF
Then becomes
Sil th+ih th-ih+s ih-s sp m+ae m-ae+n ae-n sp ...
Mktri.hed explanations generated by Perl/scripts/maketrihed/lists/monophones1/lists/triphones1
CL./lists/triphones1
TI t_b {(*-b+*,b+*,*-b). TRANSP}
TI t_p {(*-p+*,p+*,*-p). TRANSP}
...
CL indicates the meaning of cloning
TI represents the connection
Herest-b-C/config/config2-i./labels/wintri.mlf-t 250.0 150.0 1000.0-s stats-s train.scp-h./hmms/hmm11/macros-h ./hmms/hmm11/hmmsdef-m./hmms/hmm12./lists/triphones1
-B Store file in binary mode-s stats generate stats file
Constructing Decision Trees
Recognition results Correct accurate
1), the correct recognition rate of words
correct = (n-d-S)%N * 100%
2), Recognition accuracy
Accurate = (n-d-s-i)%N * 100%
N: Number of original script file morphemes
D: The recognition result corresponds to the number of words deleted in the reference sentence script
S: Recognition results correspond to the number of words replaced in the reference sentence script
I: The recognition result corresponds to the number of words inserted in the reference sentence script
In the Gram.txt file, change (Sent-start (< $word >|<SENT-START>) sent-end) to
< $word > Add Sent-start effect in Word pronunciation failed
Successful results $word =a|ai|an|ang|ao|ba|bai|ban|bang|bao|bei|ben|beng|bi|bian|biao|bie|bin|bing|
bo|bu|ca|cai|can|cang|cao|ce|cen|ceng|cha|chai|chan|chang|chao|che|chen|cheng|chi|
chong|chou|chu|chua|chuai|chuan|chuang|chui|chun|chuo|ci|cong|cou|cu|cuan|cui|cun|
cuo|da|dai|dan|dang|dao|de|den|deng|di|dia|dian|diao|die|ding|diu|dong|dou|du|duan|
dui|dun|duo|e|en|eng|er|fa|fan|fang|fei|fen|feng|fo|fu|ga|gai|gan|gang|gao|ge|gei|
gen|geng|gong|gou|gu|gua|guai|guan|guang|gui|gun|guo|ha|hai|han|hang|hao|he|hei|hen|
Heng|hong|hou|hu|hua|huai|huan|huang|hui|hun|huo|ji|jia|jian|jiang|jiao|jie|jin|jing|jiong|jiu|jv|jvan|jve|jvn |ka|kai|kan|kang|kao|ke|ken|keng|kong|kou|ku|kua|kuai|kuan|
kuang|kui|kun|kuo|la|lai|lan|lang|lao|le|lei|leng|li|lia|lian|liang|liao|lie|lin|
ling|liu|lo|long|lou|lu|luan|lve|lun|luo|lv|ma|mai|man|mang|mao|me|mei|men|meng|mi|
mian|miao|mie|min|ming|miu|mo|mou|mu|na|nai|nan|nang|nao|ne|nei|nen|neng|ni|nian|
niang|niao|nie|nin|ning|niu|nong|nu|nve|nv|nuan|nuo|nun|o|ou|pa|pai|pan|pang|pao|
pei|pen|peng|pi|pian|piao|pie|pin|ping|po|pou|pu|qi|qia|qian|qiang|qiao|qie|qin|
qing|qiong|qiu|qv|qvan|qve|qvn|ran|ang|rao|re|ren|reng|ri|rong|rou|ru|rua|ruan|rui|
run|ruo|sa|sai|san|sang|sao|se|sen|seng|sha|shai|shan|shang|shao|she|shen|sheng|shi|
shou|shu|shua|shuai|shuan|shui|shun|shuo|si|song|sou|su|suan|sui|sun|suo|ta|tai|tan|
tang|tao|te|tei|teng|ti|tian|tiao|tie|ting|tong|tou|tu|tuan|tui|tun|tuo|wa|wai|wan|
wang|wei|wen|weng|wo|wu|xi|xia|xian|xiang|xiao|xie|xin|xing|xiong|xiu|xv|xvan|xve|
xvn|ya|yan|yang|yao|ye|yv|yi|yin|ying|yo|yong|you|yvan|yve|yvn|za|zai|zan|zang|zao|
ze|zei|zen|zeng|zha|zhai|zhan|zhang|zhao|zhe|zhen|zheng|zhi|zhong|zhou|zhu|zhua|
Zhuai|zhuan|zhuang|zhui|zhun|zhuo|zi|zong|zou|zu|zuan|zui|zun|zuo|fou|shuang|silence;
(Sent-start < $word > Sent-end)
Yes
L Context-sensitive modeling can be a good solution to
L Refinement Modeling
L Cooperative Pronunciation
L Pronunciation Variation
L Accent
L parameter sharing level
L Model level (Model-level)
L Status level (State-level)
L Mixed Grade (Mixture-level)
L sharing of various other parameters (e.g. transfer matrix, center, variance, mixed weights , etc.)
The production and function of vfloors
HCOMPV has a number of options specified fori t.the-f option causes Avariancefloor macro
(called vfloors) to be generated which are equal to 0.01 times the global variance. This is a vector of
Values which is used to set Afloor on the variances estimated in the subsequent steps.
Generates initialization values for the next phase of variances
D:/tryputong>hvite-h./hmms/hmm12/macros-h./hmms/hmm12/hmmsdef-s test.scp-l
*-I./RESULTS/RECOUT_STEP1.MLF-W wdnet-p 0.0-s 5.0/dict/dict2./lists/tri
Phones1
ERROR [+8231] gethcimodel:cannot find hmm [t-]ei[+???]
FATAL error-terminating Program Hvite
1 Regular Expressions <> one or more [] 0 or one
The beauty of mathematics Series three -- application of Hidden Markov model in language processing
Http://www.google.cn/ggblog/googlechinablog/2006/04/blog-post_1583.html