"Sphinx" sphinx4 Study notes

Last Update:2015-12-22 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Sphinx-core Project is a Java project, with two examples, one is Helloword, which contains several functions, such as recording, alignment and so on. (not yet tested)
The other is Hellongram, which is speech recognition. The parameter file that can be used has hellongram0.xml,hellongram9.xml, Hellongram1.xml. Where the language model is not used in 1.xml, but instead uses the JSFG to define the language rules of the sentence, it seems that the regular expression is used to stipulate that the sentence to be recognized is only as follows: (hello) (jim|kate|tom).
Hellongram0.xml is an example of the Ngram language model, which defines an acoustic model, a language model, and a dictionary storage path.
Recognize begins by load all available models, to the following stages, the load model defines the file mdef, and then allocates the pool and size according to the mean, variance, and transformation matrices respectively. Then create a pool for each sound element (Senone) (Distfloor: The lowest value, which appears to be the lowest threshold, Variancefloor: the lowest variance)

 Variancepool = loaddensityfile (datalocation + variances ", Variancefloor        ); Mixtureweightspool  = Loadmixtureweights (datalocation  +" mixture        _weights " = Loadtransitionmatrices (datalocation  +" Transi        Tion_matrices "); Transformmatrix  = Loadtransformmatrix (datalocation  +" feature_t        Ransform "); Senonepool  = Createsenonepool (Distfloor, variancefloor);

Current problem: When you try the demo's WSJ model catalog, the load can run successfully. While reading the model I trained, loading errors. The debug problem found that two models have the following differences
```
================WSG-0. XML model format:-------------------------------senone:4147  Numgausepersenone:8means:33176=4147*8variances:33176streams:1================= Male_result (my) model format:-----------------------------------senone:186Gaussianpersenone: means:1024x768varians:streams:4
```
Analyze the reason, whether for sphinx4 loaded model, some parameters are fixed, such as the number of streams, as well as the number of Gauss

Solution

Modify the Sphinx-config training parameters file, the semi to cont, you should note the following remarks, use Pocketsphinx time, is in semi format, sphinx3 when there is cont format, then the corresponding stream is 1, The number of Gause is 8. In this context, get the Cont model, load into sphinx4 environment, compile, ok! Run smoothly, use your own model and then test it with your own sound, the results are as follows:

Start speaking. Press Ctrl-C to Quit.resultList.size=1Bestfinaltoken=0050-6.8291255e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}</s>-10008.177 0.050 Great Wall 68886.47 0.04 <sil> 0.0 0.00 <s> 0.0 0.00result=<s> <sil> Great Wall </s>resultlist.size=1Bestfinaltoken=0050-6.8291255e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}</s>-10008.177 0.050 Great Wall 68886.47 0.04 <sil> 0.0 0.00 <s> 0.0 0.0resultlist.size=2Bestfinaltoken=0077-7.2286605e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}</s> 10008.177 0.059 Great Wall 68886.47 0.04 <sil> 0.0 0.00 <s> 0.0 0.01result=<s> <sil> Great Wall </s>resultlist.size=2Bestfinaltoken=0077-7.2286605e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}best Token=0077-7.2286605e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}</s> 10008.177 0.059 Great Wall 68886.47 0.04 <sil> 0.0 0.00 <s> 0.0 0.0You said: [Great Wall]start speaking. Press Ctrl-C to quit.

sphinx4 White Paper

"Sphinx" sphinx4 Study notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

"Sphinx" sphinx4 Study notes

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support