"Sphinx" sphinx4 Study notes

Source: Internet
Author: User

    • Sphinx-core Project is a Java project, with two examples, one is Helloword, which contains several functions, such as recording, alignment and so on. (not yet tested)
    • The other is Hellongram, which is speech recognition. The parameter file that can be used has hellongram0.xml,hellongram9.xml, Hellongram1.xml. Where the language model is not used in 1.xml, but instead uses the JSFG to define the language rules of the sentence, it seems that the regular expression is used to stipulate that the sentence to be recognized is only as follows: (hello) (jim|kate|tom).
    • Hellongram0.xml is an example of the Ngram language model, which defines an acoustic model, a language model, and a dictionary storage path.
    • Recognize begins by load all available models, to the following stages, the load model defines the file mdef, and then allocates the pool and size according to the mean, variance, and transformation matrices respectively. Then create a pool for each sound element (Senone) (Distfloor: The lowest value, which appears to be the lowest threshold, Variancefloor: the lowest variance)
 Variancepool = loaddensityfile (datalocation + variances ", Variancefloor        ); Mixtureweightspool  = Loadmixtureweights (datalocation  +" mixture        _weights " = Loadtransitionmatrices (datalocation  +" Transi        Tion_matrices "); Transformmatrix  = Loadtransformmatrix (datalocation  +" feature_t        Ransform "); Senonepool  = Createsenonepool (Distfloor, variancefloor); 
    • Current problem: When you try the demo's WSJ model catalog, the load can run successfully. While reading the model I trained, loading errors. The debug problem found that two models have the following differences
      ================WSG-0. XML model format:-------------------------------senone:4147  Numgausepersenone:8means:33176=4147*8variances:33176streams:1================= Male_result (my) model format:-----------------------------------senone:186Gaussianpersenone: means:1024x768varians:streams:4

      Analyze the reason, whether for sphinx4 loaded model, some parameters are fixed, such as the number of streams, as well as the number of Gauss

Solution

Modify the Sphinx-config training parameters file, the semi to cont, you should note the following remarks, use Pocketsphinx time, is in semi format, sphinx3 when there is cont format, then the corresponding stream is 1, The number of Gause is 8. In this context, get the Cont model, load into sphinx4 environment, compile, ok! Run smoothly, use your own model and then test it with your own sound, the results are as follows:

Start speaking. Press Ctrl-C to Quit.resultList.size=1Bestfinaltoken=0050-6.8291255e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}</s>-10008.177 0.050 Great Wall 68886.47 0.04 <sil> 0.0 0.00 <s> 0.0 0.00result=<s> <sil> Great Wall </s>resultlist.size=1Bestfinaltoken=0050-6.8291255e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}</s>-10008.177 0.050 Great Wall 68886.47 0.04 <sil> 0.0 0.00 <s> 0.0 0.0resultlist.size=2Bestfinaltoken=0077-7.2286605e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}</s> 10008.177 0.059 Great Wall 68886.47 0.04 <sil> 0.0 0.00 <s> 0.0 0.01result=<s> <sil> Great Wall </s>resultlist.size=2Bestfinaltoken=0077-7.2286605e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}best Token=0077-7.2286605e06 0.0000000e00-1.0008177e04 lt-wordnode </s> (*sil) p 0.0-10008.177{[Great Wall][</s>]}</s> 10008.177 0.059 Great Wall 68886.47 0.04 <sil> 0.0 0.00 <s> 0.0 0.0You said: [Great Wall]start speaking. Press Ctrl-C to quit.

sphinx4 White Paper

"Sphinx" sphinx4 Study notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.