Software requirements:
First you have to have Moses (nonsense haha), then you have to have Giza ++ aligned with words (used in traning-model.perl), irstlm produces language models
General steps:
The general steps are as follows:
- Prepare parallerl data (Sentence Alignment required): Perform tokenisation, truecasing, and cleaning on the corpus before using the Machine Translation System (Haha, I can't help writing the detailed steps directly)
- Train your language model (using irstlm): Of course, there are several steps to explain in detail.
- Then train your translation system (it may take an hour or two): (2) Run Giza
(3) align words
(4) Learn lexical Translation
(5) extract phrases
(6) score phrases
(7) Learn reordering Model
(8) Learn generation model
(9) Create decoder Config File
- Finally, it takes a few hours to adjust the tuning.
- Finally, you can run it. If you are too slow to start, you can convert the model to binarised-model, which will be faster. Of course, you need to change something, but it is very simple.
Detailed steps and instructions:
Moses creates a basic process record for the translation system. The detailed description of each process will be followed and the parameter description for each step will be given.