Nodejs provides a simple translation between Chinese and English, and nodejs implements both Chinese and English.

Source: Internet
Author: User

Nodejs provides a simple translation between Chinese and English, and nodejs implements both Chinese and English.

Help previous colleagues solve a requirement. translate Chinese projects into English projects ~~~

Considering the specific implementation problems, if it is intelligent, it must be a Chinese syntax analysis, but it is difficult.

Therefore, the final solution is to traverse the file, match the Chinese phrase, then perform manual translation, and replace the Chinese phrase with the translation content. Of course, we still need to manually verify later. After all, the Chinese characters in the Code may affect the relevant programs.

This problem obviously involves multithreading and file read/write. nodejs is the first thing that comes to mind. Although nodejs is a main thread, asynchronous file read/write and Event Response Mechanisms must also call threads, in actual programming, thread-related issues do not need to be considered.

The code is not complex as follows. After writing, it encapsulates

Var fs = require ('fs'); var http = require ('http'); var filePath = 'd: \ WORK_new \ '; var logPath = 'd: \ chinese. log'; var map ={}; var num = 0; var dictionary = (function () {var map ={}; return {logPath: 'd :\\ chinese. log', set: function (key, val) {map [key] = val | '';}, get: function (key) {return map [key] | '';}, save2File: function () {fs. writeFile (this. logPath, JSON. stringify (map ). replace (/ ","/G, '", \ r \ n"'), {encoding: 'utf8', flag: 'W'}, function (err) {if (err) throw err ;}) ;}, loadFile: function (callback) {fs. readFile (this. logPath, {encoding: 'utf8'}, function (err, data) {map = JSON. parse (data); callback () ;}, translateByGoogle: function (callback) {var index = 0; for (var key in map) {if (map [key] = '') {index ++; (function (key) {http. get ("http://translate.google.cn/trans Late_a/t? Client = t & hl = zh-CN & sl = zh-CN & tl = en & ie = UTF-8 & oe = UTF-8 & oc = 2 & otf = 1 & ssel = 3 & tsel = 6 & SC = 2 & q = "+ key, function (res) {res. setEncoding ('utf8'); var body = ""; res. on ('data', function (chunk) {body + = chunk ;}). on ('end', function () {var obj = eval ('+ body + ')'); map [key] = obj [0] [0] [0]; index --; if (index = 0) {callback ();}});}). on ('error', function (e) {console. log ('HTTP error'); index --; if (index = 0 ){ Callback ();} console. log ("Got error:" + e. message) ;}) ;}( key) ;}}}) (); function File () {var index = 0; var _ readFile = function (pathStr, fileBack, doneBack) {fs. readFile (pathStr, {encoding: 'utf8'}, function (err, data) {index --; if (err) {data = ""; console. log (err, pathStr) // throw err;} fileBack (data, pathStr); if (index = 0) {doneBack ();}});}; var _ partition dir = function (pathStr, FileBack, doneBack) {fs. readdir (pathStr, function (err, files) {files. forEach (function (file) {if (fs. statSync (pathStr + '/' + file ). isDirectory () {_ folder Dir (pathStr + '/' + file, fileBack, doneBack);} else {if (/. js $ |. html $ |. htm $ |. jsp $ /. test (file) {index ++; _ readFile (pathStr + '/' + file, fileBack, doneBack) ;}return ;});} this. using dir = function (pathStr, fileBack, doneBack) {ind Ex = 0; _ using Dir (pathStr, fileBack, doneBack) ;}// obtain the Chinese dictionary in step 1. logPath = logPath; new File (). export Dir (filePath, function (data) {if (!! Data) {var match = data. match (/[\ u4e00-\ u9faf] +/g); if (!! Match) {match. forEach (function (mat) {dictionary. set (mat) ;}}}, function () {console. log ('get Chinese OK '); dictionary. save2File () ;}) // Step 2 google Translate/* dictionary. loadFile (function () {dictionary. translateByGoogle (function () {dictionary. save2File () ;}}); * // Replace the Chinese character in step 3/* dictionary. loadFile (function () {new File (). using Dir (filePath, function (data, pathStr) {fs. writeFile (pathStr, data. replace (/[\ u4e00-\ u9faf] +/g, function (ch) {return dictionary. get (ch) ;}, {encoding: 'ascii ', flag: 'W'}, function (err) {if (err) throw err ;});}, function () {console. log ('replace Chinese OK ');})});*/

The problem persists.

1. nodejs Encoding Problems: poor support for GBK encoding in the window environment, mainly for utf8 File Processing

2. The above efficiency may be optimized through the thread, which is not considered in depth.

3. If a single punctuation phrase is found, manual troubleshooting is required.

In actual situations, the file is GBK, and some files are utf8. Later, we considered using the script language to implement it quickly,

1. Identify the problem of file encoding by searching

Determine whether the first three bytes of the file are ef bb bf, but this is only for the utf8 format with BOM

For the utf8 format without BOM, You need to determine the byte signature (difficult, limited energy, use the above solution, for manual troubleshooting without BOM ).

2. Because kuaishou multithreading is easy to program, I always thought that multithreading is definitely better than single-thread efficiency. The actual situation is different from what we thought. A single thread is much faster than a multi-thread. It seems that the main bottleneck is reading and writing file I/O.

The above is all the content in this article. I hope you will like it.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.