Recently and front-end classmate chat more, chat chat to find the front end return a strange error, the server side to the client JSON wrong, can not open. The front-end classmate BS a server with the JSON library, incredibly even the code will be wrong. I checked the library used, the front and back end does not use the same Json.lua, so ran a simple pressure test. Of course, no problems were found. Two days later, the same problem was encountered, this time I pay attention to the details of the error:
Coco/game.lua: the: Coco/lib/json.lua:525: Coco/lib/json.lua: -: expected Colon at Char119of: [{"Res_amount":5,"ID":1,"Status":2,"Res_type":1},{"Res_amount":50000,"ID":2,"status ": 0,"Res_type": 2},{"Res_amount2:5,"ID":3,"Status":2,"Res_type":1},{"Res_amount":50000,"ID":4,"Status":0,"Res_type":3},{"Res_amount":7,"ID":5,"Status":0,"Res_type":1},{"Res_amount":50000,"ID": 6,"Status": 0,"Res_type": 5},{"Res_amount": 8,"Id": 7,"Status": 0,"Res_type": 1}]
You can see that the specific error:
{"res_amount2:5,""id": 3,"status": 2, "res_type": 1}
It was a place of quotation marks, and became 2. Compare the two binary representations:
Only one bit is worse! Then find the last mistake, is a, become <:
It seems that we have encountered the legendary bit reversal! The relevant information can be referenced in this article.
Curiosity up, wrote a reptile, climbed the last 10 days of log records (no permission to check database Orz), analyzed about 800,000 logs, a total of 6 cases of similar errors. The occurrence of models to domestic machines, the following is the name time (PA PA):
sm-g3502 |
Coolpad 8705 |
Coolpad 8720L |
HUAWEI g521-l076 |
asus_t00f |
Vivo x3t |
Brands from Samsung to Huawei, to ground gas coolpad.
Believe that there is actually more bit reversal chance than this, because the middle of the JSON string, or the middle of the JSON value, will not be found. In addition, the binary packaging is done with PROTOBUF, PB packet damage error is actually a lot, just this can not be confirmed that the 1 bit is bad or multiple bits broken. According to reference, 4G memory pc, 3bit error per hour to 3bit every month, and the number of mobile devices than the PC is much more, so this situation is not surprising.
and later, I analyzed our protocol transceiver process, when the implementation of the time did not add a verification code. If added, you can add a package to the re-send mechanism, encountered the wrong package requires the server to be re-sent. Or add some check code, similar to the use of server memory ECC, can let only the wrong 1bit package recovery back. The first time you find TCP is unreliable, haha
Houston, we're having a problem.