Today, we found a strange problem when testing the hardware communication module. Frequent data errors occurred when sending and receiving data for comparative replication.
The test process is as follows: Send a byte and receive a byte for comparison. When the returned data is different from the sent data, the error counter is accumulated. Data transmission and receiving Abstract: uint16 I = 0; uint16 J = 0; uint32 error_num = 0; xx_send_data (I ++); j = xx_rece_data ();
If (I! = (J + 1) {error_num ++;} in theory, as long as the data I is sent and received are the same, the error counter should not have an error, but it actually has an error, it was initially suspected that there was a problem with the data sending and receiving timing of the FPGA hardware module, but there were no problems with multiple bulk functional simulation and timing simulation. Now I began to doubt whether the short C program is working normally. Therefore, in the error_num ++ sentence, I = 0 is used to view the error data, why are all errors reported when J = 65535? I was puzzled. Later, I asked Fei Ge, a senior engineer. At first glance, Fei Ge saw the problem, and I was ashamed !!~~ The original error occurs in this way: (the program runs on a 32-bit processor) this program does not have any problems on the surface, but the detailed analysis will find that when I = 65535 sends data, after sending the message, execute I ++. At this time, I changes to 0. After J calls the receiving function, the returned value is 65535 correctly. However, it seems that the problem occurs in if (I! = (J + 1. It turns out that the arithmetic values of J + 1 are stored in 32 registers by default, so the result of J + 1 is 32-bit data. Therefore, when J = 65535, (J + 1) it is actually 65536, (0! = 65536). Therefore, an mismatch occurs. The error code is automatically added. There are many ways to eliminate this error. You can solve the problem by modifying the Code as follows. Uint16 I = 0; uint16 J = 0; uint32 error_num = 0; xx_send_data (I ++); j = xx_rece_data ();
If (I! = (Uint16) (J + 1) {error_num ++;} finally found the problem. It seems that you have to carefully write the code in the future and learn from it. // After reading the reply, I found that many people understand the code in pure software, but the Embedded C language must be detailed and combined with the processor behavior, the following is a detailed explanation of the assembly code corresponding to the code. Read the assembly code to find out the problem. The first assembly code ldhu R3,-10 (FP) means to load a half word from the memory or cache and extend it to the unsigned type, this statement loads the variable I into registers R3, and R3 is a 32-bit register, at this time, although the variable I is uint16, the comparison still turns into a 32-bit second assembly code ldhu R2,-12 (FP) is the same as the previous statement, here is the third assembly code addi R2, R2, 1/* problem where the variable J is loaded into Register R2 */This statement is executed J + 1, it can be seen that the result is saved in R2 and R2 is a 32-bit register. Therefore, when I = 0, j = 65535, J + 1 = 65536 does not overflow, but is directly assigned to R2, in this case, R2 is 65536, and R3 is 0. The last assembly code beq R3, R2, 0x800258 <main + 92> is used to compare the values of R3 and R2, if they are equal, the system will jump to the address 0x800258 <main + 92> to execute the program. If they are not equal, run the next statement, that is, error_num ++. From the analysis above, we can see that the embedded industry must be very familiar with the processor and have some strange problems, to locate the problem.