Embedded System Programming and debugging skills

Source: Internet
Author: User
Tags prefetch

The development of embedded systems and the stable and reliable operation of software are very important. In chips, software has no quality, but the quality of software can determine the success or failure of a chip. In chip design, the performance can meet the design requirements, in addition to the hardware design, software and hardware with the design skills, for software, programming technology and skills are equally important.

This article describes some programming debugging skills I have used during chip Firmware Development. For common issues in embedded system development, such as synchronization issues in real-time systems and Memory leakage issues in dynamic memory allocation, how can we prevent bugs in the programming stage, how to promptly discover and locate problems in the debugging phase. To sum up the experience, we aim to develop a stable running firmware to improve development efficiency and performance.

I. Programming and debugging skills (I) dynamic memory allocation or static memory allocation?

In embedded system development, dynamic memory allocation and static memory allocation have their respective advantages and disadvantages. dynamic memory allocation is flexible and convenient, but it consumes additional memory resources, resulting in performance loss and memory fragmentation problems.

In the development of firmware for wireless chips, we mainly consider the following points:

  • CPU Performance
  • Available memory size
  • Call frequency that requires Memory Allocation
  • Performance impact
In actual design:
  • Consider the overhead of calling the malloc function. The data structures used for data frame processing and management in all transmission and receiving channels are allocated in static memory. The purpose is to reduce the processing time.
  • L site management, key management, external commands passed by the sdio interface, and sdio event management transferred up using dynamic memory allocation. These calls are not an order of magnitude compared with the number of data frame sending and receiving processes, and some are occasionally called.
  • One-time allocation: The data structure variables that exist throughout the entire life cycle of the software are allocated using static memory.
(2) Use the arm C library or write one by yourself?

Some functions in the C library are often used in embedded system development, such as malloc, free, memcpy, memset, and printf. Do you want to write one by yourself or use the C library provided by ARM development tools?

My habit and suggestion is that it is best to use arm's C library. Advantage:

  • Easy to use and reduce the development cycle
  • Arm's C library delivers better performance.

Taking memcpy as an example, I have an article on the Internet that analyzes the arm memcpy assembly code. Why is the code written by arm better optimized and better performance? It mainly considers the following points:

  • Byte alignment between the source address and the destination address
  • Copy Byte Length
  • The alignment of the last byte.
  • Try to copy word
  • Use arm to copy Assembly commands in batches.

So I won't write another memcpy function. When using memcpy, you only need to note that the 4-8 long value assignment method is more efficient. Or it doesn't matter how to use it without affecting performance.

The armc library has two types of libraries: Standard Library and microc library. The latter is non-thread-safe and used in the bare system. The former is thread-safe and can be used in real-time systems. Users who use malloc need to implement a function to protect and release the protection, which is used by the C library. Prevents synchronization problems when multiple threads call the malloc function.

(3) how to prevent and detect memory leaks

With dynamic memory allocation, the system may have problems such as memory leakage, memory overflow, and repeated memory release. If malloc and free are used directly, it is difficult to find such a bug. During the programming stage, the redefinition of malloc and free functions can promptly discover and locate these problems. Let the program discover the problem, instead of finding the problem yourself or not knowing it.

The redefined malloc uses a two-way linked list to manage all dynamically allocated memory. The following figure shows the data structure used for memory management:

 


Each memory allocation allocates memory of the above size, including three parts. The yellow part is the memory_block data structure, the gray part is the actual memory usage area, and the red part (4 bytes) is the tail mark.

Memory_block stores the two-way linked list. The allocated file name pointer and file row number, the allocated length and header mark.

In this way, when the memory is released, an error message can be printed by judging the tag release failure. Memory problems and prompts that can be found after the definition of malloc and free include:

  • Check whether the tail mark is damaged. It is possible that the block memory usage overflows or is written incorrectly elsewhere.
  • Check whether the header mark is damaged. It is likely to be damaged by other memory above the address, or it has been illegally written by other places.
  • When a allocated memory is released for the first time, the header is marked as another value. In this case, if the memory is released repeatedly, the check will find that the header mark is inconsistent with the configured memory allocation, and the problem is found.
  • When the system exits, all memory should be released. At this time, check the memory management linked list. If there are still nodes, there is a memory leakage problem.
  • For unallocated memory, when you release the memory area pointed to by the pointer, you can find an invalid release because the matched pointer value cannot be found in the memory management linked list.
  • When the memory is released, you can print the file name and row number of the calling function and the allocated size. If the memory_block area is damaged, you can view the entered content of the data block as a reference for further judgment.

 

The redefined malloc and free functions can detect and locate the vast majority of memory usage bugs. This prevents Memory leakage. If the memory_block area is damaged, you can find that there is a problem, but you need to determine the location.

(4) Pay attention to the size end

In embedded system programming, it is a basic requirement to pay attention to the size issues.

  • Check whether the accessed fields are in the big-end or small-end format.
  • Pay attention to the access problem of the accessed fields in different architecture CPUs (different size ends. Consider code portability issues.
  • Careful use of bit domain definitions and operations is prone to problems on the Size side. In addition, bit-domain operations can be implemented only by a large number of commands. We can disassemble and compare the differences between bit field operations and bit operation compilation results. In some Embedded C language specifications, bit fields are also prohibited.
  • Do not think like this: my code will only run on a small-end CPU.

 

If you do not want to port code executed on a small-end CPU to a large-end CPU, modify a large number of read/write operations in the code. When writing code at the beginning, pay attention to this problem to avoid suffering in the future.

Therefore, it is necessary to compile the macro for access from large and small terminals. The Code is as follows:


(5) Attention to byte alignment

Because most of the current embedded development platforms are 32-bit CPUs, 51 and 64-bit CPUs are considered separately.

Generally, the memory address for dynamic memory allocation is word aligned, and the first address of the struct variable compiled by the compiler is also word aligned. For the definition of the variable in the struct and access, alignment issues require the attention of programmers.

Basic Principles of variable definition:

  • The variable in the struct is half-word (2 bytes long) and must be a multiple of 2 and boundary alignment
  • It is a variable word type (4 bytes long) in the struct and must be a multiple of 4 boundary alignment.
  • The variable in the struct is Char (1 byte) and can be aligned with any boundary. For the array type, consider the length.

For the following data structure (left) definition,


Although the compiler will perform byte filling during compilation, we recommend that you use explicit filling (as shown in the preceding right data structure ).

For variable read/write operation variables, alignment can be considered during variable definition to avoid read/write problems. For example, the Code may read and write a word across word boundaries. Some CPU architectures may encounter access exceptions, while some CPU architectures read (or write) An error value, but do not. There is no problem with cross-word boundary read/write of Intel's desktop platform CPU, because the CPU has helped you solve this problem, but the impact is that the code execution efficiency is deteriorated, such code is normal on the Windows platform, but problems may occur on the embedded platform. Therefore, the fundamental solution is to pay attention to this problem during the programming stage.

For example, in the firmware of a wireless network card, fields in the data packet need to be processed. The starting addresses of some fields are random, which may be word-aligned or not. When accessing such word variables, add the _ packed keyword. The Code is as follows:

U32data = * (u32 _ packed *) Da );

In this way, the compiler will compile the compilation code that is written in byte and then merged into word, without reading data.

(6) always pay attention to synchronous access

During real-time system application development, synchronization bugs are often encountered and difficult to locate. Therefore, we should consider it in the programming stage to avoid the pain points in subsequent debugging. Always pay attention to the synchronization access issue. In the programming process, always ask yourself whether the synchronization problem occurs in the operation of this variable and whether there are multiple threads for write operations, are there other threads in use when releasing? The following describes the protection technologies used in the development process.

1. Synchronization of register (variable) Write

In embedded real-time systems, it is common to read and write variables in registers or memory. Taking registers as an example, for a bare System (without a real-time operating system), you only need to define the registers as volatile, this prevents access and execution problems caused by asynchronous hardware modifications after software programming is compiled.

In a real-time multi-task system, you also need to note that multiple tasks write the registers. This requires protection. For example, you can use Guanzhong disconnection, semaphore, or mutex protection.

For example, both task1 and task2 read and write MAC address registers (two long word registers mac_low_reg and mac_hi_reg ). When Task 1 has just finished writing the MAC address four-byte register, Task 2 starts to execute and then writes different contents to the MAC address register. After Task 2 is executed, switch back to Task 1 for execution. Task 1 continues to write the MAC address register (mac_hi_reg ). The consequence is that illegal content is written in the MAC address register.

The basic principles for register operations in the firmware of a wireless network card are as follows:

  • Generally, you do not need to protect the registers that are initialized and read/written by the system.
  • Register write operations in tasks (or threads) are protected by semaphores, and register access is classified by function. Protects the entire function to prevent interruption when half of the specific function is executed. At the same time, try to make the code on the function end not too long.
  • You do not need to protect the registers accessed only in a thread, but the above one is used in principle.
  • For UART output, because it is called only during debug, the release code does not include this part, so it is not protected. The system is not affected in actual use.

 

2. dynamically managed structure variable atomic operations for Synchronous access

The firmware of the wireless network card has some struct variables that are dynamically allocated and released, such as site management, which involves the access and release of multiple threads. Protection is also required for such struct. After the struct is released, other threads will access the struct. It is difficult to find and locate bugs that read and write data to released space during the debugging stage. Therefore, this issue needs to be prevented during the programming stage.

In programming, the protection method is atomic operation. The specific code and operation steps are as follows:

 

When implementing site management,

  • Call kref_init (sta-> kref) when you join a website. The initialization atomic variable is 1.
  • To access this site, call kref_get (sta-> kref), and the atomic variable is 2.
  • Call kref_put (sta-> kref, release) after the access is completed, and the atomic variable is reduced to 1.
  • When released, call kref_put (sta-> kref, release) Again, reduce the atomic variable to 0, and call the release function to release the site memory.
  • For access to multiple threads, because the kref_get and kref_put pairs are used, there is no problem.
  • If Task 2 calls the function to release the site, if Task 1 has just obtained kref_get at this time, the function of Task 2 to release the site will not call the release function. Only after Task 1 calls kref_put, to truly release the site memory. In this way, no operation is performed on the site space that has released the memory.

 

The above Code refers to the source code of the Linux kernel. You need to implement atomic operations by yourself. UCOS has no atomic operation functions. For ARM7 and ARM9, for the Cortex-M3 can use arm atomic operation assembly command implementation.

 

3. synchronous access to a two-way linked list

Two-way linked lists are used to manage Nic firmware in many places, such as site management and data sending and receiving management. Operations on a two-way linked list include adding a node and deleting a node. Unless the linked list is used only in one thread, semaphore is used to protect access.

(7) add some printed statistics. 1. Output statistics to help locate bugs.


In the debug version of the project, the statistical information is periodically printed by the first start task created by the real-time system. In order not to affect functions and performance, the interval is set to 30 seconds.

You can print the total CPU usage, context switch times, sending and receiving statistics, and dynamic memory allocation times.

During early system debugging, due to many problems, the printed information can quickly help locate problems, suchReceive Frame CountIt is always fixed, and the number is a specific value. You can basically determine why the receiving is stopped.

In addition, you can determine whether there are unreleased items based on the total number of received messages, the total number of releases, and the number of data flows after receiving. This is when the module is not released.

The printed information of dynamic memory allocation can be used to determine whether memory leakage exists. The size of the heap required by the system is approximately large.

UCOS functions with CPU usage can be obtained directly by calling them. Some real-time systems do not have such functions and can implement one by themselves. The principle is that the real-time system has a system clock, which is triggered every time, the total tick count is increased by 1. If the system is idle, the idle (idle) task increases the number of tick times by 1. (1-Number of idle task tick/total number of tick) is the CPU usage.

2. Outputs task-related information of the real-time system.

During real-time system debugging, you need to pay attention to the usage of each task (thread) Task Stack. Whether the task stack allocation is too large or too small. Too much memory is wasted. Stack Overflow occurs when the memory usage is too small. Understand the usage of the task stack and allocate a stack of the appropriate size.

UCOS has a function of the size used by the counting stack, which is generally not enabled during compilation and configuration. Because counting affects performance. The principle is to initialize a task stack with all the values 0. After the Task Stack is used, the memory used in this zone becomes a non-zero value. Because it is a stack, during computing, from the top down of the stack, the number of counters is 0 until the non-zero memory address is encountered. Stack size minus the remaining 0 bytes is the size of the stack.

 

(8) Print Output according to module control

The wireless network card firmware is designed with multiple modules (threads and interfaces). Each module uses a dedicated output macro. Define a global variable wl_debug_components to control the module and print the output.

For example, the initialization is set:

# If debug

U32 wl_debug_components = comp_info | comp_tx | comp_rx;

# Endif

The code is run to output print information for print statements at the comp_info, comp_tx, and comp_rx levels. In addition, you can use external control, such as sending commands through external interfaces, to modify the wl_debug_components value, so that the program outputs and does not print the print information of a module.

Design Requirements for print output:

  • Print by module
  • Printing output can be controlled externally
  • The debug version compiles the printed code without compiling the release version.


(9) clever use of development environment and debugging tools

1. semihost mechanism of ARM

The semihost mechanism is one of the features of arm. You can use JTAG to send and receive relevant information in the Command window of the debugging environment without a serial port. For example, you can write a menu program and output the menu in the Command window. After the user selects the menu, the program will perform corresponding operations. The advantage of using semihost is that it is easy to use and useful for some functional tests, but the code execution performance of printf is worse than that of serial port.

2. Arm debugger breakpoint setting tool.

Realview is a powerful debugging tool. It not only sets breakpoints, but also stops when the code is executed to set breakpoints. You can also set a breakpoint expression to stop the CPU under specified conditions.

1) a breakpoint is triggered when data is read and written to a specified memory address. During debugging, illegal content is often written in a certain area. Do you want to know which line of code performs this operation during execution?


This breakpoint is triggered when the CPU stops executing the write operation on the address 0x0e001800, so that you can see where the code is executed. Command also allows the CPU to execute some commands when it is stopped, such as printing messages to the console, executing a function, and so on.

2) execute a function n times, or the Code of a line stops after N times.


3) condition execution stops. You can set a breakpoint to stop the global variable setchar = 10. In this way, you can set a specified condition to trigger a breakpoint and stop the program.

(10) Use CPU features to locate bugs in the abnormal mode of bug1 using ARM CPU

Generally, an arm cpu command exception occurs: prefetch refers to an exception, data access exception, and undefined command exception. For example, execute an invalid command (either the program is flying or the code area is broken ). Data is written to the read-only Zone illegally. prefetch refers to a place with an exception pointing to a place without obtaining the correct access permission. Once this problem occurs, you can check the value of the R14 connection register in normal mode (user mode or management mode) to determine the return address of the code to be executed. map function ing file to determine what functions the code is executing and where exceptions may occur. This method has a great opportunity to locate the problem.

2. Use the MPU or mmu of the arm9-cpu in combination with the previous one to locate the bug.

The arm946e CPU is MPU. During programming, you can place the code segment and Ro field in one zone and set it to read-only. The RW field and Zi field are placed in another zone and set it to read/write attribute.

In this way, all data writing to the read-only zone in the code can be captured. When a problem occurs in many bugs, data is written to the 0x0 address area. For example, if you use a pointer variable with the initial value set to 0, a data access exception occurs during write operations.

(11) software and hardware cooperation for debugging
  • The serial port is a good device for printing and output auxiliary debugging.
  • I/O port input and output can be considered, such as button, LED light
  • Use the system timer. For real-time systems, you can use the OS system timer with an output precision of 1 ms or 10 ms. For us precision requirements, you can directly use the timer on the CPU. For example, you need to calculate the execution time of a code segment, or check the error occurrence time and interval.
  • It can be used with an oscilloscope or logic analyzer to output the time and interval for performing operations.

 

Conclusion 2

For an embedded developer, continuous learning, experience accumulation, and extensive knowledge are of great help in improving development efficiency.

Familiar with your own tools and the CPU architecture. Carefully read the help manual of development tools, carefully read the free CPU architecture document of arm, and the free compilation tool document of arm. These documents are more valuable than the books developed by arm sold by the bookstore.

Read excellent code and accumulate programming skills and debugging methods. Many skills and technologies are the same for kernel development, windows driver development, Linux driver development, and embedded firmware development.

Read the chip manual, including the chip development manual, accumulate the design skills of software and hardware cooperation, and understand the implementation mechanism based on the chip code.

When debugging, pay attention to phenomena and details. Your knowledge can help you easily locate problems.

The above are some of my experiences in Embedded Development and debugging.

Embedded System Programming and debugging skills

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.