Usage of cache in embedded processors

Source: Internet
Author: User
Usage of cache in embedded processors Author:Northwestern University of Technology Wang Yan Wu Xuguang Zhao xunfeng

With the development of embedded computer applications, the clock speed of embedded CPU continues to increase, which leads to a situation where slow system memory cannot match high-speed CPU processing capability. To solve this problem, many high-performance embedded processors integrate high-speed cache. Among them, Samsung's b0x integrated 8 KB space unified commands and data cache.

Cache is a high-speed buffer memory, which is located between the CPU and the primary memory. When the CPU is performing operations, the required commands and data are extracted from the primary memory, and the CPU operation speed is much faster than the primary memory read/write speed, which greatly affects the performance of the entire system. The cache technology is used to store frequently-used CPU commands and data in the cache, and then the data and commands are transferred from the master memory using certain algorithms and policies, this allows the CPU to maintain high-speed operations without waiting for the primary storage data. This satisfies the real-time and efficient requirements of embedded systems. However, the use of cache also brings about consistency issues, so you should pay special attention to it in applications.

1. Discovery of cache consistency problems

The target board of this project is: the processor uses the ARM chip b0x, and the memory uses two flash drives and one SDRAM. During debugging, the input uses the keyboard, the output uses the display, and the RS232 serial port is used for communication.

During the development of the project, after the program successfully debugged by software simulation is burned into the target board, the program is aborted unexpectedly. Through reading the content of the storage, it is found that the program cannot run normally on the target board because the data written in the memory is inconsistent with the data generated by program compilation, and there are always some error bytes.

After a period of debugging, it is found that as long as the cache is disabled in the program, the data written in the memory will no longer encounter errors, and the program can run normally, but the speed is obviously slowed down. After analysis, the problem is caused by the inconsistency between the cache data and the primary storage data.

The cache data is inconsistent with the primary storage data. In a cache-based system, the same data may exist both in the cache and in the primary storage. If the two data are the same, they are consistent, inconsistent Data is called inconsistency. If data consistency cannot be ensured, problems may occur in subsequent program running.

2. Analyze cache consistency issues

To solve the cache consistency problem, you must first understand the working mode of the cache. There are two cache modes: writethrough and writeback ). In direct write mode, the cache controller immediately writes data to the corresponding location of the primary storage whenever the CPU writes data to the cache. Therefore, the primary storage keeps track of the latest version of the cache at any time, so that no new data is lost. The advantage of this method is that it is simple, but the disadvantage is that every time the cache content is updated, the primary storage needs to be written, which will cause frequent bus activities. The cache in b0x adopts the writethrough mode ). In the direct write mode, when data is output, the system will write the data to the cache and primary storage at the same time, thus ensuring the consistency of the high-speed buffer memory during output. However, in this mode, the consistency of High-speed buffer memory during input cannot be guaranteed.

Next, let's look at the Cache Organization method. According to the relationship between the primary storage and the cache, there are three ways to organize the cache. Full join mode, direct image mode, and Group join mode. The principle 1 of direct image mode is shown in.

      
Figure 1 direct image

Based on the number of cache rows m, the primary storage is divided into N/m areas, each with M storage blocks. 0 ~ M-1) blocks are mapped to the l0 ~ in the cache one by one ~ Lm-1 line. In this way, a tag can uniquely determine the correspondence between the cache row and the storage space as long as the Zone address (zone number) is specified. When the CPU sends out a memory access, it uses the memory address as the row index and addresses a high-speed buffer row to detect the row's tags. If the tag matches the corresponding address of the memory, the cache hits. This cache row is currently the only image to access the storage block. From the above analysis, we can see that in the write-through mode, because the cache content is updated each time, we need to write the primary memory, resulting in frequent bus activities. In the cache hit process, if the bus encounters interference, data inconsistency may occur.

3. Solution to the cache consistency problem

This problem can be solved from the software and hardware aspects.

3.1 software solution

The cache of b0x provides complete cache enabling and prohibited operation modes. You can set the value of the cmdomain in the syscfg register to 01 or 11 to enable cache (wherein, 01 is to enable 4 kb cache and 11 is to enable 8 KB cache ), the cache function is disabled by clearing the [2: 1] domain in the syscfg register as 0. Use the cache disabling method to eliminate data inconsistency. The specific code is as follows:

# Define rsyscfg (* (volatile unsigned *) 0x1c00000)
# Define wrbufopt (0x8) // write_buf_on
# Define syscfg_0kb (0x0 | wrbufopt)
# Define syscfg_4kb (0x2 | wrbufopt)
# Define syscfg_8kb (0x6 | wrbufopt)
# Define cachecfgsyscfg_0kb
Rsyscfg = cachecfg; // disable Cache

In addition, b0x provides two areas that cannot be accessed by the cache (noncacheable area ). Each region requires two Cache control domains to indicate the start and end addresses of each region that cannot be accessed by the cache. In areas that cannot be accessed by the cache, the cache cannot be updated when the cache does not hit or when a read operation is performed. If the addresses that affect data inconsistency are known, you can use the method of setting the region where the cache cannot be accessed to prevent data inconsistency. Sometimes, if the data area is arranged in a non-cache area, the program runs at a higher speed because most variables cannot be reused. For variables that cannot be reused, refreshing 16 B of cache memory is a waste. In this system, set the region that cannot be accessed by the cache to 0x2000000 ~ 0xc000000 to solve the data inconsistency problem. The Code is as follows:

# Define rsyscfg (* (volatile unsigned *) 0x1c00000)
# Define wrbufopt (0x8) // write_buf_on
# Define syscfg_0kb (0x0 | wrbufopt)
# Definesyscfg_4kb (0x2 | wrbufopt)
# Define syscfg_8kb (0x6 | wrbufopt)
# Define cachecfg syscfg_8kb
# Define rncachbe0 (* (volatile unsigned *) 0x1c00004)
# Define rncachbe1 (* (volatile unsigned *) 0x1c00008)
# Define non_cache_start (0x2000000)
// Start address of the region that cannot be accessed by the cache
# Define non_cache_end (0xc000000)
// End address of the region that cannot be accessed by the cache
Rsyscfg = cachecfg;
// 8 KB cache, write buffer enabled, data abort Enabled
Rncachbe0 = (non_cache_end> 12) <16) | (non_cache_start> 12); // cache is not used in the preceding data area.

The above two methods are used to eliminate the problem of data inconsistency. However, a high-performance system requires cache. disabling the use of cache greatly reduces the system performance. Therefore, in the design of embedded systems, we should also consider the hardware aspect to fundamentally prevent data inconsistency.

3.2 Hardware Solution

Because the current embedded processor, the clock speed is getting higher and higher, and the address and data line are getting more and more, so the hardware design and welding process should pay special attention to the problem of high frequency interference. High-frequency interference can cause signal integrity. These incomplete signals may cause some bad bytes during bus transmission. Therefore, high-speed PCB design is particularly important. In high-speed PCB design, the Design Technology of High-speed signal network features and line control has become the key to the success of high-speed digital devices. Pay attention to the following issues in the design:

① When the cost permits, the PCB should adopt multi-board wiring as much as possible.

② It is best to use a full line for the high-frequency circuit cabling. When turning, you can use a 45 ° line or arc turning point. In high-frequency circuits, meeting this requirement can reduce the transmission and coupling of high-frequency signals.

③ The less the switching between the lead layers of the pin of the high-frequency circuit device, the better. According to the test, a single passing hole can bring about 0.5 PF of distributed capacitance, reducing the number of passing holes can significantly increase the speed.

④ For High-Frequency Circuit Wiring, pay attention to the "Cross Interference" introduced by the near-distance parallel line of the signal line. If parallel distribution cannot be avoided, a large area of "Ground" can be arranged on the opposite side of the parallel signal line to reduce interference. Parallel cabling within the same layer is almost unavoidable, but in the adjacent two layers, the cabling direction must be vertical to each other.

⑤ A High-Frequency coupling capacitor should be set near each IC block.

6. The analog circuit and digital circuit should have independent ground wires.

7. The ground line is surrounded by a particularly important signal line or a part of the unit. All kinds of signal lines cannot form a loop, and the ground line cannot form a current loop.

After paying attention to the above design rules, the PCB can basically meet the requirements of high-speed signals.

Finally, the solder joints must be smooth during welding. Because the peak of solder joints produces high-frequency interference.

With the above rules, it ensures that there is no unnecessary interference on the bus during signal transmission, preventing data inconsistency.

Conclusion

Embedded processors have been widely used. The cache data inconsistency processing method mentioned in this article is also suitable for high-frequency embedded processors of other models. Mastering some basic experience in design and debugging can greatly improve work efficiency and reduce unnecessary troubles during system development.

References

1 Ma mingjin, Zhao qiuxia, Zeng Guangyu. High-Performance PC hardware structure and interface. Beijing: National Defense Industry Press, 2001
2 Samsung Inc. s3c2444b0x Users Manual
3 Ma Zhongmei, Ma guangyun, Xu yinghui, et al. ARM Embedded Processor Structure and Application basics. Beijing: Beijing University of Aeronautics and Astronautics Press, 2002
4 Xie Shu Ru. Design of protel pcb 99 SE circuit board. Beijing: Tsinghua University Press, 2001

Wang Yan: Master's degree. He focuses on arm applications in the control field and Embedded Operating System VxWorks.
Professor Wu Xuguang: He is mainly engaged in embedded operating systems, bus and distributed control, robust control theory, system modeling and simulation.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.