In DSP algorithm design, we often encounter these two choices, or can they be combined? I haven't figured it out yet.
Option 1: Cache
Option 2: edma + l2 SRAM
On TI's website, we have done a calculation on the vlib calculation of the Canny edge. If pipeline is well designed, option 2 is faster. In fact, it depends on many factors.
1. Data Locality
2. The complexity of processing. The more complicated it is, the more advantageous it seems to be in L2.
3. The size of SRAM. At present, most TI DSPs can set up to kb or kb L2 SRAM, so the amount of data transferred at a time by edma directly affects the speed.
4. edma settings, whether it is a large amount of data at a time, or linked, chained, or ping-pong, have a great impact.
Summary before proceeding:
1. cache first. Pay attention to cache cohernce,
2. If it is not fast enough, test edma again, and finally consider ping-pong.
Better advice ????? Thanks