In this paper, the performance of the niosii processor's economical type niosii/E and fast type niosii/F are tested in different optimization methods.Code.
1. printf // print the file
2. usleep (1000) // sleep time
3. iowr_altera_avalon_pio_data // port read/write
The test code is as follows:
# Include <stdio. h>
# Include "system. H"
# Include "unistd. H"
# Include "altera_avalon_performance_counter.h"
# Include "altera_avalon_pio_regs.h"
Int main ()
{
Perf_reset (performance_counter_base );
Perf_start_measuring (performance_counter_base );
Perf_begin (performance_counter_base, 1 );
Printf ("hello from NiO II! \ N "); // test printf
Perf_end (performance_counter_base, 1 );
Perf_begin (performance_counter_base, 2 );
Usleep (1000); // test usleep
Perf_end (performance_counter_base, 2 );
Perf_begin (performance_counter_base, 3 );
// Iowr_altera_avalon_pio_data (pio_base, 0x00); // test IOWR
IOWR (pio_base, 0, 0x00 );
Perf_end (performance_counter_base, 3 );
Perf_stop_measuring (performance_counter_base );
Perf_print_formatted_report (performance_counter_base, alt_get_cpu_freq (),
3, "printf", "usleep", "IOWR ");
Return 0;
}
The test results are as follows:
Table 1: niosii/e, not optimized |
Section |
% |
Time (SEC) |
Time (clocks) |
Occurrences |
Printf |
20.9 |
0.00113 |
56317 |
1 |
Usleep |
79 |
0.00427 |
213400 |
1 |
IOWR |
0.0459 |
0.00000 |
124 |
1 |
Table 2: NiO II/e, optimization: optimize-03
Section |
% |
Time (SEC) |
Time (clocks) |
Occurrences |
Printf |
13.9 |
0.00065 |
32648 |
1 |
Usleep |
86 |
0.00404 |
201865 |
1 |
IOWR |
0.0383 |
0.00000 |
90 |
1 |
Table 3: NiO II/e, optimization: optimize-0 s
Section |
% |
Time (SEC) |
Time (clocks) |
Occurrences |
Printf |
13.4 |
0.00063 |
31439 |
1 |
Usleep |
86.5 |
0.00405 |
202681 |
1 |
IOWR |
0.0358 |
0.00000 |
84 |
1 |
Table 4: niosii/F, not optimized
Section |
% |
Time (SEC) |
Time (clocks) |
Occurrences |
Printf |
18.7 |
0.00023 |
11387 |
1 |
Usleep |
81.1 |
0.00099 |
49290 |
1 |
IOWR |
0.0428 |
0.00000 |
26 |
1 |
Table 5: NiO II/F, optimization: optimize-03
Section |
% |
Time (SEC) |
Time (clocks) |
Occurrences |
Printf |
11 |
0.00012 |
5969 |
1 |
Usleep |
88.9 |
0.00097 |
48281 |
1 |
IOWR |
0.0147 |
0.00000 |
8 |
1 |
Table 6: NiO II/F, optimization: optimize-0 s
Section |
% |
Time (SEC) |
Time (clocks) |
Occurrences |
Printf |
12.1 |
0.00013 |
6653 |
1 |
Usleep |
87.8 |
0.00097 |
48315 |
1 |
IOWR |
0.0473 |
0.00000 |
26 |
1 |
By comparing the data in the above six tables, the performance of nioⅱ/E and nioⅱ/F is still quite different. From the data in the table, the performance difference is more than 4 times, after the same processor optimization, the performance is improved by about 25% compared with before optimization. For-03 optimization and-0 s optimization, the performance is basically the same. In addition, usleep (1000), that is, the latency of 1 ms is set only when it is run in nioⅱ/F. The actual latency in nioⅱ/E isProgramAbout 4 times of the specified time (1 ms), that is, 4 ms.
according to the information provided by Altera:
In niosii performance benchmarks (alteras document) are this dmips ratio: niosii/F-1.105, niosii/S-0.518,
If the 50 MHz clock is used, the dmips of the cup is
niosii/F: 1.105*50 = 55.25 dmips
niosii/S: 0.518*50 = 25.9 dmips
niosii/E: 0.107*50 = 5.35 dmips
the condition for this performance is to run in onchip-MEM, and optimize is-03. Cycloneii device. The program in this article runs in external SDRAM