I. Introduction
To evaluate the performance of a system, there are usually different indicators and different tests accordingly
MethodAnd testing tools. Generally, to ensure the fairness and authority of testing results, relatively mature commercial testing software will be used. However, in a specific situation, if you want to simply compare different systems or compare the performance of some function libraries, you can also select some excellent tools from the open source world to complete this task.
LmbenchThis section briefly introduces the comprehensive system performance test.
Ii. Test SoftwareLmbenchIt is a simple, portable, and micro-evaluation tool that complies with ANSI/C standards for UNIX/POSIX. In general, it measures two key features: Response Time and bandwidth.LmbenchIt aims to give system developers an insight into the basic costs of key operations. Software Description:LmbenchIt is an open-source benchmark that is used to evaluate the overall performance of the system. It can be used to test the performance, including reading/writing documents, memory operations, process creation and destruction overhead, and network performance.MethodSimple.
LmbenchIt is a multi-platform software. Therefore, it can conduct comparative tests on systems at the same level to reflect the advantages and disadvantages of different systems. By selecting different database functions, we can compare the performance of database functions; more importantly, as an open-source software,LmbenchProvides a testing framework. If the tester has higher testing requirements for the testing project, it can achieve the goal by modifying a small amount of source code (for example, only the performance of Process Creation and termination and the overhead of process conversion can be evaluated now, and performance tests at the Thread level can be achieved by modifying some code ).
Download:
Www.bitmover.com/Lmbench, Latest version 3.0-A9LmbenchMain functions: * bandwidth evaluation tool-read cache files-copy memory-Read Memory-write memory-pipeline-TCP * response time evaluation tool-context switch-network: Connection Establishment, pipelines, TCP, UDP, and RPC
Hot potato-file system creation and deletion-Process Creation-Signal Processing-upper-layer system calls-memory read response time * Others-CPU clock rate calculationLmbenchMain features:-portability testing and evaluation tools for operating systems are written in C language and have good portability (although they are easier to be compiled by GCC ). This is useful for generating detailed comparison results between systems. -Adaptive adjustmentLmbenchIt is very useful for stressful behaviors. When bloatos is 4 times slower than all competitors, the tool allocates resources to fix this problem. -Database computing results the database computing results include running results from most mainstream computer workstation manufacturers. -Memory latency computing result the memory latency test shows the cache latency of all systems (data), such as Level 1, level 2, and level 3 caches, as well as missed latency of memory and TLB tables. In addition, the cache size can be correctly divided into some result sets and read. The hardware family is similar to the preceding description. This evaluation tool has found some errors in the operating system paging policy. -Many people like the number of contextual conversions. This kind of evaluation tool does not focus on referencing only the number of "in cache. It often changes between the number and size of processes, and when the current content is not in the cache, the results are divided in a way that is visible to users. You can also get the actual overhead of the cold cache context switch. -Regression testing sun and SGI have used this evaluation tool to find and remedy performance problems. Intel used P6. Linux uses them in Linux performance adjustment. -The source code of the new evaluation tool is relatively small, readable, and easy to expand. It can be combined into different forms as usual to test other content. For example, this includes Network Measurement for processing database functions established by connections, and server shutdown. Iii. Test in this test, I have two types of tests: one is tested on my PC and the other is tested on sep4020.ArmTested On the 720t platform: (1) PC Testing
Test Platform: HP compoq, Fedora 7 Linux 2.6.21
1. confirm that the C compiler is installed. If no, install the C compiler first.
2. CopyLmbenchSource code documentationLmbench-3.0-a9.tgz: Go to the/root/test directory of fedora and decompress it to the current directory.
3. CDLmbench-3.0-a9: Enter make results in the command line to start the compilation test.
4. If there is no compilation error, some selection prompts will appear to configure the test and generate the configuration script. This configuration script will be used in subsequent tests, in future tests, the same configuration can also be used for multiple tests. In addition to the test memory range (for example, "MB [Default 371]", you should avoid selecting too many values for a large memory size; otherwise, the test will take a long time) in addition to mail results, you can select the default value.
5,LmbenchExecute any test items according to the configuration document, and generate a sub-directory under the results directory based on the system type, system name, and operating system type (system name + serial number) stored in this directory.
6. After the test is completed, execute make see to view the test result report, and export the files under the test data/results/i686-pc-linux-gnu/directory as the test report/results/summary. for the out file, view summary. you can view the test results in the out file. (2) sep4020 Test
Test Platform: sep4020 evb1.5, Linux 2.6.16
1. Check that the cross-compilation compiler is installed on the host machine.Arm-For LINUX-GCC, install
2. CopyLmbenchSource code documentationLmbench-3.0-a9.tgz: Go to the/root/test directory of fedora and decompress it to the current directory.
3. CDLmbench-3.0-a9: Type make cc = in the command line.Arm-Linu-gcc OS =Arm-In Linux, you can start compiling test cases. After compilation/Lmbench-3.0-A9/binArm-Linux Directory, which is the target file of the test case. Because our target platform does not support the make command, we must write another running script named run_all.sh, which is placed under scripts with the following content :#! /Bin/shecho runLmbenchOn sep4020Arm-Linuxenv OS =Arm-Linux./config-runenv OS =Arm-Linux./results
4. ThenLmbench-Copy the 3.0-A9 directory to the NFS root directory of the target machine and enter the serial port terminal of the target machine/Lmbench-3.0-A9/scripts. /run_all.sh if there is no cross-compilation error, some selection prompts will appear to configure the test and generate the configuration script. This configuration script will be used in subsequent tests, in future tests, the same configuration can also be used for multiple tests. In addition to the test memory range (for example, "MB [Default 19]", you should avoid selecting too many values for a large memory size; otherwise, the test will take a long time) in addition to mail results, you can select the default value.
5,LmbenchExecute any test items according to the configuration document, and generate a sub-directory under the results directory based on the system type, system name, and operating system type (system name + serial number) stored in this directory.
6. After the test is completed, enter/nfs/in the Virtual Machine fedora7/Lmbench-3.0-A9 type the make see command to generate a test result report that exports files under the test data/results/i686-pc-linux-gnu/directory as the test report/results/summary. for the out file, view summary. you can view the test results in the out file. Iv. Test results and description
make[1]: Entering directory `/nfs/lmbench-3.0-a9/results'
L M B E N C H 3 . 0 S U M M A R Y ------------------------------------ (Alpha software, do not distribute)
Basic system parameters------------------------------------------------------------------------------Host OS Description Mhz tlb cache mem scal pages line par load bytes --------- ------------- ----------------------- ---- ----- ----- ------ ----192.168.0 Linux 2.6.16 arm-linux 85 60 8 1.0000 1192.168.0 Linux 2.6.27 arm-linux 86 63 16 1.0000 1192.168.0 Linux 2.6.16 arm-linux 86 63 16 1.0000 1192.168.0 Linux 2.6.16 arm-linux 86 63 16 1.0000 1192.168.0 Linux 2.6.16 arm-linux 86 63 16 1.0000 1localhost Linux 2.6.21- i686-pc-linux-gnu 1817 8 128 1.3300 1localhost Linux 2.6.21- i686-pc-linux-gnu 1864 8 128 1.2900 1
Processor, Processes - times in microseconds - smaller is better------------------------------------------------------------------------------Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----192.168.0 Linux 2.6.16 85 2.04 8.44 187. 2064 21.0 81.2 9655 42.K 63.K192.168.0 Linux 2.6.27 86 2.69 8.44 266. 5338 20.7 94.7 10.K 44.K 73.K192.168.0 Linux 2.6.16 86 2.03 8.34 185. 5100 20.7 85.9 9468 63.K 121K192.168.0 Linux 2.6.16 86 2.03 8.72 185. 19.K 20.7 84.9 9556 53.K 72.K192.168.0 Linux 2.6.16 86 2.04 8.33 185. 5321 20.7 80.5 9395 42.K 101Klocalhost Linux 2.6.21- 1817 1.11 1.26 3.08 5.17 10.2 1.70 2.85 674. 1922 5177localhost Linux 2.6.21- 1864 1.09 1.26 2.98 5.05 8.94 1.48 3.27 1083 2086 6119
Basic integer operations - times in nanoseconds - smaller is better-------------------------------------------------------------------Host OS intgr intgr intgr intgr intgr bit add mul div mod --------- ------------- ------ ------ ------ ------ ------ 192.168.0 Linux 2.6.16 11.6 8.6900 52.1 1489.3 255.9192.168.0 Linux 2.6.27 11.5 8.5800 52.2 1469.2 252.6192.168.0 Linux 2.6.16 11.5 8.5400 52.2 1472.0 252.9192.168.0 Linux 2.6.16 11.5 8.6200 52.0 1472.8 251.9192.168.0 Linux 2.6.16 11.5 8.6400 52.2 1472.5 254.5localhost Linux 2.6.21- 0.5600 0.2800 0.2000 20.6 10.9localhost Linux 2.6.21- 0.6100 0.2700 0.1700 20.0 9.8600
Basic uint64 operations - times in nanoseconds - smaller is better------------------------------------------------------------------Host OS int64 int64 int64 int64 int64 bit add mul div mod --------- ------------- ------ ------ ------ ------ ------ 192.168.0 Linux 2.6.16 23. 691.6 4295.6 3895.0192.168.0 Linux 2.6.27 23. 685.4 4192.8 4074.3192.168.0 Linux 2.6.16 23. 683.0 4199.0 4082.1192.168.0 Linux 2.6.16 23. 680.7 4202.6 4082.9192.168.0 Linux 2.6.16 23. 686.9 4235.7 4080.3localhost Linux 2.6.21- 0.690 0.6200 34.5 41.4localhost Linux 2.6.21- 0.660 0.6100 36.8 40.2
Basic float operations - times in nanoseconds - smaller is better-----------------------------------------------------------------Host OS float float float float add mul div bogo--------- ------------- ------ ------ ------ ------ 192.168.0 Linux 2.6.16 6902.1 7781.9 12.1K 42.2K192.168.0 Linux 2.6.27 6911.0 6568.4 11.6K 43.0K192.168.0 Linux 2.6.16 6757.4 7578.5 11.9K 43.5K192.168.0 Linux 2.6.16 6763.1 7611.3 11.7K 43.5K192.168.0 Linux 2.6.16 6759.3 7640.4 11.9K 43.5Klocalhost Linux 2.6.21- 1.6600 2.7900 21.7 20.6localhost Linux 2.6.21- 1.6300 2.7200 20.9 20.1
Basic double operations - times in nanoseconds - smaller is better------------------------------------------------------------------Host OS double double double double add mul div bogo--------- ------------- ------ ------ ------ ------ 192.168.0 Linux 2.6.16 9955.5 10.6K 22.8K 79.8K192.168.0 Linux 2.6.27 9157.0 9909.4 20.6K 79.4K192.168.0 Linux 2.6.16 9793.3 10.3K 22.4K 79.8K192.168.0 Linux 2.6.16 9703.9 10.4K 22.2K 79.9K192.168.0 Linux 2.6.16 9746.9 10.3K 22.3K 79.7Klocalhost Linux 2.6.21- 1.6900 2.7900 21.2 20.6localhost Linux 2.6.21- 1.6300 2.8800 21.0 20.2
Context switching - times in microseconds - smaller is better-------------------------------------------------------------------------Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw--------- ------------- ------ ------ ------ ------ ------ ------- -------192.168.0 Linux 2.6.16 164.8 120.0 311.9 165.3 162.5 165.9 151.1192.168.0 Linux 2.6.27 247.5 196.1 198.4 238.0 254.9 262.9 291.2192.168.0 Linux 2.6.16 164.4 118.5 115.2 161.1 156.4 164.4 164.3192.168.0 Linux 2.6.16 167.2 116.6 119.6 166.9 161.9 171.3 158.1192.168.0 Linux 2.6.16 172.5 117.4 114.3 161.3 147.6 163.8 127.5localhost Linux 2.6.21- 11.0 11.6 11.7 15.3 19.2 16.8 25.1localhost Linux 2.6.21- 10.2 11.4 11.3 14.3 20.9 17.4 26.0
*Local* Communication latencies in microseconds - smaller is better---------------------------------------------------------------------Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----192.168.0 Linux 2.6.16 164.8 482.3 925. 192.168.0 Linux 2.6.27 247.5 770.7 1069 192.168.0 Linux 2.6.16 164.4 477.4 917. 192.168.0 Linux 2.6.16 167.2 472.9 926. 192.168.0 Linux 2.6.16 172.5 474.9 913. localhost Linux 2.6.21- 11.0 28.3 50.8 45.9 55.2 48.2 59.8 126.localhost Linux 2.6.21- 10.2 32.1 55.7 36.7 49.2 40.2 53.1 113.
*Remote* Communication latencies in microseconds - smaller is better---------------------------------------------------------------------Host OS UDP RPC/ TCP RPC/ TCP UDP TCP conn--------- ------------- ----- ----- ----- ----- ----192.168.0 Linux 2.6.16 192.168.0 Linux 2.6.27 192.168.0 Linux 2.6.16 192.168.0 Linux 2.6.16 192.168.0 Linux 2.6.16 localhost Linux 2.6.21- localhost Linux 2.6.21-
File & VM system latencies in microseconds - smaller is better-------------------------------------------------------------------------------Host OS 0K File 10K File Mmap Prot Page 100fd Create Delete Create Delete Latency Fault Fault selct--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----192.168.0 Linux 2.6.16 6410.3 6135.0 37.0K 6896.6 5112.0 3.124 36.8 280.8192.168.0 Linux 2.6.27 18.9K 71.4K 55.6K 28.6K 16.2K 15.9 54.2 194.3192.168.0 Linux 2.6.16 22.7K 15.4K 1000.K 47.6K 4926.0 5.213 37.1 284.2192.168.0 Linux 2.6.16 31.2K 29.4K 41.7K 50.0K 4907.0 1.087 36.0 277.1192.168.0 Linux 2.6.16 33.3K 25.0K 58.8K 9434.0 5108.0 9.428 37.1 285.6localhost Linux 2.6.21- 112.0 12.4 88.5 130.8 7413.0 2.360 5.98870 4.635localhost Linux 2.6.21- 36.1 19.0 181.2 138.4 9006.0 2.134 482.1 4.148
*Local* Communication bandwidths in MB/s - bigger is better-----------------------------------------------------------------------------Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem UNIX reread reread (libc) (hand) read write--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----192.168.0 Linux 2.6.16 10.2 11.2 13.1 32.8 19.1 17.8 32.8 72.9192.168.0 Linux 2.6.27 8.96 11.4 12.9 32.6 19.1 17.7 32.7 71.2192.168.0 Linux 2.6.16 10.2 11.2 13.0 32.8 19.0 17.8 32.7 71.2192.168.0 Linux 2.6.16 10.2 11.2 12.9 32.9 19.0 17.8 32.9 71.6192.168.0 Linux 2.6.16 10.2 11.2 12.9 32.9 19.0 17.8 32.7 71.6localhost Linux 2.6.21- 1153 436. 640. 1742.8 3463.7 1239.0 1116.5 3502 1589.localhost Linux 2.6.21- 1194 451. 744. 1742.3 3443.5 1217.8 1159.0 3357 1555.
Memory latencies in nanoseconds - smaller is better (WARNING - may not be correct, check graphs)------------------------------------------------------------------------------Host OS Mhz L1 $ L2 $ Main mem Rand mem Guesses--------- ------------- --- ---- ---- -------- -------- -------192.168.0 Linux 2.6.16 85 33.6 293.4 296.6 856.8 No L2 cache?192.168.0 Linux 2.6.27 86 35.2 293.8 309.8 863.1 No L2 cache?192.168.0 Linux 2.6.16 86 35.4 293.7 310.3 861.4 No L2 cache?192.168.0 Linux 2.6.16 86 35.4 293.7 309.9 863.6 No L2 cache?192.168.0 Linux 2.6.16 86 35.4 293.6 308.2 860.2 No L2 cache?localhost Linux 2.6.21- 1817 1.6620 7.9160 98.6 191.7localhost Linux 2.6.21- 1864 1.7240 7.7130 104.3 205.4make[1]: Leaving directory `/nfs/lmbench-3.0-a9/results'
Key technical parameters:
Category
Here, if the host is localhost, it indicates the virtual machine I use, and 192.168.0 indicates that it is tested with 4020.
Technical Parameters
Parameter description
(1) Basic system parameters (basic system parameters)
TLB pages: Number of pages of TLB (translation lookaside buffer)
Cache line Bytes: (number of bytes in the cache row)
Mem par
Memory Hierarchy Parallelism
Scal load: ParallelLmbenchQuantity
(2) processor and processes (processor and process operation time)
Null call: Simple System Call (process number)
Null I/O: simple Io operations (average of empty read/write)
Stat: operations to retrieve the Document Status
Open Clos: open and close the document immediately.
Slct TCP
Select: Configuration
SIG inst: configuration Signal
SIG hndl: capture processing signals
Fork proc: directly exits after the fork Process
Exec proc: fork, execve call, and then exit
Sh proc: fork, execute shell, and then exit
(3) basic integer/float/Double operations
Omitted
(4) Context switching context switching time
2 P/16 K: Two 16 K data records are processed concurrently.
(5) * local * Communication latencies (local communication latency, which can be read immediately after being sent in different communication modes)
Pipe: pipe Communication
AF Unix
UNIX Protocol
UDP
UDP
RPC/udp
TCP
RPC/tcp
TCP Conn
TCP establishes connect and closes the description
(6) file & VM system latencies (documentation, memory latency)
File create & Delete: Creates and deletes a document.
MMAP latency: Memory ing
Prot fault
Protect fault
Page fault: page missing
100fd selct: configure the Select time for the 100 file descriptors
(7) * local * Communication bandwidths (local communication bandwidth)
Pipe: MPS queue operations
AF Unix
UNIX Protocol
TCP
TCP Communication
File reread: repeated reading of documents
MMAP reread: Memory ing repeated read
Bcopy (libc): memory copy
Bcopy (hand): memory copy
Mem read: Memory read
Mem write: memory write
(8) memory latencies (Memory Operation delay)
L1: cache 1
L2: cache 2
Main mem: continuous memory
Rand mem: random memory access latency
Guesses
If L1 and L2 are similar, "No L1 cache?" is displayed ?"
If L2 is similar to main MEM, "No L2 cache?" is displayed ?"