Readers who have studied database theory should still remember the Performance Comparison Between CHAR and VARCHAR: CHAR is faster than VARCHAR because CHAR is of a fixed length and VARCHAR requires a length identifier, an extra operation is required during processing.
In this case, I conducted a benchmark test. The benchmark test environment is as follows:
[Hardware configuration]
Hardware |
Configuration |
CPU |
Intel (R) Xeon (R) CPU E5620 clock speed 2.40 GHz, 2 Physical CPUs, 16 logical CPUs |
Memory |
24G (6 blocks * 4G DDR3 1333 REG) |
Hard Disk |
300 GB * 3, SAS hard drive 15000 rpm, no RAID, RAID card, and write-back function |
OS |
RHEL5 |
MySQL |
5.1.49/5.1.54 |
[MySQL configuration]
Configuration item |
Configuration |
Innodb_buffer_pool_size |
18G |
Innodb_log_file_size |
200 M |
Innodb_log_files_in_group |
3 |
Sync_binlog |
100 |
Innodb_flush_log_at_trx_commit |
2 |
[Table configuration]
The average VARCHAR length is 200, and the CHAR length is 250. Other configurations are as follows:
Configuration item |
Configuration |
Number of records |
10 million, 20 million, 50 million, 0.1 billion |
Storage Engine |
Innodb |
Row format |
Compact |
The performance test results are as follows:
[Query]
[Insert]
[Update]
The VARCHAR is also a random length during the update.
[Delete]
The test results show a phenomenon that is not in line with the theory: when the table size is smaller than the Innodb buffer pool, there is no difference between CHAR and VARCHAR, while when the table size is greater than the Innodb buffer pool, VARCHAR has higher performance! Why?
First, the performance is the result of combining multiple factors, such as hardware, configuration, number of table records, and business model. The single factor difference may have almost no impact on the whole;
For example, it takes 100 ms to execute an operation, and CHAR is only 1 microsecond faster than VARCHAR, so the final performance will not be affected.
This is why there is no difference between CHAR and VARCHAR when the Innodb buffer pool is large enough.
Thirdly, in theory, CHAR is faster than VARCHAR because it is based on the CPU. However, performance is the final result after various factors are combined. When the Innodb buffer pool is smaller than the table size, "disk read/write" has become a key factor in performance, while VARCHAR is shorter, so the performance is higher than CHAR.
Finally, some may think that if the new data is longer than the old data when VARCHAR is updated, the data may need to be moved, resulting in lower performance, this operation has no significant impact on the final performance. It may be because Innodb uses pages to manage data. Data movement is completed in the memory first and then written to the disk. Therefore, even data is moved quickly.
[Application tips]
Based on the above test results and analysis, I personally think that VARCHAR is preferred in general, especially when the average length of a string is much smaller than the maximum length;
Of course, if your string is really short, for example, it only contains 10 characters, then CHAR is preferred.
Appendix:
1) if you are interested, I can infer: Why is VARCHAR about 20% faster than CHAR for 10 kW table performance in the test results?
2) the test data is only used for comparison. It does not mean that the performance of MySQL is so high in general. For comparison, many preparations are made during the test, and the test operation is special.
From yah99_wolf's column