PostgreSQL Hardware Performance Tuning

Source: Internet
Author: User
Tags create index postgresql sessions sorts disk usage

PostgreSQL Hardware Performance Tuning

Translated from the article "PostgreSQL Hardware performance Tuning"

PostgreSQL is an object-relational database developed by developers all over the world on the Internet. She is an open source alternative to business databases such as Oracle and Informix.

PostgreSQL was originally developed by the University of California at Berkeley. In 1996, a team started developing the database on the Internet. They communicate ideas via email and share code through a file server. Postgressql is now comparable to commercial databases in professional features, performance, and reliability. She has transactions, views, stored procedures, referential consistency constraints. She supports many programming interfaces, including ODBC, Java (JDBC), TCL/TK, PHP, Perl, and Python. PostgreSQL continues to improve at an astonishing rate, thanks to a large network of talented developers.

Performance Concepts

There are two aspects of database Performance tuning. One is to increase the CPU, memory, and disk drive used by the database server. The second is to optimize the queries sent to the database. This article discusses hardware-side performance tuning. Query optimizations can be done through SQL commands such as CREATE INDEX, vacuum, vacuum full, ANALYZE, CLUSTER, and EXPLAIN. These are discussed in "Postgresql:introduction and Concepts" in http://momjian.us/main/writings/pgsql/aw_pgsql_book/.

To understand the hardware tuning problem, it is important to understand what is happening in the computer. Simply put, a computer can be imagined as a CPU that is surrounded by storage. On the same chip as the CPU are several CPU registers used to store intermediate results and different pointers and counts. Around these are the CPU caches that are used to hold the most recent access information. Outside of the CPU cache is a large number of random-access main memory (RAM) that is used to hold programs and data executing. Outside of main memory is a disk drive that stores more information. Disk drive is the only persistent storage area, so any information that you want to keep after shutting down must be placed there. In summary, there are these stores around the CPU:

Storage Area Unit of Measure
CPU Registers bytes
CPU Cache Kilobytes
Ram Megabytes
Disk drive Gigabytes

You can see that the farther away the CPU is, the larger the storage. In theory, a large amount of persistent storage can be placed next to the CPU, but it is particularly slow and expensive. In practice, the information that is often used is stored next to the CPU, and infrequently accessed information is stored away from the CPU while being taken to the CPU when used.

Keep information close to the CPU

Moving information between different stores occurs automatically. The compiler determines what information is saved in the register. The CPU chip logic guarantees that the most recently used information is kept in the CPU cache. The operating system controls what information is stored in RAM and travels back and forth between disk drives.

CPU registers and CPU caches cannot be effectively tuned by the database administrator. Effective database tuning involves adding useful information in RAM to prevent disk access as much as possible.

You might think it's easy, but it's not. The computer's RAM contains many things:

    • The program being executed
    • Program Data and Stacks
    • PostgreSQL Shared buffer Cache
    • Kernel disk Cache
    • Kernel

Appropriate tuning includes storing as much of the database information as possible in RAM without affecting other areas of the operating system

PostgreSQL Shared Buffer Cache

PostgreSQL does not directly change the information on the disk. Instead, it requests read data to the PostgreSQL shared buffer cache. PostgreSQL backstage to read and write these blocks, and eventually brush to disk.

You need to access the background of the table to find the required blocks in the cache first. If they already exist, you can continue processing right away. If not, an operating system request is sent to load these blocks. These blocks are cached from the kernel disk or loaded directly from the disk. This can be an expensive operation.

How big is too big?

You might think, "I'll give all the RAM to PostgreSQL cache." However, if you do that, there will be no other space allocated to the kernel and any other programs to run. The appropriate PostgreSQL shared buffer cache is the maximum amount of space that does not produce any other undesirable activity.

To understand bad activity, you need to understand how the UNIX operating system manages memory. If there is enough memory to load all the programs and data, only a small amount of memory management is required. However, if everything is not loaded into memory, the kernel starts forcing the memory to flash out to the area called swap. It removes the memory pages that have not been used recently. This operation is called Swap pageout. Pageouts is not a problem because they occur during an inactive time period. The bad thing is that when these pages have to be returned from swap, the old pages that move to swap must be moved into RAM. This is called swap pagin. This is a bad situation because the program must pause until Pagein is complete when the page moves from swap.

Pagein activity can be demonstrated by system analysis tools such as Vmstat and SAR, and implies that there is not enough memory to run efficiently. Do not confuse swap pageins and ordinary pageins. It contains pages that are part of the normal operating system that reads the file system. If you can't find swap pageins, many pageouts are a good hint and you're also in the swap pageins.

Impact of cache Size

You may wonder why cache size is so important. First, assume that the PostgreSQL shared buffer cache is large enough to load the entire table. The duplicate sequential scan of this table does not require access to the disk because all the data is already in the cache. Now assume that the cache is one block size smaller than the table. The sequential scan of this table requires that all table data blocks be loaded into the cache until the last one. When a block is needed, the oldest block is moved out, which is the first block of the table. When another sequential scan occurs, the first block is no longer in the cache and needs to be re-loaded, when the oldest block, the second block of the table, is moved out of the cache. This will continue to push the next required block until the end of the table. This is an extreme example, but you can see that reducing a block can change the efficiency of the cache from 100% to 0%. It shows that finding the correct cache size can greatly affect performance.

The appropriate shared Buffer cache size

Ideally, PostgreSQL shared buffer cache should be:

    • Large enough to fit into a frequently accessed table
    • Small enough to avoid swap pageins activities

Remember that postmaster allocates all of the shares within the time it starts. This area remains the same size even if no one accesses the database. Some operating system pageout do not involve shared memory, some lock the shared memory in main memory. Locking the shared memory is recommended. The PostgreSQL Administrator's Guide has information about the various operating system kernel parameter configurations.

Bulk Sort Memory Size

Another parameter to tune is the amount of memory used for batch sorting. When sorting large tables or result sets, Postgressql sorts them into several parts, placing intermediate results in temporary files. These files are merged and reordered until all rows are sorted. Increasing batch size can reduce temporary files and often speed up sorting. However, if the bulk sort is too large, they will cause pageins because part of the sort is sorted pageout to the swap area. In this case it is faster to use smaller batch sorting and more temporary files. So again, swap pageins indicates that too much memory has been allocated. Remember that each background sort, such as order by,create INDEX, or Merge join uses this parameter. There are a few parallel sorts that will use this memory size several times.

Cache size and Sort size

Both the cache size and the sort size affect memory usage, so you can't maximize one without affecting the other. Remember that the cache size is allocated at postmaster startup, and the sort size is determined by the number of sort operations being performed. Typically, the cache size is more important than the sort size. However, a predictable query using order BY, CREATE Index, or merge joins can be accelerated by a large batch sort size.

At the same time, many operating systems limit how much shared memory can be allocated. Increasing these limits requires knowledge of the operating system to recompile or reconfigure the kernel. More information can be found in the PostgreSQL Administrator's Guide.

As a starting point for tuning, if there are only a few large sessions, use 15% of main memory as the cache size, 2-4% as the sort size. If there are a lot of small sessions can be adjusted smaller. You can try to increase them to see if there is a performance boost while no swap occurs. If the shared buffer cache is too large, you will waste the administrative overhead of maintaining too much cache and consuming other processes and additional kernel disk cache RAM.

A valuable parameter is effective_cache_size. The tuning user uses this parameter to estimate the size of the kernel disk cache. In a kernel that does not have a fixed cache size, this parameter can be set to the average size of unused RAM because the cores use unused RAM to cache recently accessed disk pages. In other fixed-size disk cache cores, you should usually set your kernel cache to 10%ram size.

Disk Limitations

The physical characteristics of disk drives lead to different performance characteristics than other storage areas mentioned in the article. Other storage areas can access any byte at the same speed. The disk drive has its own spin disk and moving head, and accessing data near the head is much faster than data far away.

It takes some time to move the additional magnetic column of the head to the disk. UNIX developers know this feature. When sorting large files on disk, they try to put the individual strips of the file together. For example, suppose a file requires 10 blocks of disk. The operating system may place 1-5 pieces on a magnetic column and 6-10 on the other magnetic column. If the file needs to be read from the beginning, only the head should be moved two times. -Get 1-5 pieces of magnetic column at a time and get 6-10 pieces at a time. However, if the file is not sequentially read, such as Block 1,6,2,7,3,8,4,9,5,10, 10 heads movement is required. As you can see, on disk, sequential reads are much faster than random reads. This is why if most of the content of a table needs to be read, PostgreSQL favors sequential scans instead of index scans. This also highlights the value of the cache.

Multi-disk spindles

The heads move quite a lot during database activity. If too many read-write requests are generated, the drivers become saturated, resulting in poor performance. (Vmstat and SAR can provide information on the amount of activity per disk)

One solution to disk saturation is to move some PostgreSQL data files to other disks. Remember that moving files to other file systems on the same disk does not help. All file systems on the same disk use the same head.

Disk-driven database access can be accelerated by the following methods:
Moving databases, tables, indexes
The tablespace allows you to create objects on different drives.
Mobile Wal
Initdb-x and symbolic connections can be used to move the Pg_xlog directory to different disk drives. Unlike other write operations, the PostgreSQL write log must be brushed to disk before the transaction is complete. The cache cannot be used to defer these writes. Using a separate disk for the write log allows the head to remain in the current log's magnetic column without moving the delay. (You can use the Postgres-f parameter to prevent the log from flashing to disk, but an operating system disaster needs to be recovered from the backup)

Other options include using disk arrays to speed up a file system with multiple disk drives. Mirroring slows down database writes, but speeds up database reads because the data can be obtained from each driver. Many sites use raid1+0 or raid0+1, as it provides both performance and reliability benefits. The RAID5 is faster for 6 or more drives. In theory, RAID 5 will have a battery-backed cache, so writes can be flushed to disk in an efficient way, rather than slowing down the application when writing files.

Disk-driven caching

Modern disk drives have both read and write caches. The read cache guarantees that the most recently requested disk block is available in disk memory. Write caches save write requests until they can be effectively stored on the disk surface. You may realize that this could be a problem-what if the disk loses power and the write data it holds is not written to disk? Some disk drives and RAID controllers have back-up battery write caches to keep data secure, and write data to the disk surface when the power supply is fully restored. However, most drivers do not have this feature, so they are unreliable.

Fortunately, in most drivers you can turn off write caching. SCSI drivers turn off write caching by default. IDE device write caches are usually turned on, but can be turned off from operating system commands such as Hdparm-w0,sysctl hw.ata.wc=0 or SCSIMD. However, some IDE drivers, while reporting that write caching is turned off, are still using it, which is unreliable. It's hard to tell which driver is lying without a precise test.

Because PostgreSQL uses Fsync () to write the Wal log to disk when each transaction commits, and waits for the write to complete. Users can see a huge performance boost when using write caching. Therefore, for performance and reliability, it is an ideal solution if PostgreSQL can use write cache with backup battery.

SCSI vs IDE

SCSI drivers are often much more expensive than IDE drivers. SCSI drivers have a protocol to communicate between the operating system and the controller, but the IDE driver is much simpler and can only accept one request at a time. SCSI drivers with tagged queues can accept multiple requests and rearrange them to improve efficiency. This is why SCSI and IDE drivers have similar performance characteristics when single-user or single-file IO, but SCSI is better than IDE performance when multiple users or process requests. Therefore, SCSI is more suitable for heavy-duty database servers.

In fact, the SCSI or IDE is the only way to differentiate between two main drivers: enterprise-driven, high-performance and high-reliability design. PC driven, designed for minimal expense. This article Http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf, Has done a good job of describing the drivers that are based on performance reliability or reduced cost. It is an excellent guide in selecting drivers based on these features.

File system

Some operating systems support multiple disk file systems. In this case, it may be difficult to know which file system behaves best. PostgreSQL typically behaves best in traditional UNIX file systems, such as BSD ufs/ffs, which is supported by many operating systems. The UFS default 8K block size is the same as the page size of PostgreSQL. You can run the log and log-based file systems, but these will cause additional overhead during the Fsync execution of the Wal. Old SvR3-based file systems become too fragmented to have good performance.

Because there are many file systems to choose from and none is optimal, file system selection on Linux is particularly difficult: ext2 is not completely disaster-safe, EXT3,XFS and JFS are log-based and Reiser are optimized for small files and log logs. Although the ext2 is much larger than the log file system block, ext2 is not optional when disaster recovery is required. If you must use EXT2, use sync enabled to mount it. Some people recommend using a Data=writeback mount when ext3 is used.

PostgreSQL is not recommended to use NFS and other remote file systems. NFS and the local file system do not have the same file system features, which can cause data reliability and disaster recovery issues.

Multi-CPU

PostgreSQL uses a multi-process model, which means that each database connection has its own UNIX process. As a result, all multi-CPU operating systems can speed up multi-database connections with available CPUs. However, if only one database connection is active, it can use only one CPU. PostgreSQL does not use multithreaded models to allow a process to use multiple CPUs.

Check Point

When the Wal file is full, a checkpoint is executed to force all dirty caches to be brushed to disk, thus making the log file recyclable. Checkpoints are also executed at regular intervals, typically 5 minutes. If there is a lot of database write activity, the Wal segment will be very block-filled, causing all disk caches to be brushed to disk causing the system to become excessively slow.

Check Point

The checkpoint should occur every few minutes. If it happens several times in a minute, the performance will get worse. Look for "checkpoint_warning" messages in the service log to determine whether your checkpoints are performing too frequently. This message appears if the checkpoint is executed more than once in 30 seconds.

The frequency of reducing checkpoints includes increasing the number of Wal files in Data/pg_xlog. Each file is megabyte, so it affects disk usage. The default configuration uses a minimum number of log files. To reduce the checkpoint frequency, you need to add this parameter:

Checkpoint_segments=3

The default value is 3. Increase it until the checkpoint appears once in a few minutes. Another message that may appear is:
LOG:XLogWrite:new log file Created-consider increasing wal_files
This message implies that the Wal_files parameter in postgresql.conf needs to be increased.

Conclusion

Fortunately, PostgreSQL does not require much tuning. Most of the parameters are automatically tuned to optimal performance. The cache size and the sort size are two parameters that administrators can control to better utilize available memory. Disk access can be accelerated by multiple drives. Other parameters may be set in Share/postgresql.conf.sample. You can copy to data/postgresql.conf to experiment with other PostgreSQL more exotic parameters.

PostgreSQL Hardware Performance Tuning

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.