Understanding ipvs Performance Understanding ipvs performance is a black box for many application developers. The developer wants a short time span between data entry and output. You don't need to be a DBA. Here there are some data that can be understood by most application developers to help them understand whether their database performance is good enough. This article will provide some tips to help you determine whether the performance of your database reduces the performance of your program and how to do that. Understanding the cache and cache hit rate is typical for most applications to determine which part of the data is frequently accessed. Like others, the 80/20 rule means that 20% of Data occupies 80% of read data and is sometimes higher. Postgres it will track your data mode and save frequently accessed data to the cache. Generally, you want the database to have a 99% cache hit rate. You can view the cache hit rate: SELECT sum (heap_blks_read) as heap_read, sum (heap_blks_hit) as heap_hit, sum (hit)/(sum (heap_blks_hit) + sum )) as ratioFROM pg_statio_user_tables; on dataclip, the cache hit rate of Heroku Postgres is 99.99%. If you find that the proportion is lower than 99%, you may want to increase the database cache availability, you can use Heroku Postgres to quickly improve database performance or use dump/restore on EC2 to form a larger instance to improve performance. To understand the purpose of indexes, index is another major way to improve performance. Some frameworks will add indexes for your primary key, but if you search for other fields or have a large number of joins, you may need to manually add such indexes. Indexing is the most valuable, especially for large tables. At the same time, it is faster to access data from the cache than to access the data from the disk, even if the data may slow down in the memory because ipvs must parse hundreds of rows to determine if they have been processed. To get the percentage of time used by the index of the table in your database and display the index in ascending order of the table, you can execute the following statement: SELECT relname, 100 * idx_scan/(seq_scan + idx_scan) response, n_live_tup rows_in_tableFROM pg_stat_user_tablesWHERE seq_scan + idx_scan> 0 order by n_live_tup DESC; however, there is no perfect answer here. If the hit rate is around 10,000 when more than 99% rows of data are accessed in some, you can consider adding indexes. When checking where to add an index, you should refer to the query type you are running. In general, you should add an index in the area where other IDs are used to query or the value you frequently filter, such as the created_at field. Professional tips: If you use create index concurrently in the product database to add indexes, CREATE indexes in the background and do not hold table locks. At the same time, the limitation of index creation is that it usually takes 2-3 times to create and will not be executed in the transaction. Even for any large product website, these trade-offs are worthwhile for your end users. The Heroku Dashboard example uses the recently accessed Heroku dashboard as a real-world example. We can run such a query statement and the running result: # SELECT relname, 100 * idx_scan/(seq_scan + idx_scan) percent_of_times_index_used, n_live_tup rows_in_table FROM pg_stat_user_tables order by n_live_tup DESC; relname | tables | rows_in_table tables + ------------- events | 0 | 669917 app_infos_user_info | 0 | 198218 app_infos | 50 | 175640 user_info | 3 | 46718 rolouts | 0 | 34078 favorites | 0 | 3059 schema_migrations | 0 | 2 authorizations | 0 | 0 delayed_jobs | 23 | 0 here we can we can see that nearly 700,000 rows of the events table have been used but no index. Here you can study my applications and see some general query statements used. An example is to push the blog to you. You can execute explain analyze to check your execution plan. It can give you a better idea for the performance of specific query statements. Explain analyze select * FROM events WHERE app_info_id = 7559; query plan --------------------------------------------------------------- Seq Scan on events (cost = 0. 00 .. 63749.03 rows = 38 width = 688) (actual time = 2. 538 .. 660.785 rows = 89 loops = 1) Filter: (app_info_id = 7559) Total runtime: 660.885 ms we can use indexes to traverse all data in a given order. You can also add indexes to prevent table locking and view the performance: create index concurrently idx_events_app_info_id ON events (app_info_id); explain analyze select * FROM events WHERE app_info_id = 7559; ---------------------------------------------------------------------- Index Scan using idx_events_app_info_id on events (cost = 0. 00 .. 23.40 rows = 38 width = 688) (actual time = 0. 021 .. 0.115 rows = 89 loops = 1) Index Cond: (app_info_id = 7559) Total Runtime: 0.200 MS at the same time we can see significant improvements in this single query statement, we can test this result on New Relic, we can also see that using this index and other indexes reduces the time spent on the database. The final combination of index cache hit rate, if you are interested in the number of indexes in your cache, you can run: SELECT sum (idx_blks_read) as idx_read, sum (idx_blks_hit) as idx_hit, (sum (idx_blks_hit)-sum (idx_blks_read)/sum (idx_blks_hit) as ratioFROM pg_statio_user_indexes; in general, you should require this to reach 99%, the same as your general cache hit rate.