Record-based and column-based are two different data layout types in the database and storage system. Our thinking logic is based on the line records, namely record-based data layout, the data record is a row to store and access. But found in many database applications (especially those that read requests as primary data access), people tend to simply access some attribute data in a row of records and have to read the entire row of data, many of which are not necessarily redundant IO operations and data. If you can avoid these redundant IO operations and data access, the performance and throughput of database access will be greatly improved. C-Store is in this context developed the first Columnstore storage system, the basic idea is to store each column of the original database table data together, so that when the user accesses a property data, only need to read the row's corresponding column data, This greatly reduces the amount of data read Io and improves the throughput.
C-Store ignited the column-based database or read-optimal/read-friendly's research and development craze, but its core thought column-based has been extended to many aspects. Earlier in the year I was in charge of the typhoon. Twister Form system, the data management model is constructed from the point of view of column storage, which provides a solution of high-speed data access under the mass data for research application and AD feature data analysis. Later, we abstracted the entire core of generating column storage and documenting the reorganization logic into the base component of the public library, the column IO, as the two major IO engine components that are tied to the Google Public Library record io.
This week's recommended reading is the originator of the column-based database published in Vldb in 2005 C-store:a column-oriented DBMS, from which you can learn the basic design ideas and advantages and disadvantages of columnstore.
[Reprint] "Weekly Recommended reading" C-Store: Columnstore Database