Anatomy of SQL Server 12th orcamdf row compression support (translated)

Source: Internet
Author: User

Original: Anatomy of SQL Server 12th orcamdf row compression support (translated)

Anatomy of SQL Server 12th orcamdf row compression support (translated)

http://improve.dk/orcamdf-row-compression-support/


In this two months of intermittent development work, I finally merged the ORCAMDF compression function branch into the main branch
This means that orcamdf now officially supports the data row compression feature


Supported data types
Implementing row compression requires that I modify almost any of the implemented data types to store them as compressed. The integer type is compressed, and the decimal type
into variable lengths, and variable-length types are essentially truncated and then filled with a%. Row compression is supported for all data types previously implemented by Orcamdf, and some newly supported data types are added based on previously supported data types
The current data type support list is as follows:

bigintbinarybitCharDatedatetimeMal/Numeric (including Vardecimal, both with  andwithout row compression)Imageint Moneyncharntextnvarcharsmallintsmallmoneytext Timeuniqueidentifiervarbinaryvarchar

Unicode compression
nchar and nvarchar are proving to be trickier than other types because they use the SCSU Unicode compression format.
I found it in. NET has an implementation of SCSU, but when I embed his code inside orcamdf he pops up a license window
I need to buy license.
There are also many open source Java tools implemented but none of them are what I want. I chose to implement SCSU by myself based on the reference implementation given by Unicode.inc.

I only unzipped and finally finished a very slim and simple SCSU decompressor.

I will write a separate blog to introduce Decompressor and separate it from the Orcamdf as a separate class with some default values

Architecture changes
I think I should be able to complete the decompression function within a week or so, after all, the decompression has good documentation. I need to think about it.
How many things to change in order to achieve compression. The row record resolver must know whether the page is compressed. But where does the row record resolver know
Has the page been compressed? All I got was the page pointer, and now I have to query the metadata (partition table) to make sure that all the data passing paths are passed from Datascanner to page parser to the record parser finally to the data type parsers


I had to implement a variety of abstractions on the rule parser to abstract compressed and uncompressed records.
Overall, this would be a better architecture, but it might take more time than expected. The fact that parsing a compressed data format is only a small part of the ordeal-because it's documented and it's easy to format. Then the data type I need more work to get them out of here.


Preview
As usual, the code is on github and you can download it for research! If you are not a programmer, I also uploaded executable orcamdf studio binaries (dated 2012-02-06)


Statistical data
As a digital lover, I like to read statistics. Here is a set of data that is a random statistic for orcamdf:

123 submitted the first one on April 15, 2011--It was almost a year ago!
11700 lines of C # code (without spaces).
1000 lines of comments.
35% of the code is for testing, and using a test suite contains more than 200 tests.
Ohloh estimated Orcamdf development cost of $144090

End of the 12th chapter

Anatomy of SQL Server 12th orcamdf row compression support (translated)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.