Anatomy of the Swiss Army Knife of SQL Server 16th orcamdf rawdatabase--mdf file (translated)
http://improve.dk/orcamdf-rawdatabase-a-swiss-army-knife-for-mdf-files/
When I first started to develop Orcamdf, I had only one goal, more than most of the books on the market to get a deeper level of knowledge inside MDF files
As time went on, Orcamdf did. At the time I had no plans, Orcamdf was able to parse system tables, metadata, and even DMVs. I also made a simple UI that makes orcamdf easier to use.
This is good, but the cost is that the software is very complex. To automatically parse metadata such as schemas, partitions, allocation units There are other things, not to mention the abstraction layer for the details of heap tables and indexes, the abstraction layer requires a lot of code and requires more database understanding. Due to changes in metadata between different SQL Server versions, ORCAMDF currently supports only SQL Servers R2. However, the data structure is relatively stable, the metadata is stored in a different way, using DMVs exposure data and so on. For Orcamdf to function properly, the metadata is intact, which causes orcamdf to be the same when SQL Server is corrupted. Do you encounter a damaged boot page? The database cannot be parsed either by SQL Server or Orcamdf
Say hello to Rawdatabase.
I am longing for the future of Orcamdf and how to use him is the most useful. I was able to constantly add new features in order to enable SQL Server to support what features he also supported, eventually enabling him to parse MDF files 100%. But what is the point? Of course, this is a good learning opportunity, but the point is that you use software to read data that SQL Server can do better than you. So, what's the choice?
Rawdatabase, referring to the database class, he will not attempt to parse anything unless you let him parse it.
He does not automatically parse schemas. He doesn't know the system table. He didn't know DMVs. However he knows SQL Server data structure and gives him an interface he can read MDF files directly.
Having rawdatabase only parse the data structure means that he can skip corrupt system tables or corrupt data
Example
This tool is still developing early, but let me show you what I can do with rawdatabase.
When I run the code on LINQPad, he easily shows the result, and the result is just a standard. NET object.
All the examples are running on the AdventureWorks 2008R2 LT (light Weight) database
Get a single page
Many times, we just need to parse a single page
// Get page 197 in file 1 var New Rawdatabase (@ "C:\AWLT2008R2.mdf");d B. GetPage (1197). Dump ();
Parse Page Header
Now we get to the page, how do we dump the page head out
// Get the header of page 197 in file 1 var New Rawdatabase (@ "C:\AWLT2008R2.mdf");d B. GetPage (1197). Header.dump ();
Parsing row offset arrays
Just like the page header, we can also dump the line offset array entry at the end of the page.
// Get The slot array entries of page 197 in file 1 var New Rawdatabase (@ "C:\AWLT2008R2.mdf");d B. GetPage (1197). Slotarray.dump ();
Parsing data records
When you get the raw data to the line offset entry, you usually want to look at the contents of the data row record. Fortunately, it's also easy to do
// Get all records on page 197 in file 1 var New Rawdatabase (@ "C:\AWLT2008R2.mdf");d B. GetPage (1197). Records.dump ();
Retrieving data from records
Once you get the record, you can now take advantage of Fixedlengthdata or Variablelengthoffsetvalues properties
To get the original fixed-length data content and variable-length data content. However, you must only want to get the actual parsed data values.
For parsing, Orcamdf will help you parse, you only need to provide him with schema.
//Read The record contents of the first record on page 197 of the file 1vardb =NewRawdatabase (@"C:\AWLT2008R2.mdf"); Rawprimaryrecord Firstrecord= (Rawprimaryrecord) db. GetPage (1,197). Records.first ();varValues = Rawcolumnparser.parse (Firstrecord,Newirawtype[] {rawtype.int ("Addressid"), Rawtype.nvarchar ("AddressLine1"), Rawtype.nvarchar ("AddressLine2"), Rawtype.nvarchar (" City"), Rawtype.nvarchar ("StateProvince"), Rawtype.nvarchar ("CountryRegion"), Rawtype.nvarchar ("PostalCode"), Rawtype.uniqueidentifier ("rowguid"), Rawtype.datetime ("ModifiedDate")}); Values. Dump ();
The Rawcolumnparser.parse method does is to have a schema with him, he helps you automatically convert the raw bytes to dictionary<string, Object>,key is the column name obtained from the schema,
and value is the actual value of the data column, such as int,short,guid,string and so on. By giving your users a schema, Orcamdf can skip a large amount of dependent metadata for parsing, so you can ignore data read failures that might result from possible metadata errors.
Since the page header already gives the Nextpageid and Previouspageid properties, this allows the software to simply traverse all the pages in the linked list and parse the data inside the pages--he basically scans with a given allocation unit
Filter page
Unless you retrieve a specific page, Rawdatabase also has a page property that enumerates all the pages in the database.
Use this property, for example, to get a list of all the IAM pages in the database
// Get A list of all IAM pages in the database var New Rawdatabase (@ "C:\AWLT2008R2.mdf");d B. Pages = X.header.type = = Pagetype.iam) . Dump ();
And since this is using LINQ technology, it's easy to design the properties you want.
For example, you can get all of the index pages and their slots counts like this:
// Get all index pages and their slot counts var New Rawdatabase (@ "C:\AWLT2008R2.mdf");d B. Pages = = X.header.type = pagetype.index )new { X.pageid, x. Header.slotcnt }). Dump ();
Or suppose you want to get a page with the following conditions
1. There is at least one record in the page
2. Free space has at least 7000 bytes
The following is the output of the page ID, free count, record count, and average recording size
var db = new rawdatabase (@ " c:\awlt2008r2.mdf " );d B. Pages. Where (x = x.header.freecnt > 7000 ) . Where (x = x.header.slotcnt >= 1 ). Where (x = x.header.type == Pagetype.data). Select (x = new {X.pageid, X.H Eader. freecnt, RecordCount = X.records.count (), recordsize = (8096 -x.header.freecnt)/ X.records.count ()}). Dump ();
For the last example, suppose you have only one MDF file and you have forgotten which objects are stored inside the MDF file.
It doesn't matter, we only need to query the system table SYSSCHOBJS! The SYSSCHOBJS system table contains data for all objects
And fortunately, his object ID is 34. Using this information, we can put all data pages that belong to object ID 34
Filtered out and read the records from these pages and only need to parse the first two columns of the table (you can define a partial schema as long as you omit the column at the end)
In the end we just need to dump the name (we can, of course, query all the columns in the table if we want to)
SELECT * from
vardb =NewRawdatabase (@"C:\AWLT2008R2.mdf");varRecords =db. Pages. Where (x= = X.header.objectid = = the&& X.header.type = =pagetype.data). SelectMany (x=x.records); varrows = Records. Select (x = Rawcolumnparser.parse (Rawprimaryrecord) x,Newirawtype[] {rawtype.int ("ID"), Rawtype.nvarchar ("name") }); rows. Select (x= x["name"]). Dump ();
Compatibility
You can see that rawdatabase is not dependent on metadata, which is easy to be compatible with multiple versions of SQL Server.
Therefore, I am pleased to announce that Rawdatabase is fully compatible with SQL Server 2005, 2008R2, 2012.
This may also be compatible with 2014, but I haven't tested it yet. When it comes to testing, all unit tests are run automatically.
Use AdventureWorksLT for 2005, 2008R2 and 2012 during testing.
Now there are some test demos to let Orcamdf Rawdatabase parse every record of each table in AdventureWorks LT database
Data corruption
One of the interesting ways to use rawdatabase is to attach a corrupted database. You can retrieve all pages of a specific object ID and then hard parse each page
Whether they are readable or not. If the metadata is corrupt, you can ignore him, you manually provide the schema (the column name of each column of the input table) and only need to follow the page list
Or parse the IAM page to read the data from the heap table. For the next few weeks I will be writing a blog about Orcamdf Rawdatabase's usage scenarios, including data corruption
Source Code and Feedback
I am very excited because the latest rawdatabase has been added to Orcamdf and I hope that I am not the only one to witness his power.
If you want to try it out, or have any ideas, suggestions or other feedback, I'll be happy to accept it.
If you want to try it out, check out the Orcamdf project on GitHub. Once the tool is done perfectly, I'll put him on the nuget.
Like Orcamdf, released under the GPL v3 licensed
End of the 16th chapter
Anatomy of the Swiss Army Knife of SQL Server 16th orcamdf rawdatabase--mdf file (translated)