How expensive is page splits in terms of transaction log?
By:paul Randal
page splits is always thought of as expensive, but Just how is they? In the This post I want to the Create an example to show how much more transaction log was created when a page of an index have to s Plit. I ' m going to use the sys.dm_tran_database_transactions DMV to Show how much more transaction log was generated when a page had to split. You can find the list of columns and a small amount of explanation of each column in Books Online here –i was reminded of it existence by someone on (sorry, don ' t remember who it is and I couldn ' t find it in search).
In the example, I ' m going to create a table with approximately 1000-byte long rows:
CREATE DATABASE pagesplittest; GO use pagesplittest; GO
CREATE TABLE Bigrows (C1 INT, C2 CHAR (1000)); CREATE CLUSTERED INDEX bigrows_cl on Bigrows (C1); GO
INSERT into Bigrows VALUES (1, ' a '); INSERT into Bigrows VALUES (2, ' a '); INSERT into Bigrows VALUES (3, ' a '); INSERT into Bigrows VALUES (4, ' a '); INSERT into Bigrows VALUES (6, ' a '); INSERT into Bigrows VALUES (7, ' a '); GO
I ' ve engineered the case where the clustered index data page have space for one more row, and I ' ve left a ' gap ' at c1=5 . Let's add it as part of a explicit transaction and see how much transaction log is generated:
BEGIN TRAN INSERT into Bigrows VALUES (8, ' a '); GO
SELECT [database_transaction_log_bytes_used] from sys.dm_tran_database_transactions WHERE [database_id] = DB_ID (' Pagesplittest '); GO
database_transaction_log_bytes_used ——————————— –1228
That's about what I ' d expect for that row. Now what is the when I cause a page split by inserting the ' missing ' c1=5 row to the full page?
-commit previous transaction Commit TRAN GO
BEGIN TRAN INSERT into Bigrows VALUES (5, ' a '); GO
SELECT [database_transaction_log_bytes_used] from sys.dm_tran_database_transactions WHERE [database_id] = DB_ID (' Pagesplittest '); GO
database_transaction_log_bytes_used ——————————— –6724
Wow. 5.5x more bytes is written to the transaction log as part of the system transaction that does the split.
The ratio gets worse as the row size gets smaller. For a row with an approximately 100-byte long row (use the same code as aboveCHAR (+), insert somewhere rows with a ' gap '-then inserts the 68th to cause the split), the numbers is 328 and 5924–the s Plit cause times more log to be generated! For a row with an approximately 10-byte long row, I got numbers of10436, because I created skewed data (about $8) and then inserted key value 5 which forced a (rare) Non-middle page split. That's a ratio of more than +times more Log generated! Can try this yourself if you want:i changed the code to aCHAR (Ten), inserted values 1, 2, 3, 4, 6, 7, then inserted-key values of 8 and then 2 of 5. The resulting page had only 6 rows–it split after the key value 5–the Storage Engine doesn ' t always do a 50/50 page SP Lit. And that's not even causing nasty cascading page-splits, or splits that has to split a page multiple times to fit a new ( variable-sized) row in.
Bottom Line:page splits don ' t just cause extra IOs and index fragmentation, they generate a *lot* more transaction log. And all that log have to is (potentially) backed up, log shipped, mirrored ....
Go How expensive is page splits in terms of transaction log?