Most people think that DBCC checkdb with repair_allow_data_loss can fix database problems, but not necessarily. The Pual article describes in detail which situations cannot be repaired:
In my
Previous post on interpreting checkdb output, plus in my DBCC internals session at teched it forum yesterday, I mentioned there are some things that checkdb can't repair. in this post I want to go into a bit more detail-based on a post from my old storage
Engine blog.
Before anyone takes this the wrong way-what do I mean by "can't be retried Red "? Remember that purpose of repair is to make the database structurally consistent, and that to do this usually means deleting the specified upt data/Structure (that's why the option
To do this was aptly named repair_allow_data_loss-see
This post for more explanation on why repair can be bad ). A successful uption is deemed unrepairable when it doesn't make sense to repair it given the damage the repair wocould cause, or the specified uption is so rare and so complicated to repair correctly that it's
Not worth the engineering effort to provide a repair. remember also that recovery from functions uptions shoshould be based on a sound backup strategy, not on running repair, so making this trade-off in functionality makes sense.
Here's a few of the more common unrepairable uptions that people run into along with the reasons they can't be retried red by DBCC.
PFS page header partition uption
An example of this is on SQL Server 2005:
MSG 8946, level 16, State 12, line 1
Table error: Allocation page (1: 13280496) has invalid pfs_page header values.
Type is 0. Check type, alloc unit ID and page ID on the page.
Checkdb uses the PFS pages to determine which pages are allocated-and so which pages to read to drive the varous consistency checks. the only repair for a PFS page is to reconstruct it-they can't simply be deleted as they're a fixed part of the Fabric
Of the database. PFS pages cannot be rebuilt because there is no infallible way to determine which pages are allocated or not. there are various algorithms I 've experimented with to rebuild them, with optimistic or pessimistic setting of page allocation statuses
And then re-running the various consistency checks to try to sort out the incorrect choices, but they all require very long run-times. given the frequency with which these specified uptions are seen, and the engineering effort required to come up with an (imperfect)
Solution, I made the choice to leave this as unrepairable, and I don't think that will change in future.
Critical System Table clustered-index leaf-page partition uption
An example of this is on SQL Server 2000:
Server: MSG 8966, level 16, state 1, line 1
Cocould not read and latch page (1: 18645) with Latch Type Sh. sysindexes failed.
And on SQL Server 2005:
MSG 7985, level 16, state 2, server sunart, line 1
System Table pre-checks: Object ID 4. cocould not read and latch page ()
Latch Type Sh. Check statement terminated due to unrepairable error.
In
Previous post in the series I described why how and why we do special checks of the clustered indexes of the critical system tables. if any of the pages at the leaf-level of these indexes are missing upt, we cannot repair them. repairing wocould mean deallocating
The page, wiping out the most important metadata for potentially hundreds of user tables and so forth between tively deleting all of these tables. that's obviously an unpalatable repair for anyone to allow and so checkdb doesn't do it.
Column value comment uption
Here's an example of this on SQL Server 2005:
MSG 2570, level 16, State 3, line 1
Page (), slot 0 in Object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type "in-row data "). column "C1" value is out of range for data type "datetime ". update column to a legal value.
This is where a column has a stored value that is outside the valid range for the column type. There are a couple of repairs we cocould do for this:
- Delete the entire record
- Insert a dummy Value
#1 isn't very palatable because then data is lost and it's not a structural problem in the database so doesn' t have to be retried red. #2 is dangerous-What value shocould be chosen as the dummy value? Any value put in may adversely affect business logic, or
Fire a trigger, or have some unwelcome meaning in the context of the table-even a null. Given these problems, I chose to allow people to fix the specified upt values themselves.
Metadata upload uption
Here's an example of this on SQL Server 2005:
MSG 3854, level 16, state 1, line 2
Attribute (referenced_major_id = 2089058478) of row (class = 0, object_id = 2105058535, column_id = 0, referenced_major_id = 2089058478, referenced_minor_id = 0) in SYS. SQL _dependencies has a matching row (object_id = 2089058478) in SYS. objects (type = Sn) that is invalid.
This example is relatively benign. there are other examples that will cause checkdb to terminate-not as bad as the critical system table partition uption example above, but enough that checkdb doesn't trust the metadata enough to use it to drive consistency
Checks. repairing metadata upload uption has the same problems as repairing critical system table partition uption-any repair means deleting metadata about one or more tables, and hence deleting the tables themselves. it's far better to leave the uption unretried red
So that as much data as possible can be extracted from the remaining tables.
Summary
Repair can't fix everything. you may end up having to perform manual and time-consuming data extraction from the specified upt database and losing lots of data because of, say, a critical system table partition uption. bottom line (as usual)-make sure you have valid
Backups so you don't get into this state!
This article is transferred from sqlskills Paul's blog. http://www.sqlskills.com/blogs/paul/post/CHECKDB-From-Every-Angle-Can-CHECKDB-repair-everything.aspx