Common Scenarios to avoid with datawarehousing

Source: Internet
Author: User
<span id="Label3"></p><p class="Label"><p class="Label">Database Design</p></p><p class="TableSpacing"><p class="TableSpacing"></p></p> <table style="width: 608px;" border="1" cellspacing="0" cellpadding="0"> <tbody> <tr> <td valign="top" width="53"><p class="Label">Rule</p></td> <td valign="top" width="120"><p class="Label">Description</p></td> <td valign="top" width="57"><p class="Label">Value</p></td> <td valign="top" width="166"><p class="Label">Source</p></td> <td valign="top" width="213"><p class="Label">Problem Description</p></td> </tr> <tr> <td valign="top" width="><p class=" text ">1 <p></p> </td> <td valign=" top " width=" + "><p class=" Text "> Excessive sorting and RID lookup operations   should is reduced with covered indexes. </p> </td> <td valign="top" width="><p class=" "text">  <p></p> </td> <td valign="top" width="166"><p class="text">sys.dm_exec_sql_text </p> <p class="text">sys.dm_exec_cached_plans </p> </td> <td valign="top" width="213"><p class="Text">large Data Warehouse can benefit from   Indexes. Indexes can used to cover queries and avoid sorting. The cost of   index overhead are only paid if data is Loaded. </p> </td> </tr> <tr> <td valign="top" width="><p class=" text ">2 <p></p> </td> <td valign=" top " width=" + "><p class=" Text "> Excessive fragmentation: </p> <p class="Text">average fragmentation_in_percent should be   <25% </p> </td> <td valign="top" width="><p" class="text">>25% <p></p> </td> <td valign="top" width="166"><p class="Text">sys.dm_db _ Index_physical_stats </p> </td> <td valign="top" width="213"><p class="Text">reducing index fragmentation through Index   rebuilds can benefit big range scans, common in Data Warehouse and Reporting   scenarios.  </p> </td> </tr> <tr> <td valign="top" width="><p class=" text ">3 <p></p> </td> <td valign=" top " width=" + "><p class=" Text ">scans and ranges are Common. Look for missing   indexes </p> </td> <td valign="top" width=""><p class="Text">>= 1 </p> </td> <td valign="top" width="166"><p class="text">perfmon object </p> <p class="text">sql server Access Methods </p> <p class="text"> Sys.dm_db_missing_index_group_stats </p> <p class="text">sys.dm_db_missing_index_groups </p> <p class="text"> Sys.dm_db_missing_index_details </p> </td> <td valign="top" width="213"><p class="Text">a Missing index flushes the Cache. </p> </td> </tr> <tr> <td valign="top" width="53"><p class="Text">4</p></td> <td valign="top" width="120"><p class="Text">Unused Indexes should be avoided</p></td> <td colspan="2" valign="top" width="223"><p class="Text">If an index is never used, it won't appear in the DMV sys.dm_db_index_usage_stats</p></td> <td valign="top" width="213"><p class="Text">Index maintenance for unused indexes should is Avoided.</p></td> </tr> </tbody> </table><p class="TableSpacing"><p class="TableSpacing"></p></p><p class="Label"><p class="Label">Resource ISSUE:CPU</p></p><p class="TableSpacing"><p class="TableSpacing"></p></p> <table style="width: 608px;" border="1" cellspacing="0" cellpadding="0"> <tbody> <tr> <td valign="top" width="53"><p class="Label">Rule</p></td> <td valign="top" width="120"><p class="Label">Description</p></td> <td valign="top" width="62"><p class="Label">Value</p></td> <td valign="top" width="146"><p class="Label">Source</p></td> <td valign="top" width="228"><p class="Label">Problem Description</p></td> </tr> <tr> <td valign="top" width="53"><p class="Text">1</p></td> <td valign="top" width="120"><p class="Text">Signal Waits</p></td> <td valign="top" width="62"><p class="Text">> 25%</p></td> <td valign="top" width="146"><p class="Text">Sys.dm_os_wait_stats</p></td> <td valign="top" width="228"><p class="Text">Time in runnable queue is pure CPU wait.</p></td> </tr> <tr> <td valign="top" width="53"><p class="Text">2</p></td> <td valign="top" width="120"><p class="Text">Avoid Plan Reuse</p></td> <td valign="top" width="62"><p class="Text">> 25%</p></td> <td valign="top" width="146"><p class="Text">Perfmon Object</p><p class="Text">SQL Server Statistics</p></td> <td valign="top" width="228"><p class="Text">Data warehouse have fewer transactions than OLTP, each with significantly bigger IO. Therefore, has the correct plan is more important than reusing a plan. Unlike OLTP, Data Warehouse queries is <em></em> not identical.</p></td> </tr> <tr> <td valign="top" width="53"><p class="Text">3</p></td> <td valign="top" width="120"><p class="Text">Parallelism:cxpacket Waits</p></td> <td valign="top" width="62"><p class="Text"><10%</p></td> <td valign="top" width="146"><p class="Text">Sys.dm_os_wait_stats</p></td> <td valign="top" width="228"><p class="Text">Parallelism is desirable in data Warehouse or reporting WORKLOADS.</p></td> </tr> </tbody> </table><p class="TableSpacing"><p class="TableSpacing"></p></p><p class="Label"><p class="Label">Resource issue:memory</p></p><p class="TableSpacing"><p class="TableSpacing"></p></p> <table style="width: 608px;" border="1" cellspacing="0" cellpadding="0"> <tbody> <tr> <td valign="top" width="52"><p class="Label">Rule</p></td> <td valign="top" width="120"><p class="Label">Description</p></td> <td valign="top" width="68"><p class="Label">Value</p></td> <td valign="top" width="143"><p class="Label">Source</p></td> <td valign="top" width="227"><p class="Label">Problem Description</p></td> </tr> <tr> <td valign="top" width="><p" class="text">1 <p></p> </td> <td valign="top" width="" "><p class=" text ">memory Grants pending </p> </td> <td valign="top" width=""><p class="Text">>1 </p> </td> <td valign="top" width="143">< P class= "text" >perfmon object <p></p> <p class="text">sql server Memory Manager </p> </td> <td valign="top" width="227"><p class="Text">memory Grant not available for query to run.  Check for </p> <p class=" Text ">sufficient Memory and page life Expectancy. </p> </td> </tr> <tr> <td valign="top" width="52"><p class="Text">2</p></td> <td valign="top" width="120"><p class="Text">Page Life expectancy</p></td> <td valign="top" width="68"><p class="Text">Drops by 50%</p></td> <td valign="top" width="143"><p class="Text">Perfmon Object</p><p class="Text">SQL Server Buffer Manager</p></td> <td valign="top" width="227"><p class="Text">Page life expectancy are the average number of seconds a data page stays in Cache. Low values could indicate a cache flush this is caused by a big read.</p><p class="Text">Look for possible missing Index.</p></td> </tr> </tbody> </table><p class="TableSpacing"><p class="TableSpacing"></p></p><p class="Label"><p class="Label">Resource Issue:io</p></p><p class="TableSpacing"><p class="TableSpacing"></p></p> <table style="width: 608px;" border="1" cellspacing="0" cellpadding="0"> <tbody> <tr> <td valign="top" width="52"><p class="Label">Rule</p></td> <td valign="top" width="192"><p class="Label">Description</p></td> <td valign="top" width="68"><p class="Label">Value</p></td> <td valign="top" width="166"><p class="Label">Source</p></td> <td valign="top" width="130"><p class="Label">Problem Description</p></td> </tr> <tr> <td valign="top" width="52"><p class="Text">1</p></td> <td valign="top" width="192"><p class="Text">Average Disk Sec/read</p></td> <td valign="top" width="68"><p class="Text">>20 ms</p></td> <td valign="top" width="166"><p class="Text">Perfmon Object</p><p class="Text">Physical Disk</p></td> <td valign="top" width="130"><p class="Text">Reads should take 4-8ms without any IO pressure.</p></td> </tr> <tr> <td valign="top" width="52"><p class="Text">2</p></td> <td valign="top" width="192"><p class="Text">Average Disk Sec/write</p></td> <td valign="top" width="68"><p class="Text">>20 ms</p></td> <td valign="top" width="166"><p class="Text">Perfmon Object</p><p class="Text">Physical Disk</p></td> <td valign="top" width="130"><p class="Text">Writes (sequential) can is as fast as 1 ms for transaction log.</p></td> </tr> <tr> <td valign="top" width="52"><p class="Text">3</p></td> <td valign="top" width="192"><p class="Text">Big Scans</p></td> <td valign="top" width="68"><p class="Text">>1</p></td> <td valign="top" width="166"><p class="Text">Perfmon Object</p><p class="Text">SQL Server Access Methods</p></td> <td valign="top" width="130"><p class="Text">A Missing index flushes the Cache.</p></td> </tr> <tr> <td valign="top" width="52"><p class="Text">4</p></td> <td valign="top" width="192"><p class="Text">If Top 2 values for wait stats is any of the following:</p><p class="Text">Asynch_io_completion</p><p class="Text">Io_completion</p><p class="Text">Logmgr</p><p class="Text">Writelog</p><p class="Text">Pageiolatch_x</p></td> <td valign="top" width="68"><p class="Text">Top 2</p></td> <td valign="top" width="166"><p class="Text">Sys.dm_os_wait_stats</p></td> <td valign="top" width="130"><p class="Text">If top 2 wait_stats values include io, there is an IO bottleneck</p></td> </tr> </tbody> </table><p class="TableSpacing"><p class="TableSpacing"></p></p><p class="Label"><p class="Label">Resource issue:blocking</p></p><p class="TableSpacing"><p class="TableSpacing"></p></p> <table style="width: 608px;" border="1" cellspacing="0" cellpadding="0"> <tbody> <tr> <td valign="top" width="52"><p class="Label">Rule</p></td> <td valign="top" width="100"><p class="Label">Description</p></td> <td valign="top" width="72"><p class="Label">Value</p></td> <td valign="top" width="208"><p class="Label">Source</p></td> <td valign="top" width="177"><p class="Label">Problem Description</p></td> </tr> <tr> <td valign="top" width="52"><p class="Text">1</p></td> <td valign="top" width="100"><p class="Text">Block percentage</p></td> <td valign="top" width="72"><p class="Text">>2%</p></td> <td valign="top" width="208"><p class="Text">Sys.dm_db_index_operational_stats</p></td> <td valign="top" width="177"><p class="Text">Frequency of Blocks.</p></td> </tr> <tr> <td valign="top" width="52"><p class="Text">2</p></td> <td valign="top" width="100"><p class="Text">Block Process Report</p></td> <td valign="top" width="72"><p class="Text">Sec</p></td> <td valign="top" width="208"><p class="Text">sp_configure, Profiler</p></td> <td valign="top" width="177"><p class="Text">Report of Statements.</p></td> </tr> <tr> <td valign="top" width="52"><p class="Text">3</p></td> <td valign="top" width="100"><p class="Text">Average Row Lock Waits</p></td> <td valign="top" width="72"><p class="Text">>100ms</p></td> <td valign="top" width="208"><p class="Text">Sys.dm_db_index_operational_stats</p></td> <td valign="top" width="177"><p class="Text">Duration of Blocks.</p></td> </tr> <tr> <td valign="top" width="52"><p class="Text">4</p></td> <td valign="top" width="100"><p class="Text">If Top 2 values for wait stats is any of the following:</p><p class="Text">Lck_m_bu</p><p class="Text">Lck_m_is</p><p class="Text">Lck_m_iu</p><p class="Text">Lck_m_ix</p><p class="Text">Lck_m_rin_nl</p><p class="Text">Lck_m_rin_s</p><p class="Text">Lck_m_rin_u</p><p class="Text">Lck_m_rin_x</p><p class="Text">lck_m_rs_s</p><p class="Text">Lck_m_rs_u</p><p class="Text">lck_m_rx_s</p><p class="Text">Lck_m_rx_u</p><p class="Text">Lck_m_rx_x</p><p class="Text">lck_m_s</p><p class="Text">Lck_m_sch_m</p><p class="Text">lck_m_sch_s</p><p class="Text">Lck_m_siu</p><p class="Text">Lck_m_six</p><p class="Text">Lck_m_u</p><p class="Text">Lck_m_uix</p><p class="Text">Lck_m_x</p></td> <td valign="top" width="72"><p class="Text">Top 2</p></td> <td valign="top" width="208"><p class="Text">Sys.dm_os_wait_stats</p></td> <td valign="top" width="177"><p class="Text">If top 2 wait_stats values include IO, there is a blocking bottleneck.</p><p class="Text">Consider using row versioning to minimize shared locking blocks.</p></td> </tr> </tbody> </table><p class="TableSpacing"><p class="TableSpacing"></p></p><p class="Text"><p class="Text">Exactly the opposite of OLTP applications, reporting or relational data warehouse applications is characterized by small Numbers of (different) Big Transactions. These is frequently SELECT intensive operations. The implications is significant for database design, resource usage, and system Performance.</p></p><p class="Text"><p class="Text">Reporting and Data Warehouse performance objectives is as Follows:</p></p> <ol> <li><li>Data Warehouse and relational data Warehouse designs can has more indexes as the cost of index maintenance are paid only O NE time, during the batch update process.</li></li> <li><li>Plan reuse should generally be avoided. Plan reuse may result in picking-a plan that is good for some other query (with different data Distribution)  Not is good for this query. The time taken for plan generation of a large datawarehouse query are not nearly as important as has the right Plan.</li></li> <li><li>Sorts can and should is minimized with correct index Usage.</li></li> <li><li>Missing index situations should be investigated and Corrected.</li></li> <li><li>Large IOs such as range scans benefits from on disk Contiguity. Index fragmentation should is frequently monitored and kept to a minimum with index rebuilds.</li></li> <li><li>Blocking is generally uncommon as most data warehouse transactions is read operations.</li></li> <li><li>Parallelism is generally desirable for data Warehouse Applications.</li></li> </ol><p><p>Common Scenarios to avoid with datawarehousing</p></p></span>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.