Attilax Summary of the implementation of packet aggregation groupby for atitit data storage
1. Aggregation Operations 1
1.1. A, scalar aggregation Stream aggregation 1
1.2. b, hash aggregation 2
1.3. All optimal plan choices are based on existing statistics to evaluate 3
1.4. Reference 3
1.
Aggregation Operations
aggregation is what we're writing . T-SQL statements are often encountered, let us analyze the characteristics of some commonly used aggregate operators and the items that can be optimized.
1.1.
a, scalar aggregation Stream Aggregation
scalar aggregation is a common way of aggregating data, such as the following aggregate functions used in the statements we write: MAX (),MIN (), AVG (),COUNT (),SUM ()
The output of these data result items above is basically generated by the flow aggregation, and this operator is also known as: Scalar aggregation
in fact, Stream Aggregation The most common way for this algorithm is to group ( GROUP BY) calculates that the scalar calculations above also take advantage of this feature, except that the whole is formed into a large group to be aggregated.
I understand it through the following code
Clear the current aggredate results
Clear The current group by columns for Each input row
Begin
If the input row does not match the current group by columns
Begin
output The current aggreagate results (if any )
Clear the current aggreagate results
Set the current group by columns to the input row
End
Update the aggregate results with the input row
End
The flow aggregation operator is a simple process that maintains an aggregation group and aggregated values, scans the data in the table sequentially, ignores if the aggregation group can not match, and if so, joins the aggregation group and updates the aggregated value result item.
As an example,
SELECT Shipaddress,shipcity,count (*)
From Orders
GROUP by Shipaddress,shipcity
So here we have summed up an optimization method for flow aggregation: avoid sorting as much as possible, but avoiding sorting requires grouping (groupby) fields within the index coverage.
1.2.
b, hash aggregation
The method of flow aggregation above needs to be ordered in advance, we know that sorting is a very large consumption process, so it is not suitable for large table grouping aggregation operation, in order to solve this problem, another kind of aggregation operation is introduced: hash aggregation
The so-called hash aggregation internal method is the same as the hash connection mechanism mentioned earlier in this article.
Hash aggregation does not require sorting and excessive memory consumption, and it is easy to execute the plan in parallel, leveraging multiple CPU synchronization, but one drawback is that the process is blocked, and that the hash aggregation does not produce any results until the full input.
Therefore, it is a good application scenario to use hash aggregation in large data tables.
We know that sequencing is a very resource-intensive thing, but using hash matching requires only a different column value to be extracted, so there is no doubt that the hash matching algorithm is the notch above algorithm here compared to performance.
1.3.
all optimal planning choices are based on existing statistical information to assess
and the above about these two columns of content distribution type How does SQL Server know? This is where the powerful statistics of SQL Server are supported.
in the SQL Server is not a fixed statement that will form a specific plan, and the resulting specific plan is not always optimal, which is related to the content distribution, the amount of data, the type of data in the database's existing data table, and the statistical information that records these details.
all optimal planning choices are based on existing statistical information to assess , if our statistical information is not updated in a timely manner, then the optimal execution plan will not be the best, and sometimes the worst.
1.4.
References
SQL Server Tuning Series Basics (Summary of common operators-three ways to analyze physical connections) - finger tips - Blog Park . html
author :: Nickname : Old Wow's paw claw of Eagle idol wreck iconoclast image-smasher
Bird catcherWang"Bird Catcher King of kings of Kings devout pious religiousFaithDefenders DefenderOf the Faith. Kara CarlaRed Cloak CaracallaRed Cloak
abbreviation: : Emir Attilax Akbar Emil Attilax Akbar
Full Name::EmirAttilax AkbarbinMahmud bin Attila binSolomon Al Rapanui
Emil Attilax Akbar this Mahmoud this Attila Solomon al Rapa Nui
common name : Span style= "font-family: Song body; Color: #333333; font-size:12pt "> Etila ( Ayron"   Email:[email protected]
reprint Please indicate source: attilax 's Column http://www.cnblogs.com/attilax/
--atiend
Attilax Summary of the implementation of packet aggregation groupby for atitit data storage