Talking about the grouping statistics

Source: Internet
Author: User
Tags foreach rand

In real life, grouping statistics is very common. For example, the People's bank requires commercial banks to submit a report on the anti-money laundering of a project is the month of the number of large transactions and the amount of money, where a large transaction defined as a customer's cumulative amount of the day of 200,000 yuan or foreign currency equivalent of more than 10,000 U.S. dollars. This is done by grouping statistics by transaction date from a large number of transaction chronological.

Let's generate the data to be counted, as follows:

IEnumerable<Tuple<int, double>> GetTuples(int  n)
{
 var tuples = new Tuple<int, double>[n];
 var rand = new Random();
 for (int k  = 1, i = 0; i < n; i++)
 {
  var r = rand.Next(n);
  k += (r >= n - 3) ? 2 :  ((r >= n - 9) ? 1 : 0);
  tuples[i] = new Tuple<int, double>(k, rand.NextDouble());
 }
 return tuples;
}

This method generates the data that is already sorted by n items.

Now, let's group by keyword and count the number and average of each group.

First, use C # 's foreach Loop, as follows:

IEnumerable<Tuple<int, int, double>>  ForEach(IEnumerable<Tuple<int, double>> tuples)
{
 var result = new List<Tuple<int,  int, double>>();
 var count = 0;
 var sum = 0.0;
 int? key = null;
 foreach (var v  in tuples)
 {
  if (key != v.Item1)
  {
   if (key != null) result.Add(new  Tuple<int, int, double>(key.Value, count, sum / count));
   sum = count = 0;
   key =  v.Item1;
  }
  count++;
  sum += v.Item2;
 }
 if (key != null) result.Add(new  Tuple<int, int, double>(key.Value, count, sum / count));
 return result;
}

One of the biggest drawbacks of this approach is that after the Foreach loop is over, there is a statistic that smells the "bad taste" of the code.

So let's refactor, and this time, use iterators to loop:

IEnumerable<Tuple<int, int, double>> Iterate(IEnumerable<Tuple<int,  double>> tuples)
{
 var result = new List<Tuple<int, int, double>>();
 var count  = 0;
 var sum = 0.0;
 int? key = null;
 for (var iter = tuples.GetEnumerator(); ; count++,  sum += iter.Current.Item2)
 {
  var hasValue = iter.MoveNext();
  if (!hasValue || key !=  iter.Current.Item1)
  {
   if (key != null) result.Add(new Tuple<int, int, double>(key.Value,  count, sum / count));
   if (!hasValue) break;
   sum = count = 0;
   key =  iter.Current.Item1;
  }
 }
 return result;
}

In this way, the "bad taste" is eliminated.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.