2468: [Zhongshan Select 2010] Tri -nucleotide acidDescriptionTri-nucleotide is a basic fragment of DNA sequence. Specifically, a total of 4 nucleotides, respectively, with ' A ', ' G ', ' C ', ' T ' to denote. And the nucleotide is a fragment of DNA from 3 nucleotides. There are 64 kinds of nucleoside acid, namely ' AAA ', ' AAG ', ..., ' GGG '. Given a DNA sequence of length L, a total of (L-2) nucleotides can be distinguished. Now we want to use some statistical methods to do some analysis, the steps are as follows:1. For this (L-2) three nucleotides, we give numbers from left to right, 1 to L-2, respectively. 2. From this (L-2) nucleotide selection of a pair out, there is altogether (L-2) * (L-3)/2 kinds of possible. If a given nucleotide is the same, we record the distance between them. The distance between them is defined as the difference between their numbers. 3. According to the "Sample data" We have recorded, we now need to calculate the variance of the sample data. The calculation formula for variance is s2=[(x1-x) (x2-x) 2+...+ (xn-x) 2]/n, x= (X1+X2+...+XN)/n. If the size of the sample n=0, then we think s2=x=0. For example, we want to count the DNA sequence ' atatata ':1. For the tri-nucleotide acid number. L1:ata, L2:tat, L3:ata, L4:tat, L5:ata.2. (L1,L3) =2, (L1,L5) =4, (L3,L5) =2, (L2,L4) =2. So the sample data is 2,4,2,2.3. Sample Data average x= (2+4+2+2)/4=2.5.Variance s2=[(2-2.5) (4-2.5) (2-2.5), (2-2.5) 2]/4=0.75.given a sequence of DNA, you should calculate its variance. InputThe input contains multiple sets of test data. The first line contains a positive integer t, which indicates the number of test data. Each set of data contains a string consisting of ' a ', ' G ', ' C ', ' T ', representing the DNA sequence to be counted. The DNA sequence is longer than or equal to 3 and will not exceed 100000. OutputFor each set of test data, output a row of answers, a real number that retains 6 bits of precision, representing the value of S2. If your answer and the "relative error" of the answers are less than 1e-8, your answer will be considered the correct answer. Sample Input1
Atatata
Sample Output0.750000The following:There's nothing to say. nothing but a self-propelled formula, but pay attention to the last square may explode long long, directly with a double type first except the good.
#include <stdio.h> #include <iostream> #include <string.h>using namespace Std;const int n=100005;# Define ll long LongChar C[n];int t,n,i,a[n],b[n];ll s[505],sum[505],cnt[505],s1[505],s2[505],s,ans,ans;double fans; int main () {scanf ("%d", &t); while (t--) {scanf ("%s", c+1); N=strlen (c+1); for (i=111;i<=444;i++) s[i]=sum[i]=s1[i]=s2[i]=cnt[i]=0; for (i=1;i<=n;i++) {if (c[i]== ' A ') a[i]=1;else if (c[i]== ' G ') a[i]=2;else if (C[i] = = ' C ') A[i]=3;else a[i]=4; } for (i=1;i<=n-2;i++) b[i]=a[i]*100+a[i+1]*10+a[i+2]; for (i=1;i<=n-2;i++) {s[b[i]]+=cnt[b[i]]*i*i+s1[b[i]]-s2[b[i]]*i*2; Sum[b[i]]+=cnt[b[i]]*i-s2[b[i]]; s1[b[i]]+= (ll) i*i; S2[b[i]]+=i; cnt[b[i]]++; } ans=ans=s=0; for (i=111;i<=444;i++) ans+=s[i],ans+=sum[i],s+=cnt[i]* (cnt[i]-1)/2; if (s==0) Fans=0;else fans=1.0*ans/s-(1.0*ans/s) * (1.0*ANS/S); printf ("%.6f\n", fans); } return 0;}
Bzoj 2468: [Zhongshan Select 2010] Tri-nucleotide acid