This is a great problem. The application of AC automata in the construction transfer diagram DFAThe DFA transfer diagram is a diagram showing the state of the transfer process, and the DFA diagram can be used to find out the number of any DNA length, any stateTo find out the DFA matrix using automata, there| DP[N][0] dp[n][1] [... dp[n][m] |=|dp[1][0] dp[1][1] ... dp[1][m] | * dfa^ (n-1) (M refers to the total number of States)DP boundary Matrix |dp[1
The original title link is here: https://leetcode.com/problems/repeated-dna-sequences/From the beginning to the length of the string 10, each one after the move, put into the hashset, if the hashset has already, put in the res, while ensuring that res is not duplicated.This method is time O (n), and n is the length of S. Occupies a set, space O (n).The quicker way to do this is because there are only four characters, so you can use a string of length
DescriptionOne measure of ' unsortedness ' in a sequence is the number of the pairs of entries, which is out of order with respect to each Other. For instance, with the letter sequence "Daabec", this measure is 5, since D was greater than four letters E is greater than one letter to it right. This measure was called the number of inversions in the sequence. The sequence ' AACEDGG ' have only one inversion (E and D)---it was nearly sorted---while the sequence "ZWQM" has 6 invers Ions (it is as un
Python statistics have many obstacles in our use. The problems related to the DNA sequence need to be learned continuously. Next we will introduce the relevant issues and hope to gain some gains in the usage of Python statistics in the future.
Given A bunch of DNA sequences, that is, A string consisting of characters A, C, G, and T, count the occurrence frequency of all subsequences with A length of n. For
Source: http://acm.pku.edu.cn/JudgeOnline/problem? Id = 1007
The algorithm is relatively slow, that is, first calculate the number of reverse-order letters of each string, then put the number into a multimap, sort the order directly, and then output.
# Include
Appendix:
DNA sorting
Time limit:1000 ms
Memory limit:10000 K
Total submissions:44012
Accepted:17106
DescriptionOne measure of ''unsortedness ''in a seq
In bioinformatics analysis, a series of operations are often performed on DNA sequences, including sub-sequence interception, complementary sequence acquisition, reverse sequence acquisition, and reverse complementary sequence acquisition. In the Python language, you can write the following functions to accomplish these simple functions.Sub-sequence interceptionUsing the string slicing feature in Python for sequence interception can be done, for examp
The principle of bioinformatics the second bomb: The global comparison of DNA sequences using Needleman–wunsch algorithm.Specific principle: Https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm.Paste the Python code:1 #-*-coding:utf-8-*-2 """3 Created on Sat 18:20:014 5 @author: Zxzhu6 To be modified after:7 1. Add Command line Parameters8 2. Give a variety of comparison results9 """Ten One ImportNumPy as NP A ImportPandas as PD -Sequence
"topic description" to write a program (unlimited language), can produce random DNA sequence (string) as required.DNA sequences consist of four bases (characters) of a, T, C and G. It is required to produce a DNA sequence of 100 bases in the proportion of a 10%, T 20%, C 30%, G 40%.Pay attention to the random degree of good, and meet the proportional requirements.I first: (Everyone should not look at my pro
> Analysis>> in two steps1. Calculating the inverse number of a sequence2. Sort by number of reverse order>> Since the maximum number of sequences is only 100, so there is no need for excessive speed, insert sort is enough> Note:>> The array length of the storage sequence is preferably 50+1, which facilitates the output> attached Code1#include"stdio.h"2 3 intMainvoid)4 {5 Charseqs[ -][ -+1] = {0} ;6 intinversions[ -] = {0} ;7 intsortseqs[ -] = {0} ;8 intTMP =0 ;9 intSeqNum =0
1#include 2#include 3#include string>4#include 5 using namespacestd;6 7 BOOLcmpstringAstringb);8 BOOLMatchstringAstringb);9 Ten One intMainvoid) { A inttestsize, couple; - stringstr[101]; - intlen[101]; the BOOLused[101]; - - //For each test case: (t -CIN >>testsize; + while(testsize--) { - //Scan and Store n strings (n +CIN >>couple; A for(inti =0; i i) { atCIN >>Str[i]; - } - - //Sort by length -Sort (str, str +couple, CMP); -
The main problem: give the M disease gene fragment (mAnalysis: The subject needs to build an AC automaton for the gene fragments of M disease, and each node in the automaton represents a state. The leaf nodes in the AC automata represent viruses, so they are illegal. At the same time, if a node to the root of the string suffix is a virus, then the node is also an illegal state. Eliminate all illegal state, then the rest of the nodes are in the legal state. The node's NXT pointer is then used to
Sort by the number of reverse orders from small to large. A stable sort is required. However, you can still get a quick line#include #include#includeusing namespacestd;Const intN = -;Const intM =104;CharStr[m][n];structpoint{intnum, id;} P[M];intN, M;voidGetinverse (intID) { inti; intNumA =0, NUMC =0, NUMT =0, NUMG =0; P[id].num=0; P[id].id =ID; for(i =0; I ){ if(Str[id][i] = ='A') {P[id].num+=NUMC; P[id].num+=numt; P[id].num+=NUMG; NumA++; } Else if(Str[id][i] = ='C') {P[id].n
http://acm.hdu.edu.cn/showproblem.php?pid=1560Read the question carefully (!), you can find that the problem requires a shortest string, the string of discontinuous subsequence contains all the strings given by the topicBecause there are only 40 characters in total, try searching using a *1. Store the state directly stored 40 characters, 4 of each character may not be sure.Because it is required to include a discontinuous subsequence, just remember that the current string length is the length of
I 've been wrong about my questions recently, orz
The question is to find a string with the smallest sum of hamming values, rather than finding the smallest one from the given character.
In this case, we will perform column-by-column processing. The character at the current position of the requested string should be the highest number of occurrences in the column followed by the smallest ASCII Value
The Code is a little frustrated, and there are too many if statements
1 //#define LOCAL 2 #incl
I once wrote an article about the relationship between DNA and binary. I reviewed it a few days ago and felt a lot of emotion. I learned biology, and later I learned computer software development. I found that different information in different fields has similar encoding methods. Nowadays, there are more and more computer science and fewer professional courses. Maybe you are more interested in the IT world. By reviewing this article, I would like to
If you have seen the Winamp Avs chord curve people must be impressed by its gorgeous lighting effect.
This time I use fireworks to draw this bright color shadow DNA pattern desktop, mainly applied to the firework path, components complementary combination of use. Hope to be able to draw a new, let everybody discover firework more potential!
Here is the drawing step, look slowly oh, not too long.
1. The main application technology this time is th
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.