All DNA are composed of a series of nucleotides abbreviated as a, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it's sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "aaaaacccccaaaaaccccccaaaaagggttt", return:["AAAAACCCCC", "CCCCCAAAAA"].
Public classSolution { PublicList<string>findrepeateddnasequences (String s) {//because there are only 4 letters, you can create your own hashkey, each two bits, corresponding to a incoming character. When more than 20bit is 10 characters, only 20bits is reserved.Map<character,integer> map=NewHashmap<character,integer>(); Map.put (' A ', 0); Map.put (' C ', 1); Map.put (' G ', 2); Map.put (' T ', 3); List<String> res=NewArraylist<string>(); intHash=0; Set<Integer> set=NewHashset<integer>(); for(intI=0;i<s.length (); i++){ CharC=S.charat (i); if(i<9) {Hash= (HASH<<2) +Map.get (c); }Else{Hash= (HASH<<2) +Map.get (c); Hash&= (1<<20)-1; if(Set.contains (hash)) {if(!res.contains (S.substring (i-9,i+1)) ) Res.add (s.substring (i-9,i+1)); }Else{set.add (hash); } } } returnRes; }}
[Leedcode 187] repeated DNA sequences