All DNA are composed of a series of nucleotides abbreviated as a, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it's sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "aaaaacccccaaaaaccccccaaaaagggttt", return:["AAAAACCCCC", "CCCCCAAAAA"].
Slide the window to move backward one character at a time, because the directly stored string exceeds the memory limit, so it is converted to int
A-0
B-1
C-2
3 D
Time complexity O (n), Spatial complexity O (4^10)
public class Solution {
Public list<string> findrepeateddnasequences (String s) {
Map<integer, integer> map = new hashmap<> ();
list<string> result = new arraylist<> ();
for (int i = 0; i < s.length ()-9; i++) {
int subStr = Converttoint (s, I, i + 10);
if (Map.containskey (SUBSTR)) {
if (Map.get (subStr) = = 1) {
Result.add (s.substring (i, i + 10));
Map.put (SubStr, Map.get (SUBSTR) + 1);
}
} else {
Map.put (SUBSTR, 1);
}
}
return result;
}
private int Converttoint (String s, int start, int end) {
int res = 0;
while (Start < end) {
char C = s.charat (start);
int v = 0;
Switch (c) {
Case ' A ': v = 0; Break
Case ' C ': v = 1; Break
Case ' G ': v = 2; Break
Case ' T ': v = 3; Break
}
res = Res << 2 | V
start++;
}
return res;
}
}
187. Repeated DNA sequences