leetcode_repeated DNA Sequences

Source: Internet
Author: User

Describe:

All DNA are composed of a series of nucleotides abbreviated as a, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it's sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "aaaaacccccaaaaaccccccaaaaagggttt", return:["AAAAACCCCC", "CCCCCAAAAA"].
Ideas:

1. It is clear that the solution to violence is also a method, although the method is not possible.

2. We first look at the ASCII codes for the letters "A" "C" "G" "T", respectively, 65, 67, 71, 84, binary represented as 1000001, 1000011, 1000111, 1010100. You can see that the latter three bits are different, so use the latter three bits to differentiate between the four letters. A letter with 3bit to distinguish, then 10 letters with 30bit is enough. This 0~9 character is represented by the 29th to No. 0 decimal table of int, and then the 30bit is converted to int as the key of the substring and placed in the Hashtable to determine whether the substring has occurred.

Code:
Public list<string> findrepeateddnasequences (String s) {list<string>list=new arraylist<string> (); int Strlen=s.length (); if (strlen<=10) return list; Hashmap<integer, Integer>map=new hashmap<integer,integer> (); int key=0;for (int i=0;i<strLen;i++) {key = ((key<<3) | (S.charat (i) &0x7)) &0x3fffffff;//k<<3,key left 3 bits, that is, the leftmost character is removed//s.charat (i) &0x7) get a low 3-bit//& for marking S.charat (i) characters 0x3fffffff Erase key left three bit after the high-level irrelevant bit if (i<9) continue;if (Map.get (key) ==null)//If there is no string represented by the integer, add it into the map map.put (key, 1 else if (Map.get (key) ==1)//If present, indicates that a duplicate string exists and adds it to the result list {List.add (s.substring (i-9,i+1)); Map.put (key, 2);// Prevents duplicate addition of the same string}}return list;}


Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

leetcode_repeated DNA Sequences

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.