- vừa được xem lúc

Repeated Dna sequence - Funny

0 0 3

Người đăng: MinhDrake

Theo Viblo Asia

Problem description

  • Given a string s repeating a DNA sequence ( For you don't know what is DNA sequence, it includes 'A', 'C', 'G', 'T'). Your mission is figuring out all the 10 letter long subsequences (substrings) that occur more than once in the sequence.

Approach

  1. Use a sliding window technique to extract all substrings of length 10 from s.
  2. Maintain a seen to store all substrings your loop has encountered.
  3. Maintain a duplicated to store all substrings that encoutered more than once.
  4. Iterate through a list (0 -> len- L + 1)
    • If it not seen, add to seen list
    • If it has been in seen list before, add to duplicated ( there are two ways to implement duplciated here, use set first or when return)
  5. Converted duplicated to set if needed

Time complexity: O(NL) (N len of str, L is len of substring) - because when you create a substring, your effort is equal to substring too.

Space complexity O(NL)

Solution

from typing import List class Solution: def findRepeatedDnaSequences(self, s: str) -> List[str]: return self.findRepeatedLenN(s, 10) def findRepeatedLenN(self, s: str, n: int) -> List[str]: seen = set() duplicated = set() for i in range(len(s) - n + 1): sub = s[i: i + n] if sub in seen: duplicated.add(sub) else: seen.add(sub) return list(duplicated) solution = Solution()
print(solution.findRepeatedDnaSequences("AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT")) 

Bình luận

Bài viết tương tự

- vừa được xem lúc

Kỹ Thuật Sliding Window Trong Lập Trình

1. Giới Thiệu Về Sliding Window.

0 0 3