Problem
Given a string, find the length of the longest substring without repeating characters.
Solution
Approach 1 : Brute Force
Intuition
Check all the substring one by one to see if it has no duplicate character.
Algorithm
def lengthOfLongestSubstring(self, s): """ :type s: str :rtype: int """ if len(s) == 0 : return 0 result = 1 for i in range(len(s)) : for j in range(i+1, len(s)) : if s[j] in s[i:j] : break else : result = max(result, len(s[i:j+1])) return result | cs |
Complexity Analysis
.
To verify if characters within index range[i, j) are all unique, we need to scan all of them. Thus, it costs O(j-i) time.
For a given i, the sum of time costed by each
is
∑i+1nO(j−i)
Thus, the sum of all the time consumption is :
O(∑i=0n−1(∑j=i+1n(j−i)))=O(∑i=0n−12(1+n−i)(n−i))=O(n3)
•Space complexity : O(min(n,m)). We need O(k) space for checking a substring has no duplicate characters, where k is the size of the Set. The size of the Set is upper bounded by the size of the string n and the size of the charset/alphabet m.
Approach 2 : Sliding Window
Algorithm
The naive approach is very straightforward. But it is too slow. So how can we optimize it?
In the naive approaches, we repeatedly check a substring to see if it has duplicate character. But it is necessary. If a substring s[i:j], from index i to j-1 is already checked to have no duplicate characters. We only need to check if s[j] is already in the substring s[i:j].
algorithm. But we can do better.
By using HashSet as a sliding window, checking if a character in the current can be done in O(1).
A sliding window is an abstract concept commonly used in array/string problems. A window is a range of elements in the array/string which usually defined by the start and end indices, i.e. [i, j). A sliding window is a window "slides" its two boundaries to the certain direction. For example, if we slice [i, j) to the right by 1 element, then it becomes [i_1, j+1).
Back to our problem. We use HashSet to store the characters in current window [i, j). Then we slide the index j to the right. If it is not in the HashSet, we slide j further. Doing so until s[j] is already in the HashSet. At this point, we found the maximum size of substrings without duplicate characters start with index i. If we do this for all i, we get our answer.
def lengthOfLongestSubstring(self, s): n = len(s) hashSet = dict() ans, i ,j = 0, 0, 0 while i < n and j < n : if s[j] not in hashSet : hashSet[s[j]] = 1 j += 1 ans = max(ans, j-i) else : hashSet.pop(s[i]) i += 1 return ans | cs |
Complexity Analysis
•Time complexity : O(2n) = O(n). In the worst case each character will be visited twice by i and j.
•Space complexity : O(min(m, n)). Same as the previous approach. We need O(k) space for the sliding window, where k is the size of the Set. The size of the Set is upper bounded by the size of the string n and the size of the charset/alphabet m.
Approach 3 : Sliding Window Optimized
The above solution requires at most 2n steps. In fact, it could be optimized to require only n steps. Instead of using a set to tell if a character exists or not, we could define a mapping of the characters to its index. Then we can skip the characters immediately when we found a repeated character.
The reason is that if s[j] have a duplicate in the range[i, j) with j', we don't need to increase i little by little. We can skip all the elements in the range [i, j'] and let i to be j' + i directly.
def lengthOfLongestSubstring(self, s): dic = dict() start, ans = 0, 0 for i, ch in enumerate(s) : if ch in dic and start <= dic[s[i]] : start = dic[s[i]] + 1 else : ans = max(ans, i-start+1) dic[ch] = i return ans | cs |
Complexity Analysis
• Time complexity : O(n). index j will iterate n times.
• Space complexity : O(min(m, n)). Same as the previous approach.
The better way
출처 : https://leetcode.com/problems/longest-substring-without-repeating-characters/
'0 > algorithm' 카테고리의 다른 글
Median of Two Sorted Arrays (in Python) (0) | 2019.01.19 |
---|---|
What does python's sort use? (0) | 2019.01.19 |
Add two numbers (in Python) (0) | 2019.01.18 |
Two Sum (in Python) (0) | 2019.01.18 |
우선순위 큐 (priority queue), 힙 정렬 (heap sort) (0) | 2018.12.21 |