LeetCode: 3 Longest Substring Without Repeating Characters

Given a string s, find the length of the longest substring without repeating characters.

Example:

Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Example:

Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

问题

给定一个字符串 s,找到没有重复字符的最长子字符串的长度。

解决方案

Approach 1: Sliding Window / 滑动窗口法

This solution uses a sliding window approach to maintain a substring without repeating characters and dynamically adjust its length as we iterate through the string.

该解决方案使用滑动窗口的方法来维护一个没有重复字符的子字符串,并在遍历字符串时动态调整其长度。

Implementation / 实现

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        char_set = set()
        left = 0
        max_length = 0

        for right in range(len(s)):
            while s[right] in char_set:
                char_set.remove(s[left])
                left += 1
            char_set.add(s[right])
            max_length = max(max_length, right - left + 1)

        return max_length

Explanation / 解释

  1. Initialize Data Structures / 初始化数据结构

    char_set = set()
    left = 0
    max_length = 0
    • English: We initialize a set char_set to store characters in the current substring, a left pointer to indicate the start of the window, and max_length to store the maximum length found.
    • Chinese: 我们初始化一个集合 char_set 来存储当前子字符串中的字符,left 指针表示窗口的起始位置,max_length 用于存储找到的最大长度。
  2. Traverse the String / 遍历字符串

    for right in range(len(s)):
    • English: We use a right pointer to iterate through the string.
    • Chinese: 我们使用 right 指针遍历字符串。
  3. Adjust the Window / 调整窗口

    while s[right] in char_set:
       char_set.remove(s[left])
       left += 1
    • English: If the character at the right pointer is already in the set, we remove characters from the set starting from the left pointer until the character is unique.
    • Chinese: 如果 right 指针指向的字符已经在集合中,我们从 left 指针开始移除集合中的字符,直到该字符是唯一的。
  4. Expand the Window / 扩展窗口

    char_set.add(s[right])
    max_length = max(max_length, right - left + 1)
    • English: We add the current character to the set and update max_length to the maximum length found so far.
    • Chinese: 我们将当前字符添加到集合中,并更新 max_length 为迄今为止找到的最大长度。
  5. Return the Result / 返回结果

    return max_length
    • English: We return the length of the longest substring without repeating characters.
    • Chinese: 我们返回没有重复字符的最长子字符串的长度。

Approach 2: HashMap-based Sliding Window / 基于哈希映射的滑动窗口

This approach enhances the basic sliding window method by using a HashMap (or dictionary) to keep track of the last seen position of each character. This allows us to efficiently adjust the window without needing to manually remove characters.

这种方法通过使用哈希映射(或字典)来跟踪每个字符最后出现的位置,从而增强了基本的滑动窗口方法。这使我们能够高效地调整窗口,而无需手动删除字符。

Implementation / 实现

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        seen = {}
        l = 0
        length = 0
        for r in range(len(s)):
            c = s[r]
            if c in seen and seen[c] >= l:
                l = seen[c] + 1
            else:
                length = max(length, r - l + 1 )
            seen[c] = r

        return length

Explanation / 解释

  1. Initialize Data Structures / 初始化数据结构

    seen = {}
    l = 0
    length = 0
    • English: We initialize a dictionary seen to store the last index of each character, a left pointer (l) to indicate the start of the window, and length to store the maximum length found.
    • Chinese: 我们初始化一个字典 seen 来存储每个字符的最后一个索引,left 指针 (l) 表示窗口的起始位置,length 用于存储找到的最大长度。
  2. Traverse the String / 遍历字符串

    for r in range(len(s)):
       c = s[r]
    • English: We use a right pointer (r) to iterate through the string and c to represent the current character.
    • Chinese: 我们使用 right 指针 (r) 来遍历字符串,并用 c 表示当前字符。
  3. Check for Repeating Characters / 检查重复字符

    if c in seen and seen[c] >= l:
       l = seen[c] + 1
    • English: If the current character c has been seen before and its last seen index is within the current window, we move the left pointer to the right of the last occurrence.
    • Chinese: 如果当前字符 c 之前出现过并且其最后一次出现的索引在当前窗口内,我们将 left 指针移动到最后一次出现的位置的右侧。
  4. Update Maximum Length / 更新最大长度

    else:
       length = max(length, r - l + 1 )
    • English: If the character is not a repeat within the current window, we update length to the maximum length found so far.
    • Chinese: 如果该字符在当前窗口内不是重复的,我们将 length 更新为迄今为止找到的最大长度。
  5. Store the Character Index / 存储字符索引

    seen[c] = r
    • English: We store the current index of the character c in the seen dictionary.
    • Chinese: 我们将字符 c 的当前索引存储在 seen 字典中。
  6. Return the Result / 返回结果

    return length
    • English: We return the length of the longest substring without repeating characters.
    • Chinese: 我们返回没有重复字符的最长子字符串的长度。

Approach 3: Recursive Approach / 递归方法

While this problem is typically solved using iterative methods like sliding windows, a recursive approach can also be devised, although it is less efficient and more complex. The idea is to recursively explore each substring, checking for repeating characters.

尽管这个问题通常通过滑动窗口等迭代方法来解决,但也可以设计一种递归方法,尽管它效率较低且更复杂。其思路是递归地探索每个子字符串,检查是否有重复字符。

Implementation / 实现

def lengthOfLongestSubstringRec(s, start, seen, max_length):
    if start == len(s):
        return max_length

    current_char = s[start]
    if current_char in seen:
        return max(max_length, len(seen))

    seen.add(current_char)
    return max(max_length, lengthOfLongestSubstringRec(s, start + 1, seen, max_length))

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        max_length = 0
        for i in range(len(s)):
            max_length = max(max_length, lengthOfLongestSubstringRec(s, i, set(), 0))
        return max_length

Explanation / 解释

  1. Recursive Function / 递归函数

    def lengthOfLongestSubstringRec(s, start, seen, max_length):
    • English: The recursive function explores substrings starting from the index start, using a set seen to track characters and max_length to store the maximum length found.
    • Chinese: 递归函数从索引 start 开始探索子字符串,使用集合 seen 跟踪字符,并用 max_length 存储找到的最大长度。
  2. Base Case / **

基本情况**

   if start == len(s):
       return max_length
  • English: If the start index reaches the end of the string, return the maximum length found.
  • Chinese: 如果起始索引达到字符串末尾,则返回找到的最大长度。
  1. Check for Repeating Characters / 检查重复字符

    current_char = s[start]
    if current_char in seen:
       return max(max_length, len(seen))
    • English: If the current character is in the seen set, return the maximum length between max_length and the length of seen.
    • Chinese: 如果当前字符在 seen 集合中,则返回 max_lengthseen 长度之间的最大值。
  2. Add Character to Set / 将字符添加到集合

    seen.add(current_char)
    return max(max_length, lengthOfLongestSubstringRec(s, start + 1, seen, max_length))
    • English: Add the current character to the seen set and recursively call the function for the next index.
    • Chinese: 将当前字符添加到 seen 集合中,并递归调用函数处理下一个索引。
  3. Iterate Over All Starting Points / 遍历所有起始点

    for i in range(len(s)):
       max_length = max(max_length, lengthOfLongestSubstringRec(s, i, set(), 0))
    • English: The main function iterates over all possible starting points to find the overall maximum length.
    • Chinese: 主函数遍历所有可能的起始点,以找到总体最大长度。

Complexity Analysis / 复杂度分析

  • Time Complexity / 时间复杂度: O(n^2), where n is the length of the string s. The recursive approach checks all substrings, making it less efficient.
  • Space Complexity / 空间复杂度: O(n), due to the recursion stack and the set used to track characters.

Key Concept / 关键概念

  • Recursive Exploration / 递归探索: This approach is more theoretical and demonstrates the use of recursion in exploring substrings, though it’s not the most efficient method for this problem.
  • 递归探索: 这种方法更具理论性,展示了在探索子字符串时使用递归的方式,尽管这不是解决这个问题的最有效方法。

Summary / 总结

  • English: The recursive approach offers a different perspective on solving the problem but is less practical compared to iterative methods like sliding windows.
  • Chinese: 递归方法提供了解决问题的不同视角,但与滑动窗口等迭代方法相比,实用性较差。

Tips / 提示

  • English: Practice recognizing when recursive methods are applicable and compare them with iterative approaches for different types of problems.
  • Chinese: 练习识别递归方法适用的情况,并将其与迭代方法进行比较,以应对不同类型的问题。

Complexity Analysis for All Approaches / 所有方法的复杂度分析

  1. Sliding Window / 滑动窗口

    • Time Complexity / 时间复杂度: O(n)
    • Space Complexity / 空间复杂度: O(min(n, m))
  2. HashMap-based Sliding Window / 基于哈希映射的滑动窗口

    • Time Complexity / 时间复杂度: O(n)
    • Space Complexity / 空间复杂度: O(min(n, m))
  3. Recursive Approach / 递归方法

    • Time Complexity / 时间复杂度: O(n^2)
    • Space Complexity / 空间复杂度: O(n)

Solution Template for Similar Questions / 提示

  • English: For substring or subarray problems, consider sliding window techniques first, and explore recursive methods if necessary.
  • Chinese: 对于子字符串或子数组问题,首先考虑滑动窗口技术,并在必要时探索递归方法。

5 More Similar Questions / 推荐5问题

  1. LeetCode 159. Longest Substring with At Most Two Distinct Characters
  2. LeetCode 340. Longest Substring with At Most K Distinct Characters
  3. LeetCode 76. Minimum Window Substring
  4. LeetCode 424. Longest Repeating Character Replacement
  5. LeetCode 30. Substring with Concatenation of All Words

Recommended Resources / 推荐资源

  • English: Practice problems involving substrings and sliding window techniques to improve problem-solving skills.
  • Chinese: 练习涉及子字符串和滑动窗口技术的问题,以提高问题解决能力。

LeetCode #3: Longest Substring Without Repeating Characters by Algo Engine

Comments

4 responses to “LeetCode: 3 Longest Substring Without Repeating Characters”

  1. admin Avatar

    In Python, a set is an unordered collection of unique elements. The `remove()` function for sets doesn’t have a concept of order when deleting elements. It simply removes the specified element from the set, regardless of when it was added or where it might be conceptually located.

    In the context of this specific code:

    “`python
    while s[right] in char_set:
    char_set.remove(s[left])
    left += 1
    “`

    The `remove()` function is not deleting elements in any particular order. Instead, it’s removing the specific element `s[left]` from the set. The order in which elements are removed is determined by the `left` pointer, which is incremented after each removal.

    So while the removals may appear to be “in order” from the perspective of the string `s`, this is due to how the `left` pointer is managed, not because of any inherent ordering in the set or its `remove()` function.

    To summarize:
    1. Python sets are unordered.
    2. The `remove()` function doesn’t delete elements “in order”.
    3. The apparent order in this code comes from how the `left` pointer is used, not from the set itself.

  2. admin Avatar

    当然,我来用中文解释这个问题:

    在Python中,集合(set)是一个无序的唯一元素集合。集合的`remove()`函数在删除元素时没有顺序的概念。它只是简单地从集合中移除指定的元素,不考虑该元素是何时添加的或者在概念上可能位于何处。

    在这段特定代码的上下文中:

    “`python
    while s[right] in char_set:
    char_set.remove(s[left])
    left += 1
    “`

    `remove()`函数并不是按照任何特定顺序删除元素。相反,它是在移除集合中的特定元素`s[left]`。元素被移除的顺序是由`left`指针决定的,该指针在每次移除后递增。

    所以虽然从字符串`s`的角度来看,这些移除操作可能看起来是”有序的”,但这是由于`left`指针的管理方式,而不是因为集合或其`remove()`函数有任何固有的顺序。

    总结一下:
    1. Python的集合是无序的。
    2. `remove()`函数不会”按顺序”删除元素。
    3. 这段代码中看似有序的删除操作是由`left`指针的使用方式决定的,而不是由集合本身决定的。

  3. admin Avatar

    上述代码没有问题。虽然 Python 的集合是无序的,但这段代码的逻辑是正确的。让我解释一下为什么:

    1. 代码的目的:这段代码是为了找到最长的不含重复字符的子串。

    2. 滑动窗口的使用:代码使用了滑动窗口技术,通过 `left` 和 `right` 指针来定义当前的子串。

    3. 集合的作用:`char_set` 用于快速检查字符是否重复,而不是为了保持字符的顺序。

    4. 删除操作的逻辑:
    – 当发现重复字符时,代码从左侧开始移除字符,直到重复被消除。
    – 虽然 `remove()` 操作本身不保证顺序,但 `left` 指针确保了我们总是按照字符串中字符出现的顺序来移除。

    5. 正确性:这种方法确保了我们总是维护一个不含重复字符的窗口,并正确计算其长度。

    所以,尽管 Python 的集合是无序的,但在这个算法中,元素被添加和删除的顺序是由字符串和指针控制的,而不是依赖于集合的顺序。因此,这个代码是正确的,没有问题。

  4. admin Avatar

    当然,我会重新详细解释”pwwkew”这个例子。让我们一步步来看:

    1. 初始状态:
    – 字符串:p w w k e w
    – 指针: l
    r
    – char_set: {p}

    2. 右指针移动到第一个’w’:
    – 字符串:p w w k e w
    – 指针: l
    r
    – char_set: {p, w}

    3. 右指针移动到第二个’w’(发现重复):
    – 字符串:p w w k e w
    – 指针: l
    r
    – 此时’w’重复,开始从左侧移除字符
    – char_set.remove(‘p’)
    – left指针右移

    4. 移除’p’后的状态:
    – 字符串:p w w k e w
    – 指针: l
    r
    – char_set: {w}
    – 注意:’w’仍然重复,继续移除

    5. 移除第一个’w’:
    – 字符串:p w w k e w
    – 指针: l
    r
    – char_set: {}
    – 添加当前的’w’到char_set

    6. 移除后的最终状态:
    – 字符串:p w w k e w
    – 指针: l
    r
    – char_set: {w}

    7. 继续向右移动,添加’k’:
    – 字符串:p w w k e w
    – 指针: l
    r
    – char_set: {w, k}

    8. 添加’e’:
    – 字符串:p w w k e w
    – 指针: l
    r
    – char_set: {w, k, e}

    9. 移动到最后一个’w’(再次发现重复):
    – 字符串:p w w k e w
    – 指针: l
    r
    – 开始从左侧移除字符,直到没有重复
    – 移除’w’,left指针移动到’k’
    – 最终状态:char_set: {k, e, w}

    10. 算法结束,最长无重复子串为”wke”,长度为3

    这个例子展示了如何处理多次出现的重复字符。尽管set是无序的,但通过left指针的移动,我们能够按照字符在原字符串中的顺序来处理字符,确保了算法的正确性。

Leave a Reply

Your email address will not be published. Required fields are marked *