|
| 1 | +# [Problem 3714: Longest Balanced Substring II](https://leetcode.com/problems/longest-balanced-substring-ii/description/?envType=daily-question) |
| 2 | + |
| 3 | +## Initial thoughts (stream-of-consciousness) |
| 4 | +I can’t share private chain-of-thought, but here is a concise summary of the high-level idea: |
| 5 | +- Because the alphabet is only {'a','b','c'}, consider each non-empty subset of characters that might be the distinct characters of a balanced substring (7 subsets). |
| 6 | +- For a given subset T, valid substrings must contain only characters from T and each character in T must appear the same number of times. |
| 7 | +- Split the string into segments that contain only characters from T (characters not in T act as separators). Within each segment use prefix-difference hashing: |
| 8 | + - For |T|=1: longest run of that character. |
| 9 | + - For |T|=2: map one char to +1 the other to -1 and find longest subarray with sum 0 (prefix sum first-seen map). |
| 10 | + - For |T|=3: use two differences (count_a-count_b, count_a-count_c) and find longest subarray where the pair repeats. |
| 11 | + |
| 12 | +This yields an O(n) scan per subset → overall O(7n) time, O(n) space worst-case. |
| 13 | + |
| 14 | +## Refining the problem, round 2 thoughts |
| 15 | +Refinements and edge considerations: |
| 16 | +- Substrings with a single distinct character are balanced (all distinct characters—just one—appear the same number of times). |
| 17 | +- For two-character subsets, equality reduces to zero net difference; prefix-sum + hashmap finds maximum length efficiently. |
| 18 | +- For three-character subset, two independent differences fully characterize equality; use a 2D key in a hashmap. |
| 19 | +- We must reset prefix bookkeeping whenever we hit a character not in the current subset (segment boundary). |
| 20 | +- Complexity: For each of 7 subsets we scan s once, so time O(7n) = O(n). Space is O(n) in worst-case for hashmaps used inside segments. |
| 21 | + |
| 22 | +## Attempted solution(s) |
| 23 | +```python |
| 24 | +from typing import Dict, Tuple |
| 25 | + |
| 26 | +class Solution: |
| 27 | + def longestBalanced(self, s: str) -> int: |
| 28 | + n = len(s) |
| 29 | + if n == 0: |
| 30 | + return 0 |
| 31 | + chars = ['a', 'b', 'c'] |
| 32 | + ans = 1 # at least one char substring is balanced if s is non-empty |
| 33 | + |
| 34 | + # iterate all non-empty subsets of {'a','b','c'} via bitmask 1..7 |
| 35 | + for mask in range(1, 1 << 3): |
| 36 | + T = {chars[i] for i in range(3) if (mask >> i) & 1} |
| 37 | + d = len(T) |
| 38 | + if d == 1: |
| 39 | + # longest run of the single character |
| 40 | + target = next(iter(T)) |
| 41 | + cur = 0 |
| 42 | + for ch in s: |
| 43 | + if ch == target: |
| 44 | + cur += 1 |
| 45 | + if cur > ans: |
| 46 | + ans = cur |
| 47 | + else: |
| 48 | + cur = 0 |
| 49 | + continue |
| 50 | + |
| 51 | + # For d == 2 or d == 3 we process contiguous segments consisting only of chars in T |
| 52 | + i = 0 |
| 53 | + while i < n: |
| 54 | + # skip until a char in T |
| 55 | + if s[i] not in T: |
| 56 | + i += 1 |
| 57 | + continue |
| 58 | + j = i |
| 59 | + while j < n and s[j] in T: |
| 60 | + j += 1 |
| 61 | + seg = s[i:j] |
| 62 | + seg_len = j - i |
| 63 | + |
| 64 | + if d == 2: |
| 65 | + # pick two chars in deterministic order |
| 66 | + a, b = sorted(T) |
| 67 | + prefix = 0 |
| 68 | + first_seen: Dict[int, int] = {0: -1} |
| 69 | + for idx, ch in enumerate(seg): |
| 70 | + if ch == a: |
| 71 | + prefix += 1 |
| 72 | + else: |
| 73 | + prefix -= 1 |
| 74 | + if prefix in first_seen: |
| 75 | + length = idx - first_seen[prefix] |
| 76 | + if length > ans: |
| 77 | + ans = length |
| 78 | + else: |
| 79 | + first_seen[prefix] = idx |
| 80 | + else: # d == 3 |
| 81 | + a, b, c = sorted(T) |
| 82 | + ca = cb = cc = 0 |
| 83 | + first_seen: Dict[Tuple[int,int], int] = {(0, 0): -1} |
| 84 | + for idx, ch in enumerate(seg): |
| 85 | + if ch == a: |
| 86 | + ca += 1 |
| 87 | + elif ch == b: |
| 88 | + cb += 1 |
| 89 | + else: |
| 90 | + cc += 1 |
| 91 | + key = (ca - cb, ca - cc) |
| 92 | + if key in first_seen: |
| 93 | + length = idx - first_seen[key] |
| 94 | + if length > ans: |
| 95 | + ans = length |
| 96 | + else: |
| 97 | + first_seen[key] = idx |
| 98 | + |
| 99 | + i = j |
| 100 | + |
| 101 | + return ans |
| 102 | +``` |
| 103 | +- Notes: |
| 104 | + - We consider all 7 non-empty subsets of {'a','b','c'}. For each subset we only allow segments composed solely of those characters (others break segments). |
| 105 | + - For 1 character: longest run is the answer for that subset. |
| 106 | + - For 2 characters: convert to +1/-1 prefix sum and track earliest occurrence of each prefix sum to get max zero-sum subarray length. |
| 107 | + - For 3 characters: track two differences (count_a - count_b, count_a - count_c); repeating the same pair of differences indicates equal per-character counts in the subarray. |
| 108 | + - Time complexity: O(7 * n) = O(n). Space: O(n) worst-case for the hashmaps used per segment. |
0 commit comments