Use dict keys for order-preserving dedupes instead of set + list#15105
Use dict keys for order-preserving dedupes instead of set + list#15105hauntsaninja merged 8 commits intopython:masterfrom
Conversation
| # Use a dict for O(1) lookups that preserve order; values are ignored | ||
| seen: dict[RType, Any] = {} | ||
| for item in items: | ||
| if item not in seen: | ||
| new_items.append(item) | ||
| seen.add(item) | ||
| if len(new_items) > 1: | ||
| return RUnion(new_items) | ||
| seen[item] = None |
There was a problem hiding this comment.
This for loop could be written more simply as:
seen = dict.fromkeys(items)There was a problem hiding this comment.
Just pushed this, I swear you read my mind :)
| return RUnion(new_items) | ||
| seen[item] = None | ||
| if len(seen) > 1: | ||
| return RUnion(list(seen.keys())) |
There was a problem hiding this comment.
No need to call .keys() here, since dictionaries are perfectly iterable :)
| return RUnion(list(seen.keys())) | |
| return RUnion(list(seen)) |
There was a problem hiding this comment.
Ah yep I'll blame the time
| unique_items = list(dict.fromkeys(items)) | ||
| if len(unique_items) > 1: | ||
| return RUnion(unique_items) | ||
| else: | ||
| return new_items[0] | ||
| return unique_items[0] |
There was a problem hiding this comment.
No idea if it actually makes a difference in terms of performance, but I feel like I mildly preferred your earlier idea of only casting it to a list if it's actually necessary, i.e.:
unique_items = dict.fromkeys(items)
if len(unique_items) > 1:
return RUnion(list(unique_items))
else:
return next(iter(unique_items))There was a problem hiding this comment.
So yeah reverted to that
|
I figured I might be in the wrong place altogether but this was low-hanging fruit so I decided to pick it. I'll try to take a look at |

I figured this might be faster and less code. Also just curious what the process is like to contribute to mypy.