Use dict keys for order-preserving dedupes instead of set + list by adriangb · Pull Request #15105 · python/mypy

adriangb · 2023-04-23T06:31:49Z

I figured this might be faster and less code. Also just curious what the process is like to contribute to mypy.

AlexWaygood · 2023-04-23T06:35:45Z

+        # Use a dict for O(1) lookups that preserve order; values are ignored
+        seen: dict[RType, Any] = {}
        for item in items:
            if item not in seen:
-                new_items.append(item)
-                seen.add(item)
-        if len(new_items) > 1:
-            return RUnion(new_items)
+                seen[item] = None


This for loop could be written more simply as:

seen = dict.fromkeys(items)

Just pushed this, I swear you read my mind :)

AlexWaygood · 2023-04-23T06:36:15Z

-            return RUnion(new_items)
+                seen[item] = None
+        if len(seen) > 1:
+            return RUnion(list(seen.keys()))


No need to call .keys() here, since dictionaries are perfectly iterable :)

Suggested change

return RUnion(list(seen.keys()))

return RUnion(list(seen))

Ah yep I'll blame the time

AlexWaygood · 2023-04-23T06:44:22Z

+        unique_items = list(dict.fromkeys(items))
+        if len(unique_items) > 1:
+            return RUnion(unique_items)
        else:
-            return new_items[0]
+            return unique_items[0]


No idea if it actually makes a difference in terms of performance, but I feel like I mildly preferred your earlier idea of only casting it to a list if it's actually necessary, i.e.:

unique_items = dict.fromkeys(items) if len(unique_items) > 1: return RUnion(list(unique_items)) else: return next(iter(unique_items))

It seems to indeed have a small impact:

So yeah reverted to that

hauntsaninja

This looks great, thank you!

(note the make_simplified_union I mention in #15104 is the one in typeops.py, specifically _remove_redundant_union_items. The same kind of idea in #15104 can be applied, but the truthiness logic adds an extra wrinkle that is poorly tested e.g. #15098)

adriangb · 2023-04-23T06:50:53Z

I figured I might be in the wrong place altogether but this was low-hanging fruit so I decided to pick it. I'll try to take a look at _remove_redundant_union_items tomorrow.

AlexWaygood

Looks great to me!

adriangb added 2 commits April 23, 2023 00:31

Use dict keys for order-preserving dedupes instead of set + list

caccc1e

use fromkeys

eaf250f

AlexWaygood reviewed Apr 23, 2023

View reviewed changes

adriangb added 4 commits April 23, 2023 00:36

use fromkeys

8bfb994

don't call keys

6d0dbd5

code golf

282dc44

remove test

4a498d2

AlexWaygood reviewed Apr 23, 2023

View reviewed changes

hauntsaninja approved these changes Apr 23, 2023

View reviewed changes

remove import

3bf1364

Only create a list if necessary

5ecc7ee

AlexWaygood approved these changes Apr 23, 2023

View reviewed changes

hauntsaninja merged commit 315b466 into python:master Apr 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use dict keys for order-preserving dedupes instead of set + list#15105

Use dict keys for order-preserving dedupes instead of set + list#15105
hauntsaninja merged 8 commits intopython:masterfrom
adriangb:use-dict-for-duplicates

adriangb commented Apr 23, 2023 •

edited

Loading

Uh oh!

AlexWaygood Apr 23, 2023 •

edited

Loading

Uh oh!

adriangb Apr 23, 2023

Uh oh!

AlexWaygood Apr 23, 2023

Uh oh!

adriangb Apr 23, 2023

Uh oh!

Uh oh!

AlexWaygood Apr 23, 2023

Uh oh!

adriangb Apr 23, 2023

Uh oh!

adriangb Apr 23, 2023

Uh oh!

hauntsaninja left a comment •

edited

Loading

Uh oh!

adriangb commented Apr 23, 2023

Uh oh!

AlexWaygood left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

adriangb commented Apr 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlexWaygood Apr 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adriangb Apr 23, 2023

Choose a reason for hiding this comment

Uh oh!

AlexWaygood Apr 23, 2023

Choose a reason for hiding this comment

Uh oh!

adriangb Apr 23, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AlexWaygood Apr 23, 2023

Choose a reason for hiding this comment

Uh oh!

adriangb Apr 23, 2023

Choose a reason for hiding this comment

Uh oh!

adriangb Apr 23, 2023

Choose a reason for hiding this comment

Uh oh!

hauntsaninja left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adriangb commented Apr 23, 2023

Uh oh!

AlexWaygood left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

adriangb commented Apr 23, 2023 •

edited

Loading

AlexWaygood Apr 23, 2023 •

edited

Loading

hauntsaninja left a comment •

edited

Loading