Fix large message read performance by enforcing max `read_buffer_size` read chunks by alexheretic · Pull Request #496 · snapview/tungstenite-rs

alexheretic · 2025-05-16T16:34:42Z

Non-breaking change to use read_buffer_size as a maximum read buffer size, so a large message will be read in chunks of, by default, 128KiB. This seems to significantly improve the performance of large messages.

See #493 (comment)

Resolves #493

Benchmarks

Using the #493 provided benches this fix addresses the regression and indeed provides better performance for all sizes than 0.24.

# 0.24
>>> 1.0 MB took ~1.993457ms with speed ~501.6 MB/s
>>> 3.0 MB took ~4.074073ms with speed ~736.4 MB/s
>>> 6.0 MB took ~5.40309ms with speed ~1.1 GB/s
>>> 10.0 MB took ~7.491196ms with speed ~1.3 GB/s
>>> 20.0 MB took ~16.038946ms with speed ~1.2 GB/s
>>> 30.0 MB took ~24.521725ms with speed ~1.2 GB/s
>>> 40.0 MB took ~37.509191ms with speed ~1.0 GB/s
>>> 50.0 MB took ~45.244393ms with speed ~1.1 GB/s
>>> 60.0 MB took ~53.531487ms with speed ~1.1 GB/s
>>> 100.0 MB took ~102.38638ms with speed ~976.7 MB/s
>>> 1000.0 MB took ~3.816925448s with speed ~262.0 MB/s

# 0.26.2
>>> 1.0 MB took ~1.402537ms with speed ~713.0 MB/s
>>> 3.0 MB took ~2.322722ms with speed ~1.3 GB/s
>>> 6.0 MB took ~3.837006ms with speed ~1.5 GB/s
>>> 10.0 MB took ~3.960994ms with speed ~2.5 GB/s
>>> 20.0 MB took ~31.777352ms with speed ~629.4 MB/s
>>> 30.0 MB took ~72.20983ms with speed ~415.5 MB/s
>>> 40.0 MB took ~95.672327ms with speed ~418.1 MB/s
>>> 50.0 MB took ~177.217286ms with speed ~282.1 MB/s
>>> 60.0 MB took ~261.481951ms with speed ~229.5 MB/s
>>> 100.0 MB took ~647.759606ms with speed ~154.4 MB/s
>>> 1000.0 MB took ~45.621549647s with speed ~21.9 MB/s

# PR
>>> 1.0 MB took ~550.883µs with speed ~1.8 GB/s
>>> 3.0 MB took ~1.555067ms with speed ~1.9 GB/s
>>> 6.0 MB took ~1.648007ms with speed ~3.6 GB/s
>>> 10.0 MB took ~2.242798ms with speed ~4.4 GB/s
>>> 20.0 MB took ~5.894016ms with speed ~3.3 GB/s
>>> 30.0 MB took ~11.865491ms with speed ~2.5 GB/s
>>> 40.0 MB took ~12.758398ms with speed ~3.1 GB/s
>>> 50.0 MB took ~21.80129ms with speed ~2.2 GB/s
>>> 60.0 MB took ~25.910093ms with speed ~2.3 GB/s
>>> 100.0 MB took ~44.506567ms with speed ~2.2 GB/s
>>> 1000.0 MB took ~635.858695ms with speed ~1.5 GB/s

I think we should add some high quality benches into this repo so we can track general performance better with a real Read impl. The current read benches are quite specific to batched small writes/reads. If I get time I'd like to add some as a follow up.

…` read chunks

Co-authored-by: Daniel Abramov <inetcrack2@gmail.com>

alexheretic · 2025-05-18T13:17:11Z

With the proposed e2e benchmarks in #497 we also see the performance addressed there:

send+recv/512 B         time:   [13.378 µs 13.446 µs 13.464 µs]
                        thrpt:  [72.534 MiB/s 72.626 MiB/s 72.998 MiB/s]
                 change:
                        time:   [+0.4288% +0.9211% +1.4151%] (p = 0.14 > 0.05)
                        thrpt:  [−1.3954% −0.9127% −0.4269%]
                        No change in performance detected.

send+recv/4 KiB         time:   [15.385 µs 15.409 µs 15.504 µs]
                        thrpt:  [503.89 MiB/s 507.02 MiB/s 507.81 MiB/s]
                 change:
                        time:   [−0.8401% +0.6998% +2.2758%] (p = 0.73 > 0.05)
                        thrpt:  [−2.2252% −0.6949% +0.8473%]
                        No change in performance detected.

send+recv/32 KiB        time:   [28.154 µs 28.175 µs 28.257 µs]
                        thrpt:  [2.1600 GiB/s 2.1663 GiB/s 2.1679 GiB/s]
                 change:
                        time:   [−9.1622% −8.8444% −8.5256%] (p = 0.07 > 0.05)
                        thrpt:  [+9.3202% +9.7026% +10.086%]
                        No change in performance detected.

send+recv/256 KiB       time:   [114.37 µs 115.23 µs 118.65 µs]
                        thrpt:  [4.1155 GiB/s 4.2376 GiB/s 4.2692 GiB/s]
                 change:
                        time:   [−11.231% −9.0861% −6.9179%] (p = 0.05 < 0.05)
                        thrpt:  [+7.4321% +9.9942% +12.652%]
                        Performance has improved.

send+recv/2 MiB         time:   [944.88 µs 947.06 µs 955.79 µs]
                        thrpt:  [4.0869 GiB/s 4.1246 GiB/s 4.1341 GiB/s]
                 change:
                        time:   [−13.055% −11.016% −8.9031%] (p = 0.05 > 0.05)
                        thrpt:  [+9.7732% +12.379% +15.015%]
                        No change in performance detected.

send+recv/16 MiB        time:   [12.886 ms 12.943 ms 13.172 ms]
                        thrpt:  [2.3724 GiB/s 2.4144 GiB/s 2.4251 GiB/s]
                 change:
                        time:   [−55.081% −54.294% −53.496%] (p = 0.07 > 0.05)
                        thrpt:  [+115.04% +118.79% +122.62%]
                        No change in performance detected.

send+recv/128 MiB       time:   [166.54 ms 172.08 ms 173.47 ms]
                        thrpt:  [1.4412 GiB/s 1.4528 GiB/s 1.5012 GiB/s]
                 change:
                        time:   [−83.027% −82.534% −82.033%] (p = 0.07 > 0.05)
                        thrpt:  [+456.57% +472.53% +489.16%]
                        No change in performance detected.

send+recv/1 GiB         time:   [1.1908 s 1.1914 s 1.1937 s]
                        thrpt:  [1.6755 GiB/s 1.6787 GiB/s 1.6795 GiB/s]
                 change:
                        time:   [−96.623% −96.619% −96.615%] (p = 0.00 < 0.05)
                        thrpt:  [+2854.1% +2857.7% +2861.3%]
                        Performance has improved.

You have to love that criterion performance improvement detection logic 😅

time: [−83.027% −82.534% −82.033%] (p = 0.07 > 0.05)
thrpt: [+456.57% +472.53% +489.16%]
No change in performance detected.

XavDesbordes · 2025-05-18T15:34:24Z

Basically, it should dramatically close the performance gap with fastwebsockets, which is good

alexheretic · 2025-05-18T15:56:11Z

critcmp using #497 e2e benches

group                0.24                                   master                                 #496
-----                ----                                   ------                                 ----
send+recv/1 GiB      1.45  1757.1±36.73ms  1165.6 MB/sec    29.06     35.2±0.00s    58.1 MB/sec    1.00   1212.3±2.19ms  1689.4 MB/sec
send+recv/128 MiB    1.29    231.0±0.01ms  1108.1 MB/sec    5.66  1012.8±65.35ms   252.8 MB/sec    1.00    179.0±0.85ms  1429.9 MB/sec
send+recv/16 MiB     1.76     22.4±0.48ms  1428.0 MB/sec    2.39     30.4±0.63ms  1052.6 MB/sec    1.00     12.7±0.32ms     2.5 GB/sec
send+recv/2 MiB      2.35      2.2±0.06ms  1819.0 MB/sec    1.01   948.1±23.03µs     4.1 GB/sec    1.00   935.2±21.91µs     4.2 GB/sec
send+recv/256 KiB    1.46    168.7±1.08µs     2.9 GB/sec    1.09    125.7±2.52µs     3.9 GB/sec    1.00    115.2±2.30µs     4.2 GB/sec
send+recv/32 KiB     1.24     36.4±0.01µs  1716.2 MB/sec    1.01     29.8±0.61µs     2.1 GB/sec    1.00     29.3±0.50µs     2.1 GB/sec
send+recv/4 KiB      1.14     17.6±0.33µs   445.1 MB/sec    1.05     16.2±0.25µs   483.4 MB/sec    1.00     15.4±0.02µs   506.5 MB/sec
send+recv/512 B      1.02     13.6±0.26µs    72.0 MB/sec    1.02     13.6±0.51µs    72.0 MB/sec    1.00     13.3±0.24µs    73.4 MB/sec

XavDesbordes · 2025-05-18T20:00:20Z

That's crazy! well done Alex

Fix large message read performance by enforcing max `read_buffer_size…

58f6d5a

…` read chunks

alexheretic force-pushed the fix-large-message-read-perf branch from b9f2acc to 58f6d5a Compare May 16, 2025 16:36

Don't allow zero in_buf_max_read

6b3bc84

daniel-abramov reviewed May 16, 2025

View reviewed changes

Comment thread src/protocol/frame/frame.rs Outdated

Comment thread src/protocol/frame/mod.rs

Update src/protocol/frame/frame.rs

7aec655

Co-authored-by: Daniel Abramov <inetcrack2@gmail.com>

alexheretic mentioned this pull request May 18, 2025

Add fast-and-unsound feature #485

Closed

alexheretic mentioned this pull request May 22, 2025

Improve medium/large message writing: Reduce copying & lower memory usage. #500

Open

daniel-abramov merged commit f20436c into snapview:master May 23, 2025
7 checks passed

daniel-abramov mentioned this pull request May 23, 2025

benchmark result to compare with other crates #352

Closed

alexheretic deleted the fix-large-message-read-perf branch May 23, 2025 23:18

alexheretic mentioned this pull request May 23, 2025

When reading avoid over-reserving the in the case WouldBlock causes multiple read_frame calls #501

Merged

ResuBaka mentioned this pull request Jul 21, 2025

Update tungstenite to get client read performance improvement clockworklabs/SpacetimeDB#2966

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix large message read performance by enforcing max `read_buffer_size` read chunks#496

Fix large message read performance by enforcing max `read_buffer_size` read chunks#496
daniel-abramov merged 3 commits intosnapview:masterfrom
alexheretic:fix-large-message-read-perf

alexheretic commented May 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

alexheretic commented May 18, 2025 •

edited

Loading

Uh oh!

XavDesbordes commented May 18, 2025

Uh oh!

alexheretic commented May 18, 2025 •

edited

Loading

Uh oh!

XavDesbordes commented May 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alexheretic commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

Uh oh!

Uh oh!

alexheretic commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

XavDesbordes commented May 18, 2025

Uh oh!

alexheretic commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

XavDesbordes commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alexheretic commented May 16, 2025 •

edited

Loading

alexheretic commented May 18, 2025 •

edited

Loading

alexheretic commented May 18, 2025 •

edited

Loading

XavDesbordes commented May 18, 2025 •

edited

Loading