Hydra Tail Simulation: Payment Window & Client Snapshots by KtorZ · Pull Request #17 · input-output-hk/hydra-sim

KtorZ · 2021-05-21T17:34:00Z

📍 remove options' prefix now redundant since client/server options are passed in separate commands.
📍 Introduce payment window to the simulation.
This is done in a somewhat backward compatible way. The '--payment-window' option is optional and, when not set, will result in the current behavior.
📍 Increase number of buckets for amount and size plots.
to get a better granularity and, provide better generators in the simulation.
📍 Generate random transaction size and amounts driven from real data
With the introduction of the payment window, using default static amounts does not really lead to useful simulation schedule. Since we have real data readily available, I've generated amounts and sizes using a similar distribution as the one observed in the data. This should help the 'prepare' command to produce rather 'realistic' schedules.
📍 Fix bug when it comes to counting clients before running the simulation.
📍 Handle settlement delays on the server side
The server now prevents transactions from going through if one of the recipient is performing a snapshot. The transaction is then queued and reenqueued later once the snapshot is done.
📍 Report progress on the simulation's analysis (but allow to turn it off)
📍 summarize events before analyzing the simulation
Used to not matter due to lazyness, but now the analyze is monadic.
📍 Interrupt simulation after the prepared duration.
So far, we were letting the simulation resolve all unresolved events and only terminates after. As a consequence, when the server is overwhelmed and reaches its max TPS, it'll need a lot more than the planned duration to finish processing all messages, making some simulation run extremely long. Plus, the analysis is then a bit skewed because clients will no longer emit transactions after the end of the planned duration; so remaining slots are mostly about the server catching up. In the end, it makes more sense to stop the simulation right at the end of the planned duration, and look at the number of confirmed transaction. If there are some unconfirmed transactions in flights, then they aren't counted. Beside, there's no real point to continue running the simulation if the server is already satured; because in practice, that means the server will eventually blow up or would actually start rejected connection. So the idea is to tweak simulation settings up until we start seeing a discrepancy between the max throughput and the actual throughput.
📍 Better (+fancy) progress reporting for ongoing simulation analysis.
📍 Pick recipient across the whole spectrum of available recipients when sending payment
In the early version of the simulation, recipients were picked in a very deterministic fashion: the next client in-line. As a consequence, this creates a circle where each client depends on the next one and could possibly lead to completely skewed observations. It doesn't really matter in the context of an unlimited payment window, but with a limited one, a single blocking client impacts the entire ring and creates a 'traffic jam'. A fully connected ring like this is perhaps one of the worse case scenario we can imagine for the tail? So anyway, to stay closer to real patterns, the recipients are now picked at random amongst all the possible participants so that they all have an equal chance to be picked.
📍 Reduce memory usage of the 'prepare' command by not constructing full clients
In the end, types were re-used for 'readability' and to keep the run and prepare steps sort of similar. However, the multiplexer is really not needed during the prepation. Similarly, since all steps are done sequentially, we need not to use n number generators, but instead, can rely on a single one passed from client to client.
📍 More fine-grained stats on the dataset w.r.t to the number of transactions To adjust the payment window, it's good to know how many transactions are actually within the payment window. If the vaste majority is within and well within (like for instance, smaller than w/10) then it means that we can likely reduce the size of the payment window without too much impact.

…passed in separate commands.

This is done in a somewhat backward compatible way. The '--payment-window' option is optional and, when not set, will result in the current behavior.

to get a better granularity and, provide better generators in the simulation.

With the introduction of the payment window, using default static amounts does not really lead to useful simulation schedule. Since we have real data readily available, I've generated amounts and sizes using a similar distribution as the one observed in the data. This should help the 'prepare' command to produce rather 'realistic' schedules.

The server now prevents transactions from going through if one of the recipient is performing a snapshot. The transaction is then queued and reenqueued later once the snapshot is done.

Used to not matter due to lazyness, but now the analyze is monadic.

So far, we were letting the simulation resolve all unresolved events and only terminates after. As a consequence, when the server is overwhelmed and reaches its max TPS, it'll need a lot more than the planned duration to finish processing all messages, making some simulation run extremely long. Plus, the analysis is then a bit skewed because clients will no longer emit transactions after the end of the planned duration; so remaining slots are mostly about the server catching up. In the end, it makes more sense to stop the simulation right at the end of the planned duration, and look at the number of confirmed transaction. If there are some unconfirmed transactions in flights, then they aren't counted. Beside, there's no real point to continue running the simulation if the server is already satured; because in practice, that means the server will eventually blow up or would actually start rejected connection. So the idea is to tweak simulation settings up until we start seeing a discrepancy between the max throughput and the actual throughput.

ch1bo

Interesting to see how you did it. Some things a bit unexpected, but should serve still the purpose and provide meaningful results.

ch1bo · 2021-06-02T10:56:11Z

+
+    (e@(Event _ _ (NewTx MockTx{txAmount} _)):q) | slot e <= currentSlot -> do
+      atomically (paymentWindow <$> readTVar balance) >>= \case
+        InPaymentWindow -> do


Does this mean that the client determines whether it is in or outside of the payment window? I would've expected that a server keeps track of the payment window / balances of it's clients. It should work just the same though..

Yes indeed. This was actually much simpler done in that way since the clients are driving the simulation. I originally wanted to do it the "intuitive way", which is to have the server controlling the behavior and instructing clients to do snapshot and whatnot. In the end, this was getting complex very fast and... while it would be necessary for a real implementation, in the simulation, clients are actually well behaved and very disciplined in managing their own payment window ^.^ ...

Conceptually, it doesn't change much from the overall system behavior, clients are still blocked when they go out of their window, and the server will not acknowledge or broadcast transactions to blocked clients. All-in-all, it's more simply done in that direction.

ch1bo · 2021-06-02T10:58:26Z

+
+        OutOfPaymentWindow -> do
+          sendTo multiplexer serverId SnapshotStart
+          threadDelay (secondsToDiffTime (unSlotNo (opts ^. #settlementDelay)) * opts ^. #slotLength)


Also here, I thought the client would send "reset window tx" (signed or so later) to the server, and the server does put it "on chain" .. again, this is merely shifting where to time is spent I suppose.

… sending payment In the early version of the simulation, recipients were picked in a very deterministic fashion: the next client in-line. As a consequence, this creates a circle where each client depends on the next one and could possibly lead to completely skewed observations. It doesn't really matter in the context of an unlimited payment window, but with a limited one, a single blocking client impacts the entire ring and creates a 'traffic jam'. A fully connected ring like this is perhaps one of the worse case scenario we can imagine for the tail? So anyway, to stay closer to real patterns, the recipients are now picked at random amongst all the possible participants so that they all have an equal chance to be picked.

… clients In the end, types were re-used for 'readability' and to keep the run and prepare steps sort of similar. However, the multiplexer is really not needed during the prepation. Similarly, since all steps are done sequentially, we need not to use n number generators, but instead, can rely on a single one passed from client to client.

…tions To adjust the payment window, it's good to know how many transactions are actually within the payment window. If the vaste majority is within and well within (like for instance, smaller than w/10) then it means that we can likely reduce the size of the payment window without too much impact.

KtorZ added 6 commits May 21, 2021 15:19

remove options' prefix now redundant since client/server options are …

5b06b8f

…passed in separate commands.

Introduce payment window to the simulation.

eef2128

This is done in a somewhat backward compatible way. The '--payment-window' option is optional and, when not set, will result in the current behavior.

Increase number of buckets for amount and size plots.

8517919

to get a better granularity and, provide better generators in the simulation.

Fix bug when it comes to counting clients before running the simulation.

8b9ec9e

Handle settlement delays on the server side

fea5d36

The server now prevents transactions from going through if one of the recipient is performing a snapshot. The transaction is then queued and reenqueued later once the snapshot is done.

KtorZ requested review from ch1bo and kantp May 21, 2021 17:34

KtorZ self-assigned this May 21, 2021

KtorZ added 3 commits June 1, 2021 17:50

Report progress on the simulation's analysis (but allow to turn it off)

6af86c1

summarize events before analyzing the simulation

bba0b70

Used to not matter due to lazyness, but now the analyze is monadic.

ch1bo approved these changes Jun 2, 2021

View reviewed changes

KtorZ added 4 commits June 2, 2021 13:27

Better (+fancy) progress reporting for ongoing simulation analysis.

3e19bcd

KtorZ merged commit 5a56f54 into master Jun 2, 2021

KtorZ deleted the KtorZ/hydra-tail-payment-window-and-snapshots branch June 2, 2021 13:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hydra Tail Simulation: Payment Window & Client Snapshots#17

Hydra Tail Simulation: Payment Window & Client Snapshots#17
KtorZ merged 13 commits into
masterfrom
KtorZ/hydra-tail-payment-window-and-snapshots

KtorZ commented May 21, 2021 •

edited

Loading

Uh oh!

ch1bo left a comment

Uh oh!

ch1bo Jun 2, 2021

Uh oh!

KtorZ Jun 2, 2021

Uh oh!

ch1bo Jun 2, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

KtorZ commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ch1bo left a comment

Choose a reason for hiding this comment

Uh oh!

ch1bo Jun 2, 2021

Choose a reason for hiding this comment

Uh oh!

KtorZ Jun 2, 2021

Choose a reason for hiding this comment

Uh oh!

ch1bo Jun 2, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KtorZ commented May 21, 2021 •

edited

Loading