Skip to content

Hydra Tail Simulation: Payment Window & Client Snapshots#17

Merged
KtorZ merged 13 commits into
masterfrom
KtorZ/hydra-tail-payment-window-and-snapshots
Jun 2, 2021
Merged

Hydra Tail Simulation: Payment Window & Client Snapshots#17
KtorZ merged 13 commits into
masterfrom
KtorZ/hydra-tail-payment-window-and-snapshots

Conversation

@KtorZ

@KtorZ KtorZ commented May 21, 2021

Copy link
Copy Markdown
Contributor
  • 📍 remove options' prefix now redundant since client/server options are passed in separate commands.

  • 📍 Introduce payment window to the simulation.
    This is done in a somewhat backward compatible way. The '--payment-window' option is optional and, when not set, will result in the current behavior.

  • 📍 Increase number of buckets for amount and size plots.
    to get a better granularity and, provide better generators in the simulation.

  • 📍 Generate random transaction size and amounts driven from real data
    With the introduction of the payment window, using default static amounts does not really lead to useful simulation schedule. Since we have real data readily available, I've generated amounts and sizes using a similar distribution as the one observed in the data. This should help the 'prepare' command to produce rather 'realistic' schedules.

  • 📍 Fix bug when it comes to counting clients before running the simulation.

  • 📍 Handle settlement delays on the server side
    The server now prevents transactions from going through if one of the recipient is performing a snapshot. The transaction is then queued and reenqueued later once the snapshot is done.

  • 📍 Report progress on the simulation's analysis (but allow to turn it off)

  • 📍 summarize events before analyzing the simulation
    Used to not matter due to lazyness, but now the analyze is monadic.

  • 📍 Interrupt simulation after the prepared duration.
    So far, we were letting the simulation resolve all unresolved events and only terminates after. As a consequence, when the server is overwhelmed and reaches its max TPS, it'll need a lot more than the planned duration to finish processing all messages, making some simulation run extremely long. Plus, the analysis is then a bit skewed because clients will no longer emit transactions after the end of the planned duration; so remaining slots are mostly about the server catching up. In the end, it makes more sense to stop the simulation right at the end of the planned duration, and look at the number of confirmed transaction. If there are some unconfirmed transactions in flights, then they aren't counted. Beside, there's no real point to continue running the simulation if the server is already satured; because in practice, that means the server will eventually blow up or would actually start rejected connection. So the idea is to tweak simulation settings up until we start seeing a discrepancy between the max throughput and the actual throughput.

  • 📍 Better (+fancy) progress reporting for ongoing simulation analysis.

  • 📍 Pick recipient across the whole spectrum of available recipients when sending payment
    In the early version of the simulation, recipients were picked in a very deterministic fashion: the next client in-line. As a consequence, this creates a circle where each client depends on the next one and could possibly lead to completely skewed observations. It doesn't really matter in the context of an unlimited payment window, but with a limited one, a single blocking client impacts the entire ring and creates a 'traffic jam'. A fully connected ring like this is perhaps one of the worse case scenario we can imagine for the tail? So anyway, to stay closer to real patterns, the recipients are now picked at random amongst all the possible participants so that they all have an equal chance to be picked.

  • 📍 Reduce memory usage of the 'prepare' command by not constructing full clients
    In the end, types were re-used for 'readability' and to keep the run and prepare steps sort of similar. However, the multiplexer is really not needed during the prepation. Similarly, since all steps are done sequentially, we need not to use n number generators, but instead, can rely on a single one passed from client to client.

  • 📍 More fine-grained stats on the dataset w.r.t to the number of transactions To adjust the payment window, it's good to know how many transactions are actually within the payment window. If the vaste majority is within and well within (like for instance, smaller than w/10) then it means that we can likely reduce the size of the payment window without too much impact.

KtorZ added 6 commits May 21, 2021 15:19
  This is done in a somewhat backward compatible way. The '--payment-window' option is optional and, when not set, will result in the current behavior.
  to get a better granularity and, provide better generators in the simulation.
  With the introduction of the payment window, using default static amounts does not really lead to useful simulation schedule. Since we have real data readily available, I've generated amounts and sizes using a similar distribution as the one observed in the data. This should help the 'prepare' command to produce rather 'realistic' schedules.
  The server now prevents transactions from going through if one of the recipient is performing a snapshot. The transaction is then queued and reenqueued later once the snapshot is done.
@KtorZ KtorZ requested review from ch1bo and kantp May 21, 2021 17:34
@KtorZ KtorZ self-assigned this May 21, 2021
KtorZ added 3 commits June 1, 2021 17:50
  Used to not matter due to lazyness, but now the analyze is monadic.
  So far, we were letting the simulation resolve all unresolved events and only terminates after. As a consequence, when the server is overwhelmed and reaches its max TPS, it'll need a lot more than the planned duration to finish processing all messages, making some simulation run extremely long. Plus, the analysis is then a bit skewed because clients will no longer emit transactions after the end of the planned duration; so remaining slots are mostly about the server catching up. In the end, it makes more sense to stop the simulation right at the end of the planned duration, and look at the number of confirmed transaction. If there are some unconfirmed transactions in flights, then they aren't counted. Beside, there's no real point to continue running the simulation if the server is already satured; because in practice, that means the server will eventually blow up or would actually start rejected connection. So the idea is to tweak simulation settings up until we start seeing a discrepancy between the max throughput and the actual throughput.

@ch1bo ch1bo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting to see how you did it. Some things a bit unexpected, but should serve still the purpose and provide meaningful results.


(e@(Event _ _ (NewTx MockTx{txAmount} _)):q) | slot e <= currentSlot -> do
atomically (paymentWindow <$> readTVar balance) >>= \case
InPaymentWindow -> do

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the client determines whether it is in or outside of the payment window? I would've expected that a server keeps track of the payment window / balances of it's clients. It should work just the same though..

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed. This was actually much simpler done in that way since the clients are driving the simulation. I originally wanted to do it the "intuitive way", which is to have the server controlling the behavior and instructing clients to do snapshot and whatnot. In the end, this was getting complex very fast and... while it would be necessary for a real implementation, in the simulation, clients are actually well behaved and very disciplined in managing their own payment window ^.^ ...

Conceptually, it doesn't change much from the overall system behavior, clients are still blocked when they go out of their window, and the server will not acknowledge or broadcast transactions to blocked clients. All-in-all, it's more simply done in that direction.


OutOfPaymentWindow -> do
sendTo multiplexer serverId SnapshotStart
threadDelay (secondsToDiffTime (unSlotNo (opts ^. #settlementDelay)) * opts ^. #slotLength)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, I thought the client would send "reset window tx" (signed or so later) to the server, and the server does put it "on chain" .. again, this is merely shifting where to time is spent I suppose.

KtorZ added 4 commits June 2, 2021 13:27
… sending payment

  In the early version of the simulation, recipients were picked in a very deterministic fashion: the next client in-line. As a consequence, this creates a circle where each client depends on the next one and could possibly lead to completely skewed observations. It doesn't really matter in the context of an unlimited payment window, but with a limited one, a single blocking client impacts the entire ring and creates a 'traffic jam'. A fully connected ring like this is perhaps one of the worse case scenario we can imagine for the tail? So anyway, to stay closer to real patterns, the recipients are now picked at random amongst all the possible participants so that they all have an equal chance to be picked.
… clients

  In the end, types were re-used for 'readability' and to keep the run and prepare steps sort of similar. However, the multiplexer is really not needed during the prepation. Similarly, since all steps are done sequentially, we need not to use n number generators, but instead, can rely on a single one passed from client to client.
…tions

  To adjust the payment window, it's good to know how many transactions are actually within the payment window. If the vaste majority is within and well within (like for instance, smaller than w/10) then it means that we can likely reduce the size of the payment window without too much impact.
@KtorZ KtorZ merged commit 5a56f54 into master Jun 2, 2021
@KtorZ KtorZ deleted the KtorZ/hydra-tail-payment-window-and-snapshots branch June 2, 2021 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants