Skip to content

feat: support deterministic load generator values#26174

Merged
waynr merged 1 commit intomainfrom
core/load-generator/refactor-and-support-sequential-field-data
Mar 20, 2025
Merged

feat: support deterministic load generator values#26174
waynr merged 1 commit intomainfrom
core/load-generator/refactor-and-support-sequential-field-data

Conversation

@waynr
Copy link
Copy Markdown
Contributor

@waynr waynr commented Mar 20, 2025

This PR refactors the load generator to make it easier to use in a set of end-to-end tests I am writing for an Antithesis PoC. It includes the following changes:

  • Add a lib target to the crate, make public all the methods & types needed in the binary target.
  • Make a bunch of other types public (eg Generator, GeneratorRunner, FieldValue, etc) to support their use in a separate e2e crate
  • Fix a bug in the load generator where a cardinality value less than the lines per sample value led to duplicate LP lines where the field values of the last generated LP line overwrites all the previous values.
    • The fix here was basically to give each line-per-sample a unique timestamp by incrementing the nanosecond for each generated line in a batch. It's worth noting we could potentially run into the same problem if a spec ends up with > 1_000_000 lines per sample. I think that's not very likely but I'll file a follow-up issue to take a closer look at that if necessary.
  • Introduce the GeneratorRunner type to encapsulate state that was being passed around to multiple methods and simplify usage at the e2e level.
  • Introduce two new measurement field variants:
    • FieldValue::Integer(IntegerValue::Sequential(u64)) - causes values to be written sequentially for each generated LP line starting from 0
    • FieldValue::String(StringValue::Sequential(Arc<str>, u64)) - causes values to be written sequentially for each generated LP line starting with {str}0

The new measurement field variants are what enable writing long-running tests cases where we can assert on the values we expect to get from queries in a deterministic fashion; ie, we don't need to keep a buffer of written values, we just need to know the range of values written, query for them ordered by timestamp, then generate expected values incrementally as we iterate over the queried values in order to assert the expected and queryable values are the same.

* simplify FieldValue types by making load generator functions should be
  generic over RngCore and passing the RNG in to methods rather than
  depending on it being available on every type instance that needs it
* expose influxdb3_load_generator as library crate
* export config, spec, and measurement types publicly to suppore use in
  the antithesis-e2e crate
* fix bug that surfaced whenever the cardinality value was less than the
  lines per sample value by forcing LP lines in a set of samples to be
  distinct from one another with nanosecond increments
@waynr waynr requested review from hiltontj and pauldix and removed request for hiltontj March 20, 2025 16:36
Copy link
Copy Markdown
Member

@pauldix pauldix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good stuff

@waynr waynr merged commit 4fd0a3b into main Mar 20, 2025
12 checks passed
waynr added a commit that referenced this pull request Mar 21, 2025
feat: support deterministic load generator values (#26174)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants