feat: DH-18079: Record data read info across performance logs QPL, QOPL, and UPL#7936
feat: DH-18079: Record data read info across performance logs QPL, QOPL, and UPL#7936cpwright wants to merge 18 commits intodeephaven:mainfrom
Conversation
No docs changes detected for f1d3f0f |
There was a problem hiding this comment.
Pull request overview
Adds instrumentation to track data-read and metadata-read activity and surfaces it in performance logs (QPL / QOPL / UPL).
Changes:
- Add per-thread counters for data reads and metadata operations and plumb them into performance entries / nuggets.
- Expose new read/metadata columns in QueryPerformanceStreamPublisher and QueryOperationPerformanceStreamPublisher.
- Instrument Parquet layout discovery, FileHandle I/O, and S3 reads/HEAD requests; add an S3 test asserting data-read bytes.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| extensions/s3/src/test/java/io/deephaven/extensions/s3/S3SeekableChannelSimpleTestBase.java | Adds assertion that S3 reads increment per-thread data-read bytes |
| extensions/s3/src/main/java/io/deephaven/extensions/s3/S3SeekableChannelProvider.java | Records metadata-op timing for S3 HEAD (size) |
| extensions/s3/src/main/java/io/deephaven/extensions/s3/S3ReadRequest.java | Tracks read timings/sizes and attempts to attribute reads to the blocking thread |
| extensions/s3/src/main/java/io/deephaven/extensions/s3/S3ReadContext.java | Records the read after blocking fill completes |
| extensions/s3/build.gradle | Adds engine-table dependency for perf recorder state access |
| extensions/parquet/table/src/main/java/io/deephaven/parquet/table/layout/ParquetKeyValuePartitionedLayout.java | Records metadata-op timing for directory walk |
| extensions/parquet/table/src/main/java/io/deephaven/parquet/table/layout/ParquetFlatPartitionedLayout.java | Records metadata-op timing for directory list |
| extensions/parquet/table/src/main/java/io/deephaven/parquet/table/ParquetTools.java | Wraps readTable in a QueryPerformanceRecorder nugget |
| engine/table/src/main/java/io/deephaven/engine/util/file/FileHandle.java | Records metadata size timing and data read timing/bytes for file reads |
| engine/table/src/main/java/io/deephaven/engine/table/impl/util/QueryPerformanceStreamPublisher.java | Adds new DataRead*/MetadataRead* columns to QPL stream |
| engine/table/src/main/java/io/deephaven/engine/table/impl/util/QueryOperationPerformanceStreamPublisher.java | Adds new DataRead*/MetadataRead* columns to QOPL stream |
| engine/table/src/main/java/io/deephaven/engine/table/impl/perf/QueryPerformanceRecorderState.java | Introduces per-thread read counters and metadata-op stats caching |
| engine/table/src/main/java/io/deephaven/engine/table/impl/perf/QueryPerformanceRecorderImpl.java | Updates log-threshold checks to include read count |
| engine/table/src/main/java/io/deephaven/engine/table/impl/perf/QueryPerformanceNugget.java | Minor comment/format tweak |
| engine/table/src/main/java/io/deephaven/engine/table/impl/perf/QueryPerformanceLogThreshold.java | Adds minimum read-count threshold support |
| engine/table/src/main/java/io/deephaven/engine/table/impl/perf/PerformanceEntry.java | Applies updated threshold logic for UPL intervals |
| engine/table/src/main/java/io/deephaven/engine/table/impl/perf/BasePerformanceEntry.java | Captures deltas for data/metadata reads into performance entries |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…/QueryPerformanceRecorderState.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…equest.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return readKeyValuePartitionedTable(sourceURI, readInstructions, null); | ||
| case METADATA_PARTITIONED: | ||
| return readPartitionedTableWithMetadata(sourceURI, readInstructions, null); | ||
| return QueryPerformanceRecorder.withNugget("ParquetTools.readTable(" + source + ")", () -> { |
There was a problem hiding this comment.
The nugget description includes the raw source string ("ParquetTools.readTable(" + source + ")"), which can leak sensitive information into performance logs (e.g., credentials embedded in URIs, presigned URL query params, user home paths). Consider using a sanitized/redacted identifier (e.g., scheme + path without userinfo/query/fragment, or a hash) instead of logging the full input.
No description provided.