parquet: Make page_index/pushdown metrics consistent with row_group metrics#12545
Merged
alamb merged 3 commits intoapache:mainfrom Sep 22, 2024
Merged
parquet: Make page_index/pushdown metrics consistent with row_group metrics#12545alamb merged 3 commits intoapache:mainfrom
alamb merged 3 commits intoapache:mainfrom
Conversation
…etrics
1. Rename `{pushdown,page_index}_filtered` to `{pushdown,page_index}_pruned`
2. Add `{pushdown,page_index}_matched`
The latter makes it clearer in EXPLAIN ANALYZE when the Page Index is
not checked because their row groups were already eliminated
(with a Bloom Filter or row group statistics).
alamb
approved these changes
Sep 20, 2024
| } | ||
|
|
||
| /// returns the number of rows not skipped in the selection | ||
| /// TODO should this be upstreamed to RowSelection? |
Contributor
There was a problem hiding this comment.
This looks the same as https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.RowSelection.html#method.row_count
It would be great to upstream this and rows_skipped to parquet -- any chance you are willing to file a ticket to do so?
| - `SortPreservingMergeExec` | ||
| - `output_rows=5`, `elapsed_compute=2.375µs`: Produced the final 5 rows in 2.375µs (microseconds) | ||
|
|
||
| When predicate pushdown is enabled, `ParquetExec` gains the following metrics: |
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
This was referenced Sep 20, 2024
Contributor
|
Thanks agian @progval |
bgjackma
pushed a commit
to bgjackma/datafusion
that referenced
this pull request
Sep 25, 2024
…etrics (apache#12545) * parquet: Make page_index/pushdown metrics consistent with row_group metrics 1. Rename `{pushdown,page_index}_filtered` to `{pushdown,page_index}_pruned` 2. Add `{pushdown,page_index}_matched` The latter makes it clearer in EXPLAIN ANALYZE when the Page Index is not checked because their row groups were already eliminated (with a Bloom Filter or row group statistics). * Add missing metric definitions in the docs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * s/pass/select/ --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #12543.
Closes #12544.
What changes are included in this PR?
{pushdown,page_index}_filteredto{pushdown,page_index}_pruned{pushdown,page_index}_matchedRationale for this change
The latter makes it clearer in EXPLAIN ANALYZE when the Page Index is not checked because their row groups were already eliminated (with a Bloom Filter or row group statistics).
Are these changes tested?
yes
Are there any user-facing changes?
New metrics in
EXPLAIN ANALYZE, documented in docs/source/user-guide/explain-usage.md