Skip to content

Refactor QueryStageExec in preparation for implementing map-side shuffle#459

Merged
alamb merged 6 commits intoapache:masterfrom
andygrove:query-stage-refactor
Jun 1, 2021
Merged

Refactor QueryStageExec in preparation for implementing map-side shuffle#459
alamb merged 6 commits intoapache:masterfrom
andygrove:query-stage-refactor

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented May 31, 2021

Which issue does this PR close?

Closes #458

Rationale for this change

What changes are included in this PR?

  • Moves logic from Ballista executor to QueryStageExec (which we can potentially move to DataFusion later on)
  • Puts some plumbing in place in preparation for supporting map-side shuffle
  • Implements a unit test

Are there any user-facing changes?

No

@andygrove andygrove self-assigned this May 31, 2021
@andygrove
Copy link
Copy Markdown
Member Author

@edrevo fyi

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 31, 2021

Codecov Report

Merging #459 (c6ad3de) into master (c8ab5a4) will increase coverage by 0.30%.
The diff coverage is 68.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #459      +/-   ##
==========================================
+ Coverage   75.30%   75.60%   +0.30%     
==========================================
  Files         152      152              
  Lines       25275    25336      +61     
==========================================
+ Hits        19033    19156     +123     
+ Misses       6242     6180      -62     
Impacted Files Coverage Δ
ballista/rust/core/src/utils.rs 27.58% <0.00%> (+27.58%) ⬆️
ballista/rust/executor/src/executor.rs 0.00% <0.00%> (ø)
ballista/rust/scheduler/src/lib.rs 20.81% <0.00%> (ø)
ballista/rust/scheduler/src/planner.rs 66.91% <61.53%> (-0.74%) ⬇️
...lista/rust/core/src/execution_plans/query_stage.rs 75.78% <82.27%> (+32.93%) ⬆️
ballista/rust/core/src/serde/scheduler/mod.rs 58.92% <0.00%> (+44.64%) ⬆️
ballista/rust/core/src/memory_stream.rs 60.00% <0.00%> (+60.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c8ab5a4...c6ad3de. Read the comment docs.

/// Create a new query stage
pub fn try_new(
job_id: String,
job_id: &str,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think for "constructor methods" it is OK to take an owned String as the &str will be copied anyway.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW another pattern that I like is job_id: impl Into<String> so that the caller can pass in an owned String if they have one or a &str if they don't (or anything else that knows how to turn itself into a String)

//TODO re-use code from RepartitionExec to split each batch into
// partitions and write to one IPC file per partition
// See https://github.com/apache/arrow-datafusion/issues/456
unimplemented!()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use DataFusionError:: NotImplemented

Copy link
Copy Markdown
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 👍

@alamb alamb merged commit e5264f6 into apache:master Jun 1, 2021
@andygrove andygrove deleted the query-stage-refactor branch June 1, 2021 18:17
@houqp houqp added the ballista label Jul 30, 2021
H0TB0X420 pushed a commit to H0TB0X420/datafusion that referenced this pull request Oct 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ballista refactor QueryStageExec in preparation for map-side shuffle

6 participants