Skip to content

allow Matrices in transformations of GroupedDataFrame#2782

Merged
bkamins merged 2 commits intomainfrom
bk/gdf_matrix_pair_agg
Jun 15, 2021
Merged

allow Matrices in transformations of GroupedDataFrame#2782
bkamins merged 2 commits intomainfrom
bk/gdf_matrix_pair_agg

Conversation

@bkamins
Copy link
Copy Markdown
Member

@bkamins bkamins commented Jun 7, 2021

Fixes #2781

@bkamins bkamins added this to the patch milestone Jun 7, 2021
@bkamins bkamins requested a review from pdeffebach June 7, 2021 14:39
Copy link
Copy Markdown
Contributor

@pdeffebach pdeffebach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like all the action with select(df::AbstractDataFrame...) is happening inside manipulate here.

Why not use the same structure / logic with SplitApplyCombine?

@bkamins
Copy link
Copy Markdown
Member Author

bkamins commented Jun 7, 2021

We essentially use the same structure just the function is not called manipulate, but _combine_prepare and then super complex logic.

Originally I preferred to add type assertions in select etc. in GroupedDataFrame case to be on a safe side - if I am not able to write a correct signature then it means that the design is too complex :) (and here what happened is that we have added support for matrices, but forgotten to add signatures).

@pdeffebach
Copy link
Copy Markdown
Contributor

Okay. It sounds like we might want to re-factor in the future, but shouldn't use this PR to do it.

@bkamins
Copy link
Copy Markdown
Member Author

bkamins commented Jun 7, 2021

Yes - refactoring is something I wanted to do for a long time, but you know how it is: we add features, the code gest more complex, at some point it is super hard to change anything unless you rewrite it from scratch (that is why I put so much attention to tests and documentation).

for cei in cs
@assert cei isa Union{Pair, Base.Callable, ColumnIndex, MultiColumnIndex}
@assert cei isa Union{Pair, Base.Callable, ColumnIndex, MultiColumnIndex,
AbstractVecOrMat{<:Pair}}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How were vectors supported before this PR?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AbstractVector <: MultiColumnIndex

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add AbstractMatrix{<:Pair} in the union, but I thought that it is cleaner to use AbstractVecOrMat{<:Pair} to signal the design intention.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nalimilan - do we agree on this approach?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nalimilan - ok to merge?

@bkamins bkamins merged commit ce481d7 into main Jun 15, 2021
@bkamins bkamins deleted the bk/gdf_matrix_pair_agg branch June 15, 2021 06:10
@bkamins
Copy link
Copy Markdown
Member Author

bkamins commented Jun 15, 2021

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

missing method combine(gd::GroupedDataFrame, ::Matrix)

3 participants