Allow multicolumn transformations for AbstractDataFrame#2461
Allow multicolumn transformations for AbstractDataFrame#2461bkamins merged 21 commits intoJuliaData:masterfrom
Conversation
nalimilan
left a comment
There was a problem hiding this comment.
Thanks, that's impressive. I haven't looked at the tests in detail yet, feel free to point me at interesting cases that I may have missed.
|
Could you please clarify what |
it is a type. types in Julia are also values, and we use this fact here
The first is instance of a type the second is a type (
What do you mean by "pattern"? In general the transformation mini-language is DataFrames.jl specific. What is important is that
We do not dispatch on type (actually if you look at the implementation there is a problem with this - we have to dynamically check for
We could use In summary: it is not a common pattern, but do you have a better proposal what to use instead? The benefit of this approach is:
(and just to stress - we do not dispatch on This situation is kind-of similar to |
|
As an aferthought: we could use and it would be a valid transformation specification (note that there is no need of parens for trailing |
|
Thank you for the clarification. I'm glad I have understanding of your thought process, here. I think |
|
@nalimilan - what do you think? I dislike EDIT: sorry, actually |
|
I prefer |
|
Let us keep |
|
Sounds good. Perhaps the best mental model is for it to be a |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
|
Is the following expected behavior? I would have thought with the |
|
Additionally, should the following work? |
No - this would be
What is
This is also expected - and follows your request to disallow In general |
|
Thanks, this is all very clear.
Yes that is expected. This is a really impressive work! Really appreciate it and the thought you've put into this. |
|
I think a |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
|
Why allow returning matrices at all if we are deprecating the |
Only for backward compatibility reasons. Note that we will not disallow returning them. The only question is what happens with them and we have two options:
I was thinking which behavior the user would prefer when returning a matrix and I thought that the second is more natural. Would you prefer the first? In general - under current rules the only case when we throw an error is Note that this is a different case from what we discuss with @nalimilan, as he has raised a case when |
The first option reminds me of |
|
So option 2 is what we currently have 😄. |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
|
I have updated the documentation (so essentially when we accept this this should be good to be merged). @nalimilan - as usual - feel free to rewrite the docstrings 😄 (and sorry for mistakes, as for sure there will be some). |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
Co-authored-by: Milan Bouchet-Valat <[email protected]>
|
Thank you for all the comments. If there are no more issues with this proposal I will merge the PR tomorrow and follow up with a small |
|
Thank you! |
This PR partially addresses #2410 and #2457.
It covers
selectetc. forAbstractDataFrame.If we are OK with the functionality I will update the documentation.
TODO:
selectetc. forGroupedDataFrame(this will be a separate PR to keep PRs more atomic)ByRowwith no columns passed tofilter(also a separate PR)CC @nalimilan @pdeffebach @matthieugomez - this is a rather complex PR so independent testing (especially for corner cases) would be welcome (if you would have suggestions for types of tests to add please comment and I will add them).