Skip to content

Add broadcasting of selectors to the minilanguage#2918

Merged
bkamins merged 11 commits intomainfrom
bk/broadcast_selectors
Nov 1, 2021
Merged

Add broadcasting of selectors to the minilanguage#2918
bkamins merged 11 commits intomainfrom
bk/broadcast_selectors

Conversation

@bkamins
Copy link
Copy Markdown
Member

@bkamins bkamins commented Oct 20, 2021

@nalimilan + @pdeffebach: I am sharing the preview of the functionality making e.g. Not(:x1) .=> sum work in the minilanguage.

The core mechanism should be working for intended cases.

What is todo:

  • implement expansion engine
  • inject the inner mechanics to select, subset etc. functions that should call it
  • add tests of intended use cases
  • add tests for target functions
  • add tests of corner cases (i.e. to check if we are good if we pass something that is not intended)
  • add NEWS.md
  • add docstring
  • add documentation

@bkamins bkamins added this to the 1.3 milestone Oct 20, 2021
@bkamins bkamins linked an issue Oct 20, 2021 that may be closed by this pull request
@bkamins bkamins added the bug label Oct 20, 2021
@bkamins
Copy link
Copy Markdown
Member Author

bkamins commented Oct 20, 2021

Additionally I have caught two bugs in the existing implementation:

julia> df = DataFrame(a=1,b=2,c=3)
1×3 DataFrame
 Row │ a      b      c
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     1      2      3

julia> select(view(df, 1:1, 1:2), [:x1, :x2] .=> [sum length])
ERROR: ArgumentError: Unrecognized column selector: Pair{Symbol, _A} where _A[:x1 => sum :x1 => length; :x2 => sum :x2 => length]

julia> select(groupby(df, :a), Symbol[] .=> [sum length])
ERROR: MethodError: no method matching select(::GroupedDataFrame{DataFrame}, ::Matrix{Any})

(both are fixed)

@bkamins bkamins marked this pull request as ready for review October 21, 2021 09:43
@bkamins bkamins requested a review from nalimilan October 21, 2021 09:43
for cei in cs
@assert cei isa Union{Pair, Base.Callable, ColumnIndex, MultiColumnIndex,
AbstractVecOrMat{<:Pair}}
if !(cei isa AbstractMatrix && isempty(cei))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is again because of broadcasting returning Matrix{Any}? It's somewhat ugly to have to handle this special case here. Would there be any way to fix this elsewhere?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - this is the Matrix{Any} case special handling. This is essentially the only place we can handle this in a short way. The point is that we want to do AbstractVecOrMat{<:Pair} check below (not to allow arbitrary matrices to leak further unless they are empty as then it does not matter what eltype they have - and thus the additional check).

bkamins and others added 2 commits October 26, 2021 21:33
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
@bkamins
Copy link
Copy Markdown
Member Author

bkamins commented Oct 27, 2021

CI failure here is unrelated.

is not a valid transformation specification.

Note! If `cols` or `target_cols` is one of `All`, `Cols`, `Between`, or
`Not` and is used in a broadcasting context are properly expanded to
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a syntax problem here. Just copy the text from above?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have forgotten to sync the docstrings. Fixed.

@bkamins bkamins merged commit cccf74d into main Nov 1, 2021
@bkamins bkamins deleted the bk/broadcast_selectors branch November 1, 2021 11:59
@bkamins
Copy link
Copy Markdown
Member Author

bkamins commented Nov 1, 2021

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for All, Between and Not broadcating

2 participants