allow function in allowduplicates in unstack#2998
Conversation
|
Yeah I have to admit the name is a bit weird for passing a function. So you'd call the argument |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
It is hard to say what is best. Let us first decide if we want the API the way I proposed (i.e. |
|
I have pushed the branch using Probably things can be further optimized but I think it is already OK. The only decision is about the name of the argument. Do we deprecate |
|
This approach is fast when the number of duplicates is large, but it's hard to know whether that's the case in general. Sometimes you might have only a few duplicates. I guess there's no way to be efficient all the time, except by allowing users to choose the algorithm. Maybe not a big deal.
Yeah that's probably better. |
|
I would lean towards names that reference |
|
Most of the time it will be reducer, but if you e.g. pass |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
That sounds similar to |
|
Yes - internally we call For now I have proposed to call the keyword argument |
how about |
|
The issue is that operation does not have to be aggregation. We allow any operation. But maybe indeed something like |
|
|
|
So you propose to use |
|
I would also put in competition |
|
|
|
I'm not sure, I was just thinking out loud. I'm trying to find a similar case in the existing API, but it turns out most of the time we don't use keyword arguments for functions. Maybe just |
This is also what I have checked. And for positional arguments If we feel |
I think |
|
I am ok with |
|
bump (as otherwise we will forget what we discussed). The question is if we accept the Thank you! |
|
I was going to say that combiner is OK but then I read the docstring again and I noticed we speak a lot about "combinations" when describing this argument (and others), and yet these "combinations" have nothing to do with the "combiner" (i.e. it doesn't combine values from different combinations). So maybe we should find another term to avoid the confusion. Maybe |
|
Unless no other comment is made on the best choice in a few days I will switch the implementation to use |
|
|
|
currently the argument for values is called |
|
@nalimilan - I think we need to close the discussion and make a decision (naming is always super hard unfortunately). I think |
|
The plural sounds indeed better given that the function will get passed all values for a given combination. Regarding the positional argument, it matters less, but note that we also use the singular for |
|
I am aware of |
|
The PR is updated. |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
|
Thank you! |
That is true but those uses of "combinations" are all informal and not part of the API. I think it is more important to keep the formal usage of |
|
@adkabo - but what is your proposal for a name of this keyword argument then? Do you propose CC @nalimilan |
|
IIUC it is exactly a |
|
Well |
|
Yes - I think |
|
I think any of these proposals is better than |
|
I opened #3184 to keep track of it. |
Follow up to #2995
Replaces #1181
What I would discuss if
allowduplicatesis a good name for this keyword argument now. Maybe we should introduce a new keyword argument (a single one) and deprecateallowduplicates(in a long term deprecation fashion i.e. we do need to remove it any time soon)