Skip to content

[SPARK-56313][PYTHON][FOLLOWUP] Remove the generic for rddsampler methods#55144

Closed
gaogaotiantian wants to merge 1 commit intoapache:masterfrom
gaogaotiantian:fix-rddsampler-generic
Closed

[SPARK-56313][PYTHON][FOLLOWUP] Remove the generic for rddsampler methods#55144
gaogaotiantian wants to merge 1 commit intoapache:masterfrom
gaogaotiantian:fix-rddsampler-generic

Conversation

@gaogaotiantian
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Remove the generic ([T]) for rddsampler methods

Why are the changes needed?

It's supported starting from 3.12 but we support python since 3.10.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

mypy passed locally.

Was this patch authored or co-authored using generative AI tooling?

No.

@ueshin
Copy link
Copy Markdown
Member

ueshin commented Apr 1, 2026

Thanks! merging to master.

@ueshin ueshin closed this in e0bc7aa Apr 1, 2026
@pan3793
Copy link
Copy Markdown
Member

pan3793 commented Apr 2, 2026

another failure ... why can't PR GHA capture it?

*****************************
* Building Python API docs. *
*****************************
Running Sphinx v4.5.0

Configuration error:
There is a syntax error in your configuration file: invalid syntax (rddsampler.py, line 104)

https://github.com/apache/spark/actions/runs/23874309148/job/69613658492

@zhengruifeng
Copy link
Copy Markdown
Contributor

@pan3793 it seems the this job Build and deploy documentation is not included in PR builder, and PR builder checks doc building in a different task Documentation generation.
I think the two tasks are under different env setting.

@pan3793
Copy link
Copy Markdown
Member

pan3793 commented Apr 2, 2026

I think the two tasks are under different env setting.

should they be aligned? or it's by design?

@gaogaotiantian
Copy link
Copy Markdown
Contributor Author

These are two different workflows that use different versions of Python. The page.yml actually runs without docker. If we have some bandwidth, we should clean up our workflows (not put everything inside build_and_test.yml, it's too difficult to find the relevant part). If we have a "doc" task that all workflows can use, the result will be much more consistent.

@pan3793
Copy link
Copy Markdown
Member

pan3793 commented Apr 2, 2026

inconsistency comes here, #54310 upgraded Python from 3.11 to 3.12 for doc build

@gaogaotiantian
Copy link
Copy Markdown
Contributor Author

Yeah, but even if the doc CI is consistent, we would've broken the scheduled 3.10/3.11 CI. We can't catch all issues with python compatibilities before merging. Catching it the next day is not too bad.

@raulcd
Copy link
Copy Markdown
Member

raulcd commented Apr 2, 2026

We test with Python 3.11 on the Apache Arrow nightlies integration job with Spark. Just sharing that I was here looking for this :) Our nightlies trigger here:
https://github.com/ursacomputing/crossbow/actions/runs/23878274473/job/69625942335

Maybe a minimal 3.10 CI validation when there are changes on Python files could be worth it? Just to validate there are no breaking changes for the oldest supported Python? @gaogaotiantian @HyukjinKwon

@gaogaotiantian
Copy link
Copy Markdown
Contributor Author

We are limited by the number of concurrent tasks per PR. That's ASF policy. Switching to another version of Python is definitely not impossible, but kind of conflict with our current testing framework - with all the docker images and everything.

@gaogaotiantian gaogaotiantian deleted the fix-rddsampler-generic branch April 2, 2026 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants