Skip to content

[pyspark] Add support for Spark Connect ML#11970

Merged
trivialfis merged 48 commits intodmlc:masterfrom
medb:sc_python
Mar 24, 2026
Merged

[pyspark] Add support for Spark Connect ML#11970
trivialfis merged 48 commits intodmlc:masterfrom
medb:sc_python

Conversation

@medb
Copy link
Copy Markdown
Contributor

@medb medb commented Jan 29, 2026

Fixes #9780 #11510

ci todos:

  • Revert change to CI tag.

@medb medb marked this pull request as draft January 29, 2026 00:34
@medb medb marked this pull request as ready for review January 29, 2026 00:35
@medb medb marked this pull request as draft January 29, 2026 00:35
@medb
Copy link
Copy Markdown
Contributor Author

medb commented Jan 29, 2026

@WeichenXu123 @wbo4958 may you take a look at this draft PR? It's in very early stages, but I would like to get feedback whether this is a right direction.

@trivialfis
Copy link
Copy Markdown
Member

trivialfis commented Jan 30, 2026

Thank you for the PR! I will leave the review to @WeichenXu123 and @wbo4958

Out of curiosity, last time @wbo4958 work on this topic, it was suggested to call the XGBoost JVM package from the Python client. Is this still relevant? (I was against the idea, just asking to understand the current status)

@wbo4958
Copy link
Copy Markdown
Contributor

wbo4958 commented Feb 2, 2026

Could you be able to add some unit tests for it?

@medb medb marked this pull request as ready for review February 9, 2026 16:00
@medb
Copy link
Copy Markdown
Contributor Author

medb commented Feb 9, 2026

@wbo4958 I have modified existing unit tests to run for Spark Classic and Spark Connect, PTAL

@medb medb changed the title [pyspark] Migrate to Spark Connect ML [pyspark] Add support for Spark Connect ML Feb 10, 2026
@medb
Copy link
Copy Markdown
Contributor Author

medb commented Feb 17, 2026

@trivialfis @RAMitchell @wbo4958 @WeichenXu123 I have rebased PR on the main branch HEAD and it's ready for review now. PTAL

@medb medb requested a review from wbo4958 March 14, 2026 06:51
@trivialfis
Copy link
Copy Markdown
Member

trivialfis commented Mar 15, 2026

E pyspark.errors.exceptions.base.PySparkImportError: [PACKAGE_NOT_INSTALLED] zstandard >= 0.25.0 must be installed; however, it was not found.

err.

I will run some tests locally first.

@medb medb requested a review from wbo4958 March 16, 2026 17:39
@trivialfis
Copy link
Copy Markdown
Member

ok, I managed to sort out the dependencies. Once the review from @wbo4958 is resolved, I will merge the devop PR and update the tag here.

@medb medb requested a review from wbo4958 March 19, 2026 20:11
@wbo4958
Copy link
Copy Markdown
Contributor

wbo4958 commented Mar 23, 2026

LGTM, Thx.

@trivialfis trivialfis merged commit 96d4eac into dmlc:master Mar 24, 2026
77 of 78 checks passed
@medb medb deleted the sc_python branch March 24, 2026 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[spark] Make xgboost.spark support spark connect ML

4 participants