Skip to content

[WIP][SPARK-56015][INFRA][DOCS] Cleanup docs container, remove unused R deps, and fix x86 build.#54838

Closed
holdenk wants to merge 6 commits intoapache:masterfrom
holdenk:SPARK-56015-bump-r-versions-for-consistent-build
Closed

[WIP][SPARK-56015][INFRA][DOCS] Cleanup docs container, remove unused R deps, and fix x86 build.#54838
holdenk wants to merge 6 commits intoapache:masterfrom
holdenk:SPARK-56015-bump-r-versions-for-consistent-build

Conversation

@holdenk
Copy link
Copy Markdown
Contributor

@holdenk holdenk commented Mar 16, 2026

What changes were proposed in this pull request?

Bump package versions in Docs Docker file to currently working versions & update to non-deprecated installer.

Why are the changes needed?

Docs container uses deprecated installation methods (also failed but failure was also fixed in 54164e9 )

Does this PR introduce any user-facing change?

No

How was this patch tested?

Built docs docker container locally.

Was this patch authored or co-authored using generative AI tooling?

Asked chat gpt about the different CRAN install methods.

holdenk added 4 commits March 16, 2026 14:00
…tm_source=chatgpt.com devtools install_version deprecated. Left devtools for downstream usage.

Co-authored-by: Holden Karau <holden@pigscanfly.ca>
…ike archived projects. Also temp put roxygen2 back to 7.2

Co-authored-by: Holden Karau <holden@pigscanfly.ca>
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
@holdenk holdenk changed the title [WIP][SPARK-56015][DOCS] Bump package versions in Docs Docker file [SPARK-56015][DOCS] Bump package versions in Docs Docker file Mar 16, 2026
…-build

Co-authored-by: Holden Karau <holden@pigscanfly.ca>
Rscript -e "devtools::install_version('preferably', version='0.4', repos='https://cloud.r-project.org')"
RUN Rscript -e "pak::pak('roxygen2@7.2.0')" && \
Rscript -e "pak::pak('pkgdown@2.2.0')"
RUN Rscript -e "devtools::install_version('preferably', version='0.4.1', repos='https://cloud.r-project.org')"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is not using pak, @holdenk ? For the consistency, I guess this PR is supposed to remove devtools::install_version completely.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So preferably is a deprecated/archived package and we need to use legacy install to install it. In others where we don't build docs and don't use the preferably template we could avoid devtools entirely. We could also drop the template since it's been removed from CRAN but that feels like a bigger conversation. See https://cran.r-project.org/web/packages/preferably/index.htm

&& rm -rf /var/lib/apt/lists/*

# Broken up for caching since CRAN can change.
RUN Rscript -e "install.packages(c('devtools', 'knitr', 'markdown', 'rmarkdown', 'testthat', 'remotes', 'pak'), repos='https://cloud.r-project.org/')"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can remove devtools installation completely. Does pak depend on devtools?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the reason is the https://cran.r-project.org/web/packages/preferably/index.html package needs to be installed for the current docs and it's deprecated/archived on CRAN so we use the legacy installer.

@holdenk holdenk changed the title [SPARK-56015][DOCS] Bump package versions in Docs Docker file [SPARK-56015][DOCS] Switch to non-deprecated package installation where possible for R in Docs Docker file Mar 17, 2026
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-56015][DOCS] Switch to non-deprecated package installation where possible for R in Docs Docker file [SPARK-56015][INFRA][DOCS] Switch to non-deprecated package installation where possible for R in Docs Docker file Mar 17, 2026
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
@holdenk holdenk force-pushed the SPARK-56015-bump-r-versions-for-consistent-build branch from 20cf4e0 to f5aa4bb Compare March 17, 2026 21:30
Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating, @holdenk .

There is no side-effect from pkgdown change (from 2.0.1 to 2.2.0)?

@dongjoon-hyun
Copy link
Copy Markdown
Member

dongjoon-hyun commented Mar 17, 2026

IIUC, this simply aims to remove the warning instead of fixing any outage (as of now). So, maybe, did you verify the generated Rdoc manually, @holdenk ? If there is no noticeable change, this sounds okay to me.

@holdenk
Copy link
Copy Markdown
Contributor Author

holdenk commented Mar 17, 2026

@dongjoon-hyun it started out as unblocking the docs build but in the meantime you fixed the blocking issue so it's less critical now :) I've got a local build, going to do another one with the rebase on top of 54164e9

@holdenk
Copy link
Copy Markdown
Contributor Author

holdenk commented Mar 17, 2026

Actually looking at this, I think we can / should drop R from the docs container see the run-in-container script

# 1.Set env variable.
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-arm64
export PATH=$JAVA_HOME/bin:$PATH
export SPARK_DOCS_IS_BUILT_ON_HOST=1
# We expect to compile the R document on the host.
export SKIP_RDOC=1

@holdenk holdenk closed this Mar 17, 2026
@dongjoon-hyun
Copy link
Copy Markdown
Member

dongjoon-hyun commented Mar 17, 2026

Actually looking at this, I think we can / should drop R from the docs container see the run-in-container script

# 1.Set env variable.
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-arm64
export PATH=$JAVA_HOME/bin:$PATH
export SPARK_DOCS_IS_BUILT_ON_HOST=1
# We expect to compile the R document on the host.
export SKIP_RDOC=1

+1 for the direction to drop R from the docs containers (for Apache Spark 4.2+) since it's deprecated already.

@holdenk holdenk reopened this Mar 24, 2026
@holdenk holdenk changed the title [SPARK-56015][INFRA][DOCS] Switch to non-deprecated package installation where possible for R in Docs Docker file [WIP][SPARK-56015][INFRA][DOCS] Cleanup docs container, remove unused R deps, and fix x86 build. Mar 24, 2026
@holdenk holdenk closed this Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants