Skip to content

ci: improve caches in rust.yml#21819

Draft
lgingerich wants to merge 2 commits intoapache:mainfrom
lgingerich:lg-21818-ci
Draft

ci: improve caches in rust.yml#21819
lgingerich wants to merge 2 commits intoapache:mainfrom
lgingerich:lg-21818-ci

Conversation

@lgingerich
Copy link
Copy Markdown
Contributor

@lgingerich lgingerich commented Apr 24, 2026

Which issue does this PR close?

Rationale for this change

Several jobs in .github/workflows/rust.yml rebuild dependencies cold on every run because they don't attach Swatinem/rust-cache. The workflow also maintained four separate cache keys covering roughly the same workspace, and taplo-cli was still being installed from source on every run. Part of #13813.

What changes are included in this PR?

  1. Add read-only Swatinem/rust-cache@v2 (shared-key: amd-ci, save-if: false) to the jobs that previously had no dep cache: linux-datafusion-proto-features, linux-cargo-check-datafusion-functions, linux-test-doc, linux-rustdoc, verify-benchmark-results, sqllogictest-postgres, sqllogictest-substrait, config-docs-check, vendor.

msrv is intentionally excluded — cargo msrv verify builds against MSRV rust versions, so the amd-ci cache (built with stable) wouldn't match and would just add download overhead.

  1. Consolidate amd-ci-check, amd-ci-linux-test-example, and amd-ci-clippy onto the shared amd-ci key. linux-test-example and clippy now use save-if: false; linux-build-lib and linux-test remain the savers.

  2. Replace cargo install taplo-cli in cargo-toml-formatting-checks with taiki-e/install-action, matching the pattern already used for wasm-pack and cargo-msrv.

  3. Cache the deterministic tpch-dbgen output in verify-benchmark-results keyed on the dbgen fork and scale factor, skipping the clone + make + dbgen on cache hits.

No needs: graph or coverage changes.

Are these changes tested?

Validated locally with actionlint; pre-existing warnings in untouched shell blocks are unchanged. The actual cache behaviour needs to be observed on real CI runs across multiple pushes.

Test Evidence

rust.yml Performance (Cold) image
rust.yml Performance (Warm)

Are there any user-facing changes?

No — this is CI infrastructure only.

Made with Cursor

Several jobs in rust.yml rebuild dependencies cold on every run because
they don't attach Swatinem/rust-cache. The workflow also maintained
four separate cache keys covering roughly the same workspace, and
taplo-cli was still being installed from source on every run.

- Add read-only Swatinem/rust-cache (shared-key: amd-ci, save-if: false)
  to the jobs that previously had no dep cache:
  linux-datafusion-proto-features,
  linux-cargo-check-datafusion-functions,
  linux-test-doc, linux-rustdoc, verify-benchmark-results,
  sqllogictest-postgres, sqllogictest-substrait,
  config-docs-check, vendor.

  msrv is intentionally excluded; cargo msrv verify builds against
  MSRV rust versions, so the amd-ci cache (built with stable) wouldn't
  match.

- Consolidate amd-ci-check, amd-ci-linux-test-example, and
  amd-ci-clippy onto the shared amd-ci key. linux-test-example and
  clippy now use save-if: false; linux-build-lib and linux-test remain
  the savers.

- Replace cargo install taplo-cli in cargo-toml-formatting-checks with
  taiki-e/install-action, matching the pattern already used for
  wasm-pack and cargo-msrv.

- Cache the deterministic tpch-dbgen output in
  verify-benchmark-results keyed on the dbgen fork and scale factor,
  skipping the clone + make + dbgen on cache hits.

Part of apache#13813.

Made-with: Cursor
@github-actions github-actions Bot added the development-process Related to development process of DataFusion label Apr 24, 2026
Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

development-process Related to development process of DataFusion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI: Improve caches in rust.yml

1 participant