Skip to content

Use PCA and TSVD from raft#7802

Merged
rapids-bot[bot] merged 13 commits intorapidsai:release/26.04from
aamijar:pca-raft
Mar 20, 2026
Merged

Use PCA and TSVD from raft#7802
rapids-bot[bot] merged 13 commits intorapidsai:release/26.04from
aamijar:pca-raft

Conversation

@aamijar
Copy link
Copy Markdown
Member

@aamijar aamijar commented Feb 16, 2026

Resolves #7801. Depends on rapidsai/raft#2952.
Remove code duplication, since PCA and TSVD are in raft now.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 16, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@aamijar aamijar added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Feb 16, 2026
@aamijar aamijar changed the base branch from main to release/26.04 March 14, 2026 00:05
rapids-bot bot pushed a commit to rapidsai/raft that referenced this pull request Mar 18, 2026
Required for rapidsai/cuvs#1207 and rapidsai/cuml#7802.

This PR moves `pca.cuh`, `tsvd.cuh`, and gtests into raft.

Resolves #2978

Authors:
  - Anupam (https://github.com/aamijar)

Approvers:
  - Jinsol Park (https://github.com/jinsolp)
  - Divye Gala (https://github.com/divyegala)

URL: #2952
@aamijar aamijar marked this pull request as ready for review March 19, 2026 00:02
@aamijar aamijar requested a review from a team as a code owner March 19, 2026 00:02
@aamijar aamijar requested review from dantegd and divyegala March 19, 2026 00:02
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 19, 2026

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • PCA operations now provide additional output statistics: explained variance, explained variance ratio, singular values, mean, and noise variances.
    • TSVD fit-transform now returns explained variance metrics.
  • API Improvements

    • Improved const-correctness: handle parameters are now const references across PCA and TSVD operations.
    • Streamlined parameter passing in decomposition functions.
  • Breaking Changes

    • PCA and TSVD function signatures updated; code using these functions requires updates.
    • Removed default arguments for certain boolean parameters in fit operations.

Walkthrough

Refactors PCA and TSVD to call raft::linalg implementations, replaces local solver enums with raft::linalg::solver aliases, updates public APIs to accept const raft::handle_t&, adds conversion helpers to raft params, removes legacy internal CUDA headers, and updates call sites and tests to the new signatures.

Changes

Cohort / File(s) Summary
Parameter Type Alignment
cpp/include/cuml/decomposition/params.hpp
Replaced local enum class solver and mg_solver with using ... = raft::linalg::solver;, added #include <raft/linalg/pca_types.hpp>, and added to_raft_params template converters for PCA/TSVD param structs.
Public API Header Updates
cpp/include/cuml/decomposition/pca.hpp, cpp/include/cuml/decomposition/tsvd.hpp
Made handle parameters const raft::handle_t&, expanded PCA/TSVD fit/fit-transform signatures to expose explained_var/explained_var_ratio/singular_vals/mu/noise_vars, and removed default bool defaults.
PCA Implementation
cpp/src/pca/pca.cu, cpp/src/pca/pca_mg.cu
Added templated impls that wrap raft::linalg::pca_* (device mdspan/views), forward to_raft_params(prms), replaced previous internal calls with raft equivalents, removed passing raw streams to raft calls, and updated MG code to use raft resources and sign_flip_components.
PCA Header Removal
cpp/src/pca/pca.cuh
Deleted legacy PCA CUDA header and its device helpers/templates (fit/transform, helpers for explained vars, mean/covariance, inverse/transform flows).
TSVD Implementation
cpp/src/tsvd/tsvd.cu, cpp/src/tsvd/tsvd_mg.cu
Added templated impls that wrap raft::linalg::tsvd_* with device views, convert params via to_raft_params, updated MG variant to call raft::linalg::cal_eig/sign_flip_components with raft resources.
TSVD Header Removal
cpp/src/tsvd/tsvd.cuh
Removed legacy TSVD CUDA header and its internal routines (calCompExpVarsSvd, calEig, signFlipComponents, fit/transform/inverse, etc.).
Call-site & Test Updates
cpp/src/tsne/tsne_runner.cuh, cpp/tests/sg/pca_test.cu, cpp/tests/sg/tsvd_test.cu
Replaced private header includes with public headers, updated pca/tsvd calls to pass boolean flags instead of CUDA streams and to match new signatures; updated copyright years.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • Update raft headers #7752: Migrates cuML code to raft::linalg/mdspan device-view APIs and performs similar PCA/TSVD wiring and parameter conversions.

Suggested reviewers

  • dantegd
  • divyegala
  • jcrist
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.13% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Use PCA and TSVD from raft' clearly summarizes the main change in the PR: migrating from cuml's local PCA/TSVD implementations to using raft's implementations.
Description check ✅ Passed The description is directly related to the changeset, explaining the motivation (resolve #7801, depend on raft PR) and intent (remove code duplication by using raft's PCA and TSVD).
Linked Issues check ✅ Passed The PR successfully implements the objective from #7801 by removing local PCA/TSVD implementations and delegating to raft functions through parameter conversion helpers and wrapper functions across all relevant files.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the objective of migrating PCA and TSVD from local implementations to raft. The copyright year updates (2025→2026) and include path changes are necessary supporting modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can disable the changed files summary in the walkthrough.

Disable the reviews.changed_files_summary setting to disable the changed files summary in the walkthrough.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cpp/include/cuml/decomposition/pca.hpp (1)

16-57: ⚠️ Potential issue | 🔴 Critical

Provide backward compatibility for pcaFit and pcaFitTransform API changes.

The expanded function signatures for pcaFit and pcaFitTransform (adding 5 new output parameters: explained_var, explained_var_ratio, singular_vals, mu, noise_vars) break the public C++ API without a deprecation path. Per coding guidelines for public headers in cpp/include/cuml/, breaking changes must include deprecation warnings and a migration guide.

Keep the old signatures with [[deprecated]] attributes and create overloads or wrapper functions that internally call the new implementations with default/computed values for the new parameters.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuml/decomposition/pca.hpp` around lines 16 - 57, Add
backward-compatible deprecated overloads for the old pcaFit and pcaFitTransform
signatures that call the new expanded APIs: create [[deprecated]] versions of
pcaFit(const raft::handle_t&, float* input, float* components, const paramsPCA&
prms, bool flip_signs_based_on_U) and its double variant, and similarly for
pcaFitTransform(old signatures). In each deprecated overload, allocate or pass
nullptrs/default buffers for the new outputs (explained_var,
explained_var_ratio, singular_vals, mu, noise_vars) and call the new
implementations (pcaFit and pcaFitTransform with the expanded parameter lists),
ensuring the deprecated functions forward to the new ones so existing callers
keep working; also add a brief comment directing users to the new API and
migration steps.
cpp/include/cuml/decomposition/tsvd.hpp (1)

48-65: ⚠️ Potential issue | 🔴 Critical

All internal call sites properly updated with new parameters.

The expanded tsvdFitTransform signature with explained_var and explained_var_ratio parameters has been consistently applied across all internal usages:

  • Cython extern declarations and fit_transform calls (tsvd.pyx)
  • C++ test file allocations and calls (tsvd_test.cu)
  • Python fit_transform implementation properly allocates and passes these parameters

However, this constitutes a breaking change to the public C++ API in cpp/include/cuml/decomposition/tsvd.hpp. Per the coding guidelines, breaking changes to public headers require deprecation warnings. No deprecation mechanism is currently in place for the old signature. Consider adding a deprecated overload or compiler warning to ease migration for external consumers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuml/decomposition/tsvd.hpp` around lines 48 - 65, The public API
changed for tsvdFitTransform by adding explained_var and explained_var_ratio
parameters; add a deprecated overload that preserves the old signature and
forwards to the new implementation (e.g., an overload of tsvdFitTransform(float*
... , const paramsTSVD&, bool) and the double version) so external consumers
continue to compile, and annotate those forwarding overloads with a deprecation
attribute (e.g., [[deprecated]] or compiler-specific macro) to generate
warnings; ensure the deprecated overloads call the new tsvdFitTransform
variants, passing nullptr (or sensible defaults) for explained_var and
explained_var_ratio to maintain behavior.
🧹 Nitpick comments (3)
cpp/include/cuml/decomposition/pca.hpp (2)

87-87: Remove unnecessary semicolon after namespace closing brace.

The semicolon after the closing brace is not required for namespace declarations.

Proposed fix
-};  // end namespace ML
+}  // end namespace ML
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuml/decomposition/pca.hpp` at line 87, Remove the stray
semicolon following the closing brace of namespace ML: locate the namespace
closing token "namespace ML { ... }" (the closing brace for namespace ML) and
delete the trailing semicolon so the namespace ends with only the closing brace.

16-85: Consider adding minimal Doxygen documentation for maintainability.

Similar to the TSVD header, brief documentation would clarify parameter semantics, especially for the multiple output arrays in pcaFit and pcaFitTransform.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuml/decomposition/pca.hpp` around lines 16 - 85, Add minimal
Doxygen comments above each PCA API declaration (pcaFit, pcaFitTransform,
pcaTransform, pcaInverseTransform for both float and double overloads)
explaining parameter semantics: expected input layout and shapes (rows/cols),
which arrays are outputs (components, explained_var, explained_var_ratio,
singular_vals, mu, noise_vars, trans_input, input), ownership/allocator
expectations (caller-allocated), and meaning of flags like flip_signs_based_on_U
and paramsPCA; keep comments brief (one or two lines per param) similar to the
TSVD header style to improve maintainability and clarity.
cpp/include/cuml/decomposition/tsvd.hpp (1)

16-65: Consider adding minimal Doxygen documentation.

While these headers serve as internal implementation details for the Python API, adding brief Doxygen comments would help future maintainers understand:

  • Parameter semantics (input vs output pointers)
  • Memory ownership expectations
  • The purpose of flip_signs_based_on_U
📝 Example documentation pattern
/**
 * `@brief` Fit TSVD model to input data.
 * `@param`[in]  handle       RAFT handle for resource management
 * `@param`[in,out] input     Input matrix (may be modified)
 * `@param`[out] components   Principal components (n_components x n_cols)
 * `@param`[out] singular_vals Singular values (n_components)
 * `@param`[in]  prms         TSVD parameters
 * `@param`[in]  flip_signs_based_on_U Whether to flip signs for deterministic output
 */
void tsvdFit(const raft::handle_t& handle, ...);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuml/decomposition/tsvd.hpp` around lines 16 - 65, Add minimal
Doxygen-style comments to each declaration (e.g., tsvdFit, tsvdTransform,
tsvdInverseTransform, tsvdFitTransform) describing parameter semantics (which
pointers are input vs output vs in/out), memory ownership expectations
(caller-allocated vs callee-allocated), the role of the RAFT handle, and the
purpose/behavior of flip_signs_based_on_U (deterministic sign flipping based on
U); keep each comment short (one-liner brief description + `@param` tags for
handle, input/trans_input, components, singular_vals, explained_var(s), prms,
and flip_signs_based_on_U).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/pca/pca_mg.cu`:
- Around line 108-115: The code incorrectly calls the single-GPU helper
sign_flip_components with input_data[0]->ptr and global dimensions (prms.n_rows,
prms.n_cols), causing OOB reads and wrong sign selection for partitioned
multi-GPU inputs; replace this with the MG-aware path by calling
sign_flip_components_u (or iterate over input_desc.blocksOwnedBy(rank) and call
the sign-flip logic per-partition using each partition's local row/col sizes)
and pass the correct local device matrix views for each partition and the
components buffer; apply the same fix in tsvd_mg.cu where sign_flip_components
is used.

In `@cpp/src/tsvd/tsvd_mg.cu`:
- Around line 97-104: The current else branch uses
raft::linalg::sign_flip_components with a wrapped single-partition matrix
(input_data[0]->ptr) which assumes prms.n_rows is local and causes OOB reads in
MNMG; replace that call with the partition-aware API by invoking
raft::linalg::sign_flip_components_u with the same parameters used in the true
branch (pass handle_stream_zero, the full input_data vector and input_desc,
components, prms.n_components, prms.n_cols, and the same booleans), or remove
the single-partition path entirely so MNMG always uses sign_flip_components_u to
ensure per-rank partitioning is respected.

---

Outside diff comments:
In `@cpp/include/cuml/decomposition/pca.hpp`:
- Around line 16-57: Add backward-compatible deprecated overloads for the old
pcaFit and pcaFitTransform signatures that call the new expanded APIs: create
[[deprecated]] versions of pcaFit(const raft::handle_t&, float* input, float*
components, const paramsPCA& prms, bool flip_signs_based_on_U) and its double
variant, and similarly for pcaFitTransform(old signatures). In each deprecated
overload, allocate or pass nullptrs/default buffers for the new outputs
(explained_var, explained_var_ratio, singular_vals, mu, noise_vars) and call the
new implementations (pcaFit and pcaFitTransform with the expanded parameter
lists), ensuring the deprecated functions forward to the new ones so existing
callers keep working; also add a brief comment directing users to the new API
and migration steps.

In `@cpp/include/cuml/decomposition/tsvd.hpp`:
- Around line 48-65: The public API changed for tsvdFitTransform by adding
explained_var and explained_var_ratio parameters; add a deprecated overload that
preserves the old signature and forwards to the new implementation (e.g., an
overload of tsvdFitTransform(float* ... , const paramsTSVD&, bool) and the
double version) so external consumers continue to compile, and annotate those
forwarding overloads with a deprecation attribute (e.g., [[deprecated]] or
compiler-specific macro) to generate warnings; ensure the deprecated overloads
call the new tsvdFitTransform variants, passing nullptr (or sensible defaults)
for explained_var and explained_var_ratio to maintain behavior.

---

Nitpick comments:
In `@cpp/include/cuml/decomposition/pca.hpp`:
- Line 87: Remove the stray semicolon following the closing brace of namespace
ML: locate the namespace closing token "namespace ML { ... }" (the closing brace
for namespace ML) and delete the trailing semicolon so the namespace ends with
only the closing brace.
- Around line 16-85: Add minimal Doxygen comments above each PCA API declaration
(pcaFit, pcaFitTransform, pcaTransform, pcaInverseTransform for both float and
double overloads) explaining parameter semantics: expected input layout and
shapes (rows/cols), which arrays are outputs (components, explained_var,
explained_var_ratio, singular_vals, mu, noise_vars, trans_input, input),
ownership/allocator expectations (caller-allocated), and meaning of flags like
flip_signs_based_on_U and paramsPCA; keep comments brief (one or two lines per
param) similar to the TSVD header style to improve maintainability and clarity.

In `@cpp/include/cuml/decomposition/tsvd.hpp`:
- Around line 16-65: Add minimal Doxygen-style comments to each declaration
(e.g., tsvdFit, tsvdTransform, tsvdInverseTransform, tsvdFitTransform)
describing parameter semantics (which pointers are input vs output vs in/out),
memory ownership expectations (caller-allocated vs callee-allocated), the role
of the RAFT handle, and the purpose/behavior of flip_signs_based_on_U
(deterministic sign flipping based on U); keep each comment short (one-liner
brief description + `@param` tags for handle, input/trans_input, components,
singular_vals, explained_var(s), prms, and flip_signs_based_on_U).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: de869307-e4ac-4987-aef7-b539040a82ad

📥 Commits

Reviewing files that changed from the base of the PR and between b027fff and dea771d.

📒 Files selected for processing (12)
  • cpp/include/cuml/decomposition/params.hpp
  • cpp/include/cuml/decomposition/pca.hpp
  • cpp/include/cuml/decomposition/tsvd.hpp
  • cpp/src/pca/pca.cu
  • cpp/src/pca/pca.cuh
  • cpp/src/pca/pca_mg.cu
  • cpp/src/tsne/tsne_runner.cuh
  • cpp/src/tsvd/tsvd.cu
  • cpp/src/tsvd/tsvd.cuh
  • cpp/src/tsvd/tsvd_mg.cu
  • cpp/tests/sg/pca_test.cu
  • cpp/tests/sg/tsvd_test.cu
💤 Files with no reviewable changes (2)
  • cpp/src/tsvd/tsvd.cuh
  • cpp/src/pca/pca.cuh

Comment on lines +108 to +115
raft::linalg::sign_flip_components(
handle_stream_zero,
raft::make_device_matrix_view<T, std::size_t, raft::col_major>(
input_data[0]->ptr, prms.n_rows, prms.n_cols),
raft::make_device_matrix_view<T, std::size_t, raft::col_major>(
components, prms.n_components, prms.n_cols),
true,
false);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, understand the file structure and locate the exact code
find . -type f -name "pca_mg.cu" | head -5

Repository: rapidsai/cuml

Length of output: 80


🏁 Script executed:

# Read the file around lines 108-115 with sufficient context
wc -l ./cpp/src/pca/pca_mg.cu

Repository: rapidsai/cuml

Length of output: 84


🏁 Script executed:

# Get the code context around the flagged lines
sed -n '100,125p' ./cpp/src/pca/pca_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 1188


🏁 Script executed:

# Get broader context to understand input_data and prms structure
sed -n '1,120p' ./cpp/src/pca/pca_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 5309


🏁 Script executed:

# Find TSVD MG code that was mentioned as having identical issue
find . -type f -name "*tsvd*mg*.cu" -o -name "*tsvd*mg*.cuh" 2>/dev/null | head -5

Repository: rapidsai/cuml

Length of output: 82


🏁 Script executed:

# Look at the sign_flip_components_u function (MG-aware variant used when flip_signs_based_on_U==true)
rg -n "sign_flip_components_u" --type cpp --type cuda -B 3 -A 10 | head -50

Repository: rapidsai/cuml

Length of output: 85


🏁 Script executed:

# Search for PartDescriptor usage to understand data partitioning
rg -n "PartDescriptor" cpp/src/pca/pca_mg.cu -A 2 -B 2

Repository: rapidsai/cuml

Length of output: 2766


🏁 Script executed:

# Check what input_data[0] represents and its relationship to prms.n_rows
grep -n "input_data\[0\]" cpp/src/pca/pca_mg.cu

Repository: rapidsai/cuml

Length of output: 115


🏁 Script executed:

# Read TSVD MG around sign_flip_components calls
grep -n "sign_flip_components" ./cpp/src/tsvd/tsvd_mg.cu -B 5 -A 5

Repository: rapidsai/cuml

Length of output: 1215


🏁 Script executed:

# Search for input_data[0]->ptr in TSVD MG file
grep -n "input_data\[0\]->ptr" ./cpp/src/tsvd/tsvd_mg.cu -B 3 -A 3

Repository: rapidsai/cuml

Length of output: 408


🏁 Script executed:

# Understand PartDescriptor to see how partitions relate to global dimensions
rg -n "class PartDescriptor" --type cpp -A 20 | head -40

Repository: rapidsai/cuml

Length of output: 39


🏁 Script executed:

# Check the fit_impl signature in TSVD to understand what prms.n_rows represents
sed -n '1,80p' ./cpp/src/tsvd/tsvd_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 3556


🏁 Script executed:

# Find PartDescriptor definition
find . -type f -name "*.hpp" -o -name "*.h" | xargs grep -l "class PartDescriptor" 2>/dev/null | head -3

Repository: rapidsai/cuml

Length of output: 39


🏁 Script executed:

# Look at PartDescriptor definition
rg -n "class PartDescriptor" -A 30 | head -60

Repository: rapidsai/cuml

Length of output: 497


🏁 Script executed:

# Understand how input_data is populated - look at the caller side
sed -n '280,290p' ./cpp/src/pca/pca_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 730


🏁 Script executed:

# Check what local_size or partition size would be from PartDescriptor
rg -n "get_*_size\|local.*row\|partition" ./cpp/src/pca/pca_mg.cu -i | head -20

Repository: rapidsai/cuml

Length of output: 39


🏁 Script executed:

# Look at the fit_impl entry point to understand input_data structure
sed -n '250,295p' ./cpp/src/pca/pca_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 2337


🏁 Script executed:

# Check PartDescriptor member functions to get local size
rg -n "PartDescriptor::" -A 3 --type cpp | grep -E "get_|local|size" | head -20

Repository: rapidsai/cuml

Length of output: 39


🏁 Script executed:

# Look at sign_flip_components_u to see how MG is handled correctly
sed -n '1,50p' ./cpp/src/pca/sign_flip_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 1778


🏁 Script executed:

# Look at how other parts of fit_impl use input_data partitions correctly
sed -n '49,100p' ./cpp/src/pca/pca_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 2752


🏁 Script executed:

# Check RankSizePair to understand local partition size tracking
rg -n "RankSizePair" --type cpp -B 2 -A 5 | head -40

Repository: rapidsai/cuml

Length of output: 2834


🏁 Script executed:

# Let me check if input_data[0] is actually a partition by looking at how it's used in the cov function
rg -n "opg::cov\|opg::mean" ./cpp/src/pca/pca_mg.cu -B 2 -A 2

Repository: rapidsai/cuml

Length of output: 39


🏁 Script executed:

# Check how sign_flip_components_u properly handles partitions with input_desc
sed -n '100,200p' ./cpp/src/pca/sign_flip_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 5097


🏁 Script executed:

# Verify partition sizes by checking RankSizePair usage
rg -n "get_parts\|n_rows\|partition" ./cpp/src/pca/sign_flip_mg.cu | head -30

Repository: rapidsai/cuml

Length of output: 39


🏁 Script executed:

# Let me look at the actual sign_flip_components function signature to understand what it expects
rg -n "void sign_flip_components" --type cpp -A 5 | head -30

Repository: rapidsai/cuml

Length of output: 1395


🏁 Script executed:

# Look at more of sign_flip_components_u_imp to confirm partition-aware usage
sed -n '49,150p' ./cpp/src/pca/sign_flip_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 4459


🏁 Script executed:

# Check if raft::linalg::sign_flip_components is the single-GPU variant
find . -type f \( -name "*.cuh" -o -name "*.hpp" \) -path "*/raft/*" | xargs grep -l "sign_flip_components" 2>/dev/null | head -3

Repository: rapidsai/cuml

Length of output: 39


🏁 Script executed:

# Verify the RAFT sign_flip_components signature
rg -n "sign_flip_components\(" --type cpp --type cuh -A 3 | grep -A 3 "raft::linalg::sign_flip_components" | head -20

Repository: rapidsai/cuml

Length of output: 84


🏁 Script executed:

# Confirm the issue: local_blocks from input_desc would give correct local size
sed -n '63,90p' ./cpp/src/pca/sign_flip_mg.cu | cat -n

Repository: rapidsai/cuml

Length of output: 995


Don't use the single-GPU sign-flip helper on partitioned multi-GPU data.

Lines 108–115 pass input_data[0]->ptr (a single partition) with global matrix dimensions (prms.n_rows, prms.n_cols) to raft::linalg::sign_flip_components(). This causes out-of-bounds reads or sign selection based only on the first partition instead of all distributed data. The MG-aware variant, sign_flip_components_u(), correctly iterates partitions and uses input_desc.blocksOwnedBy(rank) with each partition's local size (see sign_flip_mg.cu:63–87). Use the partition-aware path or handle all partitions in a loop with local dimensions from input_desc.

Identical issue exists in tsvd_mg.cu:97–103.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pca/pca_mg.cu` around lines 108 - 115, The code incorrectly calls the
single-GPU helper sign_flip_components with input_data[0]->ptr and global
dimensions (prms.n_rows, prms.n_cols), causing OOB reads and wrong sign
selection for partitioned multi-GPU inputs; replace this with the MG-aware path
by calling sign_flip_components_u (or iterate over
input_desc.blocksOwnedBy(rank) and call the sign-flip logic per-partition using
each partition's local row/col sizes) and pass the correct local device matrix
views for each partition and the components buffer; apply the same fix in
tsvd_mg.cu where sign_flip_components is used.

Comment on lines +97 to +104
raft::linalg::sign_flip_components(
handle_stream_zero,
raft::make_device_matrix_view<T, std::size_t, raft::col_major>(
input_data[0]->ptr, prms.n_rows, prms.n_cols),
raft::make_device_matrix_view<T, std::size_t, raft::col_major>(
components, prms.n_components, prms.n_cols),
false,
false);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Inspect how MNMG inputs are partitioned"
rg -n -C2 'blocksOwnedBy|partsToRanks|RankSizePair' cpp

echo
echo "Inspect the new sign-flip call sites"
rg -n -C3 'input_data\(input, input \+ n_parts\)|sign_flip_components\(|input_data\[0\]->ptr, prms\.n_rows, prms\.n_cols' \
  cpp/src/tsvd/tsvd_mg.cu cpp/src/pca/pca_mg.cu

Repository: rapidsai/cuml

Length of output: 50369


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Lines 97-104 and context in tsvd_mg.cu ==="
sed -n '85,115p' cpp/src/tsvd/tsvd_mg.cu

echo
echo "=== How input_data is assembled in fit_impl ==="
sed -n '119,145p' cpp/src/tsvd/tsvd_mg.cu

echo
echo "=== Check how sign_flip_components is used in PCA (for comparison) ==="
rg -n -A5 'raft::linalg::sign_flip_components|sign_flip_mg' cpp/src/pca/pca_mg.cu | head -40

Repository: rapidsai/cuml

Length of output: 3051


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== PCA fit_impl around sign_flip call ==="
sed -n '80,120p' cpp/src/pca/pca_mg.cu

echo
echo "=== Check if there's a partition-aware sign flip function in sign_flip_mg ==="
head -100 cpp/src/pca/sign_flip_mg.cu | grep -n 'namespace\|void\|template'

Repository: rapidsai/cuml

Length of output: 2034


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check paramsTSVDMG structure to see what n_rows represents ==="
rg -n 'struct paramsTSVDMG|class paramsTSVDMG' cpp/src/tsvd/ -A 10

echo
echo "=== Check how prms is initialized in the public fit function ==="
sed -n '410,445p' cpp/src/tsvd/tsvd_mg.cu

Repository: rapidsai/cuml

Length of output: 123


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Search for paramsTSVDMG definition ==="
rg -n 'paramsTSVDMG' cpp/src/tsvd/tsvd_mg.cu | head -20

echo
echo "=== Check the include files in tsvd_mg.cu ==="
head -50 cpp/src/tsvd/tsvd_mg.cu | grep '#include'

echo
echo "=== Search in header files for paramsTSVDMG ==="
find cpp -name "*.h" -o -name "*.hpp" | xargs grep -l 'paramsTSVDMG' 2>/dev/null

Repository: rapidsai/cuml

Length of output: 1482


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Look at paramsTSVDMG definition in params.hpp ==="
grep -A 20 'struct paramsTSVDMG' cpp/include/cuml/decomposition/params.hpp

echo
echo "=== Look at the public fit wrapper around line 440 ==="
sed -n '410,460p' cpp/src/tsvd/tsvd_mg.cu

echo
echo "=== Check how input_data[0]->totalSize relates to local partition size ==="
rg -n 'totalSize' cpp/src/tsvd/tsvd_mg.cu | head -5

Repository: rapidsai/cuml

Length of output: 110


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Search for paramsTSVDMG in params.hpp ==="
cat cpp/include/cuml/decomposition/params.hpp | grep -A 20 'TSVD'

echo
echo "=== Look at the public fit function around line 440 ==="
sed -n '410,460p' cpp/src/tsvd/tsvd_mg.cu

echo
echo "=== Check what input_data[0] contains by examining Data structure ==="
rg -n 'struct Data|template.*Data' cpp/src_prims/opg/matrix/data.h -A 5 | head -30

Repository: rapidsai/cuml

Length of output: 3259


🏁 Script executed:

#!/bin/bash
set -euo zipefail

echo "=== Search for where prms.n_rows and prms.n_cols are set ==="
rg -n 'n_rows\s*=' cpp/src/tsvd/tsvd_mg.cu | head -10

echo
echo "=== Look at how fit_transform initializes prms ==="
sed -n '460,520p' cpp/src/tsvd/tsvd_mg.cu

echo
echo "=== Check the public API entry point ==="
grep -n 'void fit(' cpp/include/cuml/decomposition/tsvd_mg.hpp | head -3
</parameter>

Repository: rapidsai/cuml

Length of output: 110


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Search for where prms.n_rows and prms.n_cols are set ==="
rg -n 'prms\.n_rows\s*=' cpp/src/tsvd/tsvd_mg.cu

echo
echo "=== Look at the public API wrapper around line 414 ==="
sed -n '414,435p' cpp/src/tsvd/tsvd_mg.cu

echo
echo "=== Check if there's a rank-size wrapper that assembles the params ==="
rg -n 'void fit\(' cpp/include/cuml/decomposition/tsvd_mg.hpp -A 8 | head -40

Repository: rapidsai/cuml

Length of output: 117


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check the tsvd_mg.hpp header for the public API ==="
cat cpp/include/cuml/decomposition/tsvd_mg.hpp

echo
echo "=== Look at full fit_impl signature ==="
sed -n '119,150p' cpp/src/tsvd/tsvd_mg.cu

Repository: rapidsai/cuml

Length of output: 7509


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check how the wrapper fit() function sets up prms ==="
sed -n '1,50p' cpp/src/tsvd/tsvd_mg.cu | grep -A 30 'namespace'

echo
echo "=== Look at actual test or usage to see what prms.n_rows contains ==="
rg -n 'paramsTSVDMG.*n_rows' cpp/tests/mg/ -B 2 -A 2 | head -40

echo
echo "=== Verify if PCA has same issue ==="
sed -n '100,125p' cpp/src/pca/pca_mg.cu

Repository: rapidsai/cuml

Length of output: 1077


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check what sign_flip_components_u does (the correct partition-aware path) ==="
head -50 cpp/src/pca/sign_flip_mg.hpp

echo
echo "=== Look at how input_data[0] is actually allocated/sized ==="
rg -n 'Data<T>\*.*new\|input_data\[0\]' cpp/src/tsvd/tsvd_mg.cu | head -10

echo
echo "=== Check if there's documentation on what prms.n_rows means in MNMG context ==="
rg -n 'prms.n_rows|n_rows.*total\|global' cpp/src/tsvd/tsvd_mg.cu cpp/src/pca/pca_mg.cu | head -15

Repository: rapidsai/cuml

Length of output: 243


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Find sign_flip header ==="
find cpp -name "*sign_flip*" -type f

echo
echo "=== Check the include path used in pca_mg.cu ==="
grep 'sign_flip' cpp/src/pca/pca_mg.cu | head -3

echo
echo "=== Look at the signature of sign_flip_components_u ==="
rg -n 'sign_flip_components_u' cpp --type hpp -A 5 | head -30

Repository: rapidsai/cuml

Length of output: 436


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check sign_flip_mg.hpp for sign_flip_components_u ==="
cat cpp/include/cuml/decomposition/sign_flip_mg.hpp

echo
echo "=== Compare the two code paths in fit_impl ==="
sed -n '85,110p' cpp/src/tsvd/tsvd_mg.cu

Repository: rapidsai/cuml

Length of output: 4502


Use the partition-aware sign-flip path (sign_flip_components_u) for all MNMG sign flipping.

The else branch wraps only input_data[0]->ptr (one local partition) as a prms.n_rows × prms.n_cols matrix. In MNMG, prms.n_rows is the global row count across all ranks; input_data[0] holds only the rows owned by this rank. This causes out-of-bounds device reads and incorrect sign flips from a single partition. The true branch correctly uses sign_flip_components_u with the full input_data vector and input_desc. Apply the same partition-aware approach to the false branch, or remove the single-partition path entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/tsvd/tsvd_mg.cu` around lines 97 - 104, The current else branch uses
raft::linalg::sign_flip_components with a wrapped single-partition matrix
(input_data[0]->ptr) which assumes prms.n_rows is local and causes OOB reads in
MNMG; replace that call with the partition-aware API by invoking
raft::linalg::sign_flip_components_u with the same parameters used in the true
branch (pass handle_stream_zero, the full input_data vector and input_desc,
components, prms.n_components, prms.n_cols, and the same booleans), or remove
the single-partition path entirely so MNMG always uses sign_flip_components_u to
ensure per-rank partitioning is respected.

Copy link
Copy Markdown
Member

@jcrist jcrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, just a small comment on maintainability.

raft::linalg::paramsTSVD raft_prms;
raft_prms.algorithm = prms.algorithm;
raft_prms.tol = prms.tol;
raft_prms.n_iterations = prms.n_iterations;
Copy link
Copy Markdown
Member

@jcrist jcrist Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we could keep this translation at a single common location and not copy-pasted here to avoid missing parameters if new ones are added? There's a to_raft_params defined in tsvd.cuh, perhaps that could be reused (or moved to a method or 🤷).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 5b23bb9

raft_prms.tol = prms.tol;
raft_prms.n_iterations = prms.n_iterations;
raft_prms.whiten = prms.whiten;
raft_prms.copy = prms.copy;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - if we could extract the parameter translation out to a single common function/method I'd feel a lot better.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 5b23bb9

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
cpp/src/pca/pca.cu (1)

14-40: Consider wrapping _impl functions in an anonymous namespace for consistency.

The pca_fit_impl and other _impl templates are defined at namespace scope in ML, while cpp/src/tsvd/tsvd.cu uses an anonymous namespace for similar helpers. Using an anonymous namespace limits internal linkage and prevents symbol leakage.

♻️ Proposed change for consistency with tsvd.cu
 namespace ML {

+namespace {
+
 template <typename math_t>
 void pca_fit_impl(const raft::handle_t& handle,
 ...
+
+}  // anonymous namespace
+
 void pcaFit(const raft::handle_t& handle,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pca/pca.cu` around lines 14 - 40, The pca_fit_impl template (and
other *_impl helpers in this file) are currently at ML namespace scope and
should be given internal linkage like tsvd.cu helpers; wrap pca_fit_impl (and
sibling *_impl functions) inside an anonymous namespace to prevent symbol
leakage and keep linkage consistent. Locate the pca_fit_impl definition and move
it (and any other implementation-only templates in this translation unit) into
an unnamed namespace block (namespace { ... }) while leaving public API
declarations in ML; ensure no external references rely on external linkage to
these symbols.
cpp/include/cuml/decomposition/params.hpp (1)

68-88: Missing Doxygen documentation for public conversion helpers.

The to_raft_params template functions are defined in a public header but lack Doxygen documentation. Per coding guidelines, all public C++ API functions in cpp/include/cuml/ should include complete Doxygen documentation.

📝 Proposed documentation
+/**
+ * `@brief` Convert cuML TSVD parameters to RAFT TSVD parameters.
+ * `@tparam` enum_solver The solver enum type (solver or mg_solver).
+ * `@param` ml_prms The cuML TSVD parameters.
+ * `@return` raft::linalg::paramsTSVD with fields copied from ml_prms.
+ */
 template <typename enum_solver>
 inline raft::linalg::paramsTSVD to_raft_params(const paramsTSVDTemplate<enum_solver>& ml_prms)
 {
   ...
 }

+/**
+ * `@brief` Convert cuML PCA parameters to RAFT PCA parameters.
+ * `@tparam` enum_solver The solver enum type (solver or mg_solver).
+ * `@param` ml_prms The cuML PCA parameters.
+ * `@return` raft::linalg::paramsPCA with fields copied from ml_prms.
+ */
 template <typename enum_solver>
 inline raft::linalg::paramsPCA to_raft_params(const paramsPCATemplate<enum_solver>& ml_prms)
 {
   ...
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuml/decomposition/params.hpp` around lines 68 - 88, Add Doxygen
comments to the two public conversion helpers so they are documented in the
public API: document the template function to_raft_params(const
paramsTSVDTemplate<enum_solver>&) that returns raft::linalg::paramsTSVD and the
template function to_raft_params(const paramsPCATemplate<enum_solver>&) that
returns raft::linalg::paramsPCA; each comment should describe the purpose,
explain template parameter enum_solver, describe the ml_prms parameter and which
fields are mapped (tol, n_iterations, algorithm, and for PCA: copy, whiten), and
state the return value; place the comments immediately above the corresponding
function declarations (the two to_raft_params overloads) in the header.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/include/cuml/decomposition/params.hpp`:
- Around line 14-15: Add Doxygen comments for the two inline conversion helpers
named to_raft_params so public header documentation covers them: document the
template that converts paramsTSVDTemplate<enum_solver> to
raft::linalg::paramsTSVD (describe purpose, `@param`[in] ml_prms, and `@return` RAFT
TSVD params) and the template that converts paramsSVDTemplate<enum_solver> to
raft::linalg::paramsSVD with the same tags; place the comments immediately above
each to_raft_params function declaration and reference the corresponding input
types paramsTSVDTemplate and paramsSVDTemplate and the return types
raft::linalg::paramsTSVD and raft::linalg::paramsSVD.

---

Nitpick comments:
In `@cpp/include/cuml/decomposition/params.hpp`:
- Around line 68-88: Add Doxygen comments to the two public conversion helpers
so they are documented in the public API: document the template function
to_raft_params(const paramsTSVDTemplate<enum_solver>&) that returns
raft::linalg::paramsTSVD and the template function to_raft_params(const
paramsPCATemplate<enum_solver>&) that returns raft::linalg::paramsPCA; each
comment should describe the purpose, explain template parameter enum_solver,
describe the ml_prms parameter and which fields are mapped (tol, n_iterations,
algorithm, and for PCA: copy, whiten), and state the return value; place the
comments immediately above the corresponding function declarations (the two
to_raft_params overloads) in the header.

In `@cpp/src/pca/pca.cu`:
- Around line 14-40: The pca_fit_impl template (and other *_impl helpers in this
file) are currently at ML namespace scope and should be given internal linkage
like tsvd.cu helpers; wrap pca_fit_impl (and sibling *_impl functions) inside an
anonymous namespace to prevent symbol leakage and keep linkage consistent.
Locate the pca_fit_impl definition and move it (and any other
implementation-only templates in this translation unit) into an unnamed
namespace block (namespace { ... }) while leaving public API declarations in ML;
ensure no external references rely on external linkage to these symbols.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8c0595e2-a7b1-4c90-a1e2-ccf1f6c25c68

📥 Commits

Reviewing files that changed from the base of the PR and between dea771d and ee628fd.

📒 Files selected for processing (5)
  • cpp/include/cuml/decomposition/params.hpp
  • cpp/src/pca/pca.cu
  • cpp/src/pca/pca_mg.cu
  • cpp/src/tsvd/tsvd.cu
  • cpp/src/tsvd/tsvd_mg.cu
✅ Files skipped from review due to trivial changes (1)
  • cpp/src/tsvd/tsvd_mg.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp/src/pca/pca_mg.cu

@jcrist
Copy link
Copy Markdown
Member

jcrist commented Mar 20, 2026

/merge

@rapids-bot rapids-bot bot merged commit eb56525 into rapidsai:release/26.04 Mar 20, 2026
101 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CUDA/C++ improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use PCA and TSVD from Raft

3 participants