Use PCA and TSVD from raft#7802
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Required for rapidsai/cuvs#1207 and rapidsai/cuml#7802. This PR moves `pca.cuh`, `tsvd.cuh`, and gtests into raft. Resolves #2978 Authors: - Anupam (https://github.com/aamijar) Approvers: - Jinsol Park (https://github.com/jinsolp) - Divye Gala (https://github.com/divyegala) URL: #2952
📝 WalkthroughSummary by CodeRabbitRelease Notes
WalkthroughRefactors PCA and TSVD to call raft::linalg implementations, replaces local solver enums with raft::linalg::solver aliases, updates public APIs to accept const raft::handle_t&, adds conversion helpers to raft params, removes legacy internal CUDA headers, and updates call sites and tests to the new signatures. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment Tip You can disable the changed files summary in the walkthrough.Disable the |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cpp/include/cuml/decomposition/pca.hpp (1)
16-57:⚠️ Potential issue | 🔴 CriticalProvide backward compatibility for
pcaFitandpcaFitTransformAPI changes.The expanded function signatures for
pcaFitandpcaFitTransform(adding 5 new output parameters:explained_var,explained_var_ratio,singular_vals,mu,noise_vars) break the public C++ API without a deprecation path. Per coding guidelines for public headers incpp/include/cuml/, breaking changes must include deprecation warnings and a migration guide.Keep the old signatures with
[[deprecated]]attributes and create overloads or wrapper functions that internally call the new implementations with default/computed values for the new parameters.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuml/decomposition/pca.hpp` around lines 16 - 57, Add backward-compatible deprecated overloads for the old pcaFit and pcaFitTransform signatures that call the new expanded APIs: create [[deprecated]] versions of pcaFit(const raft::handle_t&, float* input, float* components, const paramsPCA& prms, bool flip_signs_based_on_U) and its double variant, and similarly for pcaFitTransform(old signatures). In each deprecated overload, allocate or pass nullptrs/default buffers for the new outputs (explained_var, explained_var_ratio, singular_vals, mu, noise_vars) and call the new implementations (pcaFit and pcaFitTransform with the expanded parameter lists), ensuring the deprecated functions forward to the new ones so existing callers keep working; also add a brief comment directing users to the new API and migration steps.cpp/include/cuml/decomposition/tsvd.hpp (1)
48-65:⚠️ Potential issue | 🔴 CriticalAll internal call sites properly updated with new parameters.
The expanded
tsvdFitTransformsignature withexplained_varandexplained_var_ratioparameters has been consistently applied across all internal usages:
- Cython extern declarations and fit_transform calls (tsvd.pyx)
- C++ test file allocations and calls (tsvd_test.cu)
- Python fit_transform implementation properly allocates and passes these parameters
However, this constitutes a breaking change to the public C++ API in
cpp/include/cuml/decomposition/tsvd.hpp. Per the coding guidelines, breaking changes to public headers require deprecation warnings. No deprecation mechanism is currently in place for the old signature. Consider adding a deprecated overload or compiler warning to ease migration for external consumers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuml/decomposition/tsvd.hpp` around lines 48 - 65, The public API changed for tsvdFitTransform by adding explained_var and explained_var_ratio parameters; add a deprecated overload that preserves the old signature and forwards to the new implementation (e.g., an overload of tsvdFitTransform(float* ... , const paramsTSVD&, bool) and the double version) so external consumers continue to compile, and annotate those forwarding overloads with a deprecation attribute (e.g., [[deprecated]] or compiler-specific macro) to generate warnings; ensure the deprecated overloads call the new tsvdFitTransform variants, passing nullptr (or sensible defaults) for explained_var and explained_var_ratio to maintain behavior.
🧹 Nitpick comments (3)
cpp/include/cuml/decomposition/pca.hpp (2)
87-87: Remove unnecessary semicolon after namespace closing brace.The semicolon after the closing brace is not required for namespace declarations.
Proposed fix
-}; // end namespace ML +} // end namespace ML🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuml/decomposition/pca.hpp` at line 87, Remove the stray semicolon following the closing brace of namespace ML: locate the namespace closing token "namespace ML { ... }" (the closing brace for namespace ML) and delete the trailing semicolon so the namespace ends with only the closing brace.
16-85: Consider adding minimal Doxygen documentation for maintainability.Similar to the TSVD header, brief documentation would clarify parameter semantics, especially for the multiple output arrays in
pcaFitandpcaFitTransform.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuml/decomposition/pca.hpp` around lines 16 - 85, Add minimal Doxygen comments above each PCA API declaration (pcaFit, pcaFitTransform, pcaTransform, pcaInverseTransform for both float and double overloads) explaining parameter semantics: expected input layout and shapes (rows/cols), which arrays are outputs (components, explained_var, explained_var_ratio, singular_vals, mu, noise_vars, trans_input, input), ownership/allocator expectations (caller-allocated), and meaning of flags like flip_signs_based_on_U and paramsPCA; keep comments brief (one or two lines per param) similar to the TSVD header style to improve maintainability and clarity.cpp/include/cuml/decomposition/tsvd.hpp (1)
16-65: Consider adding minimal Doxygen documentation.While these headers serve as internal implementation details for the Python API, adding brief Doxygen comments would help future maintainers understand:
- Parameter semantics (input vs output pointers)
- Memory ownership expectations
- The purpose of
flip_signs_based_on_U📝 Example documentation pattern
/** * `@brief` Fit TSVD model to input data. * `@param`[in] handle RAFT handle for resource management * `@param`[in,out] input Input matrix (may be modified) * `@param`[out] components Principal components (n_components x n_cols) * `@param`[out] singular_vals Singular values (n_components) * `@param`[in] prms TSVD parameters * `@param`[in] flip_signs_based_on_U Whether to flip signs for deterministic output */ void tsvdFit(const raft::handle_t& handle, ...);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuml/decomposition/tsvd.hpp` around lines 16 - 65, Add minimal Doxygen-style comments to each declaration (e.g., tsvdFit, tsvdTransform, tsvdInverseTransform, tsvdFitTransform) describing parameter semantics (which pointers are input vs output vs in/out), memory ownership expectations (caller-allocated vs callee-allocated), the role of the RAFT handle, and the purpose/behavior of flip_signs_based_on_U (deterministic sign flipping based on U); keep each comment short (one-liner brief description + `@param` tags for handle, input/trans_input, components, singular_vals, explained_var(s), prms, and flip_signs_based_on_U).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/pca/pca_mg.cu`:
- Around line 108-115: The code incorrectly calls the single-GPU helper
sign_flip_components with input_data[0]->ptr and global dimensions (prms.n_rows,
prms.n_cols), causing OOB reads and wrong sign selection for partitioned
multi-GPU inputs; replace this with the MG-aware path by calling
sign_flip_components_u (or iterate over input_desc.blocksOwnedBy(rank) and call
the sign-flip logic per-partition using each partition's local row/col sizes)
and pass the correct local device matrix views for each partition and the
components buffer; apply the same fix in tsvd_mg.cu where sign_flip_components
is used.
In `@cpp/src/tsvd/tsvd_mg.cu`:
- Around line 97-104: The current else branch uses
raft::linalg::sign_flip_components with a wrapped single-partition matrix
(input_data[0]->ptr) which assumes prms.n_rows is local and causes OOB reads in
MNMG; replace that call with the partition-aware API by invoking
raft::linalg::sign_flip_components_u with the same parameters used in the true
branch (pass handle_stream_zero, the full input_data vector and input_desc,
components, prms.n_components, prms.n_cols, and the same booleans), or remove
the single-partition path entirely so MNMG always uses sign_flip_components_u to
ensure per-rank partitioning is respected.
---
Outside diff comments:
In `@cpp/include/cuml/decomposition/pca.hpp`:
- Around line 16-57: Add backward-compatible deprecated overloads for the old
pcaFit and pcaFitTransform signatures that call the new expanded APIs: create
[[deprecated]] versions of pcaFit(const raft::handle_t&, float* input, float*
components, const paramsPCA& prms, bool flip_signs_based_on_U) and its double
variant, and similarly for pcaFitTransform(old signatures). In each deprecated
overload, allocate or pass nullptrs/default buffers for the new outputs
(explained_var, explained_var_ratio, singular_vals, mu, noise_vars) and call the
new implementations (pcaFit and pcaFitTransform with the expanded parameter
lists), ensuring the deprecated functions forward to the new ones so existing
callers keep working; also add a brief comment directing users to the new API
and migration steps.
In `@cpp/include/cuml/decomposition/tsvd.hpp`:
- Around line 48-65: The public API changed for tsvdFitTransform by adding
explained_var and explained_var_ratio parameters; add a deprecated overload that
preserves the old signature and forwards to the new implementation (e.g., an
overload of tsvdFitTransform(float* ... , const paramsTSVD&, bool) and the
double version) so external consumers continue to compile, and annotate those
forwarding overloads with a deprecation attribute (e.g., [[deprecated]] or
compiler-specific macro) to generate warnings; ensure the deprecated overloads
call the new tsvdFitTransform variants, passing nullptr (or sensible defaults)
for explained_var and explained_var_ratio to maintain behavior.
---
Nitpick comments:
In `@cpp/include/cuml/decomposition/pca.hpp`:
- Line 87: Remove the stray semicolon following the closing brace of namespace
ML: locate the namespace closing token "namespace ML { ... }" (the closing brace
for namespace ML) and delete the trailing semicolon so the namespace ends with
only the closing brace.
- Around line 16-85: Add minimal Doxygen comments above each PCA API declaration
(pcaFit, pcaFitTransform, pcaTransform, pcaInverseTransform for both float and
double overloads) explaining parameter semantics: expected input layout and
shapes (rows/cols), which arrays are outputs (components, explained_var,
explained_var_ratio, singular_vals, mu, noise_vars, trans_input, input),
ownership/allocator expectations (caller-allocated), and meaning of flags like
flip_signs_based_on_U and paramsPCA; keep comments brief (one or two lines per
param) similar to the TSVD header style to improve maintainability and clarity.
In `@cpp/include/cuml/decomposition/tsvd.hpp`:
- Around line 16-65: Add minimal Doxygen-style comments to each declaration
(e.g., tsvdFit, tsvdTransform, tsvdInverseTransform, tsvdFitTransform)
describing parameter semantics (which pointers are input vs output vs in/out),
memory ownership expectations (caller-allocated vs callee-allocated), the role
of the RAFT handle, and the purpose/behavior of flip_signs_based_on_U
(deterministic sign flipping based on U); keep each comment short (one-liner
brief description + `@param` tags for handle, input/trans_input, components,
singular_vals, explained_var(s), prms, and flip_signs_based_on_U).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: de869307-e4ac-4987-aef7-b539040a82ad
📒 Files selected for processing (12)
cpp/include/cuml/decomposition/params.hppcpp/include/cuml/decomposition/pca.hppcpp/include/cuml/decomposition/tsvd.hppcpp/src/pca/pca.cucpp/src/pca/pca.cuhcpp/src/pca/pca_mg.cucpp/src/tsne/tsne_runner.cuhcpp/src/tsvd/tsvd.cucpp/src/tsvd/tsvd.cuhcpp/src/tsvd/tsvd_mg.cucpp/tests/sg/pca_test.cucpp/tests/sg/tsvd_test.cu
💤 Files with no reviewable changes (2)
- cpp/src/tsvd/tsvd.cuh
- cpp/src/pca/pca.cuh
| raft::linalg::sign_flip_components( | ||
| handle_stream_zero, | ||
| raft::make_device_matrix_view<T, std::size_t, raft::col_major>( | ||
| input_data[0]->ptr, prms.n_rows, prms.n_cols), | ||
| raft::make_device_matrix_view<T, std::size_t, raft::col_major>( | ||
| components, prms.n_components, prms.n_cols), | ||
| true, | ||
| false); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, understand the file structure and locate the exact code
find . -type f -name "pca_mg.cu" | head -5Repository: rapidsai/cuml
Length of output: 80
🏁 Script executed:
# Read the file around lines 108-115 with sufficient context
wc -l ./cpp/src/pca/pca_mg.cuRepository: rapidsai/cuml
Length of output: 84
🏁 Script executed:
# Get the code context around the flagged lines
sed -n '100,125p' ./cpp/src/pca/pca_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 1188
🏁 Script executed:
# Get broader context to understand input_data and prms structure
sed -n '1,120p' ./cpp/src/pca/pca_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 5309
🏁 Script executed:
# Find TSVD MG code that was mentioned as having identical issue
find . -type f -name "*tsvd*mg*.cu" -o -name "*tsvd*mg*.cuh" 2>/dev/null | head -5Repository: rapidsai/cuml
Length of output: 82
🏁 Script executed:
# Look at the sign_flip_components_u function (MG-aware variant used when flip_signs_based_on_U==true)
rg -n "sign_flip_components_u" --type cpp --type cuda -B 3 -A 10 | head -50Repository: rapidsai/cuml
Length of output: 85
🏁 Script executed:
# Search for PartDescriptor usage to understand data partitioning
rg -n "PartDescriptor" cpp/src/pca/pca_mg.cu -A 2 -B 2Repository: rapidsai/cuml
Length of output: 2766
🏁 Script executed:
# Check what input_data[0] represents and its relationship to prms.n_rows
grep -n "input_data\[0\]" cpp/src/pca/pca_mg.cuRepository: rapidsai/cuml
Length of output: 115
🏁 Script executed:
# Read TSVD MG around sign_flip_components calls
grep -n "sign_flip_components" ./cpp/src/tsvd/tsvd_mg.cu -B 5 -A 5Repository: rapidsai/cuml
Length of output: 1215
🏁 Script executed:
# Search for input_data[0]->ptr in TSVD MG file
grep -n "input_data\[0\]->ptr" ./cpp/src/tsvd/tsvd_mg.cu -B 3 -A 3Repository: rapidsai/cuml
Length of output: 408
🏁 Script executed:
# Understand PartDescriptor to see how partitions relate to global dimensions
rg -n "class PartDescriptor" --type cpp -A 20 | head -40Repository: rapidsai/cuml
Length of output: 39
🏁 Script executed:
# Check the fit_impl signature in TSVD to understand what prms.n_rows represents
sed -n '1,80p' ./cpp/src/tsvd/tsvd_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 3556
🏁 Script executed:
# Find PartDescriptor definition
find . -type f -name "*.hpp" -o -name "*.h" | xargs grep -l "class PartDescriptor" 2>/dev/null | head -3Repository: rapidsai/cuml
Length of output: 39
🏁 Script executed:
# Look at PartDescriptor definition
rg -n "class PartDescriptor" -A 30 | head -60Repository: rapidsai/cuml
Length of output: 497
🏁 Script executed:
# Understand how input_data is populated - look at the caller side
sed -n '280,290p' ./cpp/src/pca/pca_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 730
🏁 Script executed:
# Check what local_size or partition size would be from PartDescriptor
rg -n "get_*_size\|local.*row\|partition" ./cpp/src/pca/pca_mg.cu -i | head -20Repository: rapidsai/cuml
Length of output: 39
🏁 Script executed:
# Look at the fit_impl entry point to understand input_data structure
sed -n '250,295p' ./cpp/src/pca/pca_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 2337
🏁 Script executed:
# Check PartDescriptor member functions to get local size
rg -n "PartDescriptor::" -A 3 --type cpp | grep -E "get_|local|size" | head -20Repository: rapidsai/cuml
Length of output: 39
🏁 Script executed:
# Look at sign_flip_components_u to see how MG is handled correctly
sed -n '1,50p' ./cpp/src/pca/sign_flip_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 1778
🏁 Script executed:
# Look at how other parts of fit_impl use input_data partitions correctly
sed -n '49,100p' ./cpp/src/pca/pca_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 2752
🏁 Script executed:
# Check RankSizePair to understand local partition size tracking
rg -n "RankSizePair" --type cpp -B 2 -A 5 | head -40Repository: rapidsai/cuml
Length of output: 2834
🏁 Script executed:
# Let me check if input_data[0] is actually a partition by looking at how it's used in the cov function
rg -n "opg::cov\|opg::mean" ./cpp/src/pca/pca_mg.cu -B 2 -A 2Repository: rapidsai/cuml
Length of output: 39
🏁 Script executed:
# Check how sign_flip_components_u properly handles partitions with input_desc
sed -n '100,200p' ./cpp/src/pca/sign_flip_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 5097
🏁 Script executed:
# Verify partition sizes by checking RankSizePair usage
rg -n "get_parts\|n_rows\|partition" ./cpp/src/pca/sign_flip_mg.cu | head -30Repository: rapidsai/cuml
Length of output: 39
🏁 Script executed:
# Let me look at the actual sign_flip_components function signature to understand what it expects
rg -n "void sign_flip_components" --type cpp -A 5 | head -30Repository: rapidsai/cuml
Length of output: 1395
🏁 Script executed:
# Look at more of sign_flip_components_u_imp to confirm partition-aware usage
sed -n '49,150p' ./cpp/src/pca/sign_flip_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 4459
🏁 Script executed:
# Check if raft::linalg::sign_flip_components is the single-GPU variant
find . -type f \( -name "*.cuh" -o -name "*.hpp" \) -path "*/raft/*" | xargs grep -l "sign_flip_components" 2>/dev/null | head -3Repository: rapidsai/cuml
Length of output: 39
🏁 Script executed:
# Verify the RAFT sign_flip_components signature
rg -n "sign_flip_components\(" --type cpp --type cuh -A 3 | grep -A 3 "raft::linalg::sign_flip_components" | head -20Repository: rapidsai/cuml
Length of output: 84
🏁 Script executed:
# Confirm the issue: local_blocks from input_desc would give correct local size
sed -n '63,90p' ./cpp/src/pca/sign_flip_mg.cu | cat -nRepository: rapidsai/cuml
Length of output: 995
Don't use the single-GPU sign-flip helper on partitioned multi-GPU data.
Lines 108–115 pass input_data[0]->ptr (a single partition) with global matrix dimensions (prms.n_rows, prms.n_cols) to raft::linalg::sign_flip_components(). This causes out-of-bounds reads or sign selection based only on the first partition instead of all distributed data. The MG-aware variant, sign_flip_components_u(), correctly iterates partitions and uses input_desc.blocksOwnedBy(rank) with each partition's local size (see sign_flip_mg.cu:63–87). Use the partition-aware path or handle all partitions in a loop with local dimensions from input_desc.
Identical issue exists in tsvd_mg.cu:97–103.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/pca/pca_mg.cu` around lines 108 - 115, The code incorrectly calls the
single-GPU helper sign_flip_components with input_data[0]->ptr and global
dimensions (prms.n_rows, prms.n_cols), causing OOB reads and wrong sign
selection for partitioned multi-GPU inputs; replace this with the MG-aware path
by calling sign_flip_components_u (or iterate over
input_desc.blocksOwnedBy(rank) and call the sign-flip logic per-partition using
each partition's local row/col sizes) and pass the correct local device matrix
views for each partition and the components buffer; apply the same fix in
tsvd_mg.cu where sign_flip_components is used.
| raft::linalg::sign_flip_components( | ||
| handle_stream_zero, | ||
| raft::make_device_matrix_view<T, std::size_t, raft::col_major>( | ||
| input_data[0]->ptr, prms.n_rows, prms.n_cols), | ||
| raft::make_device_matrix_view<T, std::size_t, raft::col_major>( | ||
| components, prms.n_components, prms.n_cols), | ||
| false, | ||
| false); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Inspect how MNMG inputs are partitioned"
rg -n -C2 'blocksOwnedBy|partsToRanks|RankSizePair' cpp
echo
echo "Inspect the new sign-flip call sites"
rg -n -C3 'input_data\(input, input \+ n_parts\)|sign_flip_components\(|input_data\[0\]->ptr, prms\.n_rows, prms\.n_cols' \
cpp/src/tsvd/tsvd_mg.cu cpp/src/pca/pca_mg.cuRepository: rapidsai/cuml
Length of output: 50369
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Lines 97-104 and context in tsvd_mg.cu ==="
sed -n '85,115p' cpp/src/tsvd/tsvd_mg.cu
echo
echo "=== How input_data is assembled in fit_impl ==="
sed -n '119,145p' cpp/src/tsvd/tsvd_mg.cu
echo
echo "=== Check how sign_flip_components is used in PCA (for comparison) ==="
rg -n -A5 'raft::linalg::sign_flip_components|sign_flip_mg' cpp/src/pca/pca_mg.cu | head -40Repository: rapidsai/cuml
Length of output: 3051
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== PCA fit_impl around sign_flip call ==="
sed -n '80,120p' cpp/src/pca/pca_mg.cu
echo
echo "=== Check if there's a partition-aware sign flip function in sign_flip_mg ==="
head -100 cpp/src/pca/sign_flip_mg.cu | grep -n 'namespace\|void\|template'Repository: rapidsai/cuml
Length of output: 2034
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Check paramsTSVDMG structure to see what n_rows represents ==="
rg -n 'struct paramsTSVDMG|class paramsTSVDMG' cpp/src/tsvd/ -A 10
echo
echo "=== Check how prms is initialized in the public fit function ==="
sed -n '410,445p' cpp/src/tsvd/tsvd_mg.cuRepository: rapidsai/cuml
Length of output: 123
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Search for paramsTSVDMG definition ==="
rg -n 'paramsTSVDMG' cpp/src/tsvd/tsvd_mg.cu | head -20
echo
echo "=== Check the include files in tsvd_mg.cu ==="
head -50 cpp/src/tsvd/tsvd_mg.cu | grep '#include'
echo
echo "=== Search in header files for paramsTSVDMG ==="
find cpp -name "*.h" -o -name "*.hpp" | xargs grep -l 'paramsTSVDMG' 2>/dev/nullRepository: rapidsai/cuml
Length of output: 1482
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Look at paramsTSVDMG definition in params.hpp ==="
grep -A 20 'struct paramsTSVDMG' cpp/include/cuml/decomposition/params.hpp
echo
echo "=== Look at the public fit wrapper around line 440 ==="
sed -n '410,460p' cpp/src/tsvd/tsvd_mg.cu
echo
echo "=== Check how input_data[0]->totalSize relates to local partition size ==="
rg -n 'totalSize' cpp/src/tsvd/tsvd_mg.cu | head -5Repository: rapidsai/cuml
Length of output: 110
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Search for paramsTSVDMG in params.hpp ==="
cat cpp/include/cuml/decomposition/params.hpp | grep -A 20 'TSVD'
echo
echo "=== Look at the public fit function around line 440 ==="
sed -n '410,460p' cpp/src/tsvd/tsvd_mg.cu
echo
echo "=== Check what input_data[0] contains by examining Data structure ==="
rg -n 'struct Data|template.*Data' cpp/src_prims/opg/matrix/data.h -A 5 | head -30Repository: rapidsai/cuml
Length of output: 3259
🏁 Script executed:
#!/bin/bash
set -euo zipefail
echo "=== Search for where prms.n_rows and prms.n_cols are set ==="
rg -n 'n_rows\s*=' cpp/src/tsvd/tsvd_mg.cu | head -10
echo
echo "=== Look at how fit_transform initializes prms ==="
sed -n '460,520p' cpp/src/tsvd/tsvd_mg.cu
echo
echo "=== Check the public API entry point ==="
grep -n 'void fit(' cpp/include/cuml/decomposition/tsvd_mg.hpp | head -3
</parameter>Repository: rapidsai/cuml
Length of output: 110
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Search for where prms.n_rows and prms.n_cols are set ==="
rg -n 'prms\.n_rows\s*=' cpp/src/tsvd/tsvd_mg.cu
echo
echo "=== Look at the public API wrapper around line 414 ==="
sed -n '414,435p' cpp/src/tsvd/tsvd_mg.cu
echo
echo "=== Check if there's a rank-size wrapper that assembles the params ==="
rg -n 'void fit\(' cpp/include/cuml/decomposition/tsvd_mg.hpp -A 8 | head -40Repository: rapidsai/cuml
Length of output: 117
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Check the tsvd_mg.hpp header for the public API ==="
cat cpp/include/cuml/decomposition/tsvd_mg.hpp
echo
echo "=== Look at full fit_impl signature ==="
sed -n '119,150p' cpp/src/tsvd/tsvd_mg.cuRepository: rapidsai/cuml
Length of output: 7509
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Check how the wrapper fit() function sets up prms ==="
sed -n '1,50p' cpp/src/tsvd/tsvd_mg.cu | grep -A 30 'namespace'
echo
echo "=== Look at actual test or usage to see what prms.n_rows contains ==="
rg -n 'paramsTSVDMG.*n_rows' cpp/tests/mg/ -B 2 -A 2 | head -40
echo
echo "=== Verify if PCA has same issue ==="
sed -n '100,125p' cpp/src/pca/pca_mg.cuRepository: rapidsai/cuml
Length of output: 1077
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Check what sign_flip_components_u does (the correct partition-aware path) ==="
head -50 cpp/src/pca/sign_flip_mg.hpp
echo
echo "=== Look at how input_data[0] is actually allocated/sized ==="
rg -n 'Data<T>\*.*new\|input_data\[0\]' cpp/src/tsvd/tsvd_mg.cu | head -10
echo
echo "=== Check if there's documentation on what prms.n_rows means in MNMG context ==="
rg -n 'prms.n_rows|n_rows.*total\|global' cpp/src/tsvd/tsvd_mg.cu cpp/src/pca/pca_mg.cu | head -15Repository: rapidsai/cuml
Length of output: 243
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Find sign_flip header ==="
find cpp -name "*sign_flip*" -type f
echo
echo "=== Check the include path used in pca_mg.cu ==="
grep 'sign_flip' cpp/src/pca/pca_mg.cu | head -3
echo
echo "=== Look at the signature of sign_flip_components_u ==="
rg -n 'sign_flip_components_u' cpp --type hpp -A 5 | head -30Repository: rapidsai/cuml
Length of output: 436
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Check sign_flip_mg.hpp for sign_flip_components_u ==="
cat cpp/include/cuml/decomposition/sign_flip_mg.hpp
echo
echo "=== Compare the two code paths in fit_impl ==="
sed -n '85,110p' cpp/src/tsvd/tsvd_mg.cuRepository: rapidsai/cuml
Length of output: 4502
Use the partition-aware sign-flip path (sign_flip_components_u) for all MNMG sign flipping.
The else branch wraps only input_data[0]->ptr (one local partition) as a prms.n_rows × prms.n_cols matrix. In MNMG, prms.n_rows is the global row count across all ranks; input_data[0] holds only the rows owned by this rank. This causes out-of-bounds device reads and incorrect sign flips from a single partition. The true branch correctly uses sign_flip_components_u with the full input_data vector and input_desc. Apply the same partition-aware approach to the false branch, or remove the single-partition path entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/tsvd/tsvd_mg.cu` around lines 97 - 104, The current else branch uses
raft::linalg::sign_flip_components with a wrapped single-partition matrix
(input_data[0]->ptr) which assumes prms.n_rows is local and causes OOB reads in
MNMG; replace that call with the partition-aware API by invoking
raft::linalg::sign_flip_components_u with the same parameters used in the true
branch (pass handle_stream_zero, the full input_data vector and input_desc,
components, prms.n_components, prms.n_cols, and the same booleans), or remove
the single-partition path entirely so MNMG always uses sign_flip_components_u to
ensure per-rank partitioning is respected.
jcrist
left a comment
There was a problem hiding this comment.
Overall LGTM, just a small comment on maintainability.
cpp/src/tsvd/tsvd_mg.cu
Outdated
| raft::linalg::paramsTSVD raft_prms; | ||
| raft_prms.algorithm = prms.algorithm; | ||
| raft_prms.tol = prms.tol; | ||
| raft_prms.n_iterations = prms.n_iterations; |
There was a problem hiding this comment.
Is there a way we could keep this translation at a single common location and not copy-pasted here to avoid missing parameters if new ones are added? There's a to_raft_params defined in tsvd.cuh, perhaps that could be reused (or moved to a method or 🤷).
cpp/src/pca/pca_mg.cu
Outdated
| raft_prms.tol = prms.tol; | ||
| raft_prms.n_iterations = prms.n_iterations; | ||
| raft_prms.whiten = prms.whiten; | ||
| raft_prms.copy = prms.copy; |
There was a problem hiding this comment.
Same here - if we could extract the parameter translation out to a single common function/method I'd feel a lot better.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
cpp/src/pca/pca.cu (1)
14-40: Consider wrapping_implfunctions in an anonymous namespace for consistency.The
pca_fit_impland other_impltemplates are defined at namespace scope inML, whilecpp/src/tsvd/tsvd.cuuses an anonymous namespace for similar helpers. Using an anonymous namespace limits internal linkage and prevents symbol leakage.♻️ Proposed change for consistency with tsvd.cu
namespace ML { +namespace { + template <typename math_t> void pca_fit_impl(const raft::handle_t& handle, ... + +} // anonymous namespace + void pcaFit(const raft::handle_t& handle,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/pca/pca.cu` around lines 14 - 40, The pca_fit_impl template (and other *_impl helpers in this file) are currently at ML namespace scope and should be given internal linkage like tsvd.cu helpers; wrap pca_fit_impl (and sibling *_impl functions) inside an anonymous namespace to prevent symbol leakage and keep linkage consistent. Locate the pca_fit_impl definition and move it (and any other implementation-only templates in this translation unit) into an unnamed namespace block (namespace { ... }) while leaving public API declarations in ML; ensure no external references rely on external linkage to these symbols.cpp/include/cuml/decomposition/params.hpp (1)
68-88: Missing Doxygen documentation for public conversion helpers.The
to_raft_paramstemplate functions are defined in a public header but lack Doxygen documentation. Per coding guidelines, all public C++ API functions incpp/include/cuml/should include complete Doxygen documentation.📝 Proposed documentation
+/** + * `@brief` Convert cuML TSVD parameters to RAFT TSVD parameters. + * `@tparam` enum_solver The solver enum type (solver or mg_solver). + * `@param` ml_prms The cuML TSVD parameters. + * `@return` raft::linalg::paramsTSVD with fields copied from ml_prms. + */ template <typename enum_solver> inline raft::linalg::paramsTSVD to_raft_params(const paramsTSVDTemplate<enum_solver>& ml_prms) { ... } +/** + * `@brief` Convert cuML PCA parameters to RAFT PCA parameters. + * `@tparam` enum_solver The solver enum type (solver or mg_solver). + * `@param` ml_prms The cuML PCA parameters. + * `@return` raft::linalg::paramsPCA with fields copied from ml_prms. + */ template <typename enum_solver> inline raft::linalg::paramsPCA to_raft_params(const paramsPCATemplate<enum_solver>& ml_prms) { ... }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuml/decomposition/params.hpp` around lines 68 - 88, Add Doxygen comments to the two public conversion helpers so they are documented in the public API: document the template function to_raft_params(const paramsTSVDTemplate<enum_solver>&) that returns raft::linalg::paramsTSVD and the template function to_raft_params(const paramsPCATemplate<enum_solver>&) that returns raft::linalg::paramsPCA; each comment should describe the purpose, explain template parameter enum_solver, describe the ml_prms parameter and which fields are mapped (tol, n_iterations, algorithm, and for PCA: copy, whiten), and state the return value; place the comments immediately above the corresponding function declarations (the two to_raft_params overloads) in the header.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/include/cuml/decomposition/params.hpp`:
- Around line 14-15: Add Doxygen comments for the two inline conversion helpers
named to_raft_params so public header documentation covers them: document the
template that converts paramsTSVDTemplate<enum_solver> to
raft::linalg::paramsTSVD (describe purpose, `@param`[in] ml_prms, and `@return` RAFT
TSVD params) and the template that converts paramsSVDTemplate<enum_solver> to
raft::linalg::paramsSVD with the same tags; place the comments immediately above
each to_raft_params function declaration and reference the corresponding input
types paramsTSVDTemplate and paramsSVDTemplate and the return types
raft::linalg::paramsTSVD and raft::linalg::paramsSVD.
---
Nitpick comments:
In `@cpp/include/cuml/decomposition/params.hpp`:
- Around line 68-88: Add Doxygen comments to the two public conversion helpers
so they are documented in the public API: document the template function
to_raft_params(const paramsTSVDTemplate<enum_solver>&) that returns
raft::linalg::paramsTSVD and the template function to_raft_params(const
paramsPCATemplate<enum_solver>&) that returns raft::linalg::paramsPCA; each
comment should describe the purpose, explain template parameter enum_solver,
describe the ml_prms parameter and which fields are mapped (tol, n_iterations,
algorithm, and for PCA: copy, whiten), and state the return value; place the
comments immediately above the corresponding function declarations (the two
to_raft_params overloads) in the header.
In `@cpp/src/pca/pca.cu`:
- Around line 14-40: The pca_fit_impl template (and other *_impl helpers in this
file) are currently at ML namespace scope and should be given internal linkage
like tsvd.cu helpers; wrap pca_fit_impl (and sibling *_impl functions) inside an
anonymous namespace to prevent symbol leakage and keep linkage consistent.
Locate the pca_fit_impl definition and move it (and any other
implementation-only templates in this translation unit) into an unnamed
namespace block (namespace { ... }) while leaving public API declarations in ML;
ensure no external references rely on external linkage to these symbols.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8c0595e2-a7b1-4c90-a1e2-ccf1f6c25c68
📒 Files selected for processing (5)
cpp/include/cuml/decomposition/params.hppcpp/src/pca/pca.cucpp/src/pca/pca_mg.cucpp/src/tsvd/tsvd.cucpp/src/tsvd/tsvd_mg.cu
✅ Files skipped from review due to trivial changes (1)
- cpp/src/tsvd/tsvd_mg.cu
🚧 Files skipped from review as they are similar to previous changes (1)
- cpp/src/pca/pca_mg.cu
|
/merge |
Resolves #7801. Depends on rapidsai/raft#2952.
Remove code duplication, since PCA and TSVD are in raft now.