Skip to content

Bug Report: CUDA_ERROR_NO_DEVICE when Binette is executed via Snakemake #93

@hackerzone85

Description

@hackerzone85

Bug Report: CUDA_ERROR_NO_DEVICE when Binette is executed via Snakemake

Environment

  • OS: Ubuntu 20.04 LTS
  • GPU: NVIDIA Quadro RTX 4000 (8GB)
  • NVIDIA Driver: 580.126.09
  • CUDA Driver Ceiling: 13.0
  • System CUDA Toolkit: 12.9
  • Binette conda environment: created with mamba create -n binette -c conda-forge -c bioconda binette -y
  • TensorFlow version: 2.17.0
  • CheckM2 version: (installed as binette dependency)

Issue Description

When TensorFlow GPU availability is verified directly within the activated binette conda environment, the GPU is correctly detected:

(binette) $ python -c "import tensorflow as tf; print('TF:', tf.__version__); print('GPUs:', tf.config.list_physical_devices('GPU'))"
TF: 2.17.0
GPUs: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

However, when Binette is executed via Snakemake, TensorFlow fails to detect the GPU:

failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

Snakemake command used

snakemake --use-conda --conda-frontend mamba \
  --snakefile Snakefile_binette \
  --cores 30 \
  --rerun-triggers mtime \
  --keep-going \
  --reason \
  --show-failed-logs \
  --stats ASM_MAGs/Refined_MAGs/Binette/snakemake_stats.json \
  --rerun-incomplete

Full error output when run via Snakemake

2026-02-28 18:48:21.569394: E xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2026-02-28 18:48:21.590951: E xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2026-02-28 18:48:21.599839: E xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2026-02-28 18:48:28.013378: E xla/stream_executor/cuda/cuda_driver.cc:266] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

Additional observations

  • The cuFFT/cuDNN/cuBLAS factory registration errors appear both in direct invocation and via Snakemake, suggesting duplicate CUDA library registration within the conda environment (both standalone nvidia-cuda packages and TensorFlow-bundled versions present simultaneously)
  • GPU is confirmed accessible to the system (nvidia-smi shows no issues)
  • /dev/nvidia* devices are present and accessible
  • Binette continues to run using CPU fallback, but GPU acceleration for CheckM2 scoring is lost

Expected behavior

Binette should detect and utilize the available GPU for TensorFlow/CheckM2 scoring when executed via Snakemake, consistent with direct invocation behavior.


Possible cause

Snakemake's conda environment activation via --use-conda may not correctly propagate GPU device visibility (CUDA_VISIBLE_DEVICES) or necessary environment variables to the subprocess context. This may require explicit environment variable passthrough in the Snakemake rule or Binette's internal process launch.


Workaround

Currently running with CPU fallback. Pipeline completes successfully but without GPU acceleration for CheckM2 neural network scoring.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions