Skip to content

Initial support for ppc64le#1316

Merged
matthewdouglas merged 1 commit intobitsandbytes-foundation:mainfrom
mgiessing:main
Aug 22, 2024
Merged

Initial support for ppc64le#1316
matthewdouglas merged 1 commit intobitsandbytes-foundation:mainfrom
mgiessing:main

Conversation

@mgiessing
Copy link
Copy Markdown
Contributor

Initial support for PowerPC architecture following the design pattern introduced by the aarch64 PR

Signed-off-by: mgiessing <marvin.giessing@gmail.com>
@matthewdouglas
Copy link
Copy Markdown
Member

Thanks @mgiessing! I am not sure that we'll be able to commit to distributing binary wheels for ppc64le, but certainly welcome source compatibility!

I'll do some quick regression testing on this over the next couple days, but at first glance it looks good!

PowerPC support was deprecated in CUDA 12.4 and removed in 12.5. However my understanding is that there's still many AC922 systems with V100 GPUs out there, and maybe even some Power8 S822LC with P100s, so to add context and further advocate:

Example operational supercomputers:

Just a reference note: relates to #652

@mgiessing
Copy link
Copy Markdown
Contributor Author

Thanks a lot @matthewdouglas!
Yeah - I do not expect to have binary wheels but as you said there are many people/organisations having P100/V100/T4 on Power Systems so source compatibility is desired :-)

Btw. I figured out I had to rename the libbitsandbytes_cuda122.so to libbitsandbytes_cuda122_nocublaslt.so otherwise it would crash during the test...not sure I've done something wrong during the build which was the following:

## System: AC922 // CUDA 12.2 // RHEL8.9

# Create bnb environment and install dependencies via mamba/conda and rocketce channel
micromamba create -n bnb \
    -c rocketce \
    -c defaults \
    python=3.10 \
    pytorch==2.1.2 \
    pandas \
    scipy \
    matplotlib && micromamba clean --all --yes

micromamba activate bnb

#Install remaining depenedencies via pypi
pip3 install lion-pytorch wheel einops pytest setuptools>=63 transformers accelerate

git clone https://github.com/mgiessing/bitsandbytes

export PATH=$PATH:/usr/local/cuda/bin

cmake -DCMAKE_CUDA_ARCHITECTURES=70 -DCOMPUTE_BACKEND=cuda -S .

make -j$(nproc)
pip install -e .

cp bitsandbytes/libbitsandbytes*.so bitsandbytes/libbitsandbytes_cuda122_nocublaslt.so

#Simple test to check if it works
python3 -m bitsandbytes

@matthewdouglas
Copy link
Copy Markdown
Member

matthewdouglas commented Aug 16, 2024

If you add -DNO_CUBLASLT=ON to the cmake step it will build libbitsandbytes_cuda122_nocublaslt.so.

@matthewdouglas matthewdouglas merged commit 432a4f4 into bitsandbytes-foundation:main Aug 22, 2024
matthewdouglas pushed a commit to matthewdouglas/bitsandbytes that referenced this pull request Oct 28, 2024
Signed-off-by: mgiessing <marvin.giessing@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants