Skip to content

Latest commit

 

History

History
395 lines (271 loc) · 17.4 KB

File metadata and controls

395 lines (271 loc) · 17.4 KB

Installation Guide

Welcome to the installation guide for the bitsandbytes library! This document provides step-by-step instructions to install bitsandbytes across various platforms and hardware configurations. The library primarily supports CUDA-based GPUs, but the team is actively working on enabling support for additional backends like AMD ROCm, Intel, and Apple Silicon.

Tip

For a high-level overview of backend support and compatibility, see the Multi-backend Support section.

Table of Contents

CUDA[[cuda]]

bitsandbytes is currently only supported on CUDA GPUs for CUDA versions 11.0 - 12.6. However, there's an ongoing multi-backend effort under development, which is currently in alpha. If you're interested in providing feedback or testing, check out the multi-backend section below.

Supported CUDA Configurations[[cuda-pip]]

The latest version of the distributed bitsandbytes package is built with the following configurations:

OS CUDA Toolkit Host Compiler
Linux 11.7 - 12.3 GCC 11.4
12.4 - 12.6 GCC 13.2
Windows 11.7 - 12.6 MSVC 19.42+ (VS2022)
12.4+ GCC 13.2
Windows 11.7 - 12.6 MSVC 19.38+ (VS2022)

For CUDA systems, ensure your hardware meets the following requirements:

Feature Minimum Hardware Requirement
LLM.int8() NVIDIA Turing (RTX 20 series, T4) or newer GPUs
8-bit optimizers/quantization NVIDIA Maxwell (GTX 900 series, TITAN X, M40) or newer GPUs *
NF4/FP4 quantization NVIDIA Maxwell (GTX 900 series, TITAN X, M40) or newer GPUs *

Warning

bitsandbytes >= 0.45.0 no longer supports Kepler GPUs.

Support for Maxwell GPUs is deprecated and will be removed in a future release. For the best results, a Turing generation device or newer is recommended.

pip install bitsandbytes

pip install pre-built wheel from latest main commit

If you would like to use new feature even before they are officially released and help us test them, feel free to install the wheel directly from our CI (the wheel links will remain stable!):

# Note, if you don't want to reinstall BNBs dependencies, append the `--no-deps` flag!
pip install --force-reinstall 'https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_main/bitsandbytes-0.44.2.dev0-py3-none-manylinux_2_24_x86_64.whl'
# Note, if you don't want to reinstall BNBs dependencies, append the `--no-deps` flag!
pip install --force-reinstall 'https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_multi-backend-refactor/bitsandbytes-0.44.1.dev0-py3-none-macosx_13_1_arm64.whl'

Compile from source[[cuda-compile]]

Tip

Don't hesitate to compile from source! The process is pretty straight forward and resilient. This might be needed for older CUDA versions or other less common configurations, which we don't support out of the box due to package size.

For Linux and Windows systems, compiling from source allows you to customize the build configurations. See below for detailed platform-specific instructions (see the CMakeLists.txt if you want to check the specifics and explore some additional options):

To compile from source, you need CMake >= 3.22.1 and Python >= 3.9 installed. Make sure you have a compiler installed to compile C++ (gcc, make, headers, etc.).

For example, to install a compiler and CMake on Ubuntu:

apt-get install -y build-essential cmake

You should also install CUDA Toolkit by following the NVIDIA CUDA Installation Guide for Linux guide from NVIDIA. The current expected CUDA Toolkit version is 11.1+ and it is recommended to install GCC >= 7.3 and required to have at least GCC >= 6.

Refer to the following table if you're using another CUDA Toolkit version.

CUDA Toolkit GCC
>= 11.4.1 >= 11
>= 12.0 >= 12
>= 12.4 >= 13

Now to install the bitsandbytes package from source, run the following commands:

git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
pip install -r requirements-dev.txt
cmake -DCOMPUTE_BACKEND=cuda -S .
make
pip install -e .   # `-e` for "editable" install, when developing BNB (otherwise leave that out)

Tip

If you have multiple versions of CUDA installed or installed it in a non-standard location, please refer to CMake CUDA documentation for how to configure the CUDA compiler.

Windows systems require Visual Studio with C++ support as well as an installation of the CUDA SDK.

To compile from source, you need CMake >= 3.22.1 and Python >= 3.9 installed. You should also install CUDA Toolkit by following the CUDA Installation Guide for Windows guide from NVIDIA.

Refer to the following table if you're using another CUDA Toolkit version.

CUDA Toolkit MSVC
>= 11.6 19.30+ (VS2022)
git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
cmake -DCOMPUTE_BACKEND=cuda -S .
cmake --build . --config Release
pip install -e .   # `-e` for "editable" install, when developing BNB (otherwise leave that out)

Big thanks to wkpark, Jamezo97, rickardp, akx for their amazing contributions to make bitsandbytes compatible with Windows.

PyTorch CUDA versions[[pytorch-cuda-versions]]

Some bitsandbytes features may need a newer CUDA version than the one currently supported by PyTorch binaries from Conda and pip. In this case, you should follow these instructions to load a precompiled bitsandbytes binary.

  1. Determine the path of the CUDA version you want to use. Common paths include:
  • /usr/local/cuda
  • /usr/local/cuda-XX.X where XX.X is the CUDA version number

Then locally install the CUDA version you need with this script from bitsandbytes:

wget https://raw.githubusercontent.com/bitsandbytes-foundation/bitsandbytes/main/install_cuda.sh
# Syntax cuda_install CUDA_VERSION INSTALL_PREFIX EXPORT_TO_BASH
#   CUDA_VERSION in {110, 111, 112, 113, 114, 115, 116, 117, 118, 120, 121, 122, 123, 124, 125, 126}
#   EXPORT_TO_BASH in {0, 1} with 0=False and 1=True

# For example, the following installs CUDA 12.6 to ~/local/cuda-12.6 and exports the path to your .bashrc

bash install_cuda.sh 126 ~/local 1
  1. Set the environment variables BNB_CUDA_VERSION and LD_LIBRARY_PATH by manually overriding the CUDA version installed by PyTorch.

Tip

It is recommended to add the following lines to the .bashrc file to make them permanent.

export BNB_CUDA_VERSION=<VERSION>
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<PATH>

For example, to use a local install path:

export BNB_CUDA_VERSION=126
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/YOUR_USERNAME/local/cuda-12.6
  1. Now when you launch bitsandbytes with these environment variables, the PyTorch CUDA version is overridden by the new CUDA version (in this example, version 11.7) and a different bitsandbytes library is loaded.

Multi-backend Support (Alpha Release)[[multi-backend]]

Tip

This functionality is currently in preview and not yet production-ready. We very much welcome community feedback, contributions and leadership on topics like Apple Silicon as well as other less common accellerators! For more information, see this guide on multi-backend support.

Link to give us feedback (bugs, install issues, perf results, requests, etc.):

Multi-backend refactor: Alpha release (AMD ROCm ONLY)

Multi-backend refactor: Alpha release (INTEL ONLY)

Github Discussion space on coordinating the kickoff of MPS backend development

Supported Backends[[multi-backend-supported-backends]]

Backend Supported Versions Python versions Architecture Support Status
AMD ROCm 6.1+ 3.10+ minimum CDNA - gfx90a, RDNA - gfx1100 Alpha
Apple Silicon (MPS) WIP 3.10+ M1/M2 chips Planned
Intel CPU v2.4.0+ (ipex) 3.10+ Intel CPU Alpha
Intel GPU v2.4.0+ (ipex) 3.10+ Intel GPU Experimental
Ascend NPU 2.1.0+ (torch_npu) 3.10+ Ascend NPU Experimental

For each supported backend, follow the respective instructions below:

Pre-requisites[[multi-backend-pre-requisites]]

To use bitsandbytes non-CUDA backends, be sure to install:

pip install "transformers>=4.45.1"

Warning

Pre-compiled binaries are only built for ROCm versions 6.1.0/6.1.1/6.1.2/6.2.0 and gfx90a, gfx942, gfx1100 GPU architectures. Find the pip install instructions here.

Other supported versions that don't come with pre-compiled binaries can be compiled for with these instructions.

Windows is not supported for the ROCm backend; also not WSL2 to our knowledge.

Tip

If you would like to install ROCm and PyTorch on bare metal, skip the Docker steps and refer to ROCm's official guides at ROCm installation overview and Installing PyTorch for ROCm (Step 3 of wheels build for quick installation). Special note: please make sure to get the respective ROCm-specific PyTorch wheel for the installed ROCm version, e.g. https://download.pytorch.org/whl/nightly/rocm6.2/!

# Create a docker container with latest ROCm image, which includes ROCm libraries
docker pull rocm/dev-ubuntu-22.04:6.1.2-complete
docker run -it --device=/dev/kfd --device=/dev/dri --group-add video rocm/dev-ubuntu-22.04:6.1.2-complete
apt-get update && apt-get install -y git && cd home

# Install pytorch compatible with above ROCm version
pip install torch --index-url https://download.pytorch.org/whl/rocm6.1/

Compatible hardware and functioning import intel_extension_for_pytorch as ipex capable environment with Python 3.10 as the minimum requirement.

Please refer to the official Intel installations instructions for guidance on how to pip install the necessary intel_extension_for_pytorch dependency.

Compatible hardware and functioning import torch_npu capable environment with Python 3.10 as the minimum requirement.

Please refer to the official Ascend installations instructions for guidance on how to pip install the necessary torch_npu dependency.

Tip

Apple Silicon support is still a WIP. Please visit and write us in this Github Discussion space on coordinating the kickoff of MPS backend development and coordinate a community-led effort to implement this backend.

Installation

You can install the pre-built wheels for each backend, or compile from source for custom configurations.

Pre-built Wheel Installation (recommended)[[multi-backend-pip]]

# Note, if you don't want to reinstall BNBs dependencies, append the `--no-deps` flag!
pip install --force-reinstall 'https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_multi-backend-refactor/bitsandbytes-0.44.1.dev0-py3-none-manylinux_2_24_x86_64.whl'
# Note, if you don't want to reinstall BNBs dependencies, append the `--no-deps` flag!
pip install --force-reinstall 'https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_multi-backend-refactor/bitsandbytes-0.44.1.dev0-py3-none-win_amd64.whl'

Compatible hardware and functioning import torch_npu capable environment with Python 3.10 as the minimum requirement.

Please refer to the official Ascend installations instructions for guidance on how to pip install the necessary torch_npu dependency.

Warning

bitsandbytes does not yet support Apple Silicon / Metal with a dedicated backend. However, the build infrastructure is in place and the below pip install will eventually provide Apple Silicon support as it becomes available on the multi-backend-refactor branch based on community contributions.

# Note, if you don't want to reinstall BNBs dependencies, append the `--no-deps` flag!
pip install --force-reinstall 'https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_multi-backend-refactor/bitsandbytes-0.44.1.dev0-py3-none-macosx_13_1_arm64.whl'

Compile from Source[[multi-backend-compile]]

AMD GPU

bitsandbytes is fully supported from ROCm 6.1 onwards (currently in alpha release).

# Install bitsandbytes from source
# Clone bitsandbytes repo, ROCm backend is currently enabled on multi-backend-refactor branch
git clone -b multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/

# Install dependencies
pip install -r requirements-dev.txt

# Compile & install
apt-get install -y build-essential cmake  # install build tools dependencies, unless present
cmake -DCOMPUTE_BACKEND=hip -S .  # Use -DBNB_ROCM_ARCH="gfx90a;gfx942" to target specific gpu arch
make
pip install -e .   # `-e` for "editable" install, when developing BNB (otherwise leave that out)

Intel CPU / XPU

Tip

Intel CPU / XPU backend only supports building from source; for now, please follow the instructions below.

Similar to the CUDA case, you can compile bitsandbytes from source for Linux and Windows systems.

The below commands are for Linux. For installing on Windows, please adapt the below commands according to the same pattern as described the section above on compiling from source under the Windows tab.

git clone --depth 1 -b multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
pip install intel_extension_for_pytorch
pip install -r requirements-dev.txt
cmake -DCOMPUTE_BACKEND=cpu -S .
make
pip install -e .   # `-e` for "editable" install, when developing BNB (otherwise leave that out)

Ascend NPU

Tip

Ascend NPU backend only supports building from source; for now, please follow the instructions below.

# Install bitsandbytes from source
# Clone bitsandbytes repo, Ascend NPU backend is currently enabled on multi-backend-refactor branch
git clone -b multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/

# Install dependencies
pip install -r requirements-dev.txt

# Compile & install
apt-get install -y build-essential cmake  # install build tools dependencies, unless present
cmake -DCOMPUTE_BACKEND=npu -S .
make
pip install -e .   # `-e` for "editable" install, when developing BNB (otherwise leave that out)

Apple Silicon

WIP