Skip to content

[ROCm] Conditionally enable 4bit on AMD CDNA GPUs for bitsandbytes >= v0.49.2#4161

Merged
danielhanchen merged 1 commit intounslothai:mainfrom
sstamenk:cdna_4bit
Mar 7, 2026
Merged

[ROCm] Conditionally enable 4bit on AMD CDNA GPUs for bitsandbytes >= v0.49.2#4161
danielhanchen merged 1 commit intounslothai:mainfrom
sstamenk:cdna_4bit

Conversation

@sstamenk
Copy link
Copy Markdown
Contributor

@sstamenk sstamenk commented Mar 5, 2026

  • Added a conditional check that enables 4bit for CDNA Instinct GPUs with bitsandbytes version >= 0.49.2
  • Updated comments to reflect newly supported block sizes
    • CUDA/RDNA block size 32 support - #1854
    • CDNA block size 32 support - #1887
  • Changed 4bit version check for RDNA from Version(bitsandbytes.__version__) > Version("0.49.0") to Version(bitsandbytes.__version__) >= Version("0.49.0") since the PR #1748 that adds support for RDNA is included in v0.49.0

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances compatibility and expands 4-bit quantization support for AMD GPUs within the unsloth framework. It specifically enables 4-bit quantization for CDNA Instinct GPUs when using bitsandbytes version 0.49.2 or newer, and refines version checks for RDNA GPUs to ensure broader compatibility with bitsandbytes 0.49.0. Additionally, documentation comments have been updated to reflect the latest supported block sizes across various GPU architectures.

Highlights

  • CDNA 4-bit Quantization Support: Enabled conditional 4-bit quantization support for CDNA Instinct GPUs when the bitsandbytes library version is 0.49.2 or higher.
  • Updated Block Size Comments: Revised internal comments to reflect updated supported block sizes (now 32) for CUDA, Radeon, and Instinct GPUs, and added a note about bitsandbytes 0.49.2 supporting blocksize=64 on CDNA.
  • RDNA Version Check Adjustment: Adjusted the version check for RDNA 4-bit support to be inclusive of bitsandbytes version 0.49.0 (changed from > to >=).
Changelog
  • unsloth/device_type.py
    • Updated comments regarding GPU device types, warp sizes, and block sizes, specifically changing block size values for CUDA, Radeon, and Instinct from 64/128 to 32.
    • Added a new comment indicating bitsandbytes 0.49.2 supports blocksize=64 4-bit quantization on CDNA GPUs.
    • Modified the ALLOW_BITSANDBYTES logic to introduce a specific check for bitsandbytes version 0.49.2 or higher, and adjusted the 0.49.0 check to be inclusive (>=).
  • unsloth/models/loader.py
    • Updated comments in the from_pretrained function to clarify that AMD Instinct GPUs require blocksize=128 only for bitsandbytes versions older than 0.49.2.
Activity
  • No specific activity (comments, reviews, etc.) has been recorded for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dbdce096b8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread unsloth/device_type.py
Comment on lines +110 to +111
if Version(bitsandbytes.__version__) >= Version("0.49.2"):
pass
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve HIP extension probe for bitsandbytes >=0.49.2

This branch now does pass for HIP when bitsandbytes>=0.49.2, which skips the guarded bitsandbytes.cextension import that previously caught broken ROCm installs (the same HSA failure mode noted in the comments). In that scenario, ALLOW_BITSANDBYTES remains True, so the loader still enables 4-bit paths and fails later at runtime instead of falling back safely. Please keep an explicit health check in this branch so invalid HIP/bitsandbytes setups are disabled early.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly adds support for 4-bit quantization on CDNA Instinct GPUs for bitsandbytes version 0.49.2 and newer. The version checks are updated accordingly, and comments are refreshed to reflect new block size support. My review includes one suggestion to refactor duplicated code in unsloth/models/loader.py to improve maintainability.

@sstamenk sstamenk changed the title Conditionally enable 4bit on CDNA for bitsandbytes >= v0.49.2 [ROCm] Conditionally enable 4bit on AMD CDNA GPUs for bitsandbytes >= v0.49.2 Mar 5, 2026
@danielhanchen
Copy link
Copy Markdown
Contributor

Oh marvelous thanks!

@danielhanchen danielhanchen merged commit 6933095 into unslothai:main Mar 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants