[issue] Surprising Performance Drop When Using <think> Instead of <reasoning> as Custom Tags for Fine-tuning

Hello Unsloth team!

Please excuse this beginner question. I'm new to the world of fine-tuning, and your library has been a fantastic and accessible starting point for me. While experimenting, I've encountered some model behavior that I don't understand and was hoping to get some clarification on what feels like a fundamental concept.

#### **1. Did you update?**
Yes, `pip install --upgrade unsloth` is up to date.

#### **2. `Colab` or `Kaggle` or local / cloud**
Local.

#### **3. Number GPUs used**
1x NVIDIA GeForce RTX 4090

#### **4. Which notebook? Please link!**
I only modified the custom tag in the official qwen3-4b-gpro example and removed some unnecessary output checks. Below is the link to the online notebook. https://colab.research.google.com/drive/1id4WqGn3yDZ4uOEmQI5HCR8UM1S64H07?usp=sharing

#### **5. Which Unsloth version, TRL version, etc.?**
Transformers: 4.53.2. vLLM: 0.9.2.
NVIDIA GeForce RTX 4090. Num GPUs = 2. Max memory: 23.514 GB. Platform: Linux.
Torch: 2.7.0+cu126. CUDA: 8.9. CUDA Toolkit: 12.6. Triton: 3.3.0

#### **6. Which trainer?**
`GRPOTrainer` (but the same issue is observable with `SFTTrainer`).

### **Problem Description**

I am trying to fine-tune the `unsloth/Qwen3-8B-Base` model for mathematical reasoning. My goal is to teach the model to first "think" about the problem and then provide a final answer, using a specific format.

I conducted an experiment with two scenarios. The only difference between them was the custom tags I used in my data formatting.

**Scenario A: This works perfectly.**
I used `<reasoning>` and `<answer>` as my custom tags. The model learns the format very well and generates responses that follow the `assistant: <reasoning>...</reasoning><answer>...</answer>` structure.
```python
reasoning_start = "<reasoning>" 
reasoning_end   = "</reasoning>"   
solution_start  = "<answer>"
solution_end    = "</answer>"

system_prompt = \
f"""You are given a problem.
Think about the problem and provide your working out.
Place it between {reasoning_start} and {reasoning_end}.
Then, provide your solution between {solution_start}{solution_end}"""
```
<table>
  <tr>
    <td><img src="https://github.com/user-attachments/assets/5c7436a2-5ed9-4669-92ce-3c88dcc5e10e" alt="Image 2" width="200"></td>
    <td><img src="https://github.com/user-attachments/assets/283c914c-ba49-4583-939b-f050e8bb99a8" alt="Image 3" width="400"></td>
  </tr>
</table>
<table>
  <tr>
    <td><img src="https://github.com/user-attachments/assets/f19e45d7-0eb5-4415-a3f2-16608d5785e8" alt="Image 4" width="400"></td>
    <td><img src="https://github.com/user-attachments/assets/d8394540-4cc5-4d5f-9a7e-c211de25a9d9" alt="Image 5" width="600"></td>
  </tr>
</table>

**Scenario B: This performs very poorly.**
I changed the tags from `<reasoning>` to `<think>`. So the target format became `assistant: <think>...</think><answer>...</answer>`. To my surprise, the model completely fails to learn this format. The output is often incoherent, and it doesn't follow the desired structure at all.
```python
reasoning_start = "<think>" 
reasoning_end   = "</think>"   
solution_start  = "<answer>"
solution_end    = "</answer>"

system_prompt = \
f"""You are given a problem.
Think about the problem and provide your working out.
Place it between {reasoning_start} and {reasoning_end}.
Then, provide your solution between {solution_start}{solution_end}"""
```
<table>
  <tr>
    <td><img src="https://github.com/user-attachments/assets/bc73212b-48cb-4340-a624-8248ef409825" alt="Image 6" width="200"></td>
    <td><img src="https://github.com/user-attachments/assets/548fd5bb-fad6-44be-8b15-d375e8ecaab3" alt="Image 7" width="400"></td>
  </tr>
</table>
<table>
  <tr>
    <td><img src="https://github.com/user-attachments/assets/56a74157-c3f7-4258-b93e-8dfaf992ebb0" alt="Image 8" width="400"></td>
    <td><img src="https://github.com/user-attachments/assets/9ba8c60f-2dc7-43bd-a490-f85eb81a184c" alt="Image 9" width="600"></td>
  </tr>
</table>

Is there something wrong with my code? How should I fix it? Thank you for your time!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[issue] Surprising Performance Drop When Using <think> Instead of <reasoning> as Custom Tags for Fine-tuning #3029

1. Did you update?

2. `Colab` or `Kaggle` or local / cloud

3. Number GPUs used

4. Which notebook? Please link!

5. Which Unsloth version, TRL version, etc.?

6. Which trainer?

Problem Description

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[issue] Surprising Performance Drop When Using <think> Instead of <reasoning> as Custom Tags for Fine-tuning #3029

Description

1. Did you update?

2. Colab or Kaggle or local / cloud

3. Number GPUs used

4. Which notebook? Please link!

5. Which Unsloth version, TRL version, etc.?

6. Which trainer?

Problem Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

2. `Colab` or `Kaggle` or local / cloud