Book: [https://github.com/PacktPublishing/Reinforcement-Learning-for-LLMs]
Author: [Arun Shankar & Michael Chertushkin]
Click any badge to open a notebook in Google Colab — no installation needed.
| # | Chapter | Colab |
|---|---|---|
| 1 | Essential Math Toolkit | |
| 2 | Why LLMs Need RL: The Alignment Gap | |
| 3 | RL Fundamentals: The Complete Picture | |
| 4 | Setting Up Your Free Environment |
| # | Chapter | Colab |
|---|---|---|
| 17 | Recipe: Chatbot | |
| 18 | Recipe: Reasoner | |
| 19 | Recipe: Agent |
rl-made-easy-code/
├── notebooks/
│ ├── part1_foundations/ # Chapters 1–4
│ ├── part2_core/ # Chapters 5–10
│ ├── part3_advanced/ # Chapters 11–16
│ └── part4_recipes/ # Chapters 17–19
├── utils/
│ ├── data.py # Dataset loaders
│ ├── eval.py # Win rate, KL divergence helpers
│ └── viz.py # Training curve plots
├── data/samples/ # Toy datasets — notebooks run offline
│ ├── preferences.jsonl
│ └── prompts.jsonl
└── requirements.txt # Installed automatically by each notebook
All notebooks target the free Colab T4 GPU. Runtime → Change runtime type → T4 GPU.
| Part | Estimated runtime on T4 |
|---|---|
| Part 1 — Foundations | < 5 min (CPU) |
| Part 2 — Core methods | 10–30 min |
| Part 3 — Advanced | 20–60 min |
| Part 4 — Recipes | 30–90 min |
Code: Apache 2.0 · Text & figures: © [Arun Shankar & Michael Chertushkin], All Rights Reserved