Skip to content

Commit 1902fa6

Browse files
committed
Add epilogue subtiling documentation
1 parent a274a36 commit 1902fa6

File tree

2 files changed

+30
-0
lines changed

2 files changed

+30
-0
lines changed

docs/api/config.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,35 @@ Configs are typically discovered automatically through autotuning, but can also
142142
``[load1, load2, ..., loadN, store1, store2, ..., storeM]``
143143
```
144144

145+
### Epilogue Optimization
146+
147+
```{eval-rst}
148+
.. autoattribute:: Config.epilogue_subtile
149+
150+
Split factor for the epilogue (pointwise ops + store) along a tile dimension.
151+
Splits the store from ``[BLOCK_M, BLOCK_N]`` into
152+
``SUBTILE_FACTOR × [BLOCK_M, BLOCK_N / SUBTILE_FACTOR]``, reducing the accumulator
153+
shared-memory footprint and enabling extra pipeline stages.
154+
155+
**Valid values:**
156+
157+
- ``None``: Disabled (default)
158+
- ``2``: Split epilogue into 2 sub-tiles
159+
- ``4``: Split epilogue into 4 sub-tiles (when K ≥ 16384)
160+
161+
**Requirements:**
162+
163+
- Blackwell (sm_100+) GPU with tensor descriptor support
164+
- Automatically discovered by the autotuner when the K dimension is ≥ 1024
165+
166+
**Interactions:**
167+
168+
- Incompatible with ``flatten_loops=True``
169+
- Forces store indexing to ``"tensor_descriptor"``
170+
171+
See the :doc:`epilogue subtiling example </examples/epilogue_subtiling>` for usage patterns.
172+
```
173+
145174
### Memory and Caching
146175

147176
```{eval-rst}

docs/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@ portable between different hardware. Helion automates and autotunes over:
6666
* PID swizzling for improved L2 cache reuse.
6767
* Loop reordering.
6868
* Persistent kernel strategies.
69+
* Epilogue subtiling for matmul-heavy kernels (Blackwell GPUs).
6970
* Warp specialization choices, unrolling, and more.
7071

7172
## Try Helion Now

0 commit comments

Comments
 (0)