Commit d848c74
committed
Drop the escape in canUseDPP.
With the escape, the DPP path could be taken when blockSize >
maxActiveReductionThreads, leaving extra threads with nrtid >= 1
(out of the valid [0, 1) range) that would compute out-of-bounds
LDS coordinates. Tuning data across f16/f32/int8 attention configs
shows nrDimProd is always >= 16, so this escape was never actually
triggered and removing it does not change behavior for any current
configuration.1 parent aac9312 commit d848c74
1 file changed
Lines changed: 5 additions & 4 deletions
Lines changed: 5 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1360 | 1360 | | |
1361 | 1361 | | |
1362 | 1362 | | |
1363 | | - | |
1364 | | - | |
| 1363 | + | |
| 1364 | + | |
| 1365 | + | |
| 1366 | + | |
1365 | 1367 | | |
1366 | 1368 | | |
1367 | 1369 | | |
| |||
1370 | 1372 | | |
1371 | 1373 | | |
1372 | 1374 | | |
1373 | | - | |
1374 | | - | |
| 1375 | + | |
1375 | 1376 | | |
1376 | 1377 | | |
1377 | 1378 | | |
| |||
0 commit comments