Fix _benchmark dropping configs that fail compilation by fulvius31 · Pull Request #1942 · pytorch/helion

fulvius31 · 2026-04-03T14:57:04Z

Summary

_benchmark skips configs that fail compile_config but didn't emit a placeholder result, making the results list shorter than the input
parallel_benchmark_population zips members against results 1:1 and the misaligned pairing hits assert result.config is member.config
Track valid indices during compilation, splice inf-perf placeholders for failed configs at the end

Helion compiler triton codegen error for @helion.kernel(config=helion.Config(block_sizes=[16, 8], epilogue_subtile=2, flatten_loops=[True], indexing=['tensor_descriptor', 'tensor_descriptor', 'tensor_descriptor'], l2_groupings=[64], load_eviction_policies=['', 'first'], loop_orders=[[0, 1]], num_sm_multiplier=128, num_stages=7, num_warps=16, pid_type='persistent_blocked', range_flattens=[False], range_multi_buffers=[False], range_unroll_factors=[3], range_warp_specializes=[None]), static_shapes=True)
Traceback (most recent call last):
  File "/home/asangior/redhat/helion/helion/_compiler/inductor_lowering.py", line 1258, in run_node
    result = lowering.codegen(self, n)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/inductor_lowering.py", line 915, in codegen
    return codegen_fn(
           ^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/language/memory_ops.py", line 141, in _
    return strategy.codegen_store(state, tensor, [*subscript], value, extra_mask)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/indexing_strategy.py", line 525, in codegen_store
    return PointerIndexingStrategy().codegen_store(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/indexing_strategy.py", line 230, in codegen_store
    indexing = SubscriptIndexing.create(state, fake_tensor, subscript, extra_mask)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/indexing_strategy.py", line 1141, in create
    per_dim = SubscriptIndexing.compute_per_dim_indexing(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/indexing_strategy.py", line 968, in compute_per_dim_indexing
    base_offset = state.codegen.offset_var(block_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/generate_ast.py", line 113, in offset_var
    return self.active_device_loops[block_idx][-1].strategy.offset_var(block_idx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/tile_strategy.py", line 737, in offset_var
    raise NotImplementedError("offset_var not used in FlattenedTileStrategy")
NotImplementedError: offset_var not used in FlattenedTileStrategy

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 619, in compile_config
    triton_code = self.to_triton_code(
                  ^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 555, in to_triton_code
    root = generate_ast(self.host_function, config, emit_repro_caller)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/generate_ast.py", line 725, in generate_ast
    codegen.add_statement(codegen.visit(stmt))
                          ^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/ast_extension.py", line 284, in visit
    return visitor(node)
           ^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/generate_ast.py", line 509, in visit_For
    codegen_call_with_graph(self, root, [])
  File "/home/asangior/redhat/helion/helion/_compiler/inductor_lowering.py", line 1340, in codegen_call_with_graph
    return GraphInterpreter(graph, cg).run(*new_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/venv-uv-helion-upstream/lib/python3.12/site-packages/torch/fx/interpreter.py", line 200, in run
    self.env[node] = self.run_node(node)
                     ^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/inductor_lowering.py", line 1307, in run_node
    raise InductorLoweringError(
helion.exc.InductorLoweringError: Error in codegen for node store_1 (<function store at 0x7f561194ce00>): offset_var not used in FlattenedTileStrategy
While processing:
  File "/home/asangior/redhat/helion/examples/add.py", line 50, in add
    out[tile] = x[tile] + y[tile]


While executing %store_1 : [num_users=0] = call_function[target=helion.language.memory_ops.store](args = (%out, [%block_size_0, %add_1], %add_tensor, None), kwargs = {})
Original traceback:
  File "/home/asangior/redhat/helion/examples/add.py", line 50, in add
    out[tile] = x[tile] + y[tile]

Use tlparse to see full graph. (https://github.com/pytorch/tlparse?tab=readme-ov-file#tlparse-parse-structured-pt2-logs)
[0s] Skipping config that failed to compile: %s @helion.kernel(config=helion.Config(block_sizes=[16, 8], epilogue_subtile=2, flatten_loops=[True], indexing=['tensor_descriptor', 'tensor_descriptor', 'tensor_descriptor'], l2_groupings=[64], load_eviction_policies=['', 'first'], loop_orders=[[0, 1]], num_sm_multiplier=128, num_stages=7, num_warps=16, pid_type='persistent_blocked', range_flattens=[False], range_multi_buffers=[False], range_unroll_factors=[3], range_warp_specializes=[None]), static_shapes=True)
Initial population precompiling 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99/99 62.7 configs/s
Initial population exploring neighbors 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99/99 17.6 configs/s
Traceback (most recent call last):
  File "/home/asangior/redhat/helion/examples/add.py", line 87, in <module>
    main()
  File "/home/asangior/redhat/helion/examples/add.py", line 83, in main
    check(1024, 1024)
  File "/home/asangior/redhat/helion/examples/add.py", line 70, in check
    run_example(add, torch.add, (x, y))
  File "/home/asangior/redhat/helion/helion/_testing.py", line 1048, in run_example
    result = func(*cloned_args).clone()
             ^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 370, in __call__
    return self.bind(args)(*args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 956, in __call__
    self.ensure_config_exists(args)
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 924, in ensure_config_exists
    self.autotune(args, force=False)
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 776, in autotune
    config = self.env.backend.autotune(self, args, force=force, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/backend.py", line 534, in autotune
    ).autotune(skip_cache=force)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/autotuner/base_cache.py", line 266, in autotune
    config = self.autotuner.autotune()
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/autotuner/base_search.py", line 1099, in autotune
    best = self._autotune()
           ^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/autotuner/surrogate_pattern_search.py", line 373, in _autotune
    self.parallel_benchmark_population(self.population, desc="Initial population")
  File "/home/asangior/redhat/helion/helion/autotuner/base_search.py", line 1477, in parallel_benchmark_population
    assert result.config is member.config
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

hinriksnaer · 2026-04-03T15:35:49Z

These changes make sense but I'm not a big fan of adding an extra layer of complexity and tracking to an already complex function. Ideally we'd handle broken configs inline as we progress through the function e.g. emitting error results directly in the exception handler rather than doing a post-hoc fixup at the end. That said, doing it cleanly requires restructuring how the filtered lists flow through precompilation and benchmarking, which is a bigger change.

fulvius31 · 2026-04-03T15:40:15Z

These changes make sense but I'm not a big fan of adding an extra layer of complexity and tracking to an already complex function. Ideally we'd handle broken configs inline as we progress through the function e.g. emitting error results directly in the exception handler rather than doing a post-hoc fixup at the end. That said, doing it cleanly requires restructuring how the filtered lists flow through precompilation and benchmarking, which is a bigger change.

I agree but this is a bug that needs to be addressed. I wasn't able to perform the autotuning since it crashed.

hinriksnaer · 2026-04-03T16:56:22Z

I think the general changes are good for now. This function could however use some tidying up in the future.

fulvius31 added 2 commits April 3, 2026 09:48

Fix _benchmark result misalignment when configs fail compilation

645f270

improvements

80972b3

fulvius31 requested review from hinriksnaer, jansel and tianrengao April 3, 2026 14:57

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix _benchmark dropping configs that fail compilation#1942

Fix _benchmark dropping configs that fail compilation#1942
fulvius31 wants to merge 2 commits intopytorch:mainfrom
fulvius31:fix-crash-benchmark

fulvius31 commented Apr 3, 2026 •

edited

Loading

Uh oh!

hinriksnaer commented Apr 3, 2026

Uh oh!

fulvius31 commented Apr 3, 2026

Uh oh!

hinriksnaer commented Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fulvius31 commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

hinriksnaer commented Apr 3, 2026

Uh oh!

fulvius31 commented Apr 3, 2026

Uh oh!

hinriksnaer commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fulvius31 commented Apr 3, 2026 •

edited

Loading

hinriksnaer commented Apr 3, 2026 •

edited

Loading