Skip to content

Fix _benchmark dropping configs that fail compilation#1942

Open
fulvius31 wants to merge 2 commits intopytorch:mainfrom
fulvius31:fix-crash-benchmark
Open

Fix _benchmark dropping configs that fail compilation#1942
fulvius31 wants to merge 2 commits intopytorch:mainfrom
fulvius31:fix-crash-benchmark

Conversation

@fulvius31
Copy link
Copy Markdown
Collaborator

@fulvius31 fulvius31 commented Apr 3, 2026

Summary

  • _benchmark skips configs that fail compile_config but didn't emit a placeholder result, making the results list shorter than the input
  • parallel_benchmark_population zips members against results 1:1 and the misaligned pairing hits assert result.config is member.config
  • Track valid indices during compilation, splice inf-perf placeholders for failed configs at the end
Helion compiler triton codegen error for @helion.kernel(config=helion.Config(block_sizes=[16, 8], epilogue_subtile=2, flatten_loops=[True], indexing=['tensor_descriptor', 'tensor_descriptor', 'tensor_descriptor'], l2_groupings=[64], load_eviction_policies=['', 'first'], loop_orders=[[0, 1]], num_sm_multiplier=128, num_stages=7, num_warps=16, pid_type='persistent_blocked', range_flattens=[False], range_multi_buffers=[False], range_unroll_factors=[3], range_warp_specializes=[None]), static_shapes=True)
Traceback (most recent call last):
  File "/home/asangior/redhat/helion/helion/_compiler/inductor_lowering.py", line 1258, in run_node
    result = lowering.codegen(self, n)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/inductor_lowering.py", line 915, in codegen
    return codegen_fn(
           ^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/language/memory_ops.py", line 141, in _
    return strategy.codegen_store(state, tensor, [*subscript], value, extra_mask)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/indexing_strategy.py", line 525, in codegen_store
    return PointerIndexingStrategy().codegen_store(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/indexing_strategy.py", line 230, in codegen_store
    indexing = SubscriptIndexing.create(state, fake_tensor, subscript, extra_mask)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/indexing_strategy.py", line 1141, in create
    per_dim = SubscriptIndexing.compute_per_dim_indexing(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/indexing_strategy.py", line 968, in compute_per_dim_indexing
    base_offset = state.codegen.offset_var(block_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/generate_ast.py", line 113, in offset_var
    return self.active_device_loops[block_idx][-1].strategy.offset_var(block_idx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/tile_strategy.py", line 737, in offset_var
    raise NotImplementedError("offset_var not used in FlattenedTileStrategy")
NotImplementedError: offset_var not used in FlattenedTileStrategy

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 619, in compile_config
    triton_code = self.to_triton_code(
                  ^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 555, in to_triton_code
    root = generate_ast(self.host_function, config, emit_repro_caller)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/generate_ast.py", line 725, in generate_ast
    codegen.add_statement(codegen.visit(stmt))
                          ^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/ast_extension.py", line 284, in visit
    return visitor(node)
           ^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/generate_ast.py", line 509, in visit_For
    codegen_call_with_graph(self, root, [])
  File "/home/asangior/redhat/helion/helion/_compiler/inductor_lowering.py", line 1340, in codegen_call_with_graph
    return GraphInterpreter(graph, cg).run(*new_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/venv-uv-helion-upstream/lib/python3.12/site-packages/torch/fx/interpreter.py", line 200, in run
    self.env[node] = self.run_node(node)
                     ^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/inductor_lowering.py", line 1307, in run_node
    raise InductorLoweringError(
helion.exc.InductorLoweringError: Error in codegen for node store_1 (<function store at 0x7f561194ce00>): offset_var not used in FlattenedTileStrategy
While processing:
  File "/home/asangior/redhat/helion/examples/add.py", line 50, in add
    out[tile] = x[tile] + y[tile]


While executing %store_1 : [num_users=0] = call_function[target=helion.language.memory_ops.store](args = (%out, [%block_size_0, %add_1], %add_tensor, None), kwargs = {})
Original traceback:
  File "/home/asangior/redhat/helion/examples/add.py", line 50, in add
    out[tile] = x[tile] + y[tile]

Use tlparse to see full graph. (https://github.com/pytorch/tlparse?tab=readme-ov-file#tlparse-parse-structured-pt2-logs)
[0s] Skipping config that failed to compile: %s @helion.kernel(config=helion.Config(block_sizes=[16, 8], epilogue_subtile=2, flatten_loops=[True], indexing=['tensor_descriptor', 'tensor_descriptor', 'tensor_descriptor'], l2_groupings=[64], load_eviction_policies=['', 'first'], loop_orders=[[0, 1]], num_sm_multiplier=128, num_stages=7, num_warps=16, pid_type='persistent_blocked', range_flattens=[False], range_multi_buffers=[False], range_unroll_factors=[3], range_warp_specializes=[None]), static_shapes=True)
Initial population precompiling 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99/99 62.7 configs/s
Initial population exploring neighbors 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99/99 17.6 configs/s
Traceback (most recent call last):
  File "/home/asangior/redhat/helion/examples/add.py", line 87, in <module>
    main()
  File "/home/asangior/redhat/helion/examples/add.py", line 83, in main
    check(1024, 1024)
  File "/home/asangior/redhat/helion/examples/add.py", line 70, in check
    run_example(add, torch.add, (x, y))
  File "/home/asangior/redhat/helion/helion/_testing.py", line 1048, in run_example
    result = func(*cloned_args).clone()
             ^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 370, in __call__
    return self.bind(args)(*args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 956, in __call__
    self.ensure_config_exists(args)
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 924, in ensure_config_exists
    self.autotune(args, force=False)
  File "/home/asangior/redhat/helion/helion/runtime/kernel.py", line 776, in autotune
    config = self.env.backend.autotune(self, args, force=force, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/_compiler/backend.py", line 534, in autotune
    ).autotune(skip_cache=force)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/autotuner/base_cache.py", line 266, in autotune
    config = self.autotuner.autotune()
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/autotuner/base_search.py", line 1099, in autotune
    best = self._autotune()
           ^^^^^^^^^^^^^^^^
  File "/home/asangior/redhat/helion/helion/autotuner/surrogate_pattern_search.py", line 373, in _autotune
    self.parallel_benchmark_population(self.population, desc="Initial population")
  File "/home/asangior/redhat/helion/helion/autotuner/base_search.py", line 1477, in parallel_benchmark_population
    assert result.config is member.config
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 3, 2026
@hinriksnaer
Copy link
Copy Markdown
Collaborator

These changes make sense but I'm not a big fan of adding an extra layer of complexity and tracking to an already complex function. Ideally we'd handle broken configs inline as we progress through the function e.g. emitting error results directly in the exception handler rather than doing a post-hoc fixup at the end. That said, doing it cleanly requires restructuring how the filtered lists flow through precompilation and benchmarking, which is a bigger change.

@fulvius31
Copy link
Copy Markdown
Collaborator Author

These changes make sense but I'm not a big fan of adding an extra layer of complexity and tracking to an already complex function. Ideally we'd handle broken configs inline as we progress through the function e.g. emitting error results directly in the exception handler rather than doing a post-hoc fixup at the end. That said, doing it cleanly requires restructuring how the filtered lists flow through precompilation and benchmarking, which is a bigger change.

I agree but this is a bug that needs to be addressed. I wasn't able to perform the autotuning since it crashed.

@hinriksnaer
Copy link
Copy Markdown
Collaborator

hinriksnaer commented Apr 3, 2026

I think the general changes are good for now. This function could however use some tidying up in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants