[Auto-Recovery] Add checkpoint save/load/resume#1947
Open
yf225 wants to merge 1 commit intoyf225/stack/95from
Open
[Auto-Recovery] Add checkpoint save/load/resume#1947yf225 wants to merge 1 commit intoyf225/stack/95from
yf225 wants to merge 1 commit intoyf225/stack/95from
Conversation
This was referenced Apr 4, 2026
yf225
added a commit
that referenced
this pull request
Apr 4, 2026
Add opt-in checkpoint support gated behind HELION_AUTOTUNE_CHECKPOINT_DIR. When set, the autotuner saves in-progress state each generation and can resume from a checkpoint on subsequent runs. The checkpoint file is deleted on successful completion. Includes pickle serialization support for BaseSearch and PopulationMember, stable-hash-based checkpoint file naming, atomic writes, and kernel recompilation on checkpoint load. stack-info: PR: #1947, branch: yf225/stack/96
yf225
added a commit
that referenced
this pull request
Apr 4, 2026
Add opt-in checkpoint support gated behind HELION_AUTOTUNE_CHECKPOINT_DIR. When set, the autotuner saves in-progress state each generation and can resume from a checkpoint on subsequent runs. The checkpoint file is deleted on successful completion. Includes pickle serialization support for BaseSearch and PopulationMember, stable-hash-based checkpoint file naming, atomic writes, and kernel recompilation on checkpoint load. stack-info: PR: #1947, branch: yf225/stack/96
Add opt-in checkpoint support gated behind HELION_AUTOTUNE_CHECKPOINT_DIR. When set, the autotuner saves in-progress state each generation and can resume from a checkpoint on subsequent runs. The checkpoint file is deleted on successful completion. Includes pickle serialization support for BaseSearch and PopulationMember, stable-hash-based checkpoint file naming, atomic writes, and kernel recompilation on checkpoint load. stack-info: PR: #1947, branch: yf225/stack/96
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked PRs:
[Auto-Recovery] Add checkpoint save/load/resume
Add opt-in checkpoint support gated behind HELION_AUTOTUNE_CHECKPOINT_DIR.
When set, the autotuner saves in-progress state each generation and can
resume from a checkpoint on subsequent runs. The checkpoint file is
deleted on successful completion.
Includes pickle serialization support for BaseSearch and PopulationMember,
stable-hash-based checkpoint file naming, atomic writes, and kernel
recompilation on checkpoint load.