JIT: support TYP_LONG induction variables in loop analysis and cloning#129388
JIT: support TYP_LONG induction variables in loop analysis and cloning#129388AndyAyersMS wants to merge 8 commits into
Conversation
AnalyzeIteration and the loop cloner now recognize long IVs end-to-end, covering loop limits like long params, `arr.Length`, and `arr.LongLength` (the bounds-check length is widened via a non-overflowing cast as needed). A new LoopsWithLongIV metric reports how often the analysis fires. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
|
@tannergooding FYI Local runs show no new cloning in SPMI, all long-IV controlled loops seen are pointer math so no bounds check removal opportunities to inspire cloning. However we might consider taking this regardless, just for future-proofing. |
There was a problem hiding this comment.
Pull request overview
This PR updates CoreCLR JIT loop iteration analysis and loop cloning to recognize and carry TYP_LONG induction variables through more of the pipeline, and adds a new JIT metadata metric to track when long-IV iteration analysis succeeds.
Changes:
- Extend loop iteration analysis (and related consumers) to accept
TYP_LONGIVs, including limit/offset handling and cast “peeling” for widened limits. - Update loop cloning symbolic identifiers/conditions to support long-typed constants/offsets and to widen int-vs-long comparisons.
- Add a
LoopsWithLongIVJIT metadata metric.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/jit/optimizer.cpp | Loosens IV increment-constant recognition and blocks unrolling for non-TYP_INT IVs. |
| src/coreclr/jit/loopcloning.h | Expands loop-cloning symbolic identifier representation to carry long constants/offsets. |
| src/coreclr/jit/loopcloning.cpp | Emits long constants/offsets in generated trees and widens int/long compare operands for cloning conditions; adds long-IV metric increment. |
| src/coreclr/jit/jitmetadatalist.h | Adds LoopsWithLongIV metric to the metadata metric list. |
| src/coreclr/jit/flowgraph.cpp | Extends AnalyzeIteration limit/init recognition to handle TYP_LONG, including peeling casts/offsets and evaluating trivial base cases for long. |
| src/coreclr/jit/compiler.h | Updates NaturalLoopIterInfo to store constants/offsets as pointer-sized integers and exposes them via updated signatures. |
Comments suppressed due to low confidence (1)
src/coreclr/jit/compiler.h:1975
- NaturalLoopIterInfo now stores IV constants/offsets in
ssize_t(ConstInitValue, LimitOffset) and exposes them viassize_t IterConst()/ConstLimit().ssize_tis pointer-sized, so on 32-bit builds it cannot represent full 64-bit TYP_LONG IV values (genTypeSize(TYP_LONG)==8) and will truncate long IV constants/offsets. These fields/return types should use an explicit 64-bit type (e.g. int64_t) for TYP_LONG support rather thanssize_t.
// Constant value that the induction variable is initialized with, outside
// the loop. Only valid if HasConstInit is true. Width matches the IV
// (TYP_INT IV: int-sized; TYP_LONG IV: long-sized).
ssize_t ConstInitValue = 0;
// Tree that has the loop test for the induction variable.
GenTree* TestTree = nullptr;
// Block that has the loop test.
BasicBlock* TestBlock = nullptr;
// Tree that mutates the induction variable.
GenTree* IterTree = nullptr;
I wonder how much of this is limited because we don't recognize the loop limit as being special.... One scenario that I see is typically something like the below, where they arrays can index directly and spans often use int sum = 0;
// similar for span, but there is no `nint` indexer, so devs either use `Unsafe.Add` or cast back to `int`
for (nint i = 0; i < array.Length; i++)
{
sum += array[i];
}
return sum;But the other case is how public ref struct CustomSpan<T>
{
private readonly ref T _reference;
private readonly nint _length;
public nint Length => _length;
public ref T this[nint index]
{
get
{
// Type invariant is that `_length` is never negative, the JIT specially handles this for `System.Span` but we can't relay that for our custom type
if ((nuint)(index) >= (nuint)(_length))
{
ThrowIndexOutOfRangeException();
}
return ref Unsafe.Add(ref _reference, index);
}
}
} |
Widen the local stride/limit/offset/init variables that hold ssize_t accessor results so the implicit int truncation no longer trips MSVC C4244 under /WX. For long IVs, route the limit and init emission through a small helper that picks CreateLongConst when ivType is TYP_LONG; for int IVs the result is identical to before. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use int64_t instead of ssize_t for fields and accessors that hold IV- typed constants and offsets so 64-bit values are representable on a 32-bit JIT host. Respect the comparison's signedness when widening an int operand to long. Tighten the stride form-check to require the constant's actual type to match the IV's. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.
Comments suppressed due to low confidence (1)
src/coreclr/jit/flowgraph.cpp:5901
MatchLimituseslimitOp->IsCnsIntOrI()to recognize constant limits, but that only matches GT_CNS_INT. For long-IV loops on 32-bit targets, constant limits may be GT_CNS_LNG, so the loop won't be recognized. Consider usinglimitOp->IsIntegralConst()(and ensure downstreamConstLimit()uses IntConCommon APIs).
// Check what type of limit we have - constant, variable or arr-len.
if (limitOp->IsCnsIntOrI())
{
info->HasConstLimit = true;
if ((limitOp->gtFlags & GTF_ICON_SIMD_COUNT) != 0)
On 32-bit targets long constants are GT_CNS_LNG. Switch the loop IV form-checks and accessors to IsIntegralConst / AsIntConCommon / IntegralValue so long IV inits, limits, strides, and limit-offset peels work across both target widths. Replace abs(stride) with an INT64_MIN-safe negation and widen the stride/offset JITDUMP format specifiers to %lld. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
If LimitOffset was peeled from a long-typed addend that does not fit in int32, peeling the CAST long <- int would leave an int base paired with an out-of-range offset; ToGenTree would then truncate the offset when materializing the int-sized add. Refuse the cast peel in that case. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Only peel the long-from-int limit CAST when LimitOffset is non-positive so that a positive offset cannot overflow the int-domain add performed during downstream materialization. Add a LongInductionVar test under JIT/opt/Cloning covering long IVs with array-length, long-length, parameter, constant, and arr.Length-3 limits. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
AnalyzeIteration and the loop cloner now recognize long IVs end-to-end, covering loop limits like long params,
arr.Length, andarr.LongLength(the bounds-check length is widened via a non-overflowing cast as needed). A new LoopsWithLongIV metric reports how often the analysis fires.