[BPF] Consider alignment when selecting memcpy/memmove/memset op type by yonghong-song · Pull Request #204042 · llvm/llvm-project

yonghong-song · 2026-06-16T01:20:12Z

Currently getOptimalMemOpType() forces i64 for size >= 8 and i32 otherwise without considering the operation alignment. In [1], memcpy and memmove destination and source alignments are passed separately instead of being collapsed to their minimum. As a result, poor code may be generated for a copy with a well aligned destination but a poorly aligned source: a wide store ends up fed by a wide load that the legalizer has to synthesize from byte loads via a chain of shifts and ORs.

Return MVT::Other from getOptimalMemOpType() so that findOptimalMemOpLowering() in SelectionDAG/TargetLowering.cpp selects a proper value type and alignment.

Newly added memcpy-align.ll uses u8 type for load and stores. Without BPFISelLowering.h change, a chain of shifts and ORs will happen.

The rodata_1.ll change is only instruction scheduling / register allocation churn caused by the different but equivalent lowering.

[1] #201119

Currently getOptimalMemOpType() forces i64 for size >= 8 and i32 otherwise without considering the operation alignment. In [1], memcpy and memmove destination and source alignments are passed separately instead of being collapsed to their minimum. As a result, poor code may be generated for a copy with a well aligned destination but a poorly aligned source: a wide store ends up fed by a wide load that the legalizer has to synthesize from byte loads via a chain of shifts and ORs. Return MVT::Other from getOptimalMemOpType() so that findOptimalMemOpLowering() in SelectionDAG/TargetLowering.cpp selects a proper value type and alignment. Newly added memcpy-align.ll uses u8 type for load and stores. Without BPFISelLowering.h change, a chain of shifts and ORs will happen. The rodata_1.ll change is only instruction scheduling / register allocation churn caused by the different but equivalent lowering. [1] llvm#201119

yonghong-song · 2026-06-16T02:11:50Z

In BPFSelectionDAGInfo.cpp, we have

static cl::opt<unsigned> BPFMaxStoresPerMemFunc(
    "bpf-max-stores-per-memfunc", cl::Hidden, cl::init(128),
    cl::desc("Set the maximum number of stores for inlined BPF memory "
             "intrinsics"));

So the default value for MaxStoresPerMemcpy/MaxStoresPerMemset etc. is 128.
The '128' is for the number of actual copy.

    // inline memcpy() for kernel to see explicit copy
    unsigned CommonMaxStores =
      STI.getSelectionDAGInfo()->getCommonMaxStoresPerMemFunc();

    MaxStoresPerMemset = MaxStoresPerMemsetOptSize = CommonMaxStores;
    MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = CommonMaxStores;
    MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = CommonMaxStores;
    MaxLoadsPerMemcmp = MaxLoadsPerMemcmpOptSize = CommonMaxStores;

For example, for alignment 8, the maximum to-be-copied memory size is 1024.
But for alignment 1, the maximum to-be-copied memory size is 128.

I attempt to think whether we should implement findOptimalMemOpLowering() in bpf backend so for alignment 1, we can set MaxStoresPerMemcpy to be 1024, so the same number of maximum memory is used for alignment 1 and alignment 8. But I checked
other architectures (AArch64 and SystemZ) and their usage didn't change 'limit' (count).

But if we increase alignment 1 to say alignment 8 for bpf source code, it won't be a problem. So probably findOptimalMemOpLowering() is not really needed. Your opinion is appreciated.

yonghong-song requested review from 4ast, eddyz87 and efriedma-quic June 16, 2026 02:12

yonghong-song added the backend:BPF label Jun 16, 2026

yonghong-song mentioned this pull request Jun 16, 2026

[SelectionDAG] Pass dest and src alignments separately to memcpy and memmove lowering functions #201119

Merged

eddyz87 approved these changes Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BPF] Consider alignment when selecting memcpy/memmove/memset op type#204042

[BPF] Consider alignment when selecting memcpy/memmove/memset op type#204042
yonghong-song wants to merge 1 commit into
llvm:mainfrom
yonghong-song:fix-selection-dag

yonghong-song commented Jun 16, 2026

Uh oh!

yonghong-song commented Jun 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yonghong-song commented Jun 16, 2026

Uh oh!

yonghong-song commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yonghong-song commented Jun 16, 2026 •

edited

Loading