[all-commits] [llvm/llvm-project] 92a065: [LowerMemIntrinsics] Lower llvm.memmove to wide me...
Fabian Ritter via All-commits
all-commits at lists.llvm.org
Thu Jul 25 23:43:52 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 92a06546ab50c2b6beac766d989b30960c353e61
https://github.com/llvm/llvm-project/commit/92a06546ab50c2b6beac766d989b30960c353e61
Author: Fabian Ritter <fabian.ritter at amd.com>
Date: 2024-07-26 (Fri, 26 Jul 2024)
Changed paths:
M llvm/lib/Transforms/Utils/LowerMemIntrinsics.cpp
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.memmove.ll
M llvm/test/CodeGen/AMDGPU/lower-mem-intrinsics.ll
M llvm/test/CodeGen/NVPTX/lower-aggr-copies.ll
Log Message:
-----------
[LowerMemIntrinsics] Lower llvm.memmove to wide memory accesses (#100122)
So far, the IR-level lowering of llvm.memmove intrinsics generates loops
that copy each byte individually. This can be wasteful for targets that
provide wider memory access operations.
This patch makes the memmove lowering more similar to the lowering of
memcpy with unknown length.
TargetTransformInfo::getMemcpyLoopLoweringType() is queried for an
adequate type for the memory accesses, and if it is wider than a single
byte, the greatest multiple of the type's size that is less than or
equal to the length is copied with corresponding wide memory accesses. A
residual loop with byte-wise accesses (or a sequence of suitable memory
accesses in case the length is statically known) is introduced for the
remaining bytes.
For memmove, this construct is required in two variants: one for copying
forward and one for copying backwards, to handle overlapping memory
ranges. For the backwards case, the residual code still covers the bytes
at the end of the copied region and is therefore executed before the
wide main loop. This implementation choice is based on the assumption
that we are more likely to encounter memory ranges whose start aligns
with the access width than ones whose end does.
In microbenchmarks on gfx1030 (AMDGPU), this change yields speedups up
to 16x for memmoves with variable or large constant lengths.
Part of SWDEV-455845.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list