[llvm] [LoopVectorize] Perform loop versioning for some early exit loops (PR #120603)
Shih-Po Hung via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 21 00:42:52 PST 2025
arcbbb wrote:
> Do I understand correctly that this is generating out-of-bounds loads and relying on page granularity to guarantee they don't trap? If so, we cannot perform this transform with normal loads, because it is UB at the IR level -- the behavior of the underlying hardware is irrelevant.
>
> You're going to need a new load intrinsic to support this. I have this RFC draft on the topic lying around: https://hackmd.io/@nikic/S1O4QWYZkx I haven't submitted it, because I'm not particularly happy with the `%defined_size` parameter, which is needed to specify the operational semantics of the intrinsic, but not relevant for lowering.
For RVV first faulting loads, a VP variant intrinsic could support this, for example `declare { <8 x i8>, i32 } @llvm.vp.load.ff.v8i8.p0(ptr %ptr, <8 x i1> %mask, i32 %evl)`.
It differs by returning a structure that includes both the loaded data and latest EVL.
https://github.com/llvm/llvm-project/pull/120603
More information about the llvm-commits
mailing list