[llvm] [VPlan] Add ExtractLane VPInst to extract across multiple parts. (PR #148817)
Florian Hahn via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 23 07:22:30 PDT 2025
================
@@ -860,6 +860,31 @@ Value *VPInstruction::generate(VPTransformState &State) {
Res = Builder.CreateOr(Res, State.get(Op));
return Builder.CreateOrReduce(Res);
}
+ case VPInstruction::ExtractLane: {
+ Value *LaneToExtract = State.get(getOperand(0), true);
+ Type *IdxTy = State.TypeAnalysis.inferScalarType(getOperand(0));
+ Value *Res = nullptr;
+ Value *RuntimeVF = getRuntimeVF(State.Builder, IdxTy, State.VF);
+
+ for (unsigned Idx = 1; Idx != getNumOperands(); ++Idx) {
+ Value *VectorStart =
+ Builder.CreateMul(RuntimeVF, ConstantInt::get(IdxTy, Idx - 1));
+ Value *VectorIdx = Idx == 1
+ ? LaneToExtract
+ : Builder.CreateSub(LaneToExtract, VectorStart);
----------------
fhahn wrote:
> Will this generate extracts from negative indices? E.g. extractlane 3, a, b for VF = 4 will cause the extract for b to be at -1. Which I think is treated as unsigned according to the langref, and on neon that would be lowered to an illegal memory access? E.g. I thought the select might have blocked the poison but we always perform two loads here:
Yep, if the index is out of range, the extract is poison. The select will take care to select the non-poison element. The backends need to make sure extracting an out-of-range lane is handled correctly.
In the aarch64 case above, `bfi x9, x0, #2, #2` should only extract the last 2 bits from the index, so the load should be in-range.
https://github.com/llvm/llvm-project/pull/148817
More information about the llvm-commits
mailing list