[llvm] [IA]: Construct (de)interleave4 out of (de)interleave2 (PR #89276)
Paul Walker via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 30 09:56:17 PDT 2024
================
@@ -16958,40 +17079,94 @@ bool AArch64TargetLowering::lowerDeinterleaveIntrinsicToLoad(
LdN = Builder.CreateCall(LdNFunc, {Pred, Address}, "ldN");
else
LdN = Builder.CreateCall(LdNFunc, Address, "ldN");
-
Value *Idx =
Builder.getInt64(I * LdTy->getElementCount().getKnownMinValue());
- Left = Builder.CreateInsertVector(
- VTy, Left, Builder.CreateExtractValue(LdN, 0), Idx);
- Right = Builder.CreateInsertVector(
- VTy, Right, Builder.CreateExtractValue(LdN, 1), Idx);
+ for (unsigned J = 0; J < Factor; ++J) {
+ WideValues[J] = Builder.CreateInsertVector(
+ VTy, WideValues[J], Builder.CreateExtractValue(LdN, J), Idx);
+ }
+ }
+ if (Factor == 2)
+ Result = PoisonValue::get(StructType::get(VTy, VTy));
+ else
+ Result = PoisonValue::get(StructType::get(VTy, VTy, VTy, VTy));
+ // Construct the wide result out of the small results.
+ for (unsigned J = 0; J < Factor; ++J) {
+ Result = Builder.CreateInsertValue(Result, WideValues[J], J);
}
-
- Result = PoisonValue::get(DI->getType());
- Result = Builder.CreateInsertValue(Result, Left, 0);
- Result = Builder.CreateInsertValue(Result, Right, 1);
} else {
if (UseScalable)
Result = Builder.CreateCall(LdNFunc, {Pred, BaseAddr}, "ldN");
else
Result = Builder.CreateCall(LdNFunc, BaseAddr, "ldN");
}
-
- DI->replaceAllUsesWith(Result);
+ // Itereate over old deinterleaved values to replace it by
+ // the new values.
+ for (unsigned I = 0; I < DeinterleavedValues.size(); I++) {
+ Value *NewExtract = Builder.CreateExtractValue(Result, I);
----------------
paulwalker-arm wrote:
Here you're extracting a field from a struct that you'd only just added so I thinking you can use `WideValues` directly?
I'm assuming the code is like it is because of the `NumLoads==1` case so I'm wondering if you can move the extract values into the else block, effectively ensuring `WideValues` (or perhaps a better name is now required) is set on both side of the `NumLoads` comparison and thus here you have access to the extracted values directly (i.e. for the `Factor==4` there would be no need to create the intermediary struct and my two previous comments become irrelevant.
https://github.com/llvm/llvm-project/pull/89276
More information about the llvm-commits
mailing list