[PATCH] D138203: [AArch64][InstCombine] Simplify repeated complex patterns in dupqlane
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 9 04:24:02 PST 2023
sdesmalen added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1468-1469
+ IntrinsicInst &II) {
+ if (II.getType()->isIntOrIntVectorTy())
+ return std::nullopt;
+
----------------
Why is this optimisation limited to integer types?
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1471-1472
+
+ auto VecInsert = dyn_cast<IntrinsicInst>(II.getOperand(0));
+ if (!VecInsert || VecInsert->getIntrinsicID() != Intrinsic::vector_insert ||
+ !isa<FixedVectorType>(VecInsert->getOperand(1)->getType()))
----------------
It might be easier to write this as:
Value *CurrentInsertElt = nullptr, *Default = nullptr;
if (!match(II, m_Intrinsic<Intrinsic::vector_insert>(m_Value(Default), m_Value(CurrentInsertElt), m_Value()) ||
!isa<FixedVectorType(CurrentInsertElt->getType())
as that matches the operands in one go.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1481
+ SmallVector<Value *> InsertEltVec(NumElements, nullptr);
+ for (uint64_t I = NumElements; I > 0; I--) {
+ InsertElementInst *InsertElt =
----------------
Does this need to be a decrementing loop? Or could you write:
while (InsertElementInst *InsertElt =
dyn_cast<InsertElementInst>(CurrentInsertElt)) {
auto Idx = cast<ConstantInt>(InsertElt->getOperand(2));
InsertEltVec[Idx->getValue().getZExtValue()] = InsertElt->getOperand(1);
CurrentInsertElt = InsertElt->getOperand(0);
}
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1491
+
+ bool AllowPoison = isa<PoisonValue>(VecInsert->getOperand(0));
+ // Bail out if there is no pattern found
----------------
You'll also need to check the lanes of the original vector being inserted into:
%1 = insertelement <4 x float> poison, float %x, i64 0 ; <--- this poison value must be checked instead of the llvm.vector.insert one.
%2 = insertelement <4 x float> %1, float %y, i64 1
%3 = tail call <vscale x 4 x float> @llvm.vector.insert.nxv4f32.v4f32(<vscale x 4 x float> poison, <4 x float> %2, i64 0)
%4 = tail call <vscale x 4 x float> @llvm.aarch64.sve.dupq.lane.nxv4f32(<vscale x 4 x float> %3, i64 0)
However, you'll also need to check the default value of the llvm.vector.insert for this case:
%1 = insertelement <2 x float> poison, float %x, i64 0
%2 = insertelement <2 x float> %1, float %y, i64 1
%3 = tail call <vscale x 4 x float> @llvm.vector.insert.nxv4f32.v2f32(<vscale x 4 x float> poison, <2 x float> %2, i64 0) ; the full subvector is defined, but not all lanes in the subvector that's being dupq'ed.
%4 = tail call <vscale x 4 x float> @llvm.aarch64.sve.dupq.lane.nxv4f32(<vscale x 4 x float> %3, i64 0)
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1437
+static SmallVector<Value *> SimplifyValuePattern(SmallVector<Value *> Vec,
+ bool AllowPoison) {
----------------
MattDevereau wrote:
> MattDevereau wrote:
> > sdesmalen wrote:
> > > It's better to not pass/return large objects by value. If you can't use ArrayRef because you have to modify the values, perhaps you can pass it by reference, and make it both the input and output array (for example by truncating the result)
> > I'm questioning whether this is adding gold plating. The SmallVector's are of pointers to values so it's just the SmallVector objects that are being copied, and this function will recurse at most 15 times. This will also need rewiring to use bool returns as the input and output vectors from this function are compared for equality to check if any pattern was found. It will be beneficial though so I'll have a go at implementing it.
> I've re-written this to now use a single vector passed by reference.
Thanks, that looks much better!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138203/new/
https://reviews.llvm.org/D138203
More information about the llvm-commits
mailing list