[PATCH] D146737: [AMDGPU] Default component broadcast store
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 24 07:56:29 PDT 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp:354
+// zeros at the end of the passed vector.
+static APInt findDemandedElts(Value *V, unsigned DMaskVal = 15) {
+ unsigned VWidth = cast<FixedVectorType>(V->getType())->getNumElements();
----------------
Can you use possiblyDemandedEltsInMask or something else from vector utils? This doesn't feel like something you would need to invent yourself
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp:356
+ unsigned VWidth = cast<FixedVectorType>(V->getType())->getNumElements();
+ Instruction *CurInst = dyn_cast<Instruction>(V);
+ Value *FirstArg = CurInst;
----------------
This is an unchecked dyn_cast. You already did a dyn_cast at the call site, so just pass in Instruction to begin with?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp:359
+
+ std::vector<int> Components(VWidth, -1);
+ std::map<Value *, int> MapValueInt;
----------------
SmallVector
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp:360
+ std::vector<int> Components(VWidth, -1);
+ std::map<Value *, int> MapValueInt;
+ int LastElem = 0;
----------------
Don't use std::map. Also, can this just be a SmallVector in the reverse direction per component?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp:380
+ }
+ if (CurInst->getOpcode() == Instruction::InsertElement) {
+ InsertElementInst *IEI = dyn_cast<InsertElementInst>(CurInst);
----------------
Replace the opcode check with the dyn_cast
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp:395-396
+ FirstArg = CurInst->getOperand(0);
+ } else if (CurInst->getOpcode() == Instruction::ShuffleVector) {
+ ShuffleVectorInst *Shuffle = dyn_cast<ShuffleVectorInst>(CurInst);
+
----------------
Replace the opcode check with the dyn_cast
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp:1177
+ // that means that the intrinsic has already been optimized.
+ if (4 - APInt(4, DMaskVal).countLeadingZeros() > VWidth)
+ break;
----------------
Don't bother using APInt here?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D146737/new/
https://reviews.llvm.org/D146737
More information about the llvm-commits
mailing list