[PATCH] D81172: [AMDGPU] Implement hardware bug workaround for image instructions
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 28 11:27:49 PDT 2020
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:3410
+ if (StoreVT.getNumElements() == 3) {
+ SmallVector<Register, 4> PackedRegs;
+ auto Unmerge = B.buildUnmerge(S16, Reg);
----------------
If you just resize to 8 below, why not use an array?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:3411-3415
+ auto Unmerge = B.buildUnmerge(S16, Reg);
+ for (int I = 0, E = Unmerge->getNumOperands() - 1; I != E; ++I)
+ PackedRegs.push_back(Unmerge.getReg(I));
+ PackedRegs.resize(8, B.buildUndef(S16).getReg(0));
+ Reg = B.buildBuildVector(LLT::vector(8, S16), PackedRegs).getReg(0);
----------------
It would be preferable to emit a concat_vectors of <2 x s16> pieces here
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:3414
+ PackedRegs.push_back(Unmerge.getReg(I));
+ PackedRegs.resize(8, B.buildUndef(S16).getReg(0));
+ Reg = B.buildBuildVector(LLT::vector(8, S16), PackedRegs).getReg(0);
----------------
Resize here is weird. You can push back, or constructed PackedRegs to the desired size?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:3420
+ if (StoreVT.getNumElements() == 4) {
+ SmallVector<Register, 4> PackedRegs;
+ Reg = B.buildBitcast(LLT::vector(2, S32), Reg).getReg(0);
----------------
Same as above
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:7249
+ SmallVector<SDValue, 4> PackedElts;
+ for (unsigned I = 0; I < Elts.size() / 2; I += 1) {
+ SDValue Pair =
----------------
++I
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D81172/new/
https://reviews.llvm.org/D81172
More information about the llvm-commits
mailing list