[PATCH] D81172: [AMDGPU] Implement hardware bug workaround for image instructions

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 16 07:58:16 PDT 2020


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:3411-3415
+      auto Unmerge = B.buildUnmerge(S16, Reg);
+      for (int I = 0, E = Unmerge->getNumOperands() - 1; I != E; ++I)
+        PackedRegs.push_back(Unmerge.getReg(I));
+      PackedRegs.resize(8, B.buildUndef(S16).getReg(0));
+      Reg = B.buildBuildVector(LLT::vector(8, S16), PackedRegs).getReg(0);
----------------
rdomingu wrote:
> arsenm wrote:
> > It would be preferable to emit a concat_vectors of <2 x s16> pieces here
> Sorry, I'm new to this. Why would concat_vectors be preferable than build_vector? Could you please elaborate?
Because a G_BUILD_VECTOR with 16-bit sources isn't naturally legal. This works, it just adds more work for the legalizer to reprocess these when you could produce something that's legal to begin with to save compile time


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81172/new/

https://reviews.llvm.org/D81172



More information about the llvm-commits mailing list