[PATCH] D63751: AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXT

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 25 05:14:02 PDT 2019


nhaehnle accepted this revision.
nhaehnle added a comment.
This revision is now accepted and ready to land.

Trivial nitpick, but essentially LGTM.



================
Comment at: lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:386-387
+      .addReg(AMDGPU::SCC);
+    bool Ret = constrainSelectedInstRegOperands(*ICmp, TII, TRI, RBI) &&
+      RBI.constrainGenericRegister(CCReg, AMDGPU::SReg_32RegClass, MRI);
     I.eraseFromParent();
----------------
clang-format?


================
Comment at: lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:690-696
+    const unsigned BFE = Signed ? AMDGPU::V_BFE_I32 : AMDGPU::V_BFE_U32;
+    MachineInstr *ExtI =
+      BuildMI(MBB, I, DL, TII.get(BFE), DstReg)
+      .addReg(SrcReg)
+      .addImm(0) // Offset
+      .addImm(SrcSize); // Width
+    return constrainSelectedInstRegOperands(*ExtI, TII, TRI, RBI);
----------------
Out of curiosity: The unsigned case of this can be implement using `v_and_b32` as well, which would be more compact for small source sizes if `s4` or similar was legal, since it can be encoded as VOP2.

This is a micro-optimization that I believe doesn't even apply right now, but in general where is that kind of thing expected to be handled ultimately? Here in the instruction selector, or elsewhere in a separate machine instruction combining pass?


================
Comment at: lib/Target/AMDGPU/AMDGPUInstructionSelector.h:67
   bool selectG_TRUNC(MachineInstr &I) const;
+  bool selectEXT(MachineInstr &I) const;
   bool selectG_CONSTANT(MachineInstr &I) const;
----------------
selectG_EXT perhaps?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63751/new/

https://reviews.llvm.org/D63751





More information about the llvm-commits mailing list