[PATCH] D63751: AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXT

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 25 05:45:52 PDT 2019


arsenm marked an inline comment as done.
arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:690-696
+    const unsigned BFE = Signed ? AMDGPU::V_BFE_I32 : AMDGPU::V_BFE_U32;
+    MachineInstr *ExtI =
+      BuildMI(MBB, I, DL, TII.get(BFE), DstReg)
+      .addReg(SrcReg)
+      .addImm(0) // Offset
+      .addImm(SrcSize); // Width
+    return constrainSelectedInstRegOperands(*ExtI, TII, TRI, RBI);
----------------
nhaehnle wrote:
> Out of curiosity: The unsigned case of this can be implement using `v_and_b32` as well, which would be more compact for small source sizes if `s4` or similar was legal, since it can be encoded as VOP2.
> 
> This is a micro-optimization that I believe doesn't even apply right now, but in general where is that kind of thing expected to be handled ultimately? Here in the instruction selector, or elsewhere in a separate machine instruction combining pass?
It would only really be smaller if the mask value is an inline immediate. For this and the scalar 64-bit case, you could make a code size tradeoff by materializing the constant for multiple uses.

We do similar things in SIShrinkInstructions, but I think most of the cases that appear there are due to the lack of context in SelectionDAG. I would probably want to try to handle it up front here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63751/new/

https://reviews.llvm.org/D63751





More information about the llvm-commits mailing list