[PATCH] D63751: AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXT
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 25 05:45:52 PDT 2019
arsenm marked an inline comment as done.
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:690-696
+ const unsigned BFE = Signed ? AMDGPU::V_BFE_I32 : AMDGPU::V_BFE_U32;
+ MachineInstr *ExtI =
+ BuildMI(MBB, I, DL, TII.get(BFE), DstReg)
+ .addReg(SrcReg)
+ .addImm(0) // Offset
+ .addImm(SrcSize); // Width
+ return constrainSelectedInstRegOperands(*ExtI, TII, TRI, RBI);
----------------
nhaehnle wrote:
> Out of curiosity: The unsigned case of this can be implement using `v_and_b32` as well, which would be more compact for small source sizes if `s4` or similar was legal, since it can be encoded as VOP2.
>
> This is a micro-optimization that I believe doesn't even apply right now, but in general where is that kind of thing expected to be handled ultimately? Here in the instruction selector, or elsewhere in a separate machine instruction combining pass?
It would only really be smaller if the mask value is an inline immediate. For this and the scalar 64-bit case, you could make a code size tradeoff by materializing the constant for multiple uses.
We do similar things in SIShrinkInstructions, but I think most of the cases that appear there are due to the lack of context in SelectionDAG. I would probably want to try to handle it up front here.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D63751/new/
https://reviews.llvm.org/D63751
More information about the llvm-commits
mailing list