[PATCH] D63751: AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXT
Nicolai Hähnle via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 25 05:14:02 PDT 2019
nhaehnle accepted this revision.
nhaehnle added a comment.
This revision is now accepted and ready to land.
Trivial nitpick, but essentially LGTM.
================
Comment at: lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:386-387
+ .addReg(AMDGPU::SCC);
+ bool Ret = constrainSelectedInstRegOperands(*ICmp, TII, TRI, RBI) &&
+ RBI.constrainGenericRegister(CCReg, AMDGPU::SReg_32RegClass, MRI);
I.eraseFromParent();
----------------
clang-format?
================
Comment at: lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:690-696
+ const unsigned BFE = Signed ? AMDGPU::V_BFE_I32 : AMDGPU::V_BFE_U32;
+ MachineInstr *ExtI =
+ BuildMI(MBB, I, DL, TII.get(BFE), DstReg)
+ .addReg(SrcReg)
+ .addImm(0) // Offset
+ .addImm(SrcSize); // Width
+ return constrainSelectedInstRegOperands(*ExtI, TII, TRI, RBI);
----------------
Out of curiosity: The unsigned case of this can be implement using `v_and_b32` as well, which would be more compact for small source sizes if `s4` or similar was legal, since it can be encoded as VOP2.
This is a micro-optimization that I believe doesn't even apply right now, but in general where is that kind of thing expected to be handled ultimately? Here in the instruction selector, or elsewhere in a separate machine instruction combining pass?
================
Comment at: lib/Target/AMDGPU/AMDGPUInstructionSelector.h:67
bool selectG_TRUNC(MachineInstr &I) const;
+ bool selectEXT(MachineInstr &I) const;
bool selectG_CONSTANT(MachineInstr &I) const;
----------------
selectG_EXT perhaps?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D63751/new/
https://reviews.llvm.org/D63751
More information about the llvm-commits
mailing list