[PATCH] D40158: AMDGPU: Use gfx9 carry-less add/sub instructions

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 17 12:19:20 PST 2017


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2847
            AMDGPU::COPY : AMDGPU::V_MOV_B32_e32;
   case AMDGPU::S_ADD_I32:
+  case AMDGPU::S_ADD_U32:
----------------
rampitec wrote:
> Should not we always return V_ADD_I32_e32 here?
One of these is dead. We always select S_ADD_I32/S_SUB_I32. We don't select these with uses of the SCC value, so I don't think it matters we use.

I tried swapping this to always select s_add_u32 for add, but that has a similar problem later. We select s_addk_i32 from s_add_i32, and not s_add_u32. That's formed a lot later, where it's more questionable to make assumptions about how SCC is getting used. We're missing optimizations to try to use SCC conditions, but ideally there would be some.


https://reviews.llvm.org/D40158





More information about the llvm-commits mailing list