[PATCH] D128123: [SDAG] try to replace subtract-from-constant with xor

Thu Jul 7 08:25:29 PDT 2022

arsenm added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/ds-sub-offset.ll:132
+; GFX10-NEXT:    ds_write_b8 v0, v1
 ; GFX10-NEXT:    s_endpgm
   %x.i = call i32 @llvm.amdgcn.workitem.id.x() #0
----------------
deadalnix wrote:
> foad wrote:
> > deadalnix wrote:
> > > spatel wrote:
> > > > RKSimon wrote:
> > > > > RKSimon wrote:
> > > > > > Based off @foad's comments on D128080 - we need to either add a TLI override to control this or add a way for AMDGPU to reverse it.
> > > > > Hmm - how similar is TargetLowering::preferIncOfAddToSubOfNot ?
> > > > At first glance, that seems more about constant materialization while this is about a "sub-from" instruction with an immediate operand. 
> > > > 
> > > > That hook is default true but overridden to false for vector types by ARM/AArch/PowerPC.
> > > TBH, it seems like a missed opportunity from the AMDGPU backend rather than a major problem with this patch.
> > > 
> > > Isn't there someone from AMD we can get help from here?
> > AMDGPUDAGToDAGISel::SelectDS1Addr1Offset would have to be taught to match appropriate uses of ISD::XOR as well as ISD::SUB (similar to how addressing mode matchers have to match OR as well as ADD?). This code was added by @arsenm in D12283. It doesn't sound like a very important real world use case so maybe we could just ignore the regression?
> I think there is the regression, which is probably okay, but then there is what the test is actually testing for: maximum offset.
> 
> So maybe a second test case for the maximum offset - that do not trigger the regression - should be added here, and then we can move on.
It might be nice if isBaseWithConstantOffset would recognize this like for add/or but I don't think this is that important

================
Comment at: llvm/test/CodeGen/AMDGPU/setcc-multiple-use.ll:18-19
 ; CHECK-NEXT:    s_waitcnt lgkmcnt(0)
+; CHECK-NEXT:    v_cmp_eq_u32_e32 vcc_lo, 0, v0
+; CHECK-NEXT:    v_cndmask_b32_e64 v1, 0, 1, vcc_lo
 ; CHECK-NEXT:    v_cmp_ne_u32_e32 vcc_lo, 0, v0
----------------
It's a bit weird to see a vector operation traded for a scalar one but this seems fine

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128123/new/

https://reviews.llvm.org/D128123