[PATCH] D153479: [NFC] Tests for future commit in DAGCombiner

Wed Jun 21 21:43:39 PDT 2023

kmitropoulou added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 2
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -verify-machineinstrs -stop-after=finalize-isel < %s | FileCheck %s
+
----------------
kmitropoulou wrote:
> kmitropoulou wrote:
> > arsenm wrote:
> > > kmitropoulou wrote:
> > > > arsenm wrote:
> > > > > Why use mir for this?
> > > > CSE changes my optimization. Therefore, I need to do the checking earlier. 
> > > > 
> > > > For example, the following test:
> > > > 
> > > > define i1 @test1(i32 %arg1, i32 %arg2) #0 {
> > > >   %cmp1 = icmp slt i32 %arg1, 1000
> > > >   %cmp2 = icmp slt i32 %arg2, 1000
> > > >   %or  = or i1 %cmp1, %cmp2
> > > >   ret i1 %or
> > > > }
> > > > 
> > > > will be optimized as follows with my optimization:
> > > > 
> > > >   bb.0 (%ir-block.0):
> > > >     liveins: $vgpr0, $vgpr1
> > > >     %1:vgpr_32 = COPY $vgpr1
> > > >     %0:vgpr_32 = COPY $vgpr0
> > > >     %2:vgpr_32 = V_MIN_I32_e64 %0, %1, implicit $exec
> > > >     %3:sreg_32 = S_MOV_B32 1000
> > > >     %4:sreg_32_xm0_xexec = V_CMP_LT_I32_e64 killed %2, killed %3, implicit $exec
> > > >     %5:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, killed %4, implicit $exec
> > > >     $vgpr0 = COPY %5
> > > >     SI_RETURN implicit $vgpr0
> > > > 
> > > > This is the output after the instruction selection. After CSE, the predicate of the compare instruction changes:
> > > > 
> > > > ; %bb.0:
> > > >         s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
> > > >         s_waitcnt_vscnt null, 0x0
> > > >         v_min_i32_e32 v0, v0, v1
> > > >         s_delay_alu instid0(VALU_DEP_1)
> > > >         v_cmp_gt_i32_e32 vcc_lo, 0x3e8, v0
> > > >         v_cndmask_b32_e64 v0, 0, 1, vcc_lo
> > > >         s_setpc_b64 s[30:31]
> > > > 
> > > I don't understand. I assume you mean MachineCSE? Is your patch not actually a DAG combine as the description states?
> > > 
> > > Can you stop somewhere after SIFixSGPRCopies instead? 
> > I am sorry I meant MachineCSE. The patch will upload implements is in DAGCombiner. 
> > The new checks are generated after amdgpu-isel .
> *The patch that I will upload implements the optimization in DAGCombiner.
I am sorry I did not understand your comment earlier :) I update the test.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153479/new/

https://reviews.llvm.org/D153479