[PATCH] D153479: [NFC] Tests for future commit in DAGCombiner
Konstantina Mitropoulou via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 21 21:43:39 PDT 2023
kmitropoulou added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 2
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -verify-machineinstrs -stop-after=finalize-isel < %s | FileCheck %s
+
----------------
kmitropoulou wrote:
> kmitropoulou wrote:
> > arsenm wrote:
> > > kmitropoulou wrote:
> > > > arsenm wrote:
> > > > > Why use mir for this?
> > > > CSE changes my optimization. Therefore, I need to do the checking earlier.
> > > >
> > > > For example, the following test:
> > > >
> > > > define i1 @test1(i32 %arg1, i32 %arg2) #0 {
> > > > %cmp1 = icmp slt i32 %arg1, 1000
> > > > %cmp2 = icmp slt i32 %arg2, 1000
> > > > %or = or i1 %cmp1, %cmp2
> > > > ret i1 %or
> > > > }
> > > >
> > > > will be optimized as follows with my optimization:
> > > >
> > > > bb.0 (%ir-block.0):
> > > > liveins: $vgpr0, $vgpr1
> > > > %1:vgpr_32 = COPY $vgpr1
> > > > %0:vgpr_32 = COPY $vgpr0
> > > > %2:vgpr_32 = V_MIN_I32_e64 %0, %1, implicit $exec
> > > > %3:sreg_32 = S_MOV_B32 1000
> > > > %4:sreg_32_xm0_xexec = V_CMP_LT_I32_e64 killed %2, killed %3, implicit $exec
> > > > %5:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, killed %4, implicit $exec
> > > > $vgpr0 = COPY %5
> > > > SI_RETURN implicit $vgpr0
> > > >
> > > > This is the output after the instruction selection. After CSE, the predicate of the compare instruction changes:
> > > >
> > > > ; %bb.0:
> > > > s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
> > > > s_waitcnt_vscnt null, 0x0
> > > > v_min_i32_e32 v0, v0, v1
> > > > s_delay_alu instid0(VALU_DEP_1)
> > > > v_cmp_gt_i32_e32 vcc_lo, 0x3e8, v0
> > > > v_cndmask_b32_e64 v0, 0, 1, vcc_lo
> > > > s_setpc_b64 s[30:31]
> > > >
> > > I don't understand. I assume you mean MachineCSE? Is your patch not actually a DAG combine as the description states?
> > >
> > > Can you stop somewhere after SIFixSGPRCopies instead?
> > I am sorry I meant MachineCSE. The patch will upload implements is in DAGCombiner.
> > The new checks are generated after amdgpu-isel .
> *The patch that I will upload implements the optimization in DAGCombiner.
I am sorry I did not understand your comment earlier :) I update the test.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D153479/new/
https://reviews.llvm.org/D153479
More information about the llvm-commits
mailing list