[PATCH] D109889: AMDGPU: Lower one copy from SCC early for SelectionDAG

Ruiling, Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 16 08:13:19 PDT 2021


ruiling created this revision.
ruiling added reviewers: arsenm, foad, critson, piotr, alex-t.
Herald added subscribers: kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl.
ruiling requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

In SelectionDAG path, we are currently using -1/0 when copying from SCC.
But in GlobalISel path, we are requesting 1/0 when copying from SCC. The
later copy lowering after register allocator follows the expectation of
GlobalISel path. So, we have to lower the copy from SCC early.

This is used to replace D109754 <https://reviews.llvm.org/D109754>.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D109889

Files:
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/test/CodeGen/AMDGPU/expand-scalar-carry-out-select-user.ll


Index: llvm/test/CodeGen/AMDGPU/expand-scalar-carry-out-select-user.ll
===================================================================
--- llvm/test/CodeGen/AMDGPU/expand-scalar-carry-out-select-user.ll
+++ llvm/test/CodeGen/AMDGPU/expand-scalar-carry-out-select-user.ll
@@ -14,7 +14,7 @@
 ; GFX7-NEXT:    s_or_b32 s4, s4, s5
 ; GFX7-NEXT:    s_cmp_lg_u32 s4, 0
 ; GFX7-NEXT:    s_addc_u32 s4, s6, 0
-; GFX7-NEXT:    s_cselect_b64 vcc, 1, 0
+; GFX7-NEXT:    s_cselect_b64 vcc, -1, 0
 ; GFX7-NEXT:    v_mov_b32_e32 v1, s4
 ; GFX7-NEXT:    s_cmp_gt_u32 s6, 31
 ; GFX7-NEXT:    v_cndmask_b32_e32 v1, 0, v1, vcc
@@ -31,7 +31,7 @@
 ; GFX9-NEXT:    v_add_co_u32_e64 v0, s[4:5], s6, s6
 ; GFX9-NEXT:    s_cmp_lg_u64 s[4:5], 0
 ; GFX9-NEXT:    s_addc_u32 s4, s6, 0
-; GFX9-NEXT:    s_cselect_b64 vcc, 1, 0
+; GFX9-NEXT:    s_cselect_b64 vcc, -1, 0
 ; GFX9-NEXT:    v_mov_b32_e32 v1, s4
 ; GFX9-NEXT:    s_cmp_gt_u32 s6, 31
 ; GFX9-NEXT:    v_cndmask_b32_e32 v1, 0, v1, vcc
@@ -49,7 +49,7 @@
 ; GFX10-NEXT:    v_add_co_u32 v0, s5, s4, s4
 ; GFX10-NEXT:    s_cmpk_lg_u32 s5, 0x0
 ; GFX10-NEXT:    s_addc_u32 s5, s4, 0
-; GFX10-NEXT:    s_cselect_b32 s6, 1, 0
+; GFX10-NEXT:    s_cselect_b32 s6, -1, 0
 ; GFX10-NEXT:    s_cmp_gt_u32 s4, 31
 ; GFX10-NEXT:    v_cndmask_b32_e64 v1, 0, s5, s6
 ; GFX10-NEXT:    s_cselect_b32 vcc_lo, -1, 0
Index: llvm/lib/Target/AMDGPU/SIISelLowering.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -4168,8 +4168,16 @@
 
     BuildMI(*BB, MII, DL, TII->get(Opc), Dest.getReg()).add(Src0).add(Src1);
 
-    BuildMI(*BB, MII, DL, TII->get(AMDGPU::COPY), CarryDest.getReg())
-      .addReg(AMDGPU::SCC);
+    const TargetRegisterClass *CarryDestRC =
+        MRI.getRegClass(CarryDest.getReg());
+    unsigned SelOpc = (TRI->getRegSizeInBits(*CarryDestRC) == 64)
+                          ? AMDGPU::S_CSELECT_B64
+                          : AMDGPU::S_CSELECT_B32;
+
+    BuildMI(*BB, MII, DL, TII->get(SelOpc), CarryDest.getReg())
+        .addImm(-1)
+        .addImm(0);
+
     MI.eraseFromParent();
     return BB;
   }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D109889.372952.patch
Type: text/x-patch
Size: 2164 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210916/d21413ca/attachment.bin>


More information about the llvm-commits mailing list