[PATCH] D154488: [PowerPC] Define SchedModel for Power8

Sun Jul 16 19:53:05 PDT 2023

shchenz added a comment.

Thanks for the big effort.
I think moving from old `ProcessorItineraries` form to new `ProcessorModel` form should be correct. Some comments from me. And also please note that there are some failures in the pre-merge checks.

================
Comment at: llvm/lib/Target/PowerPC/PPCScheduleP8.td:18
+  let LoopMicroOpBufferSize = 60;
+  let MicroOpBufferSize = 60;
+  let CompleteModel = 0;
----------------
why the `MicroOpBufferSize` is 60? The pwr8 UM seems showing this value as 32 * 2?

================
Comment at: llvm/lib/Target/PowerPC/PPCScheduleP8.td:19
+  let MicroOpBufferSize = 60;
+  let CompleteModel = 0;
+  let UnsupportedFeatures = [HasSPE, PrefixInstrs, MMA,
----------------
We need to comment here why on pwr8 this field is set as 0.

================
Comment at: llvm/lib/Target/PowerPC/PPCScheduleP8.td:47
+  // Units for scalar, 2xDouble and 4xSingle
+  def P8_FP_1 : ProcResource<2> { let Super = P8_FPU; }
+  def P8_FP_2 : ProcResource<4> { let Super = P8_FPU; }
----------------
minor: _1, _2, _4 does not sound like a good name...

================
Comment at: llvm/lib/Target/PowerPC/PPCScheduleP8.td:48
+  def P8_FP_1 : ProcResource<2> { let Super = P8_FPU; }
+  def P8_FP_2 : ProcResource<4> { let Super = P8_FPU; }
+  def P8_FP_4 : ProcResource<2> { let Super = P8_VMX; }
----------------
Why this unit number for type "2xdouble" is set to 4? Its parent `P8_VMX` only has 2.

================
Comment at: llvm/lib/Target/PowerPC/PPCScheduleP8.td:61
+  def P8_PORT_FXLD : ProcResource<4> { let Super = P8_PORT_ALLFX; }
+  def P8_PORT_1 : ProcResource<2> { let Super = P8_PORT_FXLD; }
+  def P8_PORT_2 : ProcResource<2> { let Super = P8_PORT_FXLD; }
----------------
Same with the name issue _1, ..._6.

================
Comment at: llvm/test/CodeGen/PowerPC/BreakableToken-reduced.ll:244
+; CHECK-NEXT:    std 0, 120(1)
+; CHECK-NEXT:    std 3, 104(1)
 ; CHECK-NEXT:    mr 3, 12
----------------
Seems positive, now we use less CSRs.

================
Comment at: llvm/test/CodeGen/PowerPC/P10-stack-alignment.ll:31
+; CHECK-LE-NEXT:    addi r3, r3, .LCPI0_1 at toc@l
+; CHECK-LE-NEXT:    lxvd2x vs0, 0, r3
 ; CHECK-LE-NEXT:    addi r3, r1, 32
----------------
low ILP but less registers?

================
Comment at: llvm/test/CodeGen/PowerPC/aix32-p8-scalar_vector_conversions.ll:30
+; CHECK-NEXT:    addi 3, 1, -16
+; CHECK-NEXT:    lxvw4x 34, 0, 3
 ; CHECK-NEXT:    vsplth 2, 2, 0
----------------
lower ILP with less register, I think it is fine only if we get better perf in benchmarks.

================
Comment at: llvm/test/CodeGen/PowerPC/atomics-i128.ll:510
 ; PPC-PWR8-NEXT:    bl __atomic_compare_exchange
-; PPC-PWR8-NEXT:    cmplwi r3, 0
+; PPC-PWR8-NEXT:    mr r7, r3
 ; PPC-PWR8-NEXT:    lwz r6, 44(r1)
----------------
negative change, now we have more `mr` which is not free on pwr8.

================
Comment at: llvm/test/CodeGen/PowerPC/atomics-i16-ldst.ll:184
+; CHECK-PREP10-NEXT:    lbzx r3, r3, r4
+; CHECK-PREP10-NEXT:    blr
 entry:
----------------
We may need another name for check prefix `PREP10`? And also double check if prefixes `P8` and `P9` are still used in this file?

================
Comment at: llvm/test/CodeGen/PowerPC/atomics-i16-ldst.ll:4519
+; CHECK-P8: {{.*}}
+; CHECK-P9: {{.*}}
----------------
Ah, it show P8 and P9 are not used, we should delete them in the run lines.

================
Comment at: llvm/test/CodeGen/PowerPC/atomics-i32-ldst.ll:4779
+; CHECK-P8: {{.*}}
+; CHECK-P9: {{.*}}
----------------
Ditto

================
Comment at: llvm/test/CodeGen/PowerPC/atomics-i8-ldst.ll:4219
 }
+;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; CHECK-P8: {{.*}}
----------------
Ditto

================
Comment at: llvm/test/CodeGen/PowerPC/atomics-regression.ll:456
 ; PPC64LE:       # %bb.0:
-; PPC64LE-NEXT:    clrlwi 4, 4, 24
 ; PPC64LE-NEXT:    lwsync
+; PPC64LE-NEXT:    clrlwi 4, 4, 24
----------------
Aren't the atomic related instructions are barriers to the scheduler? Can you check why there are changes around them? Thanks.

================
Comment at: llvm/test/CodeGen/PowerPC/bool-math.ll:120

 define i8 @low_bit_select_constants_bigger_true_same_size_result(i8 %x) {
   %a = and i8 %x, 1
----------------
Weird. Now there is no check for this function?

================
Comment at: llvm/test/CodeGen/PowerPC/branch_coalesce.ll:18
+; CHECK-NEXT:    addis 3, 2, .LCPI0_1 at toc@ha
+; CHECK-NEXT:    lfd 3, .LCPI0_1 at toc@l(3)
 ; CHECK-NEXT:  .LBB0_2: # %entry
----------------
lower ILP and less registers

================
Comment at: llvm/test/CodeGen/PowerPC/ctrloop-constrained-fp.ll:27
+; CHECK-NEXT:    stfd 1, 0(30)
+; CHECK-NEXT:    cmpldi 29, 0
 ; CHECK-NEXT:    bc 12, 1, .LBB0_1
----------------
Positive

================
Comment at: llvm/test/CodeGen/PowerPC/elf64-byval-cc.ll:35
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    mr 4, 3
+; CHECK-NEXT:    stb 3, -8(1)
 ; CHECK-NEXT:    clrldi 3, 3, 56
----------------
postive

================
Comment at: llvm/test/CodeGen/PowerPC/int128_ldst.ll:1100
 }
+;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; CHECK-P8: {{.*}}
----------------
Same with above

================
Comment at: llvm/test/CodeGen/PowerPC/testComparesi32ltu.ll:37
 ; BE-NEXT:    addis r4, r2, testCompare1 at toc@ha
-; BE-NEXT:    lbz r4, testCompare1 at toc@l(r4)
+; BE-NEXT:    std r0, 128(r1)
 ; BE-NEXT:    lbz r3, 0(r3)
----------------
This looks positive. We had a request to move the lr store instruction be away from `mflr`

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154488/new/

https://reviews.llvm.org/D154488