[PATCH] D154488: [PowerPC] Define SchedModel for Power8

Tue Aug 1 22:32:33 PDT 2023

shchenz added inline comments.

================
Comment at: llvm/lib/Target/PowerPC/PPCScheduleP8.td:53

-def P8_LU1     : FuncUnit; // Loads or fixed-point operations 1
-def P8_LU2     : FuncUnit; // Loads or fixed-point operations 2
+  // Power8 Dispatch Ports:
+  // Two ports to do loads or fixed-point operations.
----------------
This seems like resources for issue not dispatch?

================
Comment at: llvm/lib/Target/PowerPC/PPCScheduleP8.td:181
+    (instregex "^VRFI(M|N|P|Z)$"),
+    XVRSQRTESP, XVSUBSP, VADDFP, VEXPTEFP, VLOGEFP, VMADDFP, VNMSUBFP, VREFP,
+    VRSQRTEFP, VSUBFP, XVCVSXWSP, XVCVUXWSP, XVMULSP, XVNABSSP, XVNEGSP, XVRESP,
----------------
Don't know how do you arrange these instructions. But from the UM: `vaddfp` pipeline is `FPU` that should be using `P8_FP_4x32` and the execution unit should be `P8_FPU`?

================
Comment at: llvm/lib/Target/PowerPC/PPCScheduleP8.td:267
+  def : InstRW<[P8_LS_LU, P8_DISP_ST], (instrs
+    (instregex "^ST(B|H|W|D)(U)?(X)?(8|TLS)?(_)?(32)?$"),
+    STBCIX, STBCX, STBEPX, STDBRX, STDCIX, STDCX, STHBRX, STHCIX, STHCX, STHEPX,
----------------
instruction like `STB` seems occupies two pipelines `LSU` and `LU` while seems here it only occupies `LSU` pipiline?

================
Comment at: llvm/lib/Target/PowerPC/PPCScheduleP8.td:51
+  def P8_FP_2x64 : ProcResource<4> { let Super = P8_FPU; }
+  def P8_FP_4x32 : ProcResource<2> { let Super = P8_VMX; }

----------------
qiucf wrote:
> shchenz wrote:
> > Setting `P8_VMX` as super of "4xSingle" also seems weird. In ISA of pwr8, I think most instructions that handle 4xSingle type are VSX related and from the UM "10.3.2 Instructional Latencies and Throughputs", I saw most of them are using pipeline `FPU`. So I guess, we should use `P8_FPU` as Super instead?
> > **2.1.3 Speculative Superscalar Inner Core Organization**:
> > 
> > - Two VMX execution units capable of executing simple FX, permute, complex FX, and 4-way SIMD single-precision floating-point operations
> 
> I think the reason is before VSX, Altivec already had instructions for 4xSingle vectors and they are implemented within VMX units. So their VSX equivalents uses the same execution units.
I believe we should use `P8_VSX` as super of `P8_FP_4x32`. According to the instructions which uses `P8_FP_4x32`, all of them are with pipeline `FPU` that should be for the 4 FPU units.
Or we can create two P8_FP_4x32 resources, one is child of `P8_FPU` and one is child of `P8_VMX`. However the child of `P8_VMX` seems have no use instructions in the "Instruction Latencies and Throughputs" sheet.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154488/new/

https://reviews.llvm.org/D154488