[PATCH] D33099: AMD Jaguar scheduler doesn't correctly model 256-bit AVX instructions
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 16 14:27:27 PDT 2017
RKSimon added inline comments.
================
Comment at: lib/Target/X86/X86ScheduleBtVer2.td:71
+def JFPIntCluster : ProcResGroup<[JVALU0, JVALU1, JSTC]>;
+
// Integer loads are 3 cycles, so ReadAfterLd registers needn't be available until 3
----------------
I don't think adding these Cluster groups is necessary. TBH most of the ProcResource defs appear to be superfluous - most aren't used at all - we're just using the JFPU0/JFPU0/JFPU01 defs, with a few others for the longer op chain instructions.
================
Comment at: lib/Target/X86/X86ScheduleBtVer2.td:349
+
+def WriteFAddYY: SchedWriteRes<[JFPA]> {
+ let Latency = 3;
----------------
Better off using JFPU0 as that's what is actually bound to the buffer. Same for the others below.
================
Comment at: lib/Target/X86/X86ScheduleBtVer2.td:357
+ let Latency = 8;
+ let ResourceCycles = [2];
+}
----------------
Shouldn't this def be something like the below, to show it will consume the AGU for a cycle? Same for the other loads.
```
def WriteFAddYMLd: SchedWriteRes<[JLAGU,JFPU0]> {
let Latency = 8;
let ResourceCycles = [1,2];
}
```
================
Comment at: test/CodeGen/X86/slow-unaligned-mem.ll:89
; FAST: # BB#0:
-; FAST-NEXT: movl {{[0-9]+}}(%esp), %eax
+; FAST: movl {{[0-9]+}}(%esp), %eax
; FAST-NOT: movl
----------------
????
https://reviews.llvm.org/D33099
More information about the llvm-commits
mailing list