[all-commits] [llvm/llvm-project] 0a0d2f: [X86] Ensure 256-bit inlane shuffles are set to 2 ...
Simon Pilgrim via All-commits
all-commits at lists.llvm.org
Sat Oct 29 04:03:57 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 0a0d2f540076d9fee1ee722b5f47cc31be9fa53e
https://github.com/llvm/llvm-project/commit/0a0d2f540076d9fee1ee722b5f47cc31be9fa53e
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2022-10-29 (Sat, 29 Oct 2022)
Changed paths:
M llvm/lib/Target/X86/X86ScheduleZnver1.td
M llvm/test/tools/llvm-mca/X86/Znver1/resources-avx1.s
M llvm/test/tools/llvm-mca/X86/Znver1/resources-avx2.s
Log Message:
-----------
[X86] Ensure 256-bit inlane shuffles are set to 2 uops + half rate
znver1 double pumps regular 256-bit shuffles (crosslane shuffles are messier....)
Fixes yet another mismatch between the numbers coming out of the script from D103695 and the znver1 scheduler model
Confirmed with the AMD SoG, Agner + instlatx64
Commit: eea6a2782e852ee38a56af8245a27d864b56b592
https://github.com/llvm/llvm-project/commit/eea6a2782e852ee38a56af8245a27d864b56b592
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2022-10-29 (Sat, 29 Oct 2022)
Changed paths:
M llvm/lib/Target/X86/X86ScheduleZnver1.td
M llvm/lib/Target/X86/X86ScheduleZnver2.td
M llvm/test/tools/llvm-mca/X86/Znver1/resources-avx2.s
M llvm/test/tools/llvm-mca/X86/Znver2/resources-avx2.s
Log Message:
-----------
[X86] WriteFShuffle256 shuffles aren't microcoded in the llvm sense
znver1/2 might have poor throughput for crosslane shuffles but they don't consume 100 cycles of resources
I think there was a misunderstanding between the AMD definition of microcoding (more than 2-3 uops) and LLVM (here be dragons - impossible to approximately model the instruction)
This is more yak shaving to come from D103695 - this time working out why codegen involving broadcasts gives such weird numbers
Compare: https://github.com/llvm/llvm-project/compare/c4dd260f9269...eea6a2782e85
More information about the All-commits
mailing list