[PATCH] D15875: AMDGPU/SI: Fold operands with sub-registers
Nicolai Hähnle via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 4 15:00:03 PST 2016
nhaehnle created this revision.
nhaehnle added reviewers: tstellarAMD, arsenm, mareko.
nhaehnle added a subscriber: llvm-commits.
Herald added a subscriber: arsenm.
Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs,
increasing the code size and VGPR pressure. These moves are now folded away.
Note that this lack of operand folding was not a problem for VMEM loads,
because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register
coalescer.
Some tests are updated, note that the fsub.ll test explicitly checks that
the move is elided.
With the IR generated by current Mesa, the changes are obviously relatively
minor:
7063 shaders in 3531 tests
Totals:
SGPRS: 351872 -> 352560 (0.20 %)
VGPRS: 199984 -> 200732 (0.37 %)
Code Size: 9876968 -> 9881112 (0.04 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave
Wait states: 295164 -> 295337 (0.06 %)
Totals from affected shaders:
SGPRS: 65784 -> 66472 (1.05 %)
VGPRS: 38064 -> 38812 (1.97 %)
Code Size: 1993828 -> 1997972 (0.21 %) bytes
LDS: 42 -> 42 (0.00 %) blocks
Scratch: 795648 -> 783360 (-1.54 %) bytes per wave
Wait states: 54026 -> 54199 (0.32 %)
http://reviews.llvm.org/D15875
Files:
lib/Target/AMDGPU/SIFixSGPRCopies.cpp
lib/Target/AMDGPU/SIFoldOperands.cpp
lib/Target/AMDGPU/SIInstrInfo.cpp
lib/Target/AMDGPU/SIRegisterInfo.cpp
test/CodeGen/AMDGPU/fmin_legacy.ll
test/CodeGen/AMDGPU/fsub.ll
test/CodeGen/AMDGPU/llvm.round.f64.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D15875.43929.patch
Type: text/x-patch
Size: 5868 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160104/2a44b2d3/attachment.bin>
More information about the llvm-commits
mailing list