[all-commits] [llvm/llvm-project] 27139b: [SME] Stop RA from coalescing COPY instructions th...
Sander de Smalen via All-commits
all-commits at lists.llvm.org
Wed Jan 31 11:43:58 PST 2024
Branch: refs/heads/release/18.x
Home: https://github.com/llvm/llvm-project
Commit: 27139bceb27d0b551e6e9d18fb91c703cbc3d7b8
https://github.com/llvm/llvm-project/commit/27139bceb27d0b551e6e9d18fb91c703cbc3d7b8
Author: Sander de Smalen <sander.desmalen at arm.com>
Date: 2024-01-31 (Wed, 31 Jan 2024)
Changed paths:
M llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
M llvm/lib/Target/AArch64/AArch64ISelLowering.h
M llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
M llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
M llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll
A llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
M llvm/test/CodeGen/AArch64/sme-streaming-body.ll
M llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
M llvm/test/CodeGen/AArch64/sme-streaming-interface.ll
M llvm/test/CodeGen/AArch64/sme-streaming-mode-changing-call-disable-stackslot-scavenging.ll
Log Message:
-----------
[SME] Stop RA from coalescing COPY instructions that transcend beyond smstart/smstop. (#78294)
This patch introduces a 'COALESCER_BARRIER' which is a pseudo node that
expands to
a 'nop', but which stops the register allocator from coalescing a COPY
node when
its use/def crosses a SMSTART or SMSTOP instruction.
For example:
%0:fpr64 = COPY killed $d0
undef %2.dsub:zpr = COPY %0 // <- Do not coalesce this COPY
ADJCALLSTACKDOWN 0, 0
MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $d0
$d0 = COPY killed %0
BL @use_f64, csr_aarch64_aapcs
If the COPY would be coalesced, that would lead to:
$d0 = COPY killed %0
being replaced by:
$d0 = COPY killed %2.dsub
which means the whole ZPR reg would be live upto the call, causing the
MSRpstatesvcrImm1 (smstop) to spill/reload the ZPR register:
str q0, [sp] // 16-byte Folded Spill
smstop sm
ldr z0, [sp] // 16-byte Folded Reload
bl use_f64
which would be incorrect for two reasons:
1. The program may load more data than it has allocated.
2. If there are other SVE objects on the stack, the compiler might use
the
'mul vl' addressing modes to access the spill location.
By disabling the coalescing, we get the desired results:
str d0, [sp, #8] // 8-byte Folded Spill
smstop sm
ldr d0, [sp, #8] // 8-byte Folded Reload
bl use_f64
(cherry picked from commit dd736661826e215ac70ff3a4a4ccd75bda0c5ccd)
More information about the All-commits
mailing list