<div dir="ltr">Hi Florian,<br><br>This commit seems to be giving unused variable error in non-assert build for variable MI for file <span class="gmail-pl-k" style="box-sizing:border-box;color:rgb(215,58,73);font-family:SFMono-Regular,Consolas,"Liberation Mono",Menlo,monospace;font-size:12px;white-space:pre;background-color:rgb(230,255,237)"><a title="llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp" class="gmail-link-gray-dark" href="https://github.com/llvm/llvm-project/commit/17554b89617e084848784dfd9ac58e2718d8f8f7#diff-fa3932354e0ceda4288dbea02847e8e0" style="box-sizing:border-box;background-color:rgb(250,251,252);outline-width:0px;white-space:normal;color:rgb(3,102,214)">AArch64LoadStoreOptimizer.cpp</a></span>.<span style="color:rgb(36,41,46);font-family:SFMono-Regular,Consolas,"Liberation Mono",Menlo,monospace;font-size:12px;white-space:pre;background-color:rgb(230,255,237)"><br></span><div><br></div><div>Could you please take a look.<br><br>Thanks,<br>Rumeet</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Dec 11, 2019 at 5:54 AM Florian Hahn via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
Author: Florian Hahn<br>
Date: 2019-12-11T13:50:11Z<br>
New Revision: 17554b89617e084848784dfd9ac58e2718d8f8f7<br>
<br>
URL: <a href="https://github.com/llvm/llvm-project/commit/17554b89617e084848784dfd9ac58e2718d8f8f7" rel="noreferrer" target="_blank">https://github.com/llvm/llvm-project/commit/17554b89617e084848784dfd9ac58e2718d8f8f7</a><br>
DIFF: <a href="https://github.com/llvm/llvm-project/commit/17554b89617e084848784dfd9ac58e2718d8f8f7.diff" rel="noreferrer" target="_blank">https://github.com/llvm/llvm-project/commit/17554b89617e084848784dfd9ac58e2718d8f8f7.diff</a><br>
<br>
LOG: [AArch64] Teach Load/Store optimizier to rename store operands for pairing.<br>
<br>
In some cases, we can rename a store operand, in order to enable pairing<br>
of stores. For store pairs, that cannot be merged because the first<br>
tored register is defined in between the second store, we try to find<br>
suitable rename register.<br>
<br>
First, we check if we can rename the given register:<br>
<br>
1. The first store register must be killed at the store, which means we<br>
do not have to rename instructions after the first store.<br>
2. We scan backwards from the first store, to find the definition of the<br>
stored register and check all uses in between are renamable. Along<br>
they way, we collect the minimal register classes of the uses for<br>
overlapping (sub/super)registers.<br>
<br>
Second, we try to find an available register from the minimal physical<br>
register class of the original register. A suitable register must not be<br>
<br>
1. defined before FirstMI<br>
2. between the previous definition of the register to rename<br>
3. a callee saved register.<br>
<br>
We use KILL flags to clear defined registers while scanning from the<br>
beginning to the end of the block.<br>
<br>
This triggers quite often, here are the top changes for MultiSource,<br>
SPEC2000, SPEC2006 compiled with -O3 for iOS:<br>
<br>
Metric: aarch64-ldst-opt.NumPairCreated<br>
<br>
Program base patch diff<br>
test-suite...nch/fourinarow/fourinarow.test 2.00 39.00 1850.0%<br>
test-suite...s/ASC_Sequoia/IRSmk/IRSmk.test 46.00 80.00 73.9%<br>
test-suite...chmarks/Olden/power/power.test 70.00 96.00 37.1%<br>
test-suite...cations/hexxagon/hexxagon.test 29.00 39.00 34.5%<br>
test-suite...nchmarks/McCat/05-eks/eks.test 100.00 132.00 32.0%<br>
test-suite.../Trimaran/enc-rc4/enc-rc4.test 46.00 59.00 28.3%<br>
test-suite...T2006/473.astar/473.astar.test 160.00 200.00 25.0%<br>
test-suite.../Trimaran/enc-md5/enc-md5.test 8.00 10.00 25.0%<br>
test-suite...telecomm-gsm/telecomm-gsm.test 113.00 139.00 23.0%<br>
test-suite...ediabench/gsm/toast/toast.test 113.00 139.00 23.0%<br>
test-suite...Source/Benchmarks/sim/sim.test 91.00 111.00 22.0%<br>
test-suite...C/CFP2000/179.art/179.art.test 41.00 49.00 19.5%<br>
test-suite...peg2/mpeg2dec/mpeg2decode.test 245.00 279.00 13.9%<br>
test-suite...marks/Olden/health/health.test 16.00 18.00 12.5%<br>
test-suite...ks/Prolangs-C/cdecl/cdecl.test 90.00 101.00 12.2%<br>
test-suite...fice-ispell/office-ispell.test 91.00 100.00 9.9%<br>
test-suite...oxyApps-C/miniGMG/miniGMG.test 430.00 465.00 8.1%<br>
test-suite...lowfish/security-blowfish.test 39.00 42.00 7.7%<br>
test-suite.../Applications/spiff/spiff.test 42.00 45.00 7.1%<br>
test-suite...arks/mafft/pairlocalalign.test 2473.00 2646.00 7.0%<br>
test-suite.../VersaBench/ecbdes/ecbdes.test 29.00 31.00 6.9%<br>
test-suite...nch/beamformer/beamformer.test 220.00 235.00 6.8%<br>
test-suite...CFP2000/177.mesa/177.mesa.test 2110.00 2252.00 6.7%<br>
test-suite...ve-susan/automotive-susan.test 109.00 116.00 6.4%<br>
test-suite...s-C/unix-smail/unix-smail.test 65.00 69.00 6.2%<br>
test-suite...CI_Purple/SMG2000/smg2000.test 1194.00 1265.00 5.9%<br>
test-suite.../Benchmarks/nbench/nbench.test 472.00 500.00 5.9%<br>
test-suite...oxyApps-C/miniAMR/miniAMR.test 248.00 262.00 5.6%<br>
test-suite...quoia/CrystalMk/CrystalMk.test 18.00 19.00 5.6%<br>
test-suite...rks/tramp3d-v4/tramp3d-v4.test 7331.00 7710.00 5.2%<br>
test-suite.../Benchmarks/Bullet/bullet.test 5651.00 5938.00 5.1%<br>
test-suite...ternal/HMMER/hmmcalibrate.test 750.00 788.00 5.1%<br>
test-suite...T2006/456.hmmer/456.hmmer.test 764.00 802.00 5.0%<br>
test-suite...ications/JM/ldecod/ldecod.test 1028.00 1079.00 5.0%<br>
test-suite...CFP2006/444.namd/444.namd.test 1368.00 1434.00 4.8%<br>
test-suite...marks/7zip/7zip-benchmark.test 4471.00 4685.00 4.8%<br>
test-suite...6/464.h264ref/464.h264ref.test 3122.00 3271.00 4.8%<br>
test-suite...pplications/oggenc/oggenc.test 1497.00 1565.00 4.5%<br>
test-suite...T2000/300.twolf/300.twolf.test 742.00 774.00 4.3%<br>
test-suite.../Prolangs-C/loader/loader.test 24.00 25.00 4.2%<br>
test-suite...0.perlbench/400.perlbench.test 1983.00 2058.00 3.8%<br>
test-suite...ications/JM/lencod/lencod.test 4612.00 4785.00 3.8%<br>
test-suite...yApps-C++/PENNANT/PENNANT.test 995.00 1032.00 3.7%<br>
test-suite...arks/VersaBench/dbms/dbms.test 54.00 56.00 3.7%<br>
<br>
Reviewers: efriedma, thegameg, samparker, dmgreen, paquette, evandro<br>
<br>
Reviewed By: paquette<br>
<br>
Differential Revision: <a href="https://reviews.llvm.org/D70450" rel="noreferrer" target="_blank">https://reviews.llvm.org/D70450</a><br>
<br>
Added: <br>
llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir<br>
<br>
Modified: <br>
llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp<br>
llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll<br>
llvm/test/CodeGen/AArch64/arm64-abi_align.ll<br>
llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll<br>
llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll<br>
llvm/test/CodeGen/AArch64/machine-outliner.ll<br>
<br>
Removed: <br>
<br>
<br>
<br>
################################################################################<br>
diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp<br>
index a0c4a25bb5b9..d60dc43d19b4 100644<br>
--- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp<br>
+++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp<br>
@@ -32,10 +32,12 @@<br>
#include "llvm/Pass.h"<br>
#include "llvm/Support/CommandLine.h"<br>
#include "llvm/Support/Debug.h"<br>
+#include "llvm/Support/DebugCounter.h"<br>
#include "llvm/Support/ErrorHandling.h"<br>
#include "llvm/Support/raw_ostream.h"<br>
#include <cassert><br>
#include <cstdint><br>
+#include <functional><br>
#include <iterator><br>
#include <limits><br>
<br>
@@ -51,6 +53,9 @@ STATISTIC(NumUnscaledPairCreated,<br>
STATISTIC(NumZeroStoresPromoted, "Number of narrow zero stores promoted");<br>
STATISTIC(NumLoadsFromStoresPromoted, "Number of loads from stores promoted");<br>
<br>
+DEBUG_COUNTER(RegRenamingCounter, DEBUG_TYPE "-reg-renaming",<br>
+ "Controls which pairs are considered for renaming");<br>
+<br>
// The LdStLimit limits how far we search for load/store pairs.<br>
static cl::opt<unsigned> LdStLimit("aarch64-load-store-scan-limit",<br>
cl::init(20), cl::Hidden);<br>
@@ -76,6 +81,11 @@ using LdStPairFlags = struct LdStPairFlags {<br>
// to be extended, 0 means I, and 1 means the returned iterator.<br>
int SExtIdx = -1;<br>
<br>
+ // If not none, RenameReg can be used to rename the result register of the<br>
+ // first store in a pair. Currently this only works when merging stores<br>
+ // forward.<br>
+ Optional<MCPhysReg> RenameReg = None;<br>
+<br>
LdStPairFlags() = default;<br>
<br>
void setMergeForward(bool V = true) { MergeForward = V; }<br>
@@ -83,6 +93,10 @@ using LdStPairFlags = struct LdStPairFlags {<br>
<br>
void setSExtIdx(int V) { SExtIdx = V; }<br>
int getSExtIdx() const { return SExtIdx; }<br>
+<br>
+ void setRenameReg(MCPhysReg R) { RenameReg = R; }<br>
+ void clearRenameReg() { RenameReg = None; }<br>
+ Optional<MCPhysReg> getRenameReg() const { return RenameReg; }<br>
};<br>
<br>
struct AArch64LoadStoreOpt : public MachineFunctionPass {<br>
@@ -99,6 +113,7 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass {<br>
<br>
// Track which register units have been modified and used.<br>
LiveRegUnits ModifiedRegUnits, UsedRegUnits;<br>
+ LiveRegUnits DefinedInBB;<br>
<br>
void getAnalysisUsage(AnalysisUsage &AU) const override {<br>
AU.addRequired<AAResultsWrapperPass>();<br>
@@ -599,8 +614,8 @@ static void getPrePostIndexedMemOpInfo(const MachineInstr &MI, int &Scale,<br>
}<br>
}<br>
<br>
-static const MachineOperand &getLdStRegOp(const MachineInstr &MI,<br>
- unsigned PairedRegOp = 0) {<br>
+static MachineOperand &getLdStRegOp(MachineInstr &MI,<br>
+ unsigned PairedRegOp = 0) {<br>
assert(PairedRegOp < 2 && "Unexpected register operand idx.");<br>
unsigned Idx = isPairedLdSt(MI) ? PairedRegOp : 0;<br>
return MI.getOperand(Idx);<br>
@@ -783,6 +798,44 @@ AArch64LoadStoreOpt::mergeNarrowZeroStores(MachineBasicBlock::iterator I,<br>
return NextI;<br>
}<br>
<br>
+// Apply Fn to all instructions between MI and the beginning of the block, until<br>
+// a def for DefReg is reached. Returns true, iff Fn returns true for all<br>
+// visited instructions. Stop after visiting Limit iterations.<br>
+static bool forAllMIsUntilDef(MachineInstr &MI, MCPhysReg DefReg,<br>
+ const TargetRegisterInfo *TRI, unsigned Limit,<br>
+ std::function<bool(MachineInstr &, bool)> &Fn) {<br>
+ auto MBB = MI.getParent();<br>
+ for (MachineBasicBlock::reverse_iterator I = MI.getReverseIterator(),<br>
+ E = MBB->rend();<br>
+ I != E; I++) {<br>
+ if (!Limit)<br>
+ return false;<br>
+ --Limit;<br>
+<br>
+ bool isDef = any_of(I->operands(), [DefReg, TRI](MachineOperand &MOP) {<br>
+ return MOP.isReg() && MOP.isDef() &&<br>
+ TRI->regsOverlap(MOP.getReg(), DefReg);<br>
+ });<br>
+ if (!Fn(*I, isDef))<br>
+ return false;<br>
+ if (isDef)<br>
+ break;<br>
+ }<br>
+ return true;<br>
+}<br>
+<br>
+static void updateDefinedRegisters(MachineInstr &MI, LiveRegUnits &Units,<br>
+ const TargetRegisterInfo *TRI) {<br>
+<br>
+ for (const MachineOperand &MOP : phys_regs_and_masks(MI))<br>
+ if (MOP.isReg() && MOP.isKill())<br>
+ Units.removeReg(MOP.getReg());<br>
+<br>
+ for (const MachineOperand &MOP : phys_regs_and_masks(MI))<br>
+ if (MOP.isReg() && !MOP.isKill())<br>
+ Units.addReg(MOP.getReg());<br>
+}<br>
+<br>
MachineBasicBlock::iterator<br>
AArch64LoadStoreOpt::mergePairedInsns(MachineBasicBlock::iterator I,<br>
MachineBasicBlock::iterator Paired,<br>
@@ -803,6 +856,70 @@ AArch64LoadStoreOpt::mergePairedInsns(MachineBasicBlock::iterator I,<br>
int OffsetStride = IsUnscaled ? getMemScale(*I) : 1;<br>
<br>
bool MergeForward = Flags.getMergeForward();<br>
+<br>
+ Optional<MCPhysReg> RenameReg = Flags.getRenameReg();<br>
+ if (MergeForward && RenameReg) {<br>
+ MCRegister RegToRename = getLdStRegOp(*I).getReg();<br>
+ DefinedInBB.addReg(*RenameReg);<br>
+<br>
+ // Return the sub/super register for RenameReg, matching the size of<br>
+ // OriginalReg.<br>
+ auto GetMatchingSubReg = [this,<br>
+ RenameReg](MCPhysReg OriginalReg) -> MCPhysReg {<br>
+ for (MCPhysReg SubOrSuper : TRI->sub_and_superregs_inclusive(*RenameReg))<br>
+ if (TRI->getMinimalPhysRegClass(OriginalReg) ==<br>
+ TRI->getMinimalPhysRegClass(SubOrSuper))<br>
+ return SubOrSuper;<br>
+ llvm_unreachable("Should have found matching sub or super register!");<br>
+ };<br>
+<br>
+ std::function<bool(MachineInstr &, bool)> UpdateMIs =<br>
+ [this, RegToRename, GetMatchingSubReg](MachineInstr &MI, bool IsDef) {<br>
+ if (IsDef) {<br>
+ bool SeenDef = false;<br>
+ for (auto &MOP : MI.operands()) {<br>
+ // Rename the first explicit definition and all implicit<br>
+ // definitions matching RegToRename.<br>
+ if (MOP.isReg() &&<br>
+ (!SeenDef || (MOP.isDef() && MOP.isImplicit())) &&<br>
+ TRI->regsOverlap(MOP.getReg(), RegToRename)) {<br>
+ assert((MOP.isImplicit() ||<br>
+ (MOP.isRenamable() && !MOP.isEarlyClobber())) &&<br>
+ "Need renamable operands");<br>
+ MOP.setReg(GetMatchingSubReg(MOP.getReg()));<br>
+ SeenDef = true;<br>
+ }<br>
+ }<br>
+ } else {<br>
+ for (auto &MOP : MI.operands()) {<br>
+ if (MOP.isReg() && TRI->regsOverlap(MOP.getReg(), RegToRename)) {<br>
+ assert(MOP.isImplicit() ||<br>
+ (MOP.isRenamable() && !MOP.isEarlyClobber()) &&<br>
+ "Need renamable operands");<br>
+ MOP.setReg(GetMatchingSubReg(MOP.getReg()));<br>
+ }<br>
+ }<br>
+ }<br>
+ LLVM_DEBUG(dbgs() << "Renamed " << MI << "\n");<br>
+ return true;<br>
+ };<br>
+ forAllMIsUntilDef(*I, RegToRename, TRI, LdStLimit, UpdateMIs);<br>
+<br>
+ // Make sure the register used for renaming is not used between the paired<br>
+ // instructions. That would trash the content before the new paired<br>
+ // instruction.<br>
+ for (auto &MI :<br>
+ iterator_range<MachineInstrBundleIterator<llvm::MachineInstr>>(<br>
+ std::next(I), std::next(Paired)))<br>
+ assert(all_of(MI.operands(),<br>
+ [this, &RenameReg](const MachineOperand &MOP) {<br>
+ return !MOP.isReg() ||<br>
+ !TRI->regsOverlap(MOP.getReg(), *RenameReg);<br>
+ }) &&<br>
+ "Rename register used between paired instruction, trashing the "<br>
+ "content");<br>
+ }<br>
+<br>
// Insert our new paired instruction after whichever of the paired<br>
// instructions MergeForward indicates.<br>
MachineBasicBlock::iterator InsertionPoint = MergeForward ? Paired : I;<br>
@@ -931,6 +1048,11 @@ AArch64LoadStoreOpt::mergePairedInsns(MachineBasicBlock::iterator I,<br>
}<br>
LLVM_DEBUG(dbgs() << "\n");<br>
<br>
+ if (MergeForward)<br>
+ for (const MachineOperand &MOP : phys_regs_and_masks(*I))<br>
+ if (MOP.isReg() && MOP.isKill())<br>
+ DefinedInBB.addReg(MOP.getReg());<br>
+<br>
// Erase the old instructions.<br>
I->eraseFromParent();<br>
Paired->eraseFromParent();<br>
@@ -1207,6 +1329,144 @@ static bool areCandidatesToMergeOrPair(MachineInstr &FirstMI, MachineInstr &MI,<br>
// FIXME: Can we also match a mixed sext/zext unscaled/scaled pair?<br>
}<br>
<br>
+static bool<br>
+canRenameUpToDef(MachineInstr &FirstMI, LiveRegUnits &UsedInBetween,<br>
+ SmallPtrSetImpl<const TargetRegisterClass *> &RequiredClasses,<br>
+ const TargetRegisterInfo *TRI) {<br>
+ if (!FirstMI.mayStore())<br>
+ return false;<br>
+<br>
+ // Check if we can find an unused register which we can use to rename<br>
+ // the register used by the first load/store.<br>
+ auto *RegClass = TRI->getMinimalPhysRegClass(getLdStRegOp(FirstMI).getReg());<br>
+ MachineFunction &MF = *FirstMI.getParent()->getParent();<br>
+ if (!RegClass || !MF.getRegInfo().tracksLiveness())<br>
+ return false;<br>
+<br>
+ auto RegToRename = getLdStRegOp(FirstMI).getReg();<br>
+ // For now, we only rename if the store operand gets killed at the store.<br>
+ if (!getLdStRegOp(FirstMI).isKill() &&<br>
+ !any_of(FirstMI.operands(),<br>
+ [TRI, RegToRename](const MachineOperand &MOP) {<br>
+ return MOP.isReg() && MOP.isImplicit() && MOP.isKill() &&<br>
+ TRI->regsOverlap(RegToRename, MOP.getReg());<br>
+ })) {<br>
+ LLVM_DEBUG(dbgs() << " Operand not killed at " << FirstMI << "\n");<br>
+ return false;<br>
+ }<br>
+ auto canRenameMOP = [](const MachineOperand &MOP) {<br>
+ return MOP.isImplicit() ||<br>
+ (MOP.isRenamable() && !MOP.isEarlyClobber() && !MOP.isTied());<br>
+ };<br>
+<br>
+ bool FoundDef = false;<br>
+<br>
+ // For each instruction between FirstMI and the previous def for RegToRename,<br>
+ // we<br>
+ // * check if we can rename RegToRename in this instruction<br>
+ // * collect the registers used and required register classes for RegToRename.<br>
+ std::function<bool(MachineInstr &, bool)> CheckMIs = [&](MachineInstr &MI,<br>
+ bool IsDef) {<br>
+ LLVM_DEBUG(dbgs() << "Checking " << MI << "\n");<br>
+ // Currently we do not try to rename across frame-setup instructions.<br>
+ if (MI.getFlag(MachineInstr::FrameSetup)) {<br>
+ LLVM_DEBUG(dbgs() << " Cannot rename framesetup instructions currently ("<br>
+ << MI << ")\n");<br>
+ return false;<br>
+ }<br>
+<br>
+ UsedInBetween.accumulate(MI);<br>
+<br>
+ // For a definition, check that we can rename the definition and exit the<br>
+ // loop.<br>
+ FoundDef = IsDef;<br>
+<br>
+ // For defs, check if we can rename the first def of RegToRename.<br>
+ if (FoundDef) {<br>
+ for (auto &MOP : MI.operands()) {<br>
+ if (!MOP.isReg() || !MOP.isDef() ||<br>
+ !TRI->regsOverlap(MOP.getReg(), RegToRename))<br>
+ continue;<br>
+ if (!canRenameMOP(MOP)) {<br>
+ LLVM_DEBUG(dbgs()<br>
+ << " Cannot rename " << MOP << " in " << MI << "\n");<br>
+ return false;<br>
+ }<br>
+ RequiredClasses.insert(TRI->getMinimalPhysRegClass(MOP.getReg()));<br>
+ }<br>
+ return true;<br>
+ } else {<br>
+ for (auto &MOP : MI.operands()) {<br>
+ if (!MOP.isReg() || !TRI->regsOverlap(MOP.getReg(), RegToRename))<br>
+ continue;<br>
+<br>
+ if (!canRenameMOP(MOP)) {<br>
+ LLVM_DEBUG(dbgs()<br>
+ << " Cannot rename " << MOP << " in " << MI << "\n");<br>
+ return false;<br>
+ }<br>
+ RequiredClasses.insert(TRI->getMinimalPhysRegClass(MOP.getReg()));<br>
+ }<br>
+ }<br>
+ return true;<br>
+ };<br>
+<br>
+ if (!forAllMIsUntilDef(FirstMI, RegToRename, TRI, LdStLimit, CheckMIs))<br>
+ return false;<br>
+<br>
+ if (!FoundDef) {<br>
+ LLVM_DEBUG(dbgs() << " Did not find definition for register in BB\n");<br>
+ return false;<br>
+ }<br>
+ return true;<br>
+}<br>
+<br>
+// Check if we can find a physical register for renaming. This register must:<br>
+// * not be defined up to FirstMI (checking DefinedInBB)<br>
+// * not used between the MI and the defining instruction of the register to<br>
+// rename (checked using UsedInBetween).<br>
+// * is available in all used register classes (checked using RequiredClasses).<br>
+static Optional<MCPhysReg> tryToFindRegisterToRename(<br>
+ MachineInstr &FirstMI, MachineInstr &MI, LiveRegUnits &DefinedInBB,<br>
+ LiveRegUnits &UsedInBetween,<br>
+ SmallPtrSetImpl<const TargetRegisterClass *> &RequiredClasses,<br>
+ const TargetRegisterInfo *TRI) {<br>
+ auto &MF = *FirstMI.getParent()->getParent();<br>
+<br>
+ // Checks if any sub- or super-register of PR is callee saved.<br>
+ auto AnySubOrSuperRegCalleePreserved = [&MF, TRI](MCPhysReg PR) {<br>
+ return any_of(TRI->sub_and_superregs_inclusive(PR),<br>
+ [&MF, TRI](MCPhysReg SubOrSuper) {<br>
+ return TRI->isCalleeSavedPhysReg(SubOrSuper, MF);<br>
+ });<br>
+ };<br>
+<br>
+ // Check if PR or one of its sub- or super-registers can be used for all<br>
+ // required register classes.<br>
+ auto CanBeUsedForAllClasses = [&RequiredClasses, TRI](MCPhysReg PR) {<br>
+ return all_of(RequiredClasses, [PR, TRI](const TargetRegisterClass *C) {<br>
+ return any_of(TRI->sub_and_superregs_inclusive(PR),<br>
+ [C, TRI](MCPhysReg SubOrSuper) {<br>
+ return C == TRI->getMinimalPhysRegClass(SubOrSuper);<br>
+ });<br>
+ });<br>
+ };<br>
+<br>
+ auto *RegClass = TRI->getMinimalPhysRegClass(getLdStRegOp(FirstMI).getReg());<br>
+ for (const MCPhysReg &PR : *RegClass) {<br>
+ if (DefinedInBB.available(PR) && UsedInBetween.available(PR) &&<br>
+ !AnySubOrSuperRegCalleePreserved(PR) && CanBeUsedForAllClasses(PR)) {<br>
+ DefinedInBB.addReg(PR);<br>
+ LLVM_DEBUG(dbgs() << "Found rename register " << printReg(PR, TRI)<br>
+ << "\n");<br>
+ return {PR};<br>
+ }<br>
+ }<br>
+ LLVM_DEBUG(dbgs() << "No rename register found from "<br>
+ << TRI->getRegClassName(RegClass) << "\n");<br>
+ return None;<br>
+}<br>
+<br>
/// Scan the instructions looking for a load/store that can be combined with the<br>
/// current instruction into a wider equivalent or a load/store pair.<br>
MachineBasicBlock::iterator<br>
@@ -1215,6 +1475,7 @@ AArch64LoadStoreOpt::findMatchingInsn(MachineBasicBlock::iterator I,<br>
bool FindNarrowMerge) {<br>
MachineBasicBlock::iterator E = I->getParent()->end();<br>
MachineBasicBlock::iterator MBBI = I;<br>
+ MachineBasicBlock::iterator MBBIWithRenameReg;<br>
MachineInstr &FirstMI = *I;<br>
++MBBI;<br>
<br>
@@ -1226,6 +1487,13 @@ AArch64LoadStoreOpt::findMatchingInsn(MachineBasicBlock::iterator I,<br>
int OffsetStride = IsUnscaled ? getMemScale(FirstMI) : 1;<br>
bool IsPromotableZeroStore = isPromotableZeroStoreInst(FirstMI);<br>
<br>
+ Optional<bool> MaybeCanRename = None;<br>
+ SmallPtrSet<const TargetRegisterClass *, 5> RequiredClasses;<br>
+ LiveRegUnits UsedInBetween;<br>
+ UsedInBetween.init(*TRI);<br>
+<br>
+ Flags.clearRenameReg();<br>
+<br>
// Track which register units have been modified and used between the first<br>
// insn (inclusive) and the second insn.<br>
ModifiedRegUnits.clear();<br>
@@ -1237,6 +1505,8 @@ AArch64LoadStoreOpt::findMatchingInsn(MachineBasicBlock::iterator I,<br>
for (unsigned Count = 0; MBBI != E && Count < Limit; ++MBBI) {<br>
MachineInstr &MI = *MBBI;<br>
<br>
+ UsedInBetween.accumulate(MI);<br>
+<br>
// Don't count transient instructions towards the search limit since there<br>
// may be <br>
diff erent numbers of them if e.g. debug information is present.<br>
if (!MI.isTransient())<br>
@@ -1329,7 +1599,9 @@ AArch64LoadStoreOpt::findMatchingInsn(MachineBasicBlock::iterator I,<br>
!(MI.mayLoad() &&<br>
!UsedRegUnits.available(getLdStRegOp(MI).getReg())) &&<br>
!mayAlias(MI, MemInsns, AA)) {<br>
+<br>
Flags.setMergeForward(false);<br>
+ Flags.clearRenameReg();<br>
return MBBI;<br>
}<br>
<br>
@@ -1337,18 +1609,41 @@ AArch64LoadStoreOpt::findMatchingInsn(MachineBasicBlock::iterator I,<br>
// between the two instructions and none of the instructions between the<br>
// first and the second alias with the first, we can combine the first<br>
// into the second.<br>
- if (ModifiedRegUnits.available(getLdStRegOp(FirstMI).getReg()) &&<br>
- !(MayLoad &&<br>
+ if (!(MayLoad &&<br>
!UsedRegUnits.available(getLdStRegOp(FirstMI).getReg())) &&<br>
!mayAlias(FirstMI, MemInsns, AA)) {<br>
- Flags.setMergeForward(true);<br>
- return MBBI;<br>
+<br>
+ if (ModifiedRegUnits.available(getLdStRegOp(FirstMI).getReg())) {<br>
+ Flags.setMergeForward(true);<br>
+ Flags.clearRenameReg();<br>
+ return MBBI;<br>
+ }<br>
+<br>
+ if (DebugCounter::shouldExecute(RegRenamingCounter)) {<br>
+ if (!MaybeCanRename)<br>
+ MaybeCanRename = {canRenameUpToDef(FirstMI, UsedInBetween,<br>
+ RequiredClasses, TRI)};<br>
+<br>
+ if (*MaybeCanRename) {<br>
+ Optional<MCPhysReg> MaybeRenameReg = tryToFindRegisterToRename(<br>
+ FirstMI, MI, DefinedInBB, UsedInBetween, RequiredClasses,<br>
+ TRI);<br>
+ if (MaybeRenameReg) {<br>
+ Flags.setRenameReg(*MaybeRenameReg);<br>
+ Flags.setMergeForward(true);<br>
+ MBBIWithRenameReg = MBBI;<br>
+ }<br>
+ }<br>
+ }<br>
}<br>
// Unable to combine these instructions due to interference in between.<br>
// Keep looking.<br>
}<br>
}<br>
<br>
+ if (Flags.getRenameReg())<br>
+ return MBBIWithRenameReg;<br>
+<br>
// If the instruction wasn't a matching load or store. Stop searching if we<br>
// encounter a call instruction that might modify memory.<br>
if (MI.isCall())<br>
@@ -1680,7 +1975,13 @@ bool AArch64LoadStoreOpt::tryToPairLdStInst(MachineBasicBlock::iterator &MBBI) {<br>
++NumUnscaledPairCreated;<br>
// Keeping the iterator straight is a pain, so we let the merge routine tell<br>
// us what the next instruction is after it's done mucking about.<br>
+ auto Prev = std::prev(MBBI);<br>
MBBI = mergePairedInsns(MBBI, Paired, Flags);<br>
+ // Collect liveness info for instructions between Prev and the new position<br>
+ // MBBI.<br>
+ for (auto I = std::next(Prev); I != MBBI; I++)<br>
+ updateDefinedRegisters(*I, DefinedInBB, TRI);<br>
+<br>
return true;<br>
}<br>
return false;<br>
@@ -1742,6 +2043,7 @@ bool AArch64LoadStoreOpt::tryToMergeLdStUpdate<br>
<br>
bool AArch64LoadStoreOpt::optimizeBlock(MachineBasicBlock &MBB,<br>
bool EnableNarrowZeroStOpt) {<br>
+<br>
bool Modified = false;<br>
// Four tranformations to do here:<br>
// 1) Find loads that directly read from stores and promote them by<br>
@@ -1786,8 +2088,17 @@ bool AArch64LoadStoreOpt::optimizeBlock(MachineBasicBlock &MBB,<br>
// ldr x1, [x2, #8]<br>
// ; becomes<br>
// ldp x0, x1, [x2]<br>
+<br>
+ if (MBB.getParent()->getRegInfo().tracksLiveness()) {<br>
+ DefinedInBB.clear();<br>
+ DefinedInBB.addLiveIns(MBB);<br>
+ }<br>
+<br>
for (MachineBasicBlock::iterator MBBI = MBB.begin(), E = MBB.end();<br>
MBBI != E;) {<br>
+ // Track currently live registers up to this point, to help with<br>
+ // searching for a rename register on demand.<br>
+ updateDefinedRegisters(*MBBI, DefinedInBB, TRI);<br>
if (TII->isPairableLdStInst(*MBBI) && tryToPairLdStInst(MBBI))<br>
Modified = true;<br>
else<br>
@@ -1825,11 +2136,14 @@ bool AArch64LoadStoreOpt::runOnMachineFunction(MachineFunction &Fn) {<br>
// or store.<br>
ModifiedRegUnits.init(*TRI);<br>
UsedRegUnits.init(*TRI);<br>
+ DefinedInBB.init(*TRI);<br>
<br>
bool Modified = false;<br>
bool enableNarrowZeroStOpt = !Subtarget->requiresStrictAlign();<br>
- for (auto &MBB : Fn)<br>
- Modified |= optimizeBlock(MBB, enableNarrowZeroStOpt);<br>
+ for (auto &MBB : Fn) {<br>
+ auto M = optimizeBlock(MBB, enableNarrowZeroStOpt);<br>
+ Modified |= M;<br>
+ }<br>
<br>
return Modified;<br>
}<br>
<br>
diff --git a/llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll b/llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll<br>
index ec3b51bd37a8..7caa4c06d698 100644<br>
--- a/llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll<br>
+++ b/llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll<br>
@@ -14,10 +14,9 @@ define void @fn9(i32* %a1, i32 %a2, i32 %a3, i32 %a4, i32 %a5, i32 %a6, i32 %a7,<br>
; CHECK-NEXT: stp w6, w5, [sp, #36]<br>
; CHECK-NEXT: str w7, [sp, #32]<br>
; CHECK-NEXT: str w8, [x0]<br>
-; CHECK-NEXT: ldr w8, [sp, #72]<br>
-; CHECK-NEXT: str w8, [sp, #20]<br>
+; CHECK-NEXT: ldr w9, [sp, #72]<br>
; CHECK-NEXT: ldr w8, [sp, #80]<br>
-; CHECK-NEXT: str w8, [sp, #16]<br>
+; CHECK-NEXT: stp w8, w9, [sp, #16]<br>
; CHECK-NEXT: add x8, sp, #72 ; =72<br>
; CHECK-NEXT: add x8, x8, #24 ; =24<br>
; CHECK-NEXT: str x8, [sp, #24]<br>
@@ -65,22 +64,18 @@ define i32 @main() nounwind ssp {<br>
; CHECK: ; %bb.0:<br>
; CHECK-NEXT: sub sp, sp, #96 ; =96<br>
; CHECK-NEXT: stp x29, x30, [sp, #80] ; 16-byte Folded Spill<br>
-; CHECK-NEXT: mov w8, #1<br>
-; CHECK-NEXT: str w8, [sp, #76]<br>
+; CHECK-NEXT: mov w9, #1<br>
; CHECK-NEXT: mov w8, #2<br>
-; CHECK-NEXT: str w8, [sp, #72]<br>
-; CHECK-NEXT: mov w8, #3<br>
-; CHECK-NEXT: str w8, [sp, #68]<br>
+; CHECK-NEXT: stp w8, w9, [sp, #72]<br>
+; CHECK-NEXT: mov w9, #3<br>
; CHECK-NEXT: mov w8, #4<br>
-; CHECK-NEXT: str w8, [sp, #64]<br>
-; CHECK-NEXT: mov w8, #5<br>
-; CHECK-NEXT: str w8, [sp, #60]<br>
+; CHECK-NEXT: stp w8, w9, [sp, #64]<br>
+; CHECK-NEXT: mov w9, #5<br>
; CHECK-NEXT: mov w8, #6<br>
-; CHECK-NEXT: str w8, [sp, #56]<br>
-; CHECK-NEXT: mov w8, #7<br>
-; CHECK-NEXT: str w8, [sp, #52]<br>
+; CHECK-NEXT: stp w8, w9, [sp, #56]<br>
+; CHECK-NEXT: mov w9, #7<br>
; CHECK-NEXT: mov w8, #8<br>
-; CHECK-NEXT: str w8, [sp, #48]<br>
+; CHECK-NEXT: stp w8, w9, [sp, #48]<br>
; CHECK-NEXT: mov w8, #9<br>
; CHECK-NEXT: mov w9, #10<br>
; CHECK-NEXT: stp w9, w8, [sp, #40]<br>
<br>
diff --git a/llvm/test/CodeGen/AArch64/arm64-abi_align.ll b/llvm/test/CodeGen/AArch64/arm64-abi_align.ll<br>
index 7db3ea76de05..189546a4553b 100644<br>
--- a/llvm/test/CodeGen/AArch64/arm64-abi_align.ll<br>
+++ b/llvm/test/CodeGen/AArch64/arm64-abi_align.ll<br>
@@ -392,10 +392,8 @@ entry:<br>
define i32 @caller43() #3 {<br>
entry:<br>
; CHECK-LABEL: caller43<br>
-; CHECK-DAG: str {{q[0-9]+}}, [sp, #48]<br>
-; CHECK-DAG: str {{q[0-9]+}}, [sp, #32]<br>
-; CHECK-DAG: str {{q[0-9]+}}, [sp, #16]<br>
-; CHECK-DAG: str {{q[0-9]+}}, [sp]<br>
+; CHECK-DAG: stp q1, q0, [sp, #32]<br>
+; CHECK-DAG: stp q1, q0, [sp]<br>
; CHECK: add x1, sp, #32<br>
; CHECK: mov x2, sp<br>
; Space for s1 is allocated at sp+32<br>
@@ -434,10 +432,8 @@ entry:<br>
; CHECK-LABEL: caller43_stack<br>
; CHECK: sub sp, sp, #112<br>
; CHECK: add x29, sp, #96<br>
-; CHECK-DAG: stur {{q[0-9]+}}, [x29, #-16]<br>
-; CHECK-DAG: stur {{q[0-9]+}}, [x29, #-32]<br>
-; CHECK-DAG: str {{q[0-9]+}}, [sp, #48]<br>
-; CHECK-DAG: str {{q[0-9]+}}, [sp, #32]<br>
+; CHECK-DAG: stp q1, q0, [x29, #-32]<br>
+; CHECK-DAG: stp q1, q0, [sp, #32]<br>
; Space for s1 is allocated at x29-32 = sp+64<br>
; Space for s2 is allocated at sp+32<br>
; CHECK: add x[[B:[0-9]+]], sp, #32<br>
<br>
diff --git a/llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll b/llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll<br>
index db87d7fae803..5f4c17d0cfba 100644<br>
--- a/llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll<br>
+++ b/llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll<br>
@@ -26,11 +26,11 @@ define void @test_simple(i32 %n, ...) {<br>
<br>
; CHECK: add [[GR_TOPTMP:x[0-9]+]], sp, #[[GR_BASE]]<br>
; CHECK: add [[GR_TOP:x[0-9]+]], [[GR_TOPTMP]], #56<br>
-; CHECK: str [[GR_TOP]], [x[[VA_LIST]], #8]<br>
+<br>
<br>
; CHECK: mov [[VR_TOPTMP:x[0-9]+]], sp<br>
; CHECK: add [[VR_TOP:x[0-9]+]], [[VR_TOPTMP]], #128<br>
-; CHECK: str [[VR_TOP]], [x[[VA_LIST]], #16]<br>
+; CHECK: stp [[GR_TOP]], [[VR_TOP]], [x[[VA_LIST]], #8]<br>
<br>
; CHECK: mov [[GRVR:x[0-9]+]], #-56<br>
; CHECK: movk [[GRVR]], #65408, lsl #32<br>
@@ -62,11 +62,10 @@ define void @test_fewargs(i32 %n, i32 %n1, i32 %n2, float %m, ...) {<br>
<br>
; CHECK: add [[GR_TOPTMP:x[0-9]+]], sp, #[[GR_BASE]]<br>
; CHECK: add [[GR_TOP:x[0-9]+]], [[GR_TOPTMP]], #40<br>
-; CHECK: str [[GR_TOP]], [x[[VA_LIST]], #8]<br>
<br>
; CHECK: mov [[VR_TOPTMP:x[0-9]+]], sp<br>
; CHECK: add [[VR_TOP:x[0-9]+]], [[VR_TOPTMP]], #112<br>
-; CHECK: str [[VR_TOP]], [x[[VA_LIST]], #16]<br>
+; CHECK: stp [[GR_TOP]], [[VR_TOP]], [x[[VA_LIST]], #8]<br>
<br>
; CHECK: mov [[GRVR_OFFS:x[0-9]+]], #-40<br>
; CHECK: movk [[GRVR_OFFS]], #65424, lsl #32<br>
<br>
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll b/llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll<br>
index 19351262b82b..f188301579e3 100644<br>
--- a/llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll<br>
+++ b/llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll<br>
@@ -4,7 +4,7 @@<br>
; CHECK-SAME: Bytes from outlining all occurrences (16) >=<br>
; CHECK-SAME: Unoutlined instruction bytes (16)<br>
; CHECK-SAME: (Also found at: <UNKNOWN LOCATION>)<br>
-; CHECK: remark: <unknown>:0:0: Saved 48 bytes by outlining 14 instructions<br>
+; CHECK: remark: <unknown>:0:0: Saved 36 bytes by outlining 11 instructions<br>
; CHECK-SAME: from 2 locations. (Found at: <UNKNOWN LOCATION>,<br>
; CHECK-SAME: <UNKNOWN LOCATION>)<br>
; RUN: llc %s -enable-machine-outliner -mtriple=aarch64-unknown-unknown -o /dev/null -pass-remarks-missed=machine-outliner -pass-remarks-output=%t.yaml<br>
@@ -38,10 +38,10 @@<br>
; YAML-NEXT: Function: OUTLINED_FUNCTION_0<br>
; YAML-NEXT: Args:<br>
; YAML-NEXT: - String: 'Saved '<br>
-; YAML-NEXT: - OutliningBenefit: '48'<br>
+; YAML-NEXT: - OutliningBenefit: '36'<br>
; YAML-NEXT: - String: ' bytes by '<br>
; YAML-NEXT: - String: 'outlining '<br>
-; YAML-NEXT: - Length: '14'<br>
+; YAML-NEXT: - Length: '11'<br>
; YAML-NEXT: - String: ' instructions '<br>
; YAML-NEXT: - String: 'from '<br>
; YAML-NEXT: - NumOccurrences: '2'<br>
<br>
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner.ll b/llvm/test/CodeGen/AArch64/machine-outliner.ll<br>
index 15afdd43d116..13a12f76695d 100644<br>
--- a/llvm/test/CodeGen/AArch64/machine-outliner.ll<br>
+++ b/llvm/test/CodeGen/AArch64/machine-outliner.ll<br>
@@ -91,19 +91,16 @@ define void @dog() #0 {<br>
; ODR: [[OUTLINED]]:<br>
; CHECK: .p2align 2<br>
; CHECK-NEXT: [[OUTLINED]]:<br>
-; CHECK: mov w8, #1<br>
-; CHECK-NEXT: str w8, [sp, #28]<br>
-; CHECK-NEXT: mov w8, #2<br>
-; CHECK-NEXT: str w8, [sp, #24]<br>
-; CHECK-NEXT: mov w8, #3<br>
-; CHECK-NEXT: str w8, [sp, #20]<br>
-; CHECK-NEXT: mov w8, #4<br>
-; CHECK-NEXT: str w8, [sp, #16]<br>
-; CHECK-NEXT: mov w8, #5<br>
-; CHECK-NEXT: str w8, [sp, #12]<br>
-; CHECK-NEXT: mov w8, #6<br>
-; CHECK-NEXT: str w8, [sp, #8]<br>
-; CHECK-NEXT: add sp, sp, #32<br>
-; CHECK-NEXT: ret<br>
+; CHECK: mov w9, #1<br>
+; CHECK-DAG: mov w8, #2<br>
+; CHECK-DAG: stp w8, w9, [sp, #24]<br>
+; CHECK-DAG: mov w9, #3<br>
+; CHECK-DAG: mov w8, #4<br>
+; CHECK-DAG: stp w8, w9, [sp, #16]<br>
+; CHECK-DAG: mov w9, #5<br>
+; CHECK-DAG: mov w8, #6<br>
+; CHECK-DAG: stp w8, w9, [sp, #8]<br>
+; CHECK-DAG: add sp, sp, #32<br>
+; CHECK-DAG: ret<br>
<br>
attributes #0 = { noredzone "target-cpu"="cyclone" "target-features"="+sse" }<br>
<br>
diff --git a/llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir b/llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir<br>
new file mode 100644<br>
index 000000000000..018827772da5<br>
--- /dev/null<br>
+++ b/llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir<br>
@@ -0,0 +1,471 @@<br>
+# RUN: llc -run-pass=aarch64-ldst-opt -mtriple=arm64-apple-iphoneos -verify-machineinstrs -o - %s | FileCheck %s<br>
+<br>
+---<br>
+# CHECK-LABEL: name: test1<br>
+# CHECK: bb.0:<br>
+# CHECK-NEXT: liveins: $x0, $x1<br>
+# CHECK: $x10, renamable $x8 = LDPXi renamable $x0, 0 :: (load 8)<br>
+# CHECK-NEXT: renamable $x9 = LDRXui renamable $x0, 1 :: (load 8)<br>
+# CHECK-NEXT: STRXui renamable $x9, renamable $x0, 100 :: (store 8, align 4)<br>
+# CHECK-NEXT: renamable $x8 = ADDXrr $x8, $x8<br>
+# CHECK-NEXT: STPXi renamable $x8, killed $x10, renamable $x0, 10 :: (store 8, align 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+<br>
+name: test1<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1<br>
+ renamable $x9, renamable $x8 = LDPXi renamable $x0, 0 :: (load 8)<br>
+ STRXui renamable killed $x9, renamable $x0, 11 :: (store 8, align 4)<br>
+ renamable $x9 = LDRXui renamable $x0, 1 :: (load 8)<br>
+ STRXui renamable $x9, renamable $x0, 100 :: (store 8, align 4)<br>
+ renamable $x8 = ADDXrr $x8, $x8<br>
+ STRXui renamable $x8, renamable $x0, 10 :: (store 8, align 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# CHECK-LABEL: name: test2<br>
+# CHECK-LABEL: bb.0:<br>
+# CHECK-NEXT: liveins: $x0, $x9, $x1<br>
+<br>
+# CHECK: $x10, renamable $x8 = LDPXi renamable $x9, 0 :: (load 8)<br>
+# CHECK-NEXT: renamable $x9 = LDRXui renamable $x0, 2 :: (load 8)<br>
+# CHECK-NEXT: STRXui renamable $x9, renamable $x0, 100 :: (store 8, align 4)<br>
+# CHECK-NEXT: renamable $x8 = ADDXrr $x8, $x8<br>
+# CHECK-NEXT: STPXi renamable $x8, killed $x10, renamable $x0, 10 :: (store 8, align 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+<br>
+name: test2<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x9' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x9, $x1<br>
+ renamable $x9, renamable $x8 = LDPXi renamable $x9, 0 :: (load 8)<br>
+ STRXui renamable killed $x9, renamable $x0, 11 :: (store 8, align 4)<br>
+ renamable $x9 = LDRXui renamable $x0, 2 :: (load 8)<br>
+ STRXui renamable $x9, renamable $x0, 100 :: (store 8, align 4)<br>
+ renamable $x8 = ADDXrr $x8, $x8<br>
+ STRXui renamable $x8, renamable $x0, 10 :: (store 8, align 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# MOVK has a tied operand and we currently do not rename across tied defs.<br>
+# CHECK-LABEL: bb.0:<br>
+# CHECK-NEXT: liveins: $x0<br>
+#<br>
+# CHECK: renamable $x8 = MRS 58880<br>
+# CHECK-NEXT: renamable $x8 = MOVZXi 15309, 0<br>
+# CHECK-NEXT: renamable $x8 = MOVKXi renamable $x8, 26239, 16<br>
+# CHECK-NEXT: STRXui renamable $x8, renamable $x0, 0, implicit killed $x8 :: (store 8)<br>
+# CHECK-NEXT: renamable $x8 = MRS 55840<br>
+# CHECK-NEXT: STRXui killed renamable $x8, killed renamable $x0, 1, implicit killed $x8 :: (store 8)<br>
+# CHECK-NEXT: RET undef $lr<br>
+#<br>
+name: test3<br>
+alignment: 2<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+frameInfo:<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0<br>
+<br>
+ renamable $x8 = MRS 58880<br>
+ renamable $x8 = MOVZXi 15309, 0<br>
+ renamable $x8 = MOVKXi renamable $x8, 26239, 16<br>
+ STRXui renamable $x8, renamable $x0, 0, implicit killed $x8 :: (store 8)<br>
+ renamable $x8 = MRS 55840<br>
+ STRXui killed renamable $x8, renamable killed $x0, 1, implicit killed $x8 :: (store 8)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# CHECK-LABEL: name: test4<br>
+# CHECK-LABEL: bb.0:<br>
+# CHECK-NEXT: liveins: $x0, $x1<br>
+<br>
+# CHECK: $x9 = MRS 58880<br>
+# CHECK-NEXT: renamable $x8 = MRS 55840<br>
+# CHECK-NEXT: STPXi $x9, killed renamable $x8, killed renamable $x0, 0 :: (store 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+<br>
+name: test4<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1<br>
+<br>
+ renamable $x8 = MRS 58880<br>
+ STRXui renamable $x8, renamable $x0, 0, implicit killed $x8 :: (store 4)<br>
+ renamable $x8 = MRS 55840<br>
+ STRXui killed renamable $x8, renamable killed $x0, 1, implicit killed $x8 :: (store 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# CHECK-LABEL: name: test5<br>
+# CHECK-LABEL: bb.0:<br>
+# CHECK-NEXT: liveins: $x0, $x1<br>
+<br>
+# CHECK: $x9 = MRS 58880<br>
+# CHECK-NEXT: renamable $x8 = MRS 55840<br>
+# CHECK-NEXT: STPWi $w9, killed renamable $w8, killed renamable $x0, 0 :: (store 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+<br>
+name: test5<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1<br>
+<br>
+ renamable $x8 = MRS 58880<br>
+ STRWui renamable $w8, renamable $x0, 0, implicit killed $x8 :: (store 4)<br>
+ renamable $x8 = MRS 55840<br>
+ STRWui killed renamable $w8, renamable killed $x0, 1, implicit killed $x8 :: (store 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# CHECK-LABEL: name: test6<br>
+# CHECK-LABEL bb.0:<br>
+# CHECK: liveins: $x0, $x1, $q3<br>
+<br>
+# CHECK: renamable $q9 = LDRQui $x0, 0 :: (load 16)<br>
+# CHECK-NEXT: renamable $q9 = XTNv8i16 renamable $q9, killed renamable $q3<br>
+# CHECK-NEXT: STRQui renamable $q9, renamable $x0, 11 :: (store 16, align 4)<br>
+# CHECK-NEXT: renamable $q9 = FADDv2f64 renamable $q9, renamable $q9<br>
+# CHECK-NEXT: STRQui renamable $q9, renamable $x0, 10 :: (store 16, align 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+<br>
+# XTN has a tied use-def.<br>
+name: test6<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+ - { reg: '$q3' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1, $q3<br>
+ renamable $q9 = LDRQui $x0, 0 :: (load 16)<br>
+ renamable $q9 = XTNv8i16 renamable $q9, killed renamable $q3<br>
+ STRQui renamable $q9, renamable $x0, 11 :: (store 16, align 4)<br>
+ renamable $q9 = FADDv2f64 renamable $q9, renamable $q9<br>
+ STRQui renamable $q9, renamable $x0, 10 :: (store 16, align 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# Currently we do not rename across frame-setup instructions.<br>
+# CHECK-LABEL: name: test7<br>
+# CHECK-LABEL: bb.0:<br>
+# CHECK-NEXT: liveins: $x0, $x1<br>
+<br>
+# CHECK: $sp = frame-setup SUBXri $sp, 64, 0<br>
+# CHECK-NEXT: renamable $x9 = frame-setup LDRXui renamable $x0, 0 :: (load 8)<br>
+# CHECK-NEXT: STRXui renamable $x9, $x0, 10 :: (store 8, align 4)<br>
+# CHECK-NEXT: renamable $x9 = LDRXui renamable $x0, 1 :: (load 8)<br>
+# CHECK-NEXT: STRXui renamable $x9, $x0, 11 :: (store 8, align 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+#<br>
+name: test7<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+frameInfo:<br>
+ stackSize: 64<br>
+ maxAlignment: 16<br>
+ adjustsStack: true<br>
+ hasCalls: true<br>
+ maxCallFrameSize: 0<br>
+stack:<br>
+ - { id: 0, type: spill-slot, offset: -48, size: 16, alignment: 16 }<br>
+ - { id: 1, type: spill-slot, offset: -64, size: 16, alignment: 16 }<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1<br>
+ $sp = frame-setup SUBXri $sp, 64, 0<br>
+ renamable $x9 = frame-setup LDRXui renamable $x0, 0 :: (load 8)<br>
+ STRXui renamable $x9, $x0, 10 :: (store 8, align 4)<br>
+ renamable $x9 = LDRXui renamable $x0, 1 :: (load 8)<br>
+ STRXui renamable $x9, $x0, 11 :: (store 8, align 4)<br>
+ RET undef $lr<br>
+...<br>
+---<br>
+# CHECK-LABEL: name: test8<br>
+# CHECK-LABEL: bb.0:<br>
+# CHECK-NEXT: liveins: $x0, $x1<br>
+<br>
+# CHECK: renamable $x8 = MRS 58880<br>
+# CHECK-NEXT: $w9 = ORRWrs $wzr, killed renamable $w8, 0, implicit-def $x9<br>
+# CHECK-NEXT: renamable $x8 = MRS 55840<br>
+# CHECK-NEXT: STPWi $w9, killed renamable $w8, killed renamable $x0, 0 :: (store 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+<br>
+name: test8<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1<br>
+<br>
+ renamable $x8 = MRS 58880<br>
+ renamable $w8 = ORRWrs $wzr, killed renamable $w8, 0, implicit-def $x8<br>
+ STRWui renamable $w8, renamable $x0, 0, implicit killed $x8 :: (store 4)<br>
+ renamable $x8 = MRS 55840<br>
+ STRWui killed renamable $w8, renamable killed $x0, 1, implicit killed $x8 :: (store 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# The reg class returned for $q9 contains only the first 16 Q registers.<br>
+# TODO: Can we check that all instructions that require renaming also support<br>
+# the second 16 Q registers?<br>
+# CHECK-LABEL: name: test9<br>
+# CHECK-LABEL bb.0:<br>
+# CHECK: liveins: $x0, $x1, $q0, $q1, $q2, $q3, $q4, $q5, $q6, $q7<br>
+<br>
+# CHECK: renamable $q9 = LDRQui $x0, 0 :: (load 16)<br>
+# CHECK-NEXT: STRQui killed renamable $q9, renamable $x0, 10 :: (store 16, align 4)<br>
+# CHECK: renamable $q9 = LDRQui $x0, 1 :: (load 16)<br>
+# CHECK-NEXT: STRQui renamable $q9, renamable $x0, 11 :: (store 16, align 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+<br>
+name: test9<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+ - { reg: '$q3' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1, $q0, $q1, $q2, $q3, $q4, $q5, $q6, $q7<br>
+ renamable $q9 = LDRQui $x0, 0 :: (load 16)<br>
+ STRQui renamable killed $q9, renamable $x0, 10 :: (store 16, align 4)<br>
+ renamable $q9 = LDRQui $x0, 1 :: (load 16)<br>
+ STRQui renamable $q9, renamable $x0, 11 :: (store 16, align 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# The livein $q7 is killed early, so we can re-use it for renaming.<br>
+# CHECK-LABEL: name: test10<br>
+# CHECK-LABEL bb.0:<br>
+# CHECK: liveins: $x0, $x1, $q0, $q1, $q2, $q3, $q4, $q5, $q6, $q7<br>
+<br>
+# CHECK: renamable $q7 = FADDv2f64 renamable $q7, renamable $q7<br>
+# CHECK-NEXT: STRQui killed renamable $q7, renamable $x0, 100 :: (store 16, align 4)<br>
+# CHECK-NEXT: $q7 = LDRQui $x0, 0 :: (load 16)<br>
+# CHECK-NEXT: renamable $q9 = LDRQui $x0, 1 :: (load 16)<br>
+# CHECK-NEXT: STPQi killed renamable $q9, killed $q7, renamable $x0, 10 :: (store 16, align 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+<br>
+name: test10<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+ - { reg: '$q3' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1, $q0, $q1, $q2, $q3, $q4, $q5, $q6, $q7<br>
+ renamable $q7 = FADDv2f64 renamable $q7, renamable $q7<br>
+ STRQui renamable killed $q7, renamable $x0, 100 :: (store 16, align 4)<br>
+ renamable $q9 = LDRQui $x0, 0 :: (load 16)<br>
+ STRQui renamable killed $q9, renamable $x0, 11 :: (store 16, align 4)<br>
+ renamable $q9 = LDRQui $x0, 1 :: (load 16)<br>
+ STRQui renamable killed $q9, renamable $x0, 10 :: (store 16, align 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# Make sure we do not use any registers that are defined between paired candidates<br>
+# ($x14 in this example)<br>
+# CHECK-LABEL: name: test11<br>
+# CHECK: bb.0:<br>
+# CHECK-NEXT: liveins: $x0, $x1, $x11, $x12, $x13<br>
+<br>
+# CHECK: renamable $w10 = LDRWui renamable $x0, 0 :: (load 8)<br>
+# CHECK-NEXT: $x15, renamable $x8 = LDPXi renamable $x0, 1 :: (load 8)<br>
+# CHECK-NEXT: renamable $x9 = LDRXui renamable $x0, 3 :: (load 8)<br>
+# CHECK-NEXT: renamable $x14 = LDRXui renamable $x0, 5 :: (load 8)<br>
+# CHECK-NEXT: STPXi renamable $x9, killed $x15, renamable $x0, 10 :: (store 8, align 4)<br>
+# CHECK-NEXT: STRXui killed renamable $x14, renamable $x0, 200 :: (store 8, align 4)<br>
+# CHECK-NEXT: renamable $w8 = ADDWrr $w10, $w10<br>
+# CHECK-NEXT: STRWui renamable $w8, renamable $x0, 100 :: (store 8, align 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+#<br>
+name: test11<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1, $x11, $x12, $x13<br>
+ renamable $w10 = LDRWui renamable $x0, 0 :: (load 8)<br>
+ renamable $x9, renamable $x8 = LDPXi renamable $x0, 1 :: (load 8)<br>
+ STRXui renamable killed $x9, renamable $x0, 11 :: (store 8, align 4)<br>
+ renamable $x9 = LDRXui renamable $x0, 3 :: (load 8)<br>
+ renamable $x14 = LDRXui renamable $x0, 5 :: (load 8)<br>
+ STRXui renamable $x9, renamable $x0, 10 :: (store 8, align 4)<br>
+ STRXui renamable killed $x14, renamable $x0, 200 :: (store 8, align 4)<br>
+ renamable $w8 = ADDWrr $w10, $w10<br>
+ STRWui renamable $w8, renamable $x0, 100 :: (store 8, align 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# Check that we correctly deal with killed registers in stores that get merged forward,<br>
+# which extends the live range of the first store operand.<br>
+# CHECK-LABEL: name: test12<br>
+# CHECK: bb.0:<br>
+# CHECK-NEXT: liveins: $x0, $x1<br>
+#<br>
+# CHECK: renamable $x10 = LDRXui renamable $x0, 0 :: (load 8)<br>
+# CHECK-NEXT: $x11, renamable $x8 = LDPXi renamable $x0, 3 :: (load 8)<br>
+# CHECK-NEXT: renamable $x9 = LDRXui renamable $x0, 2 :: (load 8)<br>
+# CHECK-NEXT: renamable $x8 = ADDXrr $x8, $x8<br>
+# CHECK-NEXT: STPXi renamable $x8, killed $x11, renamable $x0, 10 :: (store 8, align 4)<br>
+# CHECK-NEXT: STPXi killed renamable $x10, renamable $x9, renamable $x0, 20 :: (store 8, align 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+<br>
+name: test12<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1<br>
+ renamable $x10 = LDRXui renamable $x0, 0 :: (load 8)<br>
+ STRXui renamable killed $x10, renamable $x0, 20 :: (store 8, align 4)<br>
+ renamable $x9, renamable $x8 = LDPXi renamable $x0, 3 :: (load 8)<br>
+ STRXui renamable killed $x9, renamable $x0, 11 :: (store 8, align 4)<br>
+ renamable $x9 = LDRXui renamable $x0, 2 :: (load 8)<br>
+ renamable $x8 = ADDXrr $x8, $x8<br>
+ STRXui renamable $x8, renamable $x0, 10 :: (store 8, align 4)<br>
+ STRXui renamable $x9, renamable $x0, 21 :: (store 8, align 4)<br>
+ RET undef $lr<br>
+<br>
+...<br>
+---<br>
+# Make sure we do not use any registers that are defined between def to rename and the first<br>
+# paired store. ($x14 in this example)<br>
+# CHECK-LABEL: name: test13<br>
+# CHECK: bb.0:<br>
+# CHECK-NEXT: liveins: $x0, $x1, $x10, $x11, $x12, $x13<br>
+# CHECK: $x15, renamable $x8 = LDPXi renamable $x0, 0 :: (load 8)<br>
+# CHECK-NEXT: renamable $x14 = LDRXui renamable $x0, 4 :: (load 8)<br>
+# CHECK-NEXT: STRXui killed renamable $x14, renamable $x0, 100 :: (store 8, align 4)<br>
+# CHECK-NEXT: renamable $x9 = LDRXui renamable $x0, 2 :: (load 8)<br>
+# CHECK-NEXT: STPXi renamable $x9, killed $x15, renamable $x0, 10 :: (store 8, align 4)<br>
+# CHECK-NEXT: RET undef $lr<br>
+#<br>
+name: test13<br>
+alignment: 4<br>
+tracksRegLiveness: true<br>
+liveins:<br>
+ - { reg: '$x0' }<br>
+ - { reg: '$x1' }<br>
+ - { reg: '$x8' }<br>
+frameInfo:<br>
+ maxAlignment: 1<br>
+ maxCallFrameSize: 0<br>
+machineFunctionInfo: {}<br>
+body: |<br>
+ bb.0:<br>
+ liveins: $x0, $x1, $x10, $x11, $x12, $x13<br>
+ renamable $x9, renamable $x8 = LDPXi renamable $x0, 0 :: (load 8)<br>
+ renamable $x14 = LDRXui renamable $x0, 4 :: (load 8)<br>
+ STRXui renamable killed $x14, renamable $x0, 100 :: (store 8, align 4)<br>
+ STRXui renamable killed $x9, renamable $x0, 11 :: (store 8, align 4)<br>
+ renamable $x9 = LDRXui renamable $x0, 2 :: (load 8)<br>
+ STRXui renamable $x9, renamable $x0, 10 :: (store 8)<br>
+ RET undef $lr<br>
+<br>
+...<br>
<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote></div>