[llvm] r311038 - [MachineCopyPropagation] Extend pass to do COPY source forwarding
Vitaly Buka via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 17 18:23:00 PDT 2017
I just hangs on "Running the AddressSanitizer tests" for arm
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_arm/lib/asan/tests/AsanTest:
1 file pus
hed. 3.7 MB/s (10602792 bytes in 2.742s)
+ echo @@@BUILD_STEP run asan lit tests
'[arm/sailfish-userdebug/OPR1.170621.001]@@@'
@@@BUILD_STEP run asan lit tests [arm/sailfish-userdebug/OPR1.170621.001]@@@
+ cd
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_arm
+ ninja check-asan
[1/1] Running the AddressSanitizer tests
But the same device passes aarch64 tests:
[1/1] Running the AddressSanitizer tests
-- Testing: 407 tests, 12 threads --
Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
Testing Time: 87.09s
Expected Passes : 202
Expected Failures : 27
Unsupported Tests : 178
+ echo @@@BUILD_STEP run sanitizer_common tests
'[aarch64/sailfish-userdebug/OPR1.170621.001]@@@'
@@@BUILD_STEP run sanitizer_common tests
[aarch64/sailfish-userdebug/OPR1.170621.001]@@@
On Thu, Aug 17, 2017 at 5:42 PM, Geoff Berry <gberry at codeaurora.org> wrote:
> It seems like this is still happening after fixing a couple of issues with
> this patch. I'll revert again shortly. I'm having a hard time
> understanding what is failing from looking at the buildbot logs though. Do
> you have any way of determining what exactly is timing out?
>
> On 8/16/2017 11:13 PM, Vitaly Buka wrote:
>
>> Looks like after this patch Android tests consistently hang
>> http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-
>> android/builds/1825
>>
>> On Wed, Aug 16, 2017 at 1:50 PM, Geoff Berry via llvm-commits <
>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>>
>> Author: gberry
>> Date: Wed Aug 16 13:50:01 2017
>> New Revision: 311038
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=311038&view=rev
>> <http://llvm.org/viewvc/llvm-project?rev=311038&view=rev>
>> Log:
>> [MachineCopyPropagation] Extend pass to do COPY source forwarding
>>
>> This change extends MachineCopyPropagation to do COPY source
>> forwarding.
>>
>> This change also extends the MachineCopyPropagation pass to be able to
>> be run during register allocation, after physical registers have been
>> assigned, but before the virtual registers have been re-written, which
>> allows it to remove virtual register COPY LiveIntervals that become
>> dead
>> through the forwarding of all of their uses.
>>
>> Reviewers: qcolombet, javed.absar, MatzeB, jonpa
>>
>> Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier,
>> mgorny
>>
>> Differential Revision: https://reviews.llvm.org/D30751
>> <https://reviews.llvm.org/D30751>
>>
>> Modified:
>> llvm/trunk/include/llvm/CodeGen/Passes.h
>> llvm/trunk/include/llvm/InitializePasses.h
>> llvm/trunk/lib/CodeGen/CodeGen.cpp
>> llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp
>> llvm/trunk/lib/CodeGen/TargetPassConfig.cpp
>> llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll
>> llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll
>> llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll
>> llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll
>> llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll
>> llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll
>> llvm/trunk/test/CodeGen/AArch64/neg-imm.ll
>> llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll
>> llvm/trunk/test/CodeGen/AMDGPU/call-argument-types.ll
>> llvm/trunk/test/CodeGen/AMDGPU/call-preserved-registers.ll
>> llvm/trunk/test/CodeGen/AMDGPU/callee-special-input-sgprs.ll
>> llvm/trunk/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll
>> llvm/trunk/test/CodeGen/AMDGPU/mubuf-offset-private.ll
>> llvm/trunk/test/CodeGen/AMDGPU/multilevel-break.ll
>> llvm/trunk/test/CodeGen/AMDGPU/private-access-no-objects.ll
>> llvm/trunk/test/CodeGen/AMDGPU/ret.ll
>> llvm/trunk/test/CodeGen/ARM/atomic-op.ll
>> llvm/trunk/test/CodeGen/ARM/swifterror.ll
>> llvm/trunk/test/CodeGen/Mips/llvm-ir/sub.ll
>> llvm/trunk/test/CodeGen/PowerPC/fma-mutate.ll
>> llvm/trunk/test/CodeGen/PowerPC/inlineasm-i64-reg.ll
>> llvm/trunk/test/CodeGen/PowerPC/tail-dup-layout.ll
>> llvm/trunk/test/CodeGen/SPARC/32abi.ll
>> llvm/trunk/test/CodeGen/SPARC/atomics.ll
>> llvm/trunk/test/CodeGen/Thumb/thumb-shrink-wrapping.ll
>> llvm/trunk/test/CodeGen/X86/2006-03-01-InstrSchedBug.ll
>> llvm/trunk/test/CodeGen/X86/arg-copy-elide.ll
>> llvm/trunk/test/CodeGen/X86/avg.ll
>> llvm/trunk/test/CodeGen/X86/avx-load-store.ll
>> llvm/trunk/test/CodeGen/X86/avx512-bugfix-25270.ll
>> llvm/trunk/test/CodeGen/X86/avx512-calling-conv.ll
>> llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
>> llvm/trunk/test/CodeGen/X86/avx512bw-intrinsics-upgrade.ll
>> llvm/trunk/test/CodeGen/X86/bitcast-int-to-vector-bool-sext.ll
>> llvm/trunk/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll
>> llvm/trunk/test/CodeGen/X86/buildvec-insertvec.ll
>> llvm/trunk/test/CodeGen/X86/combine-fcopysign.ll
>> llvm/trunk/test/CodeGen/X86/complex-fastmath.ll
>> llvm/trunk/test/CodeGen/X86/divide-by-constant.ll
>> llvm/trunk/test/CodeGen/X86/fmaxnum.ll
>> llvm/trunk/test/CodeGen/X86/fminnum.ll
>> llvm/trunk/test/CodeGen/X86/fp128-i128.ll
>> llvm/trunk/test/CodeGen/X86/haddsub-2.ll
>> llvm/trunk/test/CodeGen/X86/haddsub-undef.ll
>> llvm/trunk/test/CodeGen/X86/half.ll
>> llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll
>> llvm/trunk/test/CodeGen/X86/ipra-local-linkage.ll
>> llvm/trunk/test/CodeGen/X86/localescape.ll
>> llvm/trunk/test/CodeGen/X86/mul-i1024.ll
>> llvm/trunk/test/CodeGen/X86/mul-i512.ll
>> llvm/trunk/test/CodeGen/X86/mul128.ll
>> llvm/trunk/test/CodeGen/X86/pmul.ll
>> llvm/trunk/test/CodeGen/X86/powi.ll
>> llvm/trunk/test/CodeGen/X86/pr11334.ll
>> llvm/trunk/test/CodeGen/X86/pr29112.ll
>> llvm/trunk/test/CodeGen/X86/psubus.ll
>> llvm/trunk/test/CodeGen/X86/select.ll
>> llvm/trunk/test/CodeGen/X86/shrink-wrap-chkstk.ll
>> llvm/trunk/test/CodeGen/X86/sqrt-fastmath.ll
>> llvm/trunk/test/CodeGen/X86/sse-scalar-fp-arith.ll
>> llvm/trunk/test/CodeGen/X86/sse1.ll
>> llvm/trunk/test/CodeGen/X86/sse3-avx-addsub-2.ll
>> llvm/trunk/test/CodeGen/X86/statepoint-live-in.ll
>> llvm/trunk/test/CodeGen/X86/statepoint-stack-usage.ll
>> llvm/trunk/test/CodeGen/X86/vec_fp_to_int.ll
>> llvm/trunk/test/CodeGen/X86/vec_int_to_fp.ll
>> llvm/trunk/test/CodeGen/X86/vec_minmax_sint.ll
>> llvm/trunk/test/CodeGen/X86/vec_shift4.ll
>> llvm/trunk/test/CodeGen/X86/vector-blend.ll
>> llvm/trunk/test/CodeGen/X86/vector-idiv-sdiv-128.ll
>> llvm/trunk/test/CodeGen/X86/vector-idiv-udiv-128.ll
>> llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll
>> llvm/trunk/test/CodeGen/X86/vector-sext.ll
>> llvm/trunk/test/CodeGen/X86/vector-shift-ashr-128.ll
>> llvm/trunk/test/CodeGen/X86/vector-shift-lshr-128.ll
>> llvm/trunk/test/CodeGen/X86/vector-shift-shl-128.ll
>> llvm/trunk/test/CodeGen/X86/vector-shuffle-combining.ll
>> llvm/trunk/test/CodeGen/X86/vector-trunc-math.ll
>> llvm/trunk/test/CodeGen/X86/vector-zext.ll
>> llvm/trunk/test/CodeGen/X86/vselect-minmax.ll
>> llvm/trunk/test/CodeGen/X86/widen_conv-3.ll
>> llvm/trunk/test/CodeGen/X86/widen_conv-4.ll
>> llvm/trunk/test/CodeGen/X86/x86-shrink-wrap-unwind.ll
>> llvm/trunk/test/CodeGen/X86/x86-shrink-wrapping.ll
>>
>> Modified: llvm/trunk/include/llvm/CodeGen/Passes.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/
>> CodeGen/Passes.h?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm
>> /CodeGen/Passes.h?rev=311038&r1=311037&r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/include/llvm/CodeGen/Passes.h (original)
>> +++ llvm/trunk/include/llvm/CodeGen/Passes.h Wed Aug 16 13:50:01 2017
>> @@ -278,6 +278,11 @@ namespace llvm {
>> /// MachineSinking - This pass performs sinking on machine
>> instructions.
>> extern char &MachineSinkingID;
>>
>> + /// MachineCopyPropagationPreRegRewrite - This pass performs copy
>> propagation
>> + /// on machine instructions after register allocation but before
>> virtual
>> + /// register re-writing..
>> + extern char &MachineCopyPropagationPreRegRewriteID;
>> +
>> /// MachineCopyPropagation - This pass performs copy propagation
>> on
>> /// machine instructions.
>> extern char &MachineCopyPropagationID;
>>
>> Modified: llvm/trunk/include/llvm/InitializePasses.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/
>> InitializePasses.h?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm
>> /InitializePasses.h?rev=311038&r1=311037&r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/include/llvm/InitializePasses.h (original)
>> +++ llvm/trunk/include/llvm/InitializePasses.h Wed Aug 16 13:50:01
>> 2017
>> @@ -233,6 +233,7 @@ void initializeMachineBranchProbabilityI
>> void initializeMachineCSEPass(PassRegistry&);
>> void initializeMachineCombinerPass(PassRegistry&);
>> void initializeMachineCopyPropagationPass(PassRegistry&);
>> +void initializeMachineCopyPropagationPreRegRewritePass(PassRegist
>> ry&);
>> void initializeMachineDominanceFrontierPass(PassRegistry&);
>> void initializeMachineDominatorTreePass(PassRegistry&);
>> void initializeMachineFunctionPrinterPassPass(PassRegistry&);
>>
>> Modified: llvm/trunk/lib/CodeGen/CodeGen.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/
>> CodeGen.cpp?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/
>> CodeGen.cpp?rev=311038&r1=311037&r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/CodeGen/CodeGen.cpp (original)
>> +++ llvm/trunk/lib/CodeGen/CodeGen.cpp Wed Aug 16 13:50:01 2017
>> @@ -54,6 +54,7 @@ void llvm::initializeCodeGen(PassRegistr
>> initializeMachineCSEPass(Registry);
>> initializeMachineCombinerPass(Registry);
>> initializeMachineCopyPropagationPass(Registry);
>> + initializeMachineCopyPropagationPreRegRewritePass(Registry);
>> initializeMachineDominatorTreePass(Registry);
>> initializeMachineFunctionPrinterPassPass(Registry);
>> initializeMachineLICMPass(Registry);
>>
>> Modified: llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/M
>> achineCopyPropagation.cpp?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/
>> MachineCopyPropagation.cpp?rev=311038&r1=311037&r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp (original)
>> +++ llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp Wed Aug 16
>> 13:50:01 2017
>> @@ -7,18 +7,62 @@
>> //
>> //===-------------------------------------------------------
>> ---------------===//
>> //
>> -// This is an extremely simple MachineInstr-level copy propagation
>> pass.
>> +// This is a simple MachineInstr-level copy forwarding pass. It
>> may be run at
>> +// two places in the codegen pipeline:
>> +// - After register allocation but before virtual registers have
>> been remapped
>> +// to physical registers.
>> +// - After physical register remapping.
>> +//
>> +// The optimizations done vary slightly based on whether virtual
>> registers are
>> +// still present. In both cases, this pass forwards the source of
>> COPYs to the
>> +// users of their destinations when doing so is legal. For example:
>> +//
>> +// %vreg1 = COPY %vreg0
>> +// ...
>> +// ... = OP %vreg1
>> +//
>> +// If
>> +// - the physical register assigned to %vreg0 has not been
>> clobbered by the
>> +// time of the use of %vreg1
>> +// - the register class constraints are satisfied
>> +// - the COPY def is the only value that reaches OP
>> +// then this pass replaces the above with:
>> +//
>> +// %vreg1 = COPY %vreg0
>> +// ...
>> +// ... = OP %vreg0
>> +//
>> +// and updates the relevant state required by VirtRegMap (e.g.
>> LiveIntervals).
>> +// COPYs whose LiveIntervals become dead as a result of this
>> forwarding (i.e. if
>> +// all uses of %vreg1 are changed to %vreg0) are removed.
>> +//
>> +// When being run with only physical registers, this pass will also
>> remove some
>> +// redundant COPYs. For example:
>> +//
>> +// %R1 = COPY %R0
>> +// ... // No clobber of %R1
>> +// %R0 = COPY %R1 <<< Removed
>> +//
>> +// or
>> +//
>> +// %R1 = COPY %R0
>> +// ... // No clobber of %R0
>> +// %R1 = COPY %R0 <<< Removed
>> //
>> //===-------------------------------------------------------
>> ---------------===//
>>
>> +#include "LiveDebugVariables.h"
>> #include "llvm/ADT/DenseMap.h"
>> #include "llvm/ADT/SetVector.h"
>> #include "llvm/ADT/SmallVector.h"
>> #include "llvm/ADT/Statistic.h"
>> +#include "llvm/CodeGen/LiveRangeEdit.h"
>> +#include "llvm/CodeGen/LiveStackAnalysis.h"
>> #include "llvm/CodeGen/MachineFunction.h"
>> #include "llvm/CodeGen/MachineFunctionPass.h"
>> #include "llvm/CodeGen/MachineRegisterInfo.h"
>> #include "llvm/CodeGen/Passes.h"
>> +#include "llvm/CodeGen/VirtRegMap.h"
>> #include "llvm/Pass.h"
>> #include "llvm/Support/Debug.h"
>> #include "llvm/Support/raw_ostream.h"
>> @@ -30,24 +74,48 @@ using namespace llvm;
>> #define DEBUG_TYPE "machine-cp"
>>
>> STATISTIC(NumDeletes, "Number of dead copies deleted");
>> +STATISTIC(NumCopyForwards, "Number of copy uses forwarded");
>>
>> namespace {
>> typedef SmallVector<unsigned, 4> RegList;
>> typedef DenseMap<unsigned, RegList> SourceMap;
>> typedef DenseMap<unsigned, MachineInstr*> Reg2MIMap;
>>
>> - class MachineCopyPropagation : public MachineFunctionPass {
>> + class MachineCopyPropagation : public MachineFunctionPass,
>> + private LiveRangeEdit::Delegate {
>> const TargetRegisterInfo *TRI;
>> const TargetInstrInfo *TII;
>> - const MachineRegisterInfo *MRI;
>> + MachineRegisterInfo *MRI;
>> + MachineFunction *MF;
>> + SlotIndexes *Indexes;
>> + LiveIntervals *LIS;
>> + const VirtRegMap *VRM;
>> + // True if this pass being run before virtual registers are
>> remapped to
>> + // physical ones.
>> + bool PreRegRewrite;
>> + bool NoSubRegLiveness;
>> +
>> + protected:
>> + MachineCopyPropagation(char &ID, bool PreRegRewrite)
>> + : MachineFunctionPass(ID), PreRegRewrite(PreRegRewrite) {}
>>
>> public:
>> static char ID; // Pass identification, replacement for typeid
>> - MachineCopyPropagation() : MachineFunctionPass(ID) {
>> + MachineCopyPropagation() : MachineCopyPropagation(ID, false) {
>> initializeMachineCopyPropagati
>> onPass(*PassRegistry::getPassRegistry());
>> }
>>
>> void getAnalysisUsage(AnalysisUsage &AU) const override {
>> + if (PreRegRewrite) {
>> + AU.addRequired<SlotIndexes>();
>> + AU.addPreserved<SlotIndexes>();
>> + AU.addRequired<LiveIntervals>();
>> + AU.addPreserved<LiveIntervals>();
>> + AU.addRequired<VirtRegMap>();
>> + AU.addPreserved<VirtRegMap>();
>> + AU.addPreserved<LiveDebugVariables>();
>> + AU.addPreserved<LiveStacks>();
>> + }
>> AU.setPreservesCFG();
>> MachineFunctionPass::getAnalysisUsage(AU);
>> }
>> @@ -55,6 +123,10 @@ namespace {
>> bool runOnMachineFunction(MachineFunction &MF) override;
>>
>> MachineFunctionProperties getRequiredProperties() const
>> override {
>> + if (PreRegRewrite)
>> + return MachineFunctionProperties()
>> + .set(MachineFunctionProperties::Property::NoPHIs)
>> + .set(MachineFunctionProperties
>> ::Property::TracksLiveness);
>> return MachineFunctionProperties().set(
>> MachineFunctionProperties::Property::NoVRegs);
>> }
>> @@ -64,6 +136,28 @@ namespace {
>> void ReadRegister(unsigned Reg);
>> void CopyPropagateBlock(MachineBasicBlock &MBB);
>> bool eraseIfRedundant(MachineInstr &Copy, unsigned Src,
>> unsigned Def);
>> + unsigned getPhysReg(unsigned Reg, unsigned SubReg);
>> + unsigned getPhysReg(const MachineOperand &Opnd) {
>> + return getPhysReg(Opnd.getReg(), Opnd.getSubReg());
>> + }
>> + unsigned getFullPhysReg(const MachineOperand &Opnd) {
>> + return getPhysReg(Opnd.getReg(), 0);
>> + }
>> + void forwardUses(MachineInstr &MI);
>> + bool isForwardableRegClassCopy(const MachineInstr &Copy,
>> + const MachineInstr &UseI);
>> + std::tuple<unsigned, unsigned, bool>
>> + checkUseSubReg(const MachineOperand &CopySrc, const
>> MachineOperand &MOUse);
>> + bool hasImplicitOverlap(const MachineInstr &MI, const
>> MachineOperand &Use);
>> + void narrowRegClass(const MachineInstr &MI, const
>> MachineOperand &MOUse,
>> + unsigned NewUseReg, unsigned NewUseSubReg);
>> + void updateForwardedCopyLiveInterval(const MachineInstr &Copy,
>> + const MachineInstr &UseMI,
>> + unsigned OrigUseReg,
>> + unsigned NewUseReg,
>> + unsigned NewUseSubReg);
>> + /// LiveRangeEdit callback for eliminateDeadDefs().
>> + void LRE_WillEraseInstruction(MachineInstr *MI) override;
>>
>> /// Candidates for deletion.
>> SmallSetVector<MachineInstr*, 8> MaybeDeadCopies;
>> @@ -75,6 +169,15 @@ namespace {
>> SourceMap SrcMap;
>> bool Changed;
>> };
>> +
>> + class MachineCopyPropagationPreRegRewrite : public
>> MachineCopyPropagation {
>> + public:
>> + static char ID; // Pass identification, replacement for typeid
>> + MachineCopyPropagationPreRegRewrite()
>> + : MachineCopyPropagation(ID, true) {
>> + initializeMachineCopyPropagati
>> onPreRegRewritePass(*PassRegistry::getPassRegistry());
>> + }
>> + };
>> }
>> char MachineCopyPropagation::ID = 0;
>> char &llvm::MachineCopyPropagationID = MachineCopyPropagation::ID;
>> @@ -82,6 +185,29 @@ char &llvm::MachineCopyPropagationID = M
>> INITIALIZE_PASS(MachineCopyPropagation, DEBUG_TYPE,
>> "Machine Copy Propagation Pass", false, false)
>>
>> +/// We have two separate passes that are very similar, the only
>> difference being
>> +/// where they are meant to be run in the pipeline. This is done
>> for several
>> +/// reasons:
>> +/// - the two passes have different dependencies
>> +/// - some targets want to disable the later run of this pass, but
>> not the
>> +/// earlier one (e.g. NVPTX and WebAssembly)
>> +/// - it allows for easier debugging via llc
>> +
>> +char MachineCopyPropagationPreRegRewrite::ID = 0;
>> +char &llvm::MachineCopyPropagationPreRegRewriteID =
>> MachineCopyPropagationPreRegRewrite::ID;
>> +
>> +INITIALIZE_PASS_BEGIN(MachineCopyPropagationPreRegRewrite,
>> + "machine-cp-prerewrite",
>> + "Machine Copy Propagation Pre-Register
>> Rewrite Pass",
>> + false, false)
>> +INITIALIZE_PASS_DEPENDENCY(SlotIndexes)
>> +INITIALIZE_PASS_DEPENDENCY(LiveIntervals)
>> +INITIALIZE_PASS_DEPENDENCY(VirtRegMap)
>> +INITIALIZE_PASS_END(MachineCopyPropagationPreRegRewrite,
>> + "machine-cp-prerewrite",
>> + "Machine Copy Propagation Pre-Register Rewrite
>> Pass", false,
>> + false)
>> +
>> /// Remove any entry in \p Map where the register is a subregister
>> or equal to
>> /// a register contained in \p Regs.
>> static void removeRegsFromMap(Reg2MIMap &Map, const RegList &Regs,
>> @@ -122,6 +248,10 @@ void MachineCopyPropagation::ClobberRegi
>> }
>>
>> void MachineCopyPropagation::ReadRegister(unsigned Reg) {
>> + // We don't track MaybeDeadCopies when running pre-VirtRegRewriter.
>> + if (PreRegRewrite)
>> + return;
>> +
>> // If 'Reg' is defined by a copy, the copy is no longer a
>> candidate
>> // for elimination.
>> for (MCRegAliasIterator AI(Reg, TRI, true); AI.isValid(); ++AI) {
>> @@ -153,6 +283,46 @@ static bool isNopCopy(const MachineInstr
>> return SubIdx == TRI->getSubRegIndex(PreviousDef, Def);
>> }
>>
>> +/// Return the physical register assigned to \p Reg if it is a
>> virtual register,
>> +/// otherwise just return the physical reg from the operand itself.
>> +///
>> +/// If \p SubReg is 0 then return the full physical register
>> assigned to the
>> +/// virtual register ignoring subregs. If we aren't tracking
>> sub-reg liveness
>> +/// then we need to use this to be more conservative with clobbers
>> by killing
>> +/// all super reg and their sub reg COPYs as well. This is to
>> prevent COPY
>> +/// forwarding in cases like the following:
>> +///
>> +/// %vreg2 = COPY %vreg1:sub1
>> +/// %vreg3 = COPY %vreg1:sub0
>> +/// ... = OP1 %vreg2
>> +/// ... = OP2 %vreg3
>> +///
>> +/// After forward %vreg2 (assuming this is the last use of %vreg1)
>> and
>> +/// VirtRegRewriter adding kill markers we have:
>> +///
>> +/// %vreg3 = COPY %vreg1:sub0
>> +/// ... = OP1 %vreg1:sub1<kill>
>> +/// ... = OP2 %vreg3
>> +///
>> +/// If %vreg3 is assigned to a sub-reg of %vreg1, then after
>> rewriting we have:
>> +///
>> +/// ... = OP1 R0:sub1, R0<imp-use,kill>
>> +/// ... = OP2 R0:sub0
>> +///
>> +/// and the use of R0 by OP2 will not have a valid definition.
>> +unsigned MachineCopyPropagation::getPhysReg(unsigned Reg, unsigned
>> SubReg) {
>> +
>> + // Physical registers cannot have subregs.
>> + if (!TargetRegisterInfo::isVirtualRegister(Reg))
>> + return Reg;
>> +
>> + assert(PreRegRewrite && "Unexpected virtual register encountered");
>> + Reg = VRM->getPhys(Reg);
>> + if (SubReg && !NoSubRegLiveness)
>> + Reg = TRI->getSubReg(Reg, SubReg);
>> + return Reg;
>> +}
>> +
>> /// Remove instruction \p Copy if there exists a previous copy
>> that copies the
>> /// register \p Src to the register \p Def; This may happen
>> indirectly by
>> /// copying the super registers.
>> @@ -190,6 +360,325 @@ bool MachineCopyPropagation::eraseIfRedu
>> return true;
>> }
>>
>> +
>> +/// Decide whether we should forward the destination of \param Copy
>> to its use
>> +/// in \param UseI based on the register class of the Copy
>> operands. Same-class
>> +/// COPYs are always accepted by this function, but cross-class
>> COPYs are only
>> +/// accepted if they are forwarded to another COPY with the operand
>> register
>> +/// classes reversed. For example:
>> +///
>> +/// RegClassA = COPY RegClassB // Copy parameter
>> +/// ...
>> +/// RegClassB = COPY RegClassA // UseI parameter
>> +///
>> +/// which after forwarding becomes
>> +///
>> +/// RegClassA = COPY RegClassB
>> +/// ...
>> +/// RegClassB = COPY RegClassB
>> +///
>> +/// so we have reduced the number of cross-class COPYs and
>> potentially
>> +/// introduced a no COPY that can be removed.
>> +bool MachineCopyPropagation::isForwardableRegClassCopy(
>> + const MachineInstr &Copy, const MachineInstr &UseI) {
>> + auto isCross = [&](const MachineOperand &Dst, const
>> MachineOperand &Src) {
>> + unsigned DstReg = Dst.getReg();
>> + unsigned SrcPhysReg = getPhysReg(Src);
>> + const TargetRegisterClass *DstRC;
>> + if (TargetRegisterInfo::isVirtualRegister(DstReg)) {
>> + DstRC = MRI->getRegClass(DstReg);
>> + unsigned DstSubReg = Dst.getSubReg();
>> + if (DstSubReg)
>> + SrcPhysReg = TRI->getMatchingSuperReg(SrcPhysReg,
>> DstSubReg, DstRC);
>> + } else
>> + DstRC = TRI->getMinimalPhysRegClass(DstReg);
>> +
>> + return !DstRC->contains(SrcPhysReg);
>> + };
>> +
>> + const MachineOperand &CopyDst = Copy.getOperand(0);
>> + const MachineOperand &CopySrc = Copy.getOperand(1);
>> +
>> + if (!isCross(CopyDst, CopySrc))
>> + return true;
>> +
>> + if (!UseI.isCopy())
>> + return false;
>> +
>> + assert(getFullPhysReg(UseI.getOperand(1)) ==
>> getFullPhysReg(CopyDst));
>> + return !isCross(UseI.getOperand(0), CopySrc);
>> +}
>> +
>> +/// Check that the subregs on the copy source operand (\p CopySrc)
>> and the use
>> +/// operand to be forwarded to (\p MOUse) are compatible with doing
>> the
>> +/// forwarding. Also computes the new register and subregister to
>> be used in
>> +/// the forwarded-to instruction.
>> +std::tuple<unsigned, unsigned, bool>
>> MachineCopyPropagation::checkUseSubReg(
>> + const MachineOperand &CopySrc, const MachineOperand &MOUse) {
>> + unsigned NewUseReg = CopySrc.getReg();
>> + unsigned NewUseSubReg;
>> +
>> + if (TargetRegisterInfo::isPhysicalRegister(NewUseReg)) {
>> + // If MOUse is a virtual reg, we need to apply it to the new
>> physical reg
>> + // we're going to replace it with.
>> + if (MOUse.getSubReg())
>> + NewUseReg = TRI->getSubReg(NewUseReg, MOUse.getSubReg());
>> + // If the original use subreg isn't valid on the new src reg,
>> we can't
>> + // forward it here.
>> + if (!NewUseReg)
>> + return std::make_tuple(0, 0, false);
>> + NewUseSubReg = 0;
>> + } else {
>> + // %v1 = COPY %v2:sub1
>> + // USE %v1:sub2
>> + // The new use is %v2:sub1:sub2
>> + NewUseSubReg =
>> + TRI->composeSubRegIndices(CopySrc.getSubReg(),
>> MOUse.getSubReg());
>> + // Check that NewUseSubReg is valid on NewUseReg
>> + if (NewUseSubReg &&
>> + !TRI->getSubClassWithSubReg(MRI->getRegClass(NewUseReg),
>> NewUseSubReg))
>> + return std::make_tuple(0, 0, false);
>> + }
>> +
>> + return std::make_tuple(NewUseReg, NewUseSubReg, true);
>> +}
>> +
>> +/// Check that \p MI does not have implicit uses that overlap with
>> it's \p Use
>> +/// operand (the register being replaced), since these can sometimes
>> be
>> +/// implicitly tied to other operands. For example, on AMDGPU:
>> +///
>> +/// V_MOVRELS_B32_e32 %VGPR2, %M0<imp-use>, %EXEC<imp-use>,
>> %VGPR2_VGPR3_VGPR4_VGPR5<imp-use>
>> +///
>> +/// the %VGPR2 is implicitly tied to the larger reg operand, but we
>> have no
>> +/// way of knowing we need to update the latter when updating the
>> former.
>> +bool MachineCopyPropagation::hasImplicitOverlap(const MachineInstr
>> &MI,
>> + const
>> MachineOperand &Use) {
>> + if (!TargetRegisterInfo::isPhysicalRegister(Use.getReg()))
>> + return false;
>> +
>> + for (const MachineOperand &MIUse : MI.uses())
>> + if (&MIUse != &Use && MIUse.isReg() && MIUse.isImplicit() &&
>> + TRI->regsOverlap(Use.getReg(), MIUse.getReg()))
>> + return true;
>> +
>> + return false;
>> +}
>> +
>> +/// Narrow the register class of the forwarded vreg so it matches any
>> +/// instruction constraints. \p MI is the instruction being
>> forwarded to. \p
>> +/// MOUse is the operand being replaced in \p MI (which hasn't yet
>> been updated
>> +/// at the time this function is called). \p NewUseReg and \p
>> NewUseSubReg are
>> +/// what the \p MOUse will be changed to after forwarding.
>> +///
>> +/// If we are forwarding
>> +/// A:RCA = COPY B:RCB
>> +/// into
>> +/// ... = OP A:RCA
>> +///
>> +/// then we need to narrow the register class of B so that it is a
>> subclass
>> +/// of RCA so that it meets the instruction register class
>> constraints.
>> +void MachineCopyPropagation::narrowRegClass(const MachineInstr &MI,
>> + const MachineOperand
>> &MOUse,
>> + unsigned NewUseReg,
>> + unsigned NewUseSubReg) {
>> + if (!TargetRegisterInfo::isVirtualRegister(NewUseReg))
>> + return;
>> +
>> + // Make sure the virtual reg class allows the subreg.
>> + if (NewUseSubReg) {
>> + const TargetRegisterClass *CurUseRC =
>> MRI->getRegClass(NewUseReg);
>> + const TargetRegisterClass *NewUseRC =
>> + TRI->getSubClassWithSubReg(CurUseRC, NewUseSubReg);
>> + if (CurUseRC != NewUseRC) {
>> + DEBUG(dbgs() << "MCP: Setting regclass of " <<
>> PrintReg(NewUseReg, TRI)
>> + << " to " << TRI->getRegClassName(NewUseRC) <<
>> "\n");
>> + MRI->setRegClass(NewUseReg, NewUseRC);
>> + }
>> + }
>> +
>> + unsigned MOUseOpNo = &MOUse - &MI.getOperand(0);
>> + const TargetRegisterClass *InstRC =
>> + TII->getRegClass(MI.getDesc(), MOUseOpNo, TRI, *MF);
>> + if (InstRC) {
>> + const TargetRegisterClass *CurUseRC =
>> MRI->getRegClass(NewUseReg);
>> + if (NewUseSubReg)
>> + InstRC = TRI->getMatchingSuperRegClass(CurUseRC, InstRC,
>> NewUseSubReg);
>> + if (!InstRC->hasSubClassEq(CurUseRC)) {
>> + const TargetRegisterClass *NewUseRC =
>> + TRI->getCommonSubClass(InstRC, CurUseRC);
>> + DEBUG(dbgs() << "MCP: Setting regclass of " <<
>> PrintReg(NewUseReg, TRI)
>> + << " to " << TRI->getRegClassName(NewUseRC) <<
>> "\n");
>> + MRI->setRegClass(NewUseReg, NewUseRC);
>> + }
>> + }
>> +}
>> +
>> +/// Update the LiveInterval information to reflect the destination
>> of \p Copy
>> +/// being forwarded to a use in \p UseMI. \p OrigUseReg is the
>> register being
>> +/// forwarded through. It should be the destination register of \p
>> Copy and has
>> +/// already been replaced in \p UseMI at the point this function is
>> called. \p
>> +/// NewUseReg and \p NewUseSubReg are the register and subregister
>> being
>> +/// forwarded. They should be the source register of the \p Copy
>> and should be
>> +/// the value of the \p UseMI operand being forwarded at the point
>> this function
>> +/// is called.
>> +void MachineCopyPropagation::updateForwardedCopyLiveInterval(
>> + const MachineInstr &Copy, const MachineInstr &UseMI, unsigned
>> OrigUseReg,
>> + unsigned NewUseReg, unsigned NewUseSubReg) {
>> +
>> + assert(TRI->isSubRegisterEq(getPhysReg(OrigUseReg, 0),
>> + getFullPhysReg(Copy.getOperand(0))) &&
>> + "OrigUseReg mismatch");
>> + assert(TRI->isSubRegisterEq(getFullPhysReg(Copy.getOperand(1)),
>> + getPhysReg(NewUseReg, 0)) &&
>> + "NewUseReg mismatch");
>> +
>> + // Extend live range starting from COPY early-clobber slot, since
>> that
>> + // is where the original src live range ends.
>> + SlotIndex CopyUseIdx =
>> + Indexes->getInstructionIndex(Copy).getRegSlot(true
>> /*=EarlyClobber*/);
>> + SlotIndex UseIdx = Indexes->getInstructionIndex(U
>> seMI).getRegSlot();
>> + if (TargetRegisterInfo::isVirtualRegister(NewUseReg)) {
>> + LiveInterval &LI = LIS->getInterval(NewUseReg);
>> + LI.extendInBlock(CopyUseIdx, UseIdx);
>> + LaneBitmask UseMask = TRI->getSubRegIndexLaneMask(NewUseSubReg);
>> + for (auto &S : LI.subranges())
>> + if ((S.LaneMask & UseMask).any() && S.find(CopyUseIdx))
>> + S.extendInBlock(CopyUseIdx, UseIdx);
>> + } else {
>> + assert(NewUseSubReg == 0 && "Unexpected subreg on physical
>> register!");
>> + for (MCRegUnitIterator UI(NewUseReg, TRI); UI.isValid(); ++UI) {
>> + LiveRange &LR = LIS->getRegUnit(*UI);
>> + LR.extendInBlock(CopyUseIdx, UseIdx);
>> + }
>> + }
>> +
>> + if (!TargetRegisterInfo::isVirtualRegister(OrigUseReg))
>> + return;
>> +
>> + LiveInterval &LI = LIS->getInterval(OrigUseReg);
>> +
>> + // Can happen for undef uses.
>> + if (LI.empty())
>> + return;
>> +
>> + SlotIndex UseIndex = Indexes->getInstructionIndex(UseMI);
>> + const LiveRange::Segment *UseSeg = LI.getSegmentContaining(UseInd
>> ex);
>> +
>> + // Only shrink if forwarded use is the end of a segment.
>> + if (UseSeg->end != UseIndex.getRegSlot())
>> + return;
>> +
>> + SmallVector<MachineInstr *, 4> DeadInsts;
>> + LIS->shrinkToUses(&LI, &DeadInsts);
>> + if (!DeadInsts.empty()) {
>> + SmallVector<unsigned, 8> NewRegs;
>> + LiveRangeEdit(nullptr, NewRegs, *MF, *LIS, nullptr, this)
>> + .eliminateDeadDefs(DeadInsts);
>> + }
>> +}
>> +
>> +void MachineCopyPropagation::LRE_WillEraseInstruction(MachineInstr
>> *MI) {
>> + // Remove this COPY from further consideration for forwarding.
>> + ClobberRegister(getFullPhysReg(MI->getOperand(0)));
>> + Changed = true;
>> +}
>> +
>> +/// Look for available copies whose destination register is used by
>> \p MI and
>> +/// replace the use in \p MI with the copy's source register.
>> +void MachineCopyPropagation::forwardUses(MachineInstr &MI) {
>> + if (AvailCopyMap.empty())
>> + return;
>> +
>> + // Look for non-tied explicit vreg uses that have an active COPY
>> + // instruction that defines the physical register allocated to
>> them.
>> + // Replace the vreg with the source of the active COPY.
>> + for (MachineOperand &MOUse : MI.explicit_uses()) {
>> + if (!MOUse.isReg() || MOUse.isTied())
>> + continue;
>> +
>> + unsigned UseReg = MOUse.getReg();
>> + if (!UseReg)
>> + continue;
>> +
>> + if (TargetRegisterInfo::isVirtualRegister(UseReg))
>> + UseReg = VRM->getPhys(UseReg);
>> + else if (MI.isCall() || MI.isReturn() || MI.isInlineAsm() ||
>> + MI.hasUnmodeledSideEffects() || MI.isDebugValue() ||
>> MI.isKill())
>> + // Some instructions seem to have ABI uses e.g. not marked as
>> + // implicit, which can lead to forwarding them when we
>> shouldn't, so
>> + // restrict the types of instructions we forward physical
>> regs into.
>> + continue;
>> +
>> + // Don't forward COPYs via non-allocatable regs since they can
>> have
>> + // non-standard semantics.
>> + if (!MRI->isAllocatable(UseReg))
>> + continue;
>> +
>> + auto CI = AvailCopyMap.find(UseReg);
>> + if (CI == AvailCopyMap.end())
>> + continue;
>> +
>> + MachineInstr &Copy = *CI->second;
>> + MachineOperand &CopyDst = Copy.getOperand(0);
>> + MachineOperand &CopySrc = Copy.getOperand(1);
>> +
>> + // Don't forward COPYs that are already NOPs due to register
>> assignment.
>> + if (getPhysReg(CopyDst) == getPhysReg(CopySrc))
>> + continue;
>> +
>> + // FIXME: Don't handle partial uses of wider COPYs yet.
>> + if (CopyDst.getSubReg() != 0 || UseReg != getPhysReg(CopyDst))
>> + continue;
>> +
>> + // Don't forward COPYs of non-allocatable regs unless they are
>> constant.
>> + unsigned CopySrcReg = CopySrc.getReg();
>> + if (TargetRegisterInfo::isPhysicalRegister(CopySrcReg) &&
>> + !MRI->isAllocatable(CopySrcReg) &&
>> !MRI->isConstantPhysReg(CopySrcReg))
>> + continue;
>> +
>> + if (!isForwardableRegClassCopy(Copy, MI))
>> + continue;
>> +
>> + unsigned NewUseReg, NewUseSubReg;
>> + bool SubRegOK;
>> + std::tie(NewUseReg, NewUseSubReg, SubRegOK) =
>> + checkUseSubReg(CopySrc, MOUse);
>> + if (!SubRegOK)
>> + continue;
>> +
>> + if (hasImplicitOverlap(MI, MOUse))
>> + continue;
>> +
>> + DEBUG(dbgs() << "MCP: Replacing "
>> + << PrintReg(MOUse.getReg(), TRI, MOUse.getSubReg())
>> + << "\n with "
>> + << PrintReg(NewUseReg, TRI, CopySrc.getSubReg())
>> + << "\n in "
>> + << MI
>> + << " from "
>> + << Copy);
>> +
>> + narrowRegClass(MI, MOUse, NewUseReg, NewUseSubReg);
>> +
>> + unsigned OrigUseReg = MOUse.getReg();
>> + MOUse.setReg(NewUseReg);
>> + MOUse.setSubReg(NewUseSubReg);
>> +
>> + DEBUG(dbgs() << "MCP: After replacement: " << MI << "\n");
>> +
>> + if (PreRegRewrite)
>> + updateForwardedCopyLiveInterval(Copy, MI, OrigUseReg,
>> NewUseReg,
>> + NewUseSubReg);
>> + else
>> + for (MachineInstr &KMI :
>> + make_range(Copy.getIterator(),
>> std::next(MI.getIterator())))
>> + KMI.clearRegisterKills(NewUseReg, TRI);
>> +
>> + ++NumCopyForwards;
>> + Changed = true;
>> + }
>> +}
>> +
>> void MachineCopyPropagation::CopyPropagateBlock(MachineBasicBlock
>> &MBB) {
>> DEBUG(dbgs() << "MCP: CopyPropagateBlock " << MBB.getName() <<
>> "\n");
>>
>> @@ -198,12 +687,8 @@ void MachineCopyPropagation::CopyPropaga
>> ++I;
>>
>> if (MI->isCopy()) {
>> - unsigned Def = MI->getOperand(0).getReg();
>> - unsigned Src = MI->getOperand(1).getReg();
>> -
>> - assert(!TargetRegisterInfo::isVirtualRegister(Def) &&
>> - !TargetRegisterInfo::isVirtualRegister(Src) &&
>> - "MachineCopyPropagation should be run after register
>> allocation!");
>> + unsigned Def = getPhysReg(MI->getOperand(0));
>> + unsigned Src = getPhysReg(MI->getOperand(1));
>>
>> // The two copies cancel out and the source of the first copy
>> // hasn't been overridden, eliminate the second one. e.g.
>> @@ -220,8 +705,16 @@ void MachineCopyPropagation::CopyPropaga
>> // %ECX<def> = COPY %EAX
>> // =>
>> // %ECX<def> = COPY %EAX
>> - if (eraseIfRedundant(*MI, Def, Src) || eraseIfRedundant(*MI,
>> Src, Def))
>> - continue;
>> + if (!PreRegRewrite)
>> + if (eraseIfRedundant(*MI, Def, Src) ||
>> eraseIfRedundant(*MI, Src, Def))
>> + continue;
>> +
>> + forwardUses(*MI);
>> +
>> + // Src may have been changed by forwardUses()
>> + Src = getPhysReg(MI->getOperand(1));
>> + unsigned DefClobber = getFullPhysReg(MI->getOperand(0));
>> + unsigned SrcClobber = getFullPhysReg(MI->getOperand(1));
>>
>> // If Src is defined by a previous copy, the previous copy
>> cannot be
>> // eliminated.
>> @@ -238,7 +731,10 @@ void MachineCopyPropagation::CopyPropaga
>> DEBUG(dbgs() << "MCP: Copy is a deletion candidate: ";
>> MI->dump());
>>
>> // Copy is now a candidate for deletion.
>> - if (!MRI->isReserved(Def))
>> + // Only look for dead COPYs if we're not running just before
>> + // VirtRegRewriter, since presumably these COPYs will have
>> already been
>> + // removed.
>> + if (!PreRegRewrite && !MRI->isReserved(Def))
>> MaybeDeadCopies.insert(MI);
>>
>> // If 'Def' is previously source of another copy, then this
>> earlier copy's
>> @@ -248,11 +744,11 @@ void MachineCopyPropagation::CopyPropaga
>> // %xmm2<def> = copy %xmm0
>> // ...
>> // %xmm2<def> = copy %xmm9
>> - ClobberRegister(Def);
>> + ClobberRegister(DefClobber);
>> for (const MachineOperand &MO : MI->implicit_operands()) {
>> if (!MO.isReg() || !MO.isDef())
>> continue;
>> - unsigned Reg = MO.getReg();
>> + unsigned Reg = getFullPhysReg(MO);
>> if (!Reg)
>> continue;
>> ClobberRegister(Reg);
>> @@ -267,13 +763,27 @@ void MachineCopyPropagation::CopyPropaga
>>
>> // Remember source that's copied to Def. Once it's
>> clobbered, then
>> // it's no longer available for copy propagation.
>> - RegList &DestList = SrcMap[Src];
>> - if (!is_contained(DestList, Def))
>> - DestList.push_back(Def);
>> + RegList &DestList = SrcMap[SrcClobber];
>> + if (!is_contained(DestList, DefClobber))
>> + DestList.push_back(DefClobber);
>>
>> continue;
>> }
>>
>> + // Clobber any earlyclobber regs first.
>> + for (const MachineOperand &MO : MI->operands())
>> + if (MO.isReg() && MO.isEarlyClobber()) {
>> + unsigned Reg = getFullPhysReg(MO);
>> + // If we have a tied earlyclobber, that means it is also
>> read by this
>> + // instruction, so we need to make sure we don't remove it
>> as dead
>> + // later.
>> + if (MO.isTied())
>> + ReadRegister(Reg);
>> + ClobberRegister(Reg);
>> + }
>> +
>> + forwardUses(*MI);
>> +
>> // Not a copy.
>> SmallVector<unsigned, 2> Defs;
>> const MachineOperand *RegMask = nullptr;
>> @@ -282,14 +792,11 @@ void MachineCopyPropagation::CopyPropaga
>> RegMask = &MO;
>> if (!MO.isReg())
>> continue;
>> - unsigned Reg = MO.getReg();
>> + unsigned Reg = getFullPhysReg(MO);
>> if (!Reg)
>> continue;
>>
>> - assert(!TargetRegisterInfo::isVirtualRegister(Reg) &&
>> - "MachineCopyPropagation should be run after register
>> allocation!");
>> -
>> - if (MO.isDef()) {
>> + if (MO.isDef() && !MO.isEarlyClobber()) {
>> Defs.push_back(Reg);
>> continue;
>> } else if (MO.readsReg())
>> @@ -346,6 +853,8 @@ void MachineCopyPropagation::CopyPropaga
>> // since we don't want to trust live-in lists.
>> if (MBB.succ_empty()) {
>> for (MachineInstr *MaybeDead : MaybeDeadCopies) {
>> + DEBUG(dbgs() << "MCP: Removing copy due to no live-out succ: ";
>> + MaybeDead->dump());
>> assert(!MRI->isReserved(MaybeDead->getOperand(0).getReg()));
>> MaybeDead->eraseFromParent();
>> Changed = true;
>> @@ -368,10 +877,16 @@ bool MachineCopyPropagation::runOnMachin
>> TRI = MF.getSubtarget().getRegisterInfo();
>> TII = MF.getSubtarget().getInstrInfo();
>> MRI = &MF.getRegInfo();
>> + this->MF = &MF;
>> + if (PreRegRewrite) {
>> + Indexes = &getAnalysis<SlotIndexes>();
>> + LIS = &getAnalysis<LiveIntervals>();
>> + VRM = &getAnalysis<VirtRegMap>();
>> + }
>> + NoSubRegLiveness = !MRI->subRegLivenessEnabled();
>>
>> for (MachineBasicBlock &MBB : MF)
>> CopyPropagateBlock(MBB);
>>
>> return Changed;
>> }
>> -
>>
>> Modified: llvm/trunk/lib/CodeGen/TargetPassConfig.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/
>> TargetPassConfig.cpp?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/
>> TargetPassConfig.cpp?rev=311038&r1=311037&r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/CodeGen/TargetPassConfig.cpp (original)
>> +++ llvm/trunk/lib/CodeGen/TargetPassConfig.cpp Wed Aug 16 13:50:01
>> 2017
>> @@ -88,6 +88,8 @@ static cl::opt<bool> DisableCGP("disable
>> cl::desc("Disable Codegen Prepare"));
>> static cl::opt<bool> DisableCopyProp("disable-copyprop",
>> cl::Hidden,
>> cl::desc("Disable Copy Propagation pass"));
>> +static cl::opt<bool>
>> DisableCopyPropPreRegRewrite("disable-copyprop-prerewrite",
>> cl::Hidden,
>> + cl::desc("Disable Copy Propagation Pre-Register Re-write pass"));
>> static cl::opt<bool>
>> DisablePartialLibcallInlining("disable-partial-libcall-inlining",
>> cl::Hidden, cl::desc("Disable Partial Libcall Inlining"));
>> static cl::opt<bool> EnableImplicitNullChecks(
>> @@ -248,6 +250,9 @@ static IdentifyingPassPtr overridePass(A
>> if (StandardID == &MachineCopyPropagationID)
>> return applyDisable(TargetID, DisableCopyProp);
>>
>> + if (StandardID == &MachineCopyPropagationPreRegRewriteID)
>> + return applyDisable(TargetID, DisableCopyPropPreRegRewrite);
>> +
>> return TargetID;
>> }
>>
>> @@ -1059,6 +1064,10 @@ void TargetPassConfig::addOptimizedRegAl
>> // Allow targets to change the register assignments before
>> rewriting.
>> addPreRewrite();
>>
>> + // Copy propagate to forward register uses and try to eliminate
>> COPYs that
>> + // were not coalesced.
>> + addPass(&MachineCopyPropagationPreRegRewriteID);
>> +
>> // Finally rewrite virtual registers.
>> addPass(&VirtRegRewriterID);
>>
>>
>> Modified: llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> AArch64/aarch64-fold-lslfast.ll?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen
>> /AArch64/aarch64-fold-lslfast.ll?rev=311038&r1=311037&r2=311038&view=diff
>> >
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll
>> (original)
>> +++ llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll Wed Aug
>> 16 13:50:01 2017
>> @@ -9,7 +9,8 @@ define i16 @halfword(%struct.a* %ctx, i3
>> ; CHECK-LABEL: halfword:
>> ; CHECK: ubfx [[REG:x[0-9]+]], x1, #9, #8
>> ; CHECK: ldrh [[REG1:w[0-9]+]], [{{.*}}[[REG2:x[0-9]+]], [[REG]],
>> lsl #1]
>> -; CHECK: strh [[REG1]], [{{.*}}[[REG2]], [[REG]], lsl #1]
>> +; CHECK: mov [[REG3:x[0-9]+]], [[REG2]]
>> +; CHECK: strh [[REG1]], [{{.*}}[[REG3]], [[REG]], lsl #1]
>> %shr81 = lshr i32 %xor72, 9
>> %conv82 = zext i32 %shr81 to i64
>> %idxprom83 = and i64 %conv82, 255
>> @@ -24,7 +25,8 @@ define i32 @word(%struct.b* %ctx, i32 %x
>> ; CHECK-LABEL: word:
>> ; CHECK: ubfx [[REG:x[0-9]+]], x1, #9, #8
>> ; CHECK: ldr [[REG1:w[0-9]+]], [{{.*}}[[REG2:x[0-9]+]], [[REG]],
>> lsl #2]
>> -; CHECK: str [[REG1]], [{{.*}}[[REG2]], [[REG]], lsl #2]
>> +; CHECK: mov [[REG3:x[0-9]+]], [[REG2]]
>> +; CHECK: str [[REG1]], [{{.*}}[[REG3]], [[REG]], lsl #2]
>> %shr81 = lshr i32 %xor72, 9
>> %conv82 = zext i32 %shr81 to i64
>> %idxprom83 = and i64 %conv82, 255
>> @@ -39,7 +41,8 @@ define i64 @doubleword(%struct.c* %ctx,
>> ; CHECK-LABEL: doubleword:
>> ; CHECK: ubfx [[REG:x[0-9]+]], x1, #9, #8
>> ; CHECK: ldr [[REG1:x[0-9]+]], [{{.*}}[[REG2:x[0-9]+]], [[REG]],
>> lsl #3]
>> -; CHECK: str [[REG1]], [{{.*}}[[REG2]], [[REG]], lsl #3]
>> +; CHECK: mov [[REG3:x[0-9]+]], [[REG2]]
>> +; CHECK: str [[REG1]], [{{.*}}[[REG3]], [[REG]], lsl #3]
>> %shr81 = lshr i32 %xor72, 9
>> %conv82 = zext i32 %shr81 to i64
>> %idxprom83 = and i64 %conv82, 255
>>
>> Modified: llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> AArch64/arm64-AdvSIMD-Scalar.ll?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen
>> /AArch64/arm64-AdvSIMD-Scalar.ll?rev=311038&r1=311037&r2=311038&view=diff
>> >
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll
>> (original)
>> +++ llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll Wed Aug
>> 16 13:50:01 2017
>> @@ -8,15 +8,9 @@ define <2 x i64> @bar(<2 x i64> %a, <2 x
>> ; CHECK: add.2d v[[REG:[0-9]+]], v0, v1
>> ; CHECK: add d[[REG3:[0-9]+]], d[[REG]], d1
>> ; CHECK: sub d[[REG2:[0-9]+]], d[[REG]], d1
>> -; Without advanced copy optimization, we end up with cross register
>> -; banks copies that cannot be coalesced.
>> -; CHECK-NOOPT: fmov [[COPY_REG3:x[0-9]+]], d[[REG3]]
>> -; With advanced copy optimization, we end up with just one copy
>> -; to insert the computed high part into the V register.
>> -; CHECK-OPT-NOT: fmov
>> +; CHECK-NOT: fmov
>> ; CHECK: fmov [[COPY_REG2:x[0-9]+]], d[[REG2]]
>> -; CHECK-NOOPT: fmov d0, [[COPY_REG3]]
>> -; CHECK-OPT-NOT: fmov
>> +; CHECK-NOT: fmov
>> ; CHECK: ins.d v0[1], [[COPY_REG2]]
>> ; CHECK-NEXT: ret
>> ;
>> @@ -24,11 +18,9 @@ define <2 x i64> @bar(<2 x i64> %a, <2 x
>> ; GENERIC: add v[[REG:[0-9]+]].2d, v0.2d, v1.2d
>> ; GENERIC: add d[[REG3:[0-9]+]], d[[REG]], d1
>> ; GENERIC: sub d[[REG2:[0-9]+]], d[[REG]], d1
>> -; GENERIC-NOOPT: fmov [[COPY_REG3:x[0-9]+]], d[[REG3]]
>> -; GENERIC-OPT-NOT: fmov
>> +; GENERIC-NOT: fmov
>> ; GENERIC: fmov [[COPY_REG2:x[0-9]+]], d[[REG2]]
>> -; GENERIC-NOOPT: fmov d0, [[COPY_REG3]]
>> -; GENERIC-OPT-NOT: fmov
>> +; GENERIC-NOT: fmov
>> ; GENERIC: ins v0.d[1], [[COPY_REG2]]
>> ; GENERIC-NEXT: ret
>> %add = add <2 x i64> %a, %b
>>
>> Modified: llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> AArch64/arm64-zero-cycle-regmov.ll?rev=311038&r1=311037
>> &r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen
>> /AArch64/arm64-zero-cycle-regmov.ll?rev=311038&r1=311037
>> &r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll
>> (original)
>> +++ llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll Wed
>> Aug 16 13:50:01 2017
>> @@ -4,8 +4,10 @@
>> define i32 @t(i32 %a, i32 %b, i32 %c, i32 %d) nounwind ssp {
>> entry:
>> ; CHECK-LABEL: t:
>> -; CHECK: mov x0, [[REG1:x[0-9]+]]
>> -; CHECK: mov x1, [[REG2:x[0-9]+]]
>> +; CHECK: mov [[REG2:x[0-9]+]], x3
>> +; CHECK: mov [[REG1:x[0-9]+]], x2
>> +; CHECK: mov x0, x2
>> +; CHECK: mov x1, x3
>> ; CHECK: bl _foo
>> ; CHECK: mov x0, [[REG1]]
>> ; CHECK: mov x1, [[REG2]]
>>
>> Modified: llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> AArch64/f16-instructions.ll?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen
>> /AArch64/f16-instructions.ll?rev=311038&r1=311037&r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll (original)
>> +++ llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll Wed Aug 16
>> 13:50:01 2017
>> @@ -350,7 +350,7 @@ else:
>>
>> ; CHECK-LABEL: test_phi:
>> ; CHECK: mov x[[PTR:[0-9]+]], x0
>> -; CHECK: ldr h[[AB:[0-9]+]], [x[[PTR]]]
>> +; CHECK: ldr h[[AB:[0-9]+]], [x0]
>> ; CHECK: [[LOOP:LBB[0-9_]+]]:
>> ; CHECK: mov.16b v[[R:[0-9]+]], v[[AB]]
>> ; CHECK: ldr h[[AB]], [x[[PTR]]]
>>
>> Modified: llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> AArch64/flags-multiuse.ll?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen
>> /AArch64/flags-multiuse.ll?rev=311038&r1=311037&r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll (original)
>> +++ llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll Wed Aug 16
>> 13:50:01 2017
>> @@ -17,6 +17,9 @@ define i32 @test_multiflag(i32 %n, i32 %
>> %val = zext i1 %test to i32
>> ; CHECK: cset {{[xw][0-9]+}}, ne
>>
>> +; CHECK: mov [[RHSCOPY:w[0-9]+]], [[RHS]]
>> +; CHECK: mov [[LHSCOPY:w[0-9]+]], [[LHS]]
>> +
>> store i32 %val, i32* @var
>>
>> call void @bar()
>> @@ -25,7 +28,7 @@ define i32 @test_multiflag(i32 %n, i32 %
>> ; Currently, the comparison is emitted again. An MSR/MRS pair
>> would also be
>> ; acceptable, but assuming the call preserves NZCV is not.
>> br i1 %test, label %iftrue, label %iffalse
>> -; CHECK: cmp [[LHS]], [[RHS]]
>> +; CHECK: cmp [[LHSCOPY]], [[RHSCOPY]]
>> ; CHECK: b.eq
>>
>> iftrue:
>>
>> Modified: llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> AArch64/merge-store-dependency.ll?rev=311038&r1=311037&r2=
>> 311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen
>> /AArch64/merge-store-dependency.ll?rev=311038&r1=311037&r2=
>> 311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll
>> (original)
>> +++ llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll Wed
>> Aug 16 13:50:01 2017
>> @@ -8,10 +8,9 @@
>> define void @test(%struct1* %fde, i32 %fd, void (i32, i32, i8*)*
>> %func, i8* %arg) {
>> ;CHECK-LABEL: test
>> entry:
>> -; A53: mov [[DATA:w[0-9]+]], w1
>> ; A53: str q{{[0-9]+}}, {{.*}}
>> ; A53: str q{{[0-9]+}}, {{.*}}
>> -; A53: str [[DATA]], {{.*}}
>> +; A53: str w1, {{.*}}
>>
>> %0 = bitcast %struct1* %fde to i8*
>> tail call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 40, i32
>> 8, i1 false)
>>
>> Modified: llvm/trunk/test/CodeGen/AArch64/neg-imm.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> AArch64/neg-imm.ll?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen
>> /AArch64/neg-imm.ll?rev=311038&r1=311037&r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/AArch64/neg-imm.ll (original)
>> +++ llvm/trunk/test/CodeGen/AArch64/neg-imm.ll Wed Aug 16 13:50:01
>> 2017
>> @@ -7,8 +7,8 @@ declare void @foo(i32)
>> define void @test(i32 %px) {
>> ; CHECK_LABEL: test:
>> ; CHECK_LABEL: %entry
>> -; CHECK: subs
>> -; CHECK-NEXT: csel
>> +; CHECK: subs [[REG0:w[0-9]+]],
>> +; CHECK: csel {{w[0-9]+}}, wzr, [[REG0]]
>> entry:
>> %sub = add nsw i32 %px, -1
>> %cmp = icmp slt i32 %px, 1
>>
>> Modified: llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> AMDGPU/byval-frame-setup.ll?rev=311038&r1=311037&r2=311038&view=diff
>> <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen
>> /AMDGPU/byval-frame-setup.ll?rev=311038&r1=311037&r2=311038&view=diff>
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll (original)
>> +++ llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll Wed Aug 16
>> 13:50:01 2017
>> @@ -127,20 +127,21 @@ entry:
>> }
>>
>> ; GCN-LABEL: {{^}}call_void_func_byval_struct_kernel:
>> -; GCN: s_mov_b32 s33, s7
>> -; GCN: s_add_u32 s32, s33, 0xa00{{$}}
>> +; GCN: s_add_u32 s32, s7, 0xa00{{$}}
>>
>> ; GCN-DAG: v_mov_b32_e32 [[NINE:v[0-9]+]], 9
>> ; GCN-DAG: v_mov_b32_e32 [[THIRTEEN:v[0-9]+]], 13
>> -; GCN-DAG: buffer_store_dword [[NINE]], off, s[0:3], s33 offset:8
>> -; GCN: buffer_store_dword [[THIRTEEN]], off, s[0:3], s33 offset:24
>> +; GCN-DAG: buffer_store_dword [[NINE]], off, s[0:3], s7 offset:8
>> +; GCN: buffer_store_dword [[THIRTEEN]], off, s[0:3], s7 offset:24
>> +
>> +; GCN: s_mov_b32 s33, s7
>>
>> ; GCN-DAG: s_add_u32 s32, s32, 0x800{{$}}
>>
>> -; GCN-DAG: buffer_load_dword [[LOAD0:v[0-9]+]], off, s[0:3], s33
>> offset:8
>> -; GCN-DAG: buffer_load_dword [[LOAD1:v[0-9]+]], off, s[0:3], s33
>> offset:12
>> -; GCN-DAG: buffer_load_dword [[LOAD2:v[0-9]+]], off, s[0:3], s33
>> offset:16
>> -; GCN-DAG: buffer_load_dword [[LOAD3:v[0-9]+]], off, s[0:3], s33
>> offset:20
>> +; GCN-DAG: buffer_load_dword [[LOAD0:v[0-9]+]], off, s[0:3],
>> s{{7|33}} offset:8
>> +; GCN-DAG: buffer_load_dword [[LOAD1:v[0-9]+]], off, s[0:3],
>> s{{7|33}} offset:12
>> +; GCN-DAG: buffer_load_dword [[LOAD2:v[0-9]+]], off, s[0:3],
>> s{{7|33}} offset:16
>> +; GCN-DAG: buffer_load_dword [[LOAD3:v[0-9]+]], off, s[0:3],
>> s{{7|33}} offset:20
>>
>> ; GCN-DAG: buffer_store_dword [[LOAD0]], off, s[0:3], s32
>> offset:4{{$}}
>> ; GCN-DAG: buffer_store_dword [[LOAD1]], off, s[0:3], s32 offset:8
>>
>> Modified: llvm/trunk/test/CodeGen/AMDGPU/call-argument-types.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> AMDGPU/call-argument-types.ll?rev=311038&r1=311037&r2=311038&view=diff
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170817/e827419f/attachment-0001.html>
More information about the llvm-commits
mailing list