[llvm] r311038 - [MachineCopyPropagation] Extend pass to do COPY source forwarding

Geoff Berry via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 22 20:46:04 PDT 2017


I've been trying to reproduce this failure, but I'm getting check-asan 
hangs even if I test the revision with my latest revert:

Is there something I need to do to the device that isn't done by these 
scripts?  Are there additional environment variables I need to set 
before running?

For the record I'm running:
$ BUILDBOT_CLOBBER= BUILDBOT_REVISION=311142 ./buildbot_android.sh

r311142 is my latest reversion of my MachineCopyPropagation changes
and the only changes I've made to the scripts are:
- change the svn repo to be https:// instead of http:// to avoid svn 
timeouts
- add -DLLVM_LIT_ARGS="-v" to the compiler-rt cmake args so running 
check-asan will print the tests names as they run to the log file

It appears that the following tests are hanging:
/data/local/tmp/Output/deep_call_stack.cc.tmp
/data/local/tmp/Output/interception_test.cc.tmp
/data/local/tmp/Output/quarantine_size_mb.cc.tmp
/data/local/tmp/Output/thread_local_quarantine_size_kb.cc.tmp
/data/local/tmp/Output/coverage-module-unloaded.cc.tmp.exe
/data/local/tmp/Output/coverage-reset.cc.tmp
/data/local/tmp/Output/coverage.cc.tmp
/data/local/tmp/Output/start-deactivated.cc.tmp


On 8/17/2017 9:24 PM, Vitaly Buka wrote:
> If you have android with arm 32bit, you probably can reproduce with 
> running 
> https://llvm.org/svn/llvm-project/zorg/trunk/zorg/buildbot/builders/sanitizers/buildbot_android.sh 
> in empty directory.
> 
> On Thu, Aug 17, 2017 at 6:23 PM, Vitaly Buka <vitalybuka at google.com 
> <mailto:vitalybuka at google.com>> wrote:
> 
>     I just hangs on "Running the AddressSanitizer tests" for arm
> 
>     /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_arm/lib/asan/tests/AsanTest:
>     1 file pus
>     hed. 3.7 MB/s (10602792 bytes in 2.742s)
>     + echo @@@BUILD_STEP run asan lit tests
>     '[arm/sailfish-userdebug/OPR1.170621.001]@@@'
>     @@@BUILD_STEP run asan lit tests
>     [arm/sailfish-userdebug/OPR1.170621.001]@@@
>     + cd
>     /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/compiler_rt_build_android_arm
>     + ninja check-asan
>     [1/1] Running the AddressSanitizer tests
> 
> 
>     But the same device passes aarch64 tests:
>     [1/1] Running the AddressSanitizer tests
>     -- Testing: 407 tests, 12 threads --
>     Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
>     Testing Time: 87.09s
>        Expected Passes    : 202
>        Expected Failures  : 27
>        Unsupported Tests  : 178
>     + echo @@@BUILD_STEP run sanitizer_common tests
>     '[aarch64/sailfish-userdebug/OPR1.170621.001]@@@'
>     @@@BUILD_STEP run sanitizer_common tests
>     [aarch64/sailfish-userdebug/OPR1.170621.001]@@@
> 
> 
> 
>     On Thu, Aug 17, 2017 at 5:42 PM, Geoff Berry <gberry at codeaurora.org
>     <mailto:gberry at codeaurora.org>> wrote:
> 
>         It seems like this is still happening after fixing a couple of
>         issues with this patch.  I'll revert again shortly.  I'm having
>         a hard time understanding what is failing from looking at the
>         buildbot logs though. Do you have any way of determining what
>         exactly is timing out?
> 
>         On 8/16/2017 11:13 PM, Vitaly Buka wrote:
> 
>             Looks like after this patch Android tests consistently hang
>             http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/1825
>             <http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/1825>
> 
>             On Wed, Aug 16, 2017 at 1:50 PM, Geoff Berry via
>             llvm-commits <llvm-commits at lists.llvm.org
>             <mailto:llvm-commits at lists.llvm.org>
>             <mailto:llvm-commits at lists.llvm.org
>             <mailto:llvm-commits at lists.llvm.org>>> wrote:
> 
>                  Author: gberry
>                  Date: Wed Aug 16 13:50:01 2017
>                  New Revision: 311038
> 
>                  URL:
>             http://llvm.org/viewvc/llvm-project?rev=311038&view=rev
>             <http://llvm.org/viewvc/llvm-project?rev=311038&view=rev>
>                 
>             <http://llvm.org/viewvc/llvm-project?rev=311038&view=rev
>             <http://llvm.org/viewvc/llvm-project?rev=311038&view=rev>>
>                  Log:
>                  [MachineCopyPropagation] Extend pass to do COPY source
>             forwarding
> 
>                  This change extends MachineCopyPropagation to do COPY
>             source forwarding.
> 
>                  This change also extends the MachineCopyPropagation
>             pass to be able to
>                  be run during register allocation, after physical
>             registers have been
>                  assigned, but before the virtual registers have been
>             re-written, which
>                  allows it to remove virtual register COPY LiveIntervals
>             that become dead
>                  through the forwarding of all of their uses.
> 
>                  Reviewers: qcolombet, javed.absar, MatzeB, jonpa
> 
>                  Subscribers: jyknight, nemanjai, llvm-commits,
>             nhaehnle, mcrosier,
>                  mgorny
> 
>                  Differential Revision: https://reviews.llvm.org/D30751
>             <https://reviews.llvm.org/D30751>
>                  <https://reviews.llvm.org/D30751
>             <https://reviews.llvm.org/D30751>>
> 
>                  Modified:
>                       llvm/trunk/include/llvm/CodeGen/Passes.h
>                       llvm/trunk/include/llvm/InitializePasses.h
>                       llvm/trunk/lib/CodeGen/CodeGen.cpp
>                       llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp
>                       llvm/trunk/lib/CodeGen/TargetPassConfig.cpp
>                     
>               llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll
>                     
>               llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll
>                     
>               llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll
>                       llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll
>                       llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll
>                     
>               llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll
>                       llvm/trunk/test/CodeGen/AArch64/neg-imm.ll
>                       llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll
>                       llvm/trunk/test/CodeGen/AMDGPU/call-argument-types.ll
>                     
>               llvm/trunk/test/CodeGen/AMDGPU/call-preserved-registers.ll
>                     
>               llvm/trunk/test/CodeGen/AMDGPU/callee-special-input-sgprs.ll
>                     
>               llvm/trunk/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll
>                       llvm/trunk/test/CodeGen/AMDGPU/mubuf-offset-private.ll
>                       llvm/trunk/test/CodeGen/AMDGPU/multilevel-break.ll
>                     
>               llvm/trunk/test/CodeGen/AMDGPU/private-access-no-objects.ll
>                       llvm/trunk/test/CodeGen/AMDGPU/ret.ll
>                       llvm/trunk/test/CodeGen/ARM/atomic-op.ll
>                       llvm/trunk/test/CodeGen/ARM/swifterror.ll
>                       llvm/trunk/test/CodeGen/Mips/llvm-ir/sub.ll
>                       llvm/trunk/test/CodeGen/PowerPC/fma-mutate.ll
>                       llvm/trunk/test/CodeGen/PowerPC/inlineasm-i64-reg.ll
>                       llvm/trunk/test/CodeGen/PowerPC/tail-dup-layout.ll
>                       llvm/trunk/test/CodeGen/SPARC/32abi.ll
>                       llvm/trunk/test/CodeGen/SPARC/atomics.ll
>                       llvm/trunk/test/CodeGen/Thumb/thumb-shrink-wrapping.ll
>                     
>               llvm/trunk/test/CodeGen/X86/2006-03-01-InstrSchedBug.ll
>                       llvm/trunk/test/CodeGen/X86/arg-copy-elide.ll
>                       llvm/trunk/test/CodeGen/X86/avg.ll
>                       llvm/trunk/test/CodeGen/X86/avx-load-store.ll
>                       llvm/trunk/test/CodeGen/X86/avx512-bugfix-25270.ll
>                       llvm/trunk/test/CodeGen/X86/avx512-calling-conv.ll
>                       llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
>                     
>               llvm/trunk/test/CodeGen/X86/avx512bw-intrinsics-upgrade.ll
>                     
>               llvm/trunk/test/CodeGen/X86/bitcast-int-to-vector-bool-sext.ll
>                     
>               llvm/trunk/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll
>                       llvm/trunk/test/CodeGen/X86/buildvec-insertvec.ll
>                       llvm/trunk/test/CodeGen/X86/combine-fcopysign.ll
>                       llvm/trunk/test/CodeGen/X86/complex-fastmath.ll
>                       llvm/trunk/test/CodeGen/X86/divide-by-constant.ll
>                       llvm/trunk/test/CodeGen/X86/fmaxnum.ll
>                       llvm/trunk/test/CodeGen/X86/fminnum.ll
>                       llvm/trunk/test/CodeGen/X86/fp128-i128.ll
>                       llvm/trunk/test/CodeGen/X86/haddsub-2.ll
>                       llvm/trunk/test/CodeGen/X86/haddsub-undef.ll
>                       llvm/trunk/test/CodeGen/X86/half.ll
>                       llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll
>                       llvm/trunk/test/CodeGen/X86/ipra-local-linkage.ll
>                       llvm/trunk/test/CodeGen/X86/localescape.ll
>                       llvm/trunk/test/CodeGen/X86/mul-i1024.ll
>                       llvm/trunk/test/CodeGen/X86/mul-i512.ll
>                       llvm/trunk/test/CodeGen/X86/mul128.ll
>                       llvm/trunk/test/CodeGen/X86/pmul.ll
>                       llvm/trunk/test/CodeGen/X86/powi.ll
>                       llvm/trunk/test/CodeGen/X86/pr11334.ll
>                       llvm/trunk/test/CodeGen/X86/pr29112.ll
>                       llvm/trunk/test/CodeGen/X86/psubus.ll
>                       llvm/trunk/test/CodeGen/X86/select.ll
>                       llvm/trunk/test/CodeGen/X86/shrink-wrap-chkstk.ll
>                       llvm/trunk/test/CodeGen/X86/sqrt-fastmath.ll
>                       llvm/trunk/test/CodeGen/X86/sse-scalar-fp-arith.ll
>                       llvm/trunk/test/CodeGen/X86/sse1.ll
>                       llvm/trunk/test/CodeGen/X86/sse3-avx-addsub-2.ll
>                       llvm/trunk/test/CodeGen/X86/statepoint-live-in.ll
>                       llvm/trunk/test/CodeGen/X86/statepoint-stack-usage.ll
>                       llvm/trunk/test/CodeGen/X86/vec_fp_to_int.ll
>                       llvm/trunk/test/CodeGen/X86/vec_int_to_fp.ll
>                       llvm/trunk/test/CodeGen/X86/vec_minmax_sint.ll
>                       llvm/trunk/test/CodeGen/X86/vec_shift4.ll
>                       llvm/trunk/test/CodeGen/X86/vector-blend.ll
>                       llvm/trunk/test/CodeGen/X86/vector-idiv-sdiv-128.ll
>                       llvm/trunk/test/CodeGen/X86/vector-idiv-udiv-128.ll
>                       llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll
>                       llvm/trunk/test/CodeGen/X86/vector-sext.ll
>                       llvm/trunk/test/CodeGen/X86/vector-shift-ashr-128.ll
>                       llvm/trunk/test/CodeGen/X86/vector-shift-lshr-128.ll
>                       llvm/trunk/test/CodeGen/X86/vector-shift-shl-128.ll
>                     
>               llvm/trunk/test/CodeGen/X86/vector-shuffle-combining.ll
>                       llvm/trunk/test/CodeGen/X86/vector-trunc-math.ll
>                       llvm/trunk/test/CodeGen/X86/vector-zext.ll
>                       llvm/trunk/test/CodeGen/X86/vselect-minmax.ll
>                       llvm/trunk/test/CodeGen/X86/widen_conv-3.ll
>                       llvm/trunk/test/CodeGen/X86/widen_conv-4.ll
>                       llvm/trunk/test/CodeGen/X86/x86-shrink-wrap-unwind.ll
>                       llvm/trunk/test/CodeGen/X86/x86-shrink-wrapping.ll
> 
>                  Modified: llvm/trunk/include/llvm/CodeGen/Passes.h
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/Passes.h?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/Passes.h?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/Passes.h?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/Passes.h?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  --- llvm/trunk/include/llvm/CodeGen/Passes.h (original)
>                  +++ llvm/trunk/include/llvm/CodeGen/Passes.h Wed Aug 16
>             13:50:01 2017
>                  @@ -278,6 +278,11 @@ namespace llvm {
>                      /// MachineSinking - This pass performs sinking on
>             machine
>                  instructions.
>                      extern char &MachineSinkingID;
> 
>                  +  /// MachineCopyPropagationPreRegRewrite - This pass
>             performs copy
>                  propagation
>                  +  /// on machine instructions after register
>             allocation but before
>                  virtual
>                  +  /// register re-writing..
>                  +  extern char &MachineCopyPropagationPreRegRewriteID;
>                  +
>                      /// MachineCopyPropagation - This pass performs
>             copy propagation on
>                      /// machine instructions.
>                      extern char &MachineCopyPropagationID;
> 
>                  Modified: llvm/trunk/include/llvm/InitializePasses.h
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/InitializePasses.h?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/InitializePasses.h?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/InitializePasses.h?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/InitializePasses.h?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  --- llvm/trunk/include/llvm/InitializePasses.h (original)
>                  +++ llvm/trunk/include/llvm/InitializePasses.h Wed Aug
>             16 13:50:01 2017
>                  @@ -233,6 +233,7 @@ void
>             initializeMachineBranchProbabilityI
>                    void initializeMachineCSEPass(PassRegistry&);
>                    void initializeMachineCombinerPass(PassRegistry&);
>                    void initializeMachineCopyPropagationPass(PassRegistry&);
>                  +void
>             initializeMachineCopyPropagationPreRegRewritePass(PassRegistry&);
>                    void
>             initializeMachineDominanceFrontierPass(PassRegistry&);
>                    void initializeMachineDominatorTreePass(PassRegistry&);
>                    void
>             initializeMachineFunctionPrinterPassPass(PassRegistry&);
> 
>                  Modified: llvm/trunk/lib/CodeGen/CodeGen.cpp
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CodeGen.cpp?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CodeGen.cpp?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CodeGen.cpp?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CodeGen.cpp?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  --- llvm/trunk/lib/CodeGen/CodeGen.cpp (original)
>                  +++ llvm/trunk/lib/CodeGen/CodeGen.cpp Wed Aug 16
>             13:50:01 2017
>                  @@ -54,6 +54,7 @@ void llvm::initializeCodeGen(PassRegistr
>                      initializeMachineCSEPass(Registry);
>                      initializeMachineCombinerPass(Registry);
>                      initializeMachineCopyPropagationPass(Registry);
>                  + 
>             initializeMachineCopyPropagationPreRegRewritePass(Registry);
>                      initializeMachineDominatorTreePass(Registry);
>                      initializeMachineFunctionPrinterPassPass(Registry);
>                      initializeMachineLICMPass(Registry);
> 
>                  Modified: llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  --- llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp
>             (original)
>                  +++ llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp
>             Wed Aug 16
>                  13:50:01 2017
>                  @@ -7,18 +7,62 @@
>                    //
>                   
>             //===----------------------------------------------------------------------===//
>                    //
>                  -// This is an extremely simple MachineInstr-level copy
>             propagation
>                  pass.
>                  +// This is a simple MachineInstr-level copy forwarding
>             pass.  It
>                  may be run at
>                  +// two places in the codegen pipeline:
>                  +//   - After register allocation but before virtual
>             registers have
>                  been remapped
>                  +//     to physical registers.
>                  +//   - After physical register remapping.
>                  +//
>                  +// The optimizations done vary slightly based on
>             whether virtual
>                  registers are
>                  +// still present.  In both cases, this pass forwards
>             the source of
>                  COPYs to the
>                  +// users of their destinations when doing so is
>             legal.  For example:
>                  +//
>                  +//   %vreg1 = COPY %vreg0
>                  +//   ...
>                  +//   ... = OP %vreg1
>                  +//
>                  +// If
>                  +//   - the physical register assigned to %vreg0 has
>             not been
>                  clobbered by the
>                  +//     time of the use of %vreg1
>                  +//   - the register class constraints are satisfied
>                  +//   - the COPY def is the only value that reaches OP
>                  +// then this pass replaces the above with:
>                  +//
>                  +//   %vreg1 = COPY %vreg0
>                  +//   ...
>                  +//   ... = OP %vreg0
>                  +//
>                  +// and updates the relevant state required by
>             VirtRegMap (e.g.
>                  LiveIntervals).
>                  +// COPYs whose LiveIntervals become dead as a result
>             of this
>                  forwarding (i.e. if
>                  +// all uses of %vreg1 are changed to %vreg0) are removed.
>                  +//
>                  +// When being run with only physical registers, this
>             pass will also
>                  remove some
>                  +// redundant COPYs.  For example:
>                  +//
>                  +//    %R1 = COPY %R0
>                  +//    ... // No clobber of %R1
>                  +//    %R0 = COPY %R1 <<< Removed
>                  +//
>                  +// or
>                  +//
>                  +//    %R1 = COPY %R0
>                  +//    ... // No clobber of %R0
>                  +//    %R1 = COPY %R0 <<< Removed
>                    //
>                   
>             //===----------------------------------------------------------------------===//
> 
>                  +#include "LiveDebugVariables.h"
>                    #include "llvm/ADT/DenseMap.h"
>                    #include "llvm/ADT/SetVector.h"
>                    #include "llvm/ADT/SmallVector.h"
>                    #include "llvm/ADT/Statistic.h"
>                  +#include "llvm/CodeGen/LiveRangeEdit.h"
>                  +#include "llvm/CodeGen/LiveStackAnalysis.h"
>                    #include "llvm/CodeGen/MachineFunction.h"
>                    #include "llvm/CodeGen/MachineFunctionPass.h"
>                    #include "llvm/CodeGen/MachineRegisterInfo.h"
>                    #include "llvm/CodeGen/Passes.h"
>                  +#include "llvm/CodeGen/VirtRegMap.h"
>                    #include "llvm/Pass.h"
>                    #include "llvm/Support/Debug.h"
>                    #include "llvm/Support/raw_ostream.h"
>                  @@ -30,24 +74,48 @@ using namespace llvm;
>                    #define DEBUG_TYPE "machine-cp"
> 
>                    STATISTIC(NumDeletes, "Number of dead copies deleted");
>                  +STATISTIC(NumCopyForwards, "Number of copy uses
>             forwarded");
> 
>                    namespace {
>                      typedef SmallVector<unsigned, 4> RegList;
>                      typedef DenseMap<unsigned, RegList> SourceMap;
>                      typedef DenseMap<unsigned, MachineInstr*> Reg2MIMap;
> 
>                  -  class MachineCopyPropagation : public
>             MachineFunctionPass {
>                  +  class MachineCopyPropagation : public
>             MachineFunctionPass,
>                  +                                 private
>             LiveRangeEdit::Delegate {
>                        const TargetRegisterInfo *TRI;
>                        const TargetInstrInfo *TII;
>                  -    const MachineRegisterInfo *MRI;
>                  +    MachineRegisterInfo *MRI;
>                  +    MachineFunction *MF;
>                  +    SlotIndexes *Indexes;
>                  +    LiveIntervals *LIS;
>                  +    const VirtRegMap *VRM;
>                  +    // True if this pass being run before virtual
>             registers are
>                  remapped to
>                  +    // physical ones.
>                  +    bool PreRegRewrite;
>                  +    bool NoSubRegLiveness;
>                  +
>                  +  protected:
>                  +    MachineCopyPropagation(char &ID, bool PreRegRewrite)
>                  +        : MachineFunctionPass(ID),
>             PreRegRewrite(PreRegRewrite) {}
> 
>                      public:
>                        static char ID; // Pass identification,
>             replacement for typeid
>                  -    MachineCopyPropagation() : MachineFunctionPass(ID) {
>                  +    MachineCopyPropagation() :
>             MachineCopyPropagation(ID, false) {
>                             
>             initializeMachineCopyPropagationPass(*PassRegistry::getPassRegistry());
>                        }
> 
>                        void getAnalysisUsage(AnalysisUsage &AU) const
>             override {
>                  +      if (PreRegRewrite) {
>                  +        AU.addRequired<SlotIndexes>();
>                  +        AU.addPreserved<SlotIndexes>();
>                  +        AU.addRequired<LiveIntervals>();
>                  +        AU.addPreserved<LiveIntervals>();
>                  +        AU.addRequired<VirtRegMap>();
>                  +        AU.addPreserved<VirtRegMap>();
>                  +        AU.addPreserved<LiveDebugVariables>();
>                  +        AU.addPreserved<LiveStacks>();
>                  +      }
>                          AU.setPreservesCFG();
>                          MachineFunctionPass::getAnalysisUsage(AU);
>                        }
>                  @@ -55,6 +123,10 @@ namespace {
>                        bool runOnMachineFunction(MachineFunction &MF)
>             override;
> 
>                        MachineFunctionProperties getRequiredProperties()
>             const override {
>                  +      if (PreRegRewrite)
>                  +        return MachineFunctionProperties()
>                  +           
>             .set(MachineFunctionProperties::Property::NoPHIs)
>                  +           
>             .set(MachineFunctionProperties::Property::TracksLiveness);
>                          return MachineFunctionProperties().set(
>                              MachineFunctionProperties::Property::NoVRegs);
>                        }
>                  @@ -64,6 +136,28 @@ namespace {
>                        void ReadRegister(unsigned Reg);
>                        void CopyPropagateBlock(MachineBasicBlock &MBB);
>                        bool eraseIfRedundant(MachineInstr &Copy,
>             unsigned Src,
>                  unsigned Def);
>                  +    unsigned getPhysReg(unsigned Reg, unsigned SubReg);
>                  +    unsigned getPhysReg(const MachineOperand &Opnd) {
>                  +      return getPhysReg(Opnd.getReg(), Opnd.getSubReg());
>                  +    }
>                  +    unsigned getFullPhysReg(const MachineOperand &Opnd) {
>                  +      return getPhysReg(Opnd.getReg(), 0);
>                  +    }
>                  +    void forwardUses(MachineInstr &MI);
>                  +    bool isForwardableRegClassCopy(const MachineInstr
>             &Copy,
>                  +                                   const MachineInstr
>             &UseI);
>                  +    std::tuple<unsigned, unsigned, bool>
>                  +    checkUseSubReg(const MachineOperand &CopySrc, const
>                  MachineOperand &MOUse);
>                  +    bool hasImplicitOverlap(const MachineInstr &MI, const
>                  MachineOperand &Use);
>                  +    void narrowRegClass(const MachineInstr &MI, const
>                  MachineOperand &MOUse,
>                  +                        unsigned NewUseReg, unsigned
>             NewUseSubReg);
>                  +    void updateForwardedCopyLiveInterval(const
>             MachineInstr &Copy,
>                  +                                         const
>             MachineInstr &UseMI,
>                  +                                         unsigned
>             OrigUseReg,
>                  +                                         unsigned
>             NewUseReg,
>                  +                                         unsigned
>             NewUseSubReg);
>                  +    /// LiveRangeEdit callback for eliminateDeadDefs().
>                  +    void LRE_WillEraseInstruction(MachineInstr *MI)
>             override;
> 
>                        /// Candidates for deletion.
>                        SmallSetVector<MachineInstr*, 8> MaybeDeadCopies;
>                  @@ -75,6 +169,15 @@ namespace {
>                        SourceMap SrcMap;
>                        bool Changed;
>                      };
>                  +
>                  +  class MachineCopyPropagationPreRegRewrite : public
>                  MachineCopyPropagation {
>                  +  public:
>                  +    static char ID; // Pass identification,
>             replacement for typeid
>                  +    MachineCopyPropagationPreRegRewrite()
>                  +        : MachineCopyPropagation(ID, true) {
>                  +       
>               initializeMachineCopyPropagationPreRegRewritePass(*PassRegistry::getPassRegistry());
>                  +    }
>                  +  };
>                    }
>                    char MachineCopyPropagation::ID = 0;
>                    char &llvm::MachineCopyPropagationID =
>             MachineCopyPropagation::ID;
>                  @@ -82,6 +185,29 @@ char
>             &llvm::MachineCopyPropagationID = M
>                    INITIALIZE_PASS(MachineCopyPropagation, DEBUG_TYPE,
>                                    "Machine Copy Propagation Pass",
>             false, false)
> 
>                  +/// We have two separate passes that are very similar,
>             the only
>                  difference being
>                  +/// where they are meant to be run in the pipeline. 
>             This is done
>                  for several
>                  +/// reasons:
>                  +/// - the two passes have different dependencies
>                  +/// - some targets want to disable the later run of
>             this pass, but
>                  not the
>                  +///   earlier one (e.g. NVPTX and WebAssembly)
>                  +/// - it allows for easier debugging via llc
>                  +
>                  +char MachineCopyPropagationPreRegRewrite::ID = 0;
>                  +char &llvm::MachineCopyPropagationPreRegRewriteID =
>                  MachineCopyPropagationPreRegRewrite::ID;
>                  +
>                  +INITIALIZE_PASS_BEGIN(MachineCopyPropagationPreRegRewrite,
>                  +                      "machine-cp-prerewrite",
>                  +                      "Machine Copy Propagation
>             Pre-Register
>                  Rewrite Pass",
>                  +                      false, false)
>                  +INITIALIZE_PASS_DEPENDENCY(SlotIndexes)
>                  +INITIALIZE_PASS_DEPENDENCY(LiveIntervals)
>                  +INITIALIZE_PASS_DEPENDENCY(VirtRegMap)
>                  +INITIALIZE_PASS_END(MachineCopyPropagationPreRegRewrite,
>                  +                    "machine-cp-prerewrite",
>                  +                    "Machine Copy Propagation
>             Pre-Register Rewrite
>                  Pass", false,
>                  +                    false)
>                  +
>                    /// Remove any entry in \p Map where the register is
>             a subregister
>                  or equal to
>                    /// a register contained in \p Regs.
>                    static void removeRegsFromMap(Reg2MIMap &Map, const
>             RegList &Regs,
>                  @@ -122,6 +248,10 @@ void
>             MachineCopyPropagation::ClobberRegi
>                    }
> 
>                    void MachineCopyPropagation::ReadRegister(unsigned Reg) {
>                  +  // We don't track MaybeDeadCopies when running
>             pre-VirtRegRewriter.
>                  +  if (PreRegRewrite)
>                  +    return;
>                  +
>                      // If 'Reg' is defined by a copy, the copy is no
>             longer a candidate
>                      // for elimination.
>                      for (MCRegAliasIterator AI(Reg, TRI, true);
>             AI.isValid(); ++AI) {
>                  @@ -153,6 +283,46 @@ static bool isNopCopy(const
>             MachineInstr
>                      return SubIdx == TRI->getSubRegIndex(PreviousDef, Def);
>                    }
> 
>                  +/// Return the physical register assigned to \p Reg if
>             it is a
>                  virtual register,
>                  +/// otherwise just return the physical reg from the
>             operand itself.
>                  +///
>                  +/// If \p SubReg is 0 then return the full physical
>             register
>                  assigned to the
>                  +/// virtual register ignoring subregs.  If we aren't
>             tracking
>                  sub-reg liveness
>                  +/// then we need to use this to be more conservative
>             with clobbers
>                  by killing
>                  +/// all super reg and their sub reg COPYs as well. 
>             This is to
>                  prevent COPY
>                  +/// forwarding in cases like the following:
>                  +///
>                  +///    %vreg2 = COPY %vreg1:sub1
>                  +///    %vreg3 = COPY %vreg1:sub0
>                  +///    ...    = OP1 %vreg2
>                  +///    ...    = OP2 %vreg3
>                  +///
>                  +/// After forward %vreg2 (assuming this is the last
>             use of %vreg1) and
>                  +/// VirtRegRewriter adding kill markers we have:
>                  +///
>                  +///    %vreg3 = COPY %vreg1:sub0
>                  +///    ...    = OP1 %vreg1:sub1<kill>
>                  +///    ...    = OP2 %vreg3
>                  +///
>                  +/// If %vreg3 is assigned to a sub-reg of %vreg1, then
>             after
>                  rewriting we have:
>                  +///
>                  +///    ...     = OP1 R0:sub1, R0<imp-use,kill>
>                  +///    ...     = OP2 R0:sub0
>                  +///
>                  +/// and the use of R0 by OP2 will not have a valid
>             definition.
>                  +unsigned MachineCopyPropagation::getPhysReg(unsigned
>             Reg, unsigned
>                  SubReg) {
>                  +
>                  +  // Physical registers cannot have subregs.
>                  +  if (!TargetRegisterInfo::isVirtualRegister(Reg))
>                  +    return Reg;
>                  +
>                  +  assert(PreRegRewrite && "Unexpected virtual register
>             encountered");
>                  +  Reg = VRM->getPhys(Reg);
>                  +  if (SubReg && !NoSubRegLiveness)
>                  +    Reg = TRI->getSubReg(Reg, SubReg);
>                  +  return Reg;
>                  +}
>                  +
>                    /// Remove instruction \p Copy if there exists a
>             previous copy
>                  that copies the
>                    /// register \p Src to the register \p Def; This may
>             happen
>                  indirectly by
>                    /// copying the super registers.
>                  @@ -190,6 +360,325 @@ bool
>             MachineCopyPropagation::eraseIfRedu
>                      return true;
>                    }
> 
>                  +
>                  +/// Decide whether we should forward the destination
>             of \param Copy
>                  to its use
>                  +/// in \param UseI based on the register class of the Copy
>                  operands.  Same-class
>                  +/// COPYs are always accepted by this function, but
>             cross-class
>                  COPYs are only
>                  +/// accepted if they are forwarded to another COPY
>             with the operand
>                  register
>                  +/// classes reversed.  For example:
>                  +///
>                  +///   RegClassA = COPY RegClassB  // Copy parameter
>                  +///   ...
>                  +///   RegClassB = COPY RegClassA  // UseI parameter
>                  +///
>                  +/// which after forwarding becomes
>                  +///
>                  +///   RegClassA = COPY RegClassB
>                  +///   ...
>                  +///   RegClassB = COPY RegClassB
>                  +///
>                  +/// so we have reduced the number of cross-class COPYs
>             and potentially
>                  +/// introduced a no COPY that can be removed.
>                  +bool MachineCopyPropagation::isForwardableRegClassCopy(
>                  +    const MachineInstr &Copy, const MachineInstr &UseI) {
>                  +  auto isCross = [&](const MachineOperand &Dst, const
>                  MachineOperand &Src) {
>                  +    unsigned DstReg = Dst.getReg();
>                  +    unsigned SrcPhysReg = getPhysReg(Src);
>                  +    const TargetRegisterClass *DstRC;
>                  +    if (TargetRegisterInfo::isVirtualRegister(DstReg)) {
>                  +      DstRC = MRI->getRegClass(DstReg);
>                  +      unsigned DstSubReg = Dst.getSubReg();
>                  +      if (DstSubReg)
>                  +        SrcPhysReg = TRI->getMatchingSuperReg(SrcPhysReg,
>                  DstSubReg, DstRC);
>                  +    } else
>                  +      DstRC = TRI->getMinimalPhysRegClass(DstReg);
>                  +
>                  +    return !DstRC->contains(SrcPhysReg);
>                  +  };
>                  +
>                  +  const MachineOperand &CopyDst = Copy.getOperand(0);
>                  +  const MachineOperand &CopySrc = Copy.getOperand(1);
>                  +
>                  +  if (!isCross(CopyDst, CopySrc))
>                  +    return true;
>                  +
>                  +  if (!UseI.isCopy())
>                  +    return false;
>                  +
>                  +  assert(getFullPhysReg(UseI.getOperand(1)) ==
>                  getFullPhysReg(CopyDst));
>                  +  return !isCross(UseI.getOperand(0), CopySrc);
>                  +}
>                  +
>                  +/// Check that the subregs on the copy source operand
>             (\p CopySrc)
>                  and the use
>                  +/// operand to be forwarded to (\p MOUse) are
>             compatible with doing the
>                  +/// forwarding.  Also computes the new register and
>             subregister to
>                  be used in
>                  +/// the forwarded-to instruction.
>                  +std::tuple<unsigned, unsigned, bool>
>                  MachineCopyPropagation::checkUseSubReg(
>                  +    const MachineOperand &CopySrc, const
>             MachineOperand &MOUse) {
>                  +  unsigned NewUseReg = CopySrc.getReg();
>                  +  unsigned NewUseSubReg;
>                  +
>                  +  if (TargetRegisterInfo::isPhysicalRegister(NewUseReg)) {
>                  +    // If MOUse is a virtual reg, we need to apply it
>             to the new
>                  physical reg
>                  +    // we're going to replace it with.
>                  +    if (MOUse.getSubReg())
>                  +      NewUseReg = TRI->getSubReg(NewUseReg,
>             MOUse.getSubReg());
>                  +    // If the original use subreg isn't valid on the
>             new src reg,
>                  we can't
>                  +    // forward it here.
>                  +    if (!NewUseReg)
>                  +      return std::make_tuple(0, 0, false);
>                  +    NewUseSubReg = 0;
>                  +  } else {
>                  +    // %v1 = COPY %v2:sub1
>                  +    //    USE %v1:sub2
>                  +    // The new use is %v2:sub1:sub2
>                  +    NewUseSubReg =
>                  +        TRI->composeSubRegIndices(CopySrc.getSubReg(),
>                  MOUse.getSubReg());
>                  +    // Check that NewUseSubReg is valid on NewUseReg
>                  +    if (NewUseSubReg &&
>                  +       
>             !TRI->getSubClassWithSubReg(MRI->getRegClass(NewUseReg),
>                  NewUseSubReg))
>                  +      return std::make_tuple(0, 0, false);
>                  +  }
>                  +
>                  +  return std::make_tuple(NewUseReg, NewUseSubReg, true);
>                  +}
>                  +
>                  +/// Check that \p MI does not have implicit uses that
>             overlap with
>                  it's \p Use
>                  +/// operand (the register being replaced), since these
>             can sometimes be
>                  +/// implicitly tied to other operands.  For example,
>             on AMDGPU:
>                  +///
>                  +/// V_MOVRELS_B32_e32 %VGPR2, %M0<imp-use>,
>             %EXEC<imp-use>,
>                  %VGPR2_VGPR3_VGPR4_VGPR5<imp-use>
>                  +///
>                  +/// the %VGPR2 is implicitly tied to the larger reg
>             operand, but we
>                  have no
>                  +/// way of knowing we need to update the latter when
>             updating the
>                  former.
>                  +bool MachineCopyPropagation::hasImplicitOverlap(const
>             MachineInstr &MI,
>                  +                                                const
>                  MachineOperand &Use) {
>                  +  if
>             (!TargetRegisterInfo::isPhysicalRegister(Use.getReg()))
>                  +    return false;
>                  +
>                  +  for (const MachineOperand &MIUse : MI.uses())
>                  +    if (&MIUse != &Use && MIUse.isReg() &&
>             MIUse.isImplicit() &&
>                  +        TRI->regsOverlap(Use.getReg(), MIUse.getReg()))
>                  +      return true;
>                  +
>                  +  return false;
>                  +}
>                  +
>                  +/// Narrow the register class of the forwarded vreg so
>             it matches any
>                  +/// instruction constraints.  \p MI is the instruction
>             being
>                  forwarded to. \p
>                  +/// MOUse is the operand being replaced in \p MI
>             (which hasn't yet
>                  been updated
>                  +/// at the time this function is called).  \p
>             NewUseReg and \p
>                  NewUseSubReg are
>                  +/// what the \p MOUse will be changed to after forwarding.
>                  +///
>                  +/// If we are forwarding
>                  +///    A:RCA = COPY B:RCB
>                  +/// into
>                  +///    ... = OP A:RCA
>                  +///
>                  +/// then we need to narrow the register class of B so
>             that it is a
>                  subclass
>                  +/// of RCA so that it meets the instruction register
>             class constraints.
>                  +void MachineCopyPropagation::narrowRegClass(const
>             MachineInstr &MI,
>                  +                                            const
>             MachineOperand
>                  &MOUse,
>                  +                                            unsigned
>             NewUseReg,
>                  +                                            unsigned
>             NewUseSubReg) {
>                  +  if (!TargetRegisterInfo::isVirtualRegister(NewUseReg))
>                  +    return;
>                  +
>                  +  // Make sure the virtual reg class allows the subreg.
>                  +  if (NewUseSubReg) {
>                  +    const TargetRegisterClass *CurUseRC =
>             MRI->getRegClass(NewUseReg);
>                  +    const TargetRegisterClass *NewUseRC =
>                  +        TRI->getSubClassWithSubReg(CurUseRC,
>             NewUseSubReg);
>                  +    if (CurUseRC != NewUseRC) {
>                  +      DEBUG(dbgs() << "MCP: Setting regclass of " <<
>                  PrintReg(NewUseReg, TRI)
>                  +                   << " to " <<
>             TRI->getRegClassName(NewUseRC) <<
>                  "\n");
>                  +      MRI->setRegClass(NewUseReg, NewUseRC);
>                  +    }
>                  +  }
>                  +
>                  +  unsigned MOUseOpNo = &MOUse - &MI.getOperand(0);
>                  +  const TargetRegisterClass *InstRC =
>                  +      TII->getRegClass(MI.getDesc(), MOUseOpNo, TRI, *MF);
>                  +  if (InstRC) {
>                  +    const TargetRegisterClass *CurUseRC =
>             MRI->getRegClass(NewUseReg);
>                  +    if (NewUseSubReg)
>                  +      InstRC = TRI->getMatchingSuperRegClass(CurUseRC,
>             InstRC,
>                  NewUseSubReg);
>                  +    if (!InstRC->hasSubClassEq(CurUseRC)) {
>                  +      const TargetRegisterClass *NewUseRC =
>                  +          TRI->getCommonSubClass(InstRC, CurUseRC);
>                  +      DEBUG(dbgs() << "MCP: Setting regclass of " <<
>                  PrintReg(NewUseReg, TRI)
>                  +                   << " to " <<
>             TRI->getRegClassName(NewUseRC) <<
>                  "\n");
>                  +      MRI->setRegClass(NewUseReg, NewUseRC);
>                  +    }
>                  +  }
>                  +}
>                  +
>                  +/// Update the LiveInterval information to reflect the
>             destination
>                  of \p Copy
>                  +/// being forwarded to a use in \p UseMI.  \p
>             OrigUseReg is the
>                  register being
>                  +/// forwarded through. It should be the destination
>             register of \p
>                  Copy and has
>                  +/// already been replaced in \p UseMI at the point
>             this function is
>                  called.  \p
>                  +/// NewUseReg and \p NewUseSubReg are the register and
>             subregister
>                  being
>                  +/// forwarded.  They should be the source register of
>             the \p Copy
>                  and should be
>                  +/// the value of the \p UseMI operand being forwarded
>             at the point
>                  this function
>                  +/// is called.
>                  +void
>             MachineCopyPropagation::updateForwardedCopyLiveInterval(
>                  +    const MachineInstr &Copy, const MachineInstr
>             &UseMI, unsigned
>                  OrigUseReg,
>                  +    unsigned NewUseReg, unsigned NewUseSubReg) {
>                  +
>                  +  assert(TRI->isSubRegisterEq(getPhysReg(OrigUseReg, 0),
>                  +                             
>             getFullPhysReg(Copy.getOperand(0))) &&
>                  +         "OrigUseReg mismatch");
>                  + 
>             assert(TRI->isSubRegisterEq(getFullPhysReg(Copy.getOperand(1)),
>                  +                              getPhysReg(NewUseReg, 0)) &&
>                  +         "NewUseReg mismatch");
>                  +
>                  +  // Extend live range starting from COPY
>             early-clobber slot, since
>                  that
>                  +  // is where the original src live range ends.
>                  +  SlotIndex CopyUseIdx =
>                  +      Indexes->getInstructionIndex(Copy).getRegSlot(true
>                  /*=EarlyClobber*/);
>                  +  SlotIndex UseIdx =
>             Indexes->getInstructionIndex(UseMI).getRegSlot();
>                  +  if (TargetRegisterInfo::isVirtualRegister(NewUseReg)) {
>                  +    LiveInterval &LI = LIS->getInterval(NewUseReg);
>                  +    LI.extendInBlock(CopyUseIdx, UseIdx);
>                  +    LaneBitmask UseMask =
>             TRI->getSubRegIndexLaneMask(NewUseSubReg);
>                  +    for (auto &S : LI.subranges())
>                  +      if ((S.LaneMask & UseMask).any() &&
>             S.find(CopyUseIdx))
>                  +        S.extendInBlock(CopyUseIdx, UseIdx);
>                  +  } else {
>                  +    assert(NewUseSubReg == 0 && "Unexpected subreg on
>             physical
>                  register!");
>                  +    for (MCRegUnitIterator UI(NewUseReg, TRI);
>             UI.isValid(); ++UI) {
>                  +      LiveRange &LR = LIS->getRegUnit(*UI);
>                  +      LR.extendInBlock(CopyUseIdx, UseIdx);
>                  +    }
>                  +  }
>                  +
>                  +  if (!TargetRegisterInfo::isVirtualRegister(OrigUseReg))
>                  +    return;
>                  +
>                  +  LiveInterval &LI = LIS->getInterval(OrigUseReg);
>                  +
>                  +  // Can happen for undef uses.
>                  +  if (LI.empty())
>                  +    return;
>                  +
>                  +  SlotIndex UseIndex =
>             Indexes->getInstructionIndex(UseMI);
>                  +  const LiveRange::Segment *UseSeg =
>             LI.getSegmentContaining(UseIndex);
>                  +
>                  +  // Only shrink if forwarded use is the end of a segment.
>                  +  if (UseSeg->end != UseIndex.getRegSlot())
>                  +    return;
>                  +
>                  +  SmallVector<MachineInstr *, 4> DeadInsts;
>                  +  LIS->shrinkToUses(&LI, &DeadInsts);
>                  +  if (!DeadInsts.empty()) {
>                  +    SmallVector<unsigned, 8> NewRegs;
>                  +    LiveRangeEdit(nullptr, NewRegs, *MF, *LIS,
>             nullptr, this)
>                  +        .eliminateDeadDefs(DeadInsts);
>                  +  }
>                  +}
>                  +
>                  +void
>             MachineCopyPropagation::LRE_WillEraseInstruction(MachineInstr
>                  *MI) {
>                  +  // Remove this COPY from further consideration for
>             forwarding.
>                  +  ClobberRegister(getFullPhysReg(MI->getOperand(0)));
>                  +  Changed = true;
>                  +}
>                  +
>                  +/// Look for available copies whose destination
>             register is used by
>                  \p MI and
>                  +/// replace the use in \p MI with the copy's source
>             register.
>                  +void MachineCopyPropagation::forwardUses(MachineInstr
>             &MI) {
>                  +  if (AvailCopyMap.empty())
>                  +    return;
>                  +
>                  +  // Look for non-tied explicit vreg uses that have an
>             active COPY
>                  +  // instruction that defines the physical register
>             allocated to them.
>                  +  // Replace the vreg with the source of the active COPY.
>                  +  for (MachineOperand &MOUse : MI.explicit_uses()) {
>                  +    if (!MOUse.isReg() || MOUse.isTied())
>                  +      continue;
>                  +
>                  +    unsigned UseReg = MOUse.getReg();
>                  +    if (!UseReg)
>                  +      continue;
>                  +
>                  +    if (TargetRegisterInfo::isVirtualRegister(UseReg))
>                  +      UseReg = VRM->getPhys(UseReg);
>                  +    else if (MI.isCall() || MI.isReturn() ||
>             MI.isInlineAsm() ||
>                  +             MI.hasUnmodeledSideEffects() ||
>             MI.isDebugValue() ||
>                  MI.isKill())
>                  +      // Some instructions seem to have ABI uses e.g.
>             not marked as
>                  +      // implicit, which can lead to forwarding them
>             when we
>                  shouldn't, so
>                  +      // restrict the types of instructions we forward
>             physical
>                  regs into.
>                  +      continue;
>                  +
>                  +    // Don't forward COPYs via non-allocatable regs
>             since they can have
>                  +    // non-standard semantics.
>                  +    if (!MRI->isAllocatable(UseReg))
>                  +      continue;
>                  +
>                  +    auto CI = AvailCopyMap.find(UseReg);
>                  +    if (CI == AvailCopyMap.end())
>                  +      continue;
>                  +
>                  +    MachineInstr &Copy = *CI->second;
>                  +    MachineOperand &CopyDst = Copy.getOperand(0);
>                  +    MachineOperand &CopySrc = Copy.getOperand(1);
>                  +
>                  +    // Don't forward COPYs that are already NOPs due
>             to register
>                  assignment.
>                  +    if (getPhysReg(CopyDst) == getPhysReg(CopySrc))
>                  +      continue;
>                  +
>                  +    // FIXME: Don't handle partial uses of wider COPYs
>             yet.
>                  +    if (CopyDst.getSubReg() != 0 || UseReg !=
>             getPhysReg(CopyDst))
>                  +      continue;
>                  +
>                  +    // Don't forward COPYs of non-allocatable regs
>             unless they are
>                  constant.
>                  +    unsigned CopySrcReg = CopySrc.getReg();
>                  +    if
>             (TargetRegisterInfo::isPhysicalRegister(CopySrcReg) &&
>                  +        !MRI->isAllocatable(CopySrcReg) &&
>                  !MRI->isConstantPhysReg(CopySrcReg))
>                  +      continue;
>                  +
>                  +    if (!isForwardableRegClassCopy(Copy, MI))
>                  +      continue;
>                  +
>                  +    unsigned NewUseReg, NewUseSubReg;
>                  +    bool SubRegOK;
>                  +    std::tie(NewUseReg, NewUseSubReg, SubRegOK) =
>                  +        checkUseSubReg(CopySrc, MOUse);
>                  +    if (!SubRegOK)
>                  +      continue;
>                  +
>                  +    if (hasImplicitOverlap(MI, MOUse))
>                  +      continue;
>                  +
>                  +    DEBUG(dbgs() << "MCP: Replacing "
>                  +          << PrintReg(MOUse.getReg(), TRI,
>             MOUse.getSubReg())
>                  +          << "\n     with "
>                  +          << PrintReg(NewUseReg, TRI, CopySrc.getSubReg())
>                  +          << "\n     in "
>                  +          << MI
>                  +          << "     from "
>                  +          << Copy);
>                  +
>                  +    narrowRegClass(MI, MOUse, NewUseReg, NewUseSubReg);
>                  +
>                  +    unsigned OrigUseReg = MOUse.getReg();
>                  +    MOUse.setReg(NewUseReg);
>                  +    MOUse.setSubReg(NewUseSubReg);
>                  +
>                  +    DEBUG(dbgs() << "MCP: After replacement: " << MI
>             << "\n");
>                  +
>                  +    if (PreRegRewrite)
>                  +      updateForwardedCopyLiveInterval(Copy, MI,
>             OrigUseReg, NewUseReg,
>                  +                                      NewUseSubReg);
>                  +    else
>                  +      for (MachineInstr &KMI :
>                  +             make_range(Copy.getIterator(),
>                  std::next(MI.getIterator())))
>                  +        KMI.clearRegisterKills(NewUseReg, TRI);
>                  +
>                  +    ++NumCopyForwards;
>                  +    Changed = true;
>                  +  }
>                  +}
>                  +
>                    void
>             MachineCopyPropagation::CopyPropagateBlock(MachineBasicBlock
>                  &MBB) {
>                      DEBUG(dbgs() << "MCP: CopyPropagateBlock " <<
>             MBB.getName() <<
>                  "\n");
> 
>                  @@ -198,12 +687,8 @@ void
>             MachineCopyPropagation::CopyPropaga
>                        ++I;
> 
>                        if (MI->isCopy()) {
>                  -      unsigned Def = MI->getOperand(0).getReg();
>                  -      unsigned Src = MI->getOperand(1).getReg();
>                  -
>                  -     
>             assert(!TargetRegisterInfo::isVirtualRegister(Def) &&
>                  -           
>               !TargetRegisterInfo::isVirtualRegister(Src) &&
>                  -             "MachineCopyPropagation should be run
>             after register
>                  allocation!");
>                  +      unsigned Def = getPhysReg(MI->getOperand(0));
>                  +      unsigned Src = getPhysReg(MI->getOperand(1));
> 
>                          // The two copies cancel out and the source of
>             the first copy
>                          // hasn't been overridden, eliminate the second
>             one. e.g.
>                  @@ -220,8 +705,16 @@ void
>             MachineCopyPropagation::CopyPropaga
>                          //  %ECX<def> = COPY %EAX
>                          // =>
>                          //  %ECX<def> = COPY %EAX
>                  -      if (eraseIfRedundant(*MI, Def, Src) ||
>             eraseIfRedundant(*MI,
>                  Src, Def))
>                  -        continue;
>                  +      if (!PreRegRewrite)
>                  +        if (eraseIfRedundant(*MI, Def, Src) ||
>                  eraseIfRedundant(*MI, Src, Def))
>                  +          continue;
>                  +
>                  +      forwardUses(*MI);
>                  +
>                  +      // Src may have been changed by forwardUses()
>                  +      Src = getPhysReg(MI->getOperand(1));
>                  +      unsigned DefClobber =
>             getFullPhysReg(MI->getOperand(0));
>                  +      unsigned SrcClobber =
>             getFullPhysReg(MI->getOperand(1));
> 
>                          // If Src is defined by a previous copy, the
>             previous copy
>                  cannot be
>                          // eliminated.
>                  @@ -238,7 +731,10 @@ void
>             MachineCopyPropagation::CopyPropaga
>                          DEBUG(dbgs() << "MCP: Copy is a deletion
>             candidate: ";
>                  MI->dump());
> 
>                          // Copy is now a candidate for deletion.
>                  -      if (!MRI->isReserved(Def))
>                  +      // Only look for dead COPYs if we're not running
>             just before
>                  +      // VirtRegRewriter, since presumably these COPYs
>             will have
>                  already been
>                  +      // removed.
>                  +      if (!PreRegRewrite && !MRI->isReserved(Def))
>                            MaybeDeadCopies.insert(MI);
> 
>                          // If 'Def' is previously source of another
>             copy, then this
>                  earlier copy's
>                  @@ -248,11 +744,11 @@ void
>             MachineCopyPropagation::CopyPropaga
>                          // %xmm2<def> = copy %xmm0
>                          // ...
>                          // %xmm2<def> = copy %xmm9
>                  -      ClobberRegister(Def);
>                  +      ClobberRegister(DefClobber);
>                          for (const MachineOperand &MO :
>             MI->implicit_operands()) {
>                            if (!MO.isReg() || !MO.isDef())
>                              continue;
>                  -        unsigned Reg = MO.getReg();
>                  +        unsigned Reg = getFullPhysReg(MO);
>                            if (!Reg)
>                              continue;
>                            ClobberRegister(Reg);
>                  @@ -267,13 +763,27 @@ void
>             MachineCopyPropagation::CopyPropaga
> 
>                          // Remember source that's copied to Def. Once it's
>                  clobbered, then
>                          // it's no longer available for copy propagation.
>                  -      RegList &DestList = SrcMap[Src];
>                  -      if (!is_contained(DestList, Def))
>                  -        DestList.push_back(Def);
>                  +      RegList &DestList = SrcMap[SrcClobber];
>                  +      if (!is_contained(DestList, DefClobber))
>                  +        DestList.push_back(DefClobber);
> 
>                          continue;
>                        }
> 
>                  +    // Clobber any earlyclobber regs first.
>                  +    for (const MachineOperand &MO : MI->operands())
>                  +      if (MO.isReg() && MO.isEarlyClobber()) {
>                  +        unsigned Reg = getFullPhysReg(MO);
>                  +        // If we have a tied earlyclobber, that means
>             it is also
>                  read by this
>                  +        // instruction, so we need to make sure we
>             don't remove it
>                  as dead
>                  +        // later.
>                  +        if (MO.isTied())
>                  +          ReadRegister(Reg);
>                  +        ClobberRegister(Reg);
>                  +      }
>                  +
>                  +    forwardUses(*MI);
>                  +
>                        // Not a copy.
>                        SmallVector<unsigned, 2> Defs;
>                        const MachineOperand *RegMask = nullptr;
>                  @@ -282,14 +792,11 @@ void
>             MachineCopyPropagation::CopyPropaga
>                            RegMask = &MO;
>                          if (!MO.isReg())
>                            continue;
>                  -      unsigned Reg = MO.getReg();
>                  +      unsigned Reg = getFullPhysReg(MO);
>                          if (!Reg)
>                            continue;
> 
>                  -     
>             assert(!TargetRegisterInfo::isVirtualRegister(Reg) &&
>                  -             "MachineCopyPropagation should be run
>             after register
>                  allocation!");
>                  -
>                  -      if (MO.isDef()) {
>                  +      if (MO.isDef() && !MO.isEarlyClobber()) {
>                            Defs.push_back(Reg);
>                            continue;
>                          } else if (MO.readsReg())
>                  @@ -346,6 +853,8 @@ void
>             MachineCopyPropagation::CopyPropaga
>                      // since we don't want to trust live-in lists.
>                      if (MBB.succ_empty()) {
>                        for (MachineInstr *MaybeDead : MaybeDeadCopies) {
>                  +      DEBUG(dbgs() << "MCP: Removing copy due to no
>             live-out succ: ";
>                  +            MaybeDead->dump());
>                         
>             assert(!MRI->isReserved(MaybeDead->getOperand(0).getReg()));
>                          MaybeDead->eraseFromParent();
>                          Changed = true;
>                  @@ -368,10 +877,16 @@ bool
>             MachineCopyPropagation::runOnMachin
>                      TRI = MF.getSubtarget().getRegisterInfo();
>                      TII = MF.getSubtarget().getInstrInfo();
>                      MRI = &MF.getRegInfo();
>                  +  this->MF = &MF;
>                  +  if (PreRegRewrite) {
>                  +    Indexes = &getAnalysis<SlotIndexes>();
>                  +    LIS = &getAnalysis<LiveIntervals>();
>                  +    VRM = &getAnalysis<VirtRegMap>();
>                  +  }
>                  +  NoSubRegLiveness = !MRI->subRegLivenessEnabled();
> 
>                      for (MachineBasicBlock &MBB : MF)
>                        CopyPropagateBlock(MBB);
> 
>                      return Changed;
>                    }
>                  -
> 
>                  Modified: llvm/trunk/lib/CodeGen/TargetPassConfig.cpp
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetPassConfig.cpp?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetPassConfig.cpp?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetPassConfig.cpp?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetPassConfig.cpp?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  --- llvm/trunk/lib/CodeGen/TargetPassConfig.cpp (original)
>                  +++ llvm/trunk/lib/CodeGen/TargetPassConfig.cpp Wed Aug
>             16 13:50:01 2017
>                  @@ -88,6 +88,8 @@ static cl::opt<bool> DisableCGP("disable
>                        cl::desc("Disable Codegen Prepare"));
>                    static cl::opt<bool>
>             DisableCopyProp("disable-copyprop", cl::Hidden,
>                        cl::desc("Disable Copy Propagation pass"));
>                  +static cl::opt<bool>
>                 
>             DisableCopyPropPreRegRewrite("disable-copyprop-prerewrite",
>             cl::Hidden,
>                  +    cl::desc("Disable Copy Propagation Pre-Register
>             Re-write pass"));
>                    static cl::opt<bool>
>                 
>             DisablePartialLibcallInlining("disable-partial-libcall-inlining",
>                        cl::Hidden, cl::desc("Disable Partial Libcall
>             Inlining"));
>                    static cl::opt<bool> EnableImplicitNullChecks(
>                  @@ -248,6 +250,9 @@ static IdentifyingPassPtr
>             overridePass(A
>                      if (StandardID == &MachineCopyPropagationID)
>                        return applyDisable(TargetID, DisableCopyProp);
> 
>                  +  if (StandardID ==
>             &MachineCopyPropagationPreRegRewriteID)
>                  +    return applyDisable(TargetID,
>             DisableCopyPropPreRegRewrite);
>                  +
>                      return TargetID;
>                    }
> 
>                  @@ -1059,6 +1064,10 @@ void
>             TargetPassConfig::addOptimizedRegAl
>                        // Allow targets to change the register
>             assignments before
>                  rewriting.
>                        addPreRewrite();
> 
>                  +    // Copy propagate to forward register uses and try
>             to eliminate
>                  COPYs that
>                  +    // were not coalesced.
>                  +    addPass(&MachineCopyPropagationPreRegRewriteID);
>                  +
>                        // Finally rewrite virtual registers.
>                        addPass(&VirtRegRewriterID);
> 
> 
>                  Modified:
>             llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  ---
>             llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll
>             (original)
>                  +++
>             llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll Wed Aug
>                  16 13:50:01 2017
>                  @@ -9,7 +9,8 @@ define i16 @halfword(%struct.a* %ctx, i3
>                    ; CHECK-LABEL: halfword:
>                    ; CHECK: ubfx [[REG:x[0-9]+]], x1, #9, #8
>                    ; CHECK: ldrh [[REG1:w[0-9]+]],
>             [{{.*}}[[REG2:x[0-9]+]], [[REG]],
>                  lsl #1]
>                  -; CHECK: strh [[REG1]], [{{.*}}[[REG2]], [[REG]], lsl #1]
>                  +; CHECK: mov [[REG3:x[0-9]+]], [[REG2]]
>                  +; CHECK: strh [[REG1]], [{{.*}}[[REG3]], [[REG]], lsl #1]
>                      %shr81 = lshr i32 %xor72, 9
>                      %conv82 = zext i32 %shr81 to i64
>                      %idxprom83 = and i64 %conv82, 255
>                  @@ -24,7 +25,8 @@ define i32 @word(%struct.b* %ctx, i32 %x
>                    ; CHECK-LABEL: word:
>                    ; CHECK: ubfx [[REG:x[0-9]+]], x1, #9, #8
>                    ; CHECK: ldr [[REG1:w[0-9]+]],
>             [{{.*}}[[REG2:x[0-9]+]], [[REG]],
>                  lsl #2]
>                  -; CHECK: str [[REG1]], [{{.*}}[[REG2]], [[REG]], lsl #2]
>                  +; CHECK: mov [[REG3:x[0-9]+]], [[REG2]]
>                  +; CHECK: str [[REG1]], [{{.*}}[[REG3]], [[REG]], lsl #2]
>                      %shr81 = lshr i32 %xor72, 9
>                      %conv82 = zext i32 %shr81 to i64
>                      %idxprom83 = and i64 %conv82, 255
>                  @@ -39,7 +41,8 @@ define i64 @doubleword(%struct.c* %ctx,
>                    ; CHECK-LABEL: doubleword:
>                    ; CHECK: ubfx [[REG:x[0-9]+]], x1, #9, #8
>                    ; CHECK: ldr [[REG1:x[0-9]+]],
>             [{{.*}}[[REG2:x[0-9]+]], [[REG]],
>                  lsl #3]
>                  -; CHECK: str [[REG1]], [{{.*}}[[REG2]], [[REG]], lsl #3]
>                  +; CHECK: mov [[REG3:x[0-9]+]], [[REG2]]
>                  +; CHECK: str [[REG1]], [{{.*}}[[REG3]], [[REG]], lsl #3]
>                      %shr81 = lshr i32 %xor72, 9
>                      %conv82 = zext i32 %shr81 to i64
>                      %idxprom83 = and i64 %conv82, 255
> 
>                  Modified:
>             llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  ---
>             llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll
>             (original)
>                  +++
>             llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll Wed Aug
>                  16 13:50:01 2017
>                  @@ -8,15 +8,9 @@ define <2 x i64> @bar(<2 x i64> %a, <2 x
>                    ; CHECK: add.2d        v[[REG:[0-9]+]], v0, v1
>                    ; CHECK: add   d[[REG3:[0-9]+]], d[[REG]], d1
>                    ; CHECK: sub   d[[REG2:[0-9]+]], d[[REG]], d1
>                  -; Without advanced copy optimization, we end up with
>             cross register
>                  -; banks copies that cannot be coalesced.
>                  -; CHECK-NOOPT: fmov [[COPY_REG3:x[0-9]+]], d[[REG3]]
>                  -; With advanced copy optimization, we end up with just
>             one copy
>                  -; to insert the computed high part into the V register.
>                  -; CHECK-OPT-NOT: fmov
>                  +; CHECK-NOT: fmov
>                    ; CHECK: fmov [[COPY_REG2:x[0-9]+]], d[[REG2]]
>                  -; CHECK-NOOPT: fmov d0, [[COPY_REG3]]
>                  -; CHECK-OPT-NOT: fmov
>                  +; CHECK-NOT: fmov
>                    ; CHECK: ins.d v0[1], [[COPY_REG2]]
>                    ; CHECK-NEXT: ret
>                    ;
>                  @@ -24,11 +18,9 @@ define <2 x i64> @bar(<2 x i64> %a, <2 x
>                    ; GENERIC: add v[[REG:[0-9]+]].2d, v0.2d, v1.2d
>                    ; GENERIC: add d[[REG3:[0-9]+]], d[[REG]], d1
>                    ; GENERIC: sub d[[REG2:[0-9]+]], d[[REG]], d1
>                  -; GENERIC-NOOPT: fmov [[COPY_REG3:x[0-9]+]], d[[REG3]]
>                  -; GENERIC-OPT-NOT: fmov
>                  +; GENERIC-NOT: fmov
>                    ; GENERIC: fmov [[COPY_REG2:x[0-9]+]], d[[REG2]]
>                  -; GENERIC-NOOPT: fmov d0, [[COPY_REG3]]
>                  -; GENERIC-OPT-NOT: fmov
>                  +; GENERIC-NOT: fmov
>                    ; GENERIC: ins v0.d[1], [[COPY_REG2]]
>                    ; GENERIC-NEXT: ret
>                      %add = add <2 x i64> %a, %b
> 
>                  Modified:
>             llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  ---
>             llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll
>                  (original)
>                  +++
>             llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll Wed
>                  Aug 16 13:50:01 2017
>                  @@ -4,8 +4,10 @@
>                    define i32 @t(i32 %a, i32 %b, i32 %c, i32 %d)
>             nounwind ssp {
>                    entry:
>                    ; CHECK-LABEL: t:
>                  -; CHECK: mov x0, [[REG1:x[0-9]+]]
>                  -; CHECK: mov x1, [[REG2:x[0-9]+]]
>                  +; CHECK: mov [[REG2:x[0-9]+]], x3
>                  +; CHECK: mov [[REG1:x[0-9]+]], x2
>                  +; CHECK: mov x0, x2
>                  +; CHECK: mov x1, x3
>                    ; CHECK: bl _foo
>                    ; CHECK: mov x0, [[REG1]]
>                    ; CHECK: mov x1, [[REG2]]
> 
>                  Modified:
>             llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  --- llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll
>             (original)
>                  +++ llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll
>             Wed Aug 16
>                  13:50:01 2017
>                  @@ -350,7 +350,7 @@ else:
> 
>                    ; CHECK-LABEL: test_phi:
>                    ; CHECK: mov  x[[PTR:[0-9]+]], x0
>                  -; CHECK: ldr  h[[AB:[0-9]+]], [x[[PTR]]]
>                  +; CHECK: ldr  h[[AB:[0-9]+]], [x0]
>                    ; CHECK: [[LOOP:LBB[0-9_]+]]:
>                    ; CHECK: mov.16b  v[[R:[0-9]+]], v[[AB]]
>                    ; CHECK: ldr  h[[AB]], [x[[PTR]]]
> 
>                  Modified: llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  --- llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll
>             (original)
>                  +++ llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll
>             Wed Aug 16
>                  13:50:01 2017
>                  @@ -17,6 +17,9 @@ define i32 @test_multiflag(i32 %n, i32 %
>                      %val = zext i1 %test to i32
>                    ; CHECK: cset {{[xw][0-9]+}}, ne
> 
>                  +; CHECK: mov [[RHSCOPY:w[0-9]+]], [[RHS]]
>                  +; CHECK: mov [[LHSCOPY:w[0-9]+]], [[LHS]]
>                  +
>                      store i32 %val, i32* @var
> 
>                      call void @bar()
>                  @@ -25,7 +28,7 @@ define i32 @test_multiflag(i32 %n, i32 %
>                      ; Currently, the comparison is emitted again. An
>             MSR/MRS pair
>                  would also be
>                      ; acceptable, but assuming the call preserves NZCV
>             is not.
>                      br i1 %test, label %iftrue, label %iffalse
>                  -; CHECK: cmp [[LHS]], [[RHS]]
>                  +; CHECK: cmp [[LHSCOPY]], [[RHSCOPY]]
>                    ; CHECK: b.eq
> 
>                    iftrue:
> 
>                  Modified:
>             llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  ---
>             llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll
>             (original)
>                  +++
>             llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll Wed
>                  Aug 16 13:50:01 2017
>                  @@ -8,10 +8,9 @@
>                    define void @test(%struct1* %fde, i32 %fd, void (i32,
>             i32, i8*)*
>                  %func, i8* %arg)  {
>                    ;CHECK-LABEL: test
>                    entry:
>                  -; A53: mov [[DATA:w[0-9]+]], w1
>                    ; A53: str q{{[0-9]+}}, {{.*}}
>                    ; A53: str q{{[0-9]+}}, {{.*}}
>                  -; A53: str [[DATA]], {{.*}}
>                  +; A53: str w1, {{.*}}
> 
>                      %0 = bitcast %struct1* %fde to i8*
>                      tail call void @llvm.memset.p0i8.i64(i8* %0, i8 0,
>             i64 40, i32
>                  8, i1 false)
> 
>                  Modified: llvm/trunk/test/CodeGen/AArch64/neg-imm.ll
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/neg-imm.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/neg-imm.ll?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/neg-imm.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/neg-imm.ll?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  --- llvm/trunk/test/CodeGen/AArch64/neg-imm.ll (original)
>                  +++ llvm/trunk/test/CodeGen/AArch64/neg-imm.ll Wed Aug
>             16 13:50:01 2017
>                  @@ -7,8 +7,8 @@ declare void @foo(i32)
>                    define void @test(i32 %px) {
>                    ; CHECK_LABEL: test:
>                    ; CHECK_LABEL: %entry
>                  -; CHECK: subs
>                  -; CHECK-NEXT: csel
>                  +; CHECK: subs [[REG0:w[0-9]+]],
>                  +; CHECK: csel {{w[0-9]+}}, wzr, [[REG0]]
>                    entry:
>                      %sub = add nsw i32 %px, -1
>                      %cmp = icmp slt i32 %px, 1
> 
>                  Modified:
>             llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll?rev=311038&r1=311037&r2=311038&view=diff>
>                 
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll?rev=311038&r1=311037&r2=311038&view=diff>>
>                 
>             ==============================================================================
>                  --- llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll
>             (original)
>                  +++ llvm/trunk/test/CodeGen/AMDGPU/byval-frame-setup.ll
>             Wed Aug 16
>                  13:50:01 2017
>                  @@ -127,20 +127,21 @@ entry:
>                    }
> 
>                    ; GCN-LABEL: {{^}}call_void_func_byval_struct_kernel:
>                  -; GCN: s_mov_b32 s33, s7
>                  -; GCN: s_add_u32 s32, s33, 0xa00{{$}}
>                  +; GCN: s_add_u32 s32, s7, 0xa00{{$}}
> 
>                    ; GCN-DAG: v_mov_b32_e32 [[NINE:v[0-9]+]], 9
>                    ; GCN-DAG: v_mov_b32_e32 [[THIRTEEN:v[0-9]+]], 13
>                  -; GCN-DAG: buffer_store_dword [[NINE]], off, s[0:3],
>             s33 offset:8
>                  -; GCN: buffer_store_dword [[THIRTEEN]], off, s[0:3],
>             s33 offset:24
>                  +; GCN-DAG: buffer_store_dword [[NINE]], off, s[0:3],
>             s7 offset:8
>                  +; GCN: buffer_store_dword [[THIRTEEN]], off, s[0:3],
>             s7 offset:24
>                  +
>                  +; GCN: s_mov_b32 s33, s7
> 
>                    ; GCN-DAG: s_add_u32 s32, s32, 0x800{{$}}
> 
>                  -; GCN-DAG: buffer_load_dword [[LOAD0:v[0-9]+]], off,
>             s[0:3], s33
>                  offset:8
>                  -; GCN-DAG: buffer_load_dword [[LOAD1:v[0-9]+]], off,
>             s[0:3], s33
>                  offset:12
>                  -; GCN-DAG: buffer_load_dword [[LOAD2:v[0-9]+]], off,
>             s[0:3], s33
>                  offset:16
>                  -; GCN-DAG: buffer_load_dword [[LOAD3:v[0-9]+]], off,
>             s[0:3], s33
>                  offset:20
>                  +; GCN-DAG: buffer_load_dword [[LOAD0:v[0-9]+]], off,
>             s[0:3],
>                  s{{7|33}} offset:8
>                  +; GCN-DAG: buffer_load_dword [[LOAD1:v[0-9]+]], off,
>             s[0:3],
>                  s{{7|33}} offset:12
>                  +; GCN-DAG: buffer_load_dword [[LOAD2:v[0-9]+]], off,
>             s[0:3],
>                  s{{7|33}} offset:16
>                  +; GCN-DAG: buffer_load_dword [[LOAD3:v[0-9]+]], off,
>             s[0:3],
>                  s{{7|33}} offset:20
> 
>                    ; GCN-DAG: buffer_store_dword [[LOAD0]], off, s[0:3], s32
>                  offset:4{{$}}
>                    ; GCN-DAG: buffer_store_dword [[LOAD1]], off, s[0:3],
>             s32 offset:8
> 
>                  Modified:
>             llvm/trunk/test/CodeGen/AMDGPU/call-argument-types.ll
>                  URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/call-argument-types.ll?rev=311038&r1=311037&r2=311038&view=diff
>             <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/call-argument-types.ll?rev=311038&r1=311037&r2=311038&view=diff>
> 
> 
> 

-- 
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
  Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code 
Aurora Forum, a Linux Foundation Collaborative Project.


More information about the llvm-commits mailing list