<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">On 11/19/2014 12:05 PM, Tom Stellard
      wrote:<br>
    </div>
    <blockquote cite="mid:20141119200553.GA30277@freedesktop.org"
      type="cite">
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        Even though they weren't being clustered, ds_read2 was still
        matching, because the load instructions happened to
        be next to each other by luck.<br>
      </div>
    </blockquote>
    It would be nice to come up with a test where this happens now<br>
    <br>
    <blockquote cite="mid:20141119200553.GA30277@freedesktop.org"
      type="cite">
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        <blockquote type="cite" style="color: #000000;">
          <blockquote type="cite" style="color: #000000;">
            <pre wrap=""><span class="moz-txt-citetags">> ></span>0003-R600-SI-Add-SIFoldOperands-pass.patch
<span class="moz-txt-citetags">> ></span>
<span class="moz-txt-citetags">> ></span>
<span class="moz-txt-citetags">> > </span>From 41409a4ba20a0a74acc58ae07a5130144bff8c39 Mon Sep 17 00:00:00 2001
<span class="moz-txt-citetags">> ></span>From: Tom Stellard<a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:thomas.stellard@amd.com"><thomas.stellard@amd.com></a>
<span class="moz-txt-citetags">> ></span>Date: Tue, 18 Nov 2014 11:45:28 -0500
<span class="moz-txt-citetags">> ></span>Subject: [PATCH 3/6] R600/SI: Add SIFoldOperands pass
<span class="moz-txt-citetags">> ></span>
<span class="moz-txt-citetags">> ></span>This pass attempts to fold the source operands of mov and copy
<span class="moz-txt-citetags">> ></span>instructions into their uses.
</pre>
          </blockquote>
          <pre wrap=""><span class="moz-txt-citetags">> </span>
<span class="moz-txt-citetags">> </span>Is this supposed to replace what's left of
<span class="moz-txt-citetags">> </span>SITargetLowering::legalizeOperands? This doesn't delete it so this
<span class="moz-txt-citetags">> </span>probably isn't going to hit many cases. I'm also slightly worried
<span class="moz-txt-citetags">> </span>this might break my f64 inline immediate patches
</pre>
        </blockquote>
        <pre wrap="">I was planning to remove legalizeOperands in a separate patch.  It does
seem to hit a fair amount of cases even with legalizeOperands still there.

</pre>
        <blockquote type="cite" style="color: #000000;">
          <blockquote type="cite" style="color: #000000;">
            <pre wrap=""><span class="moz-txt-citetags">> ></span>---
<span class="moz-txt-citetags">> > </span> lib/Target/R600/AMDGPU.h                |   4 +
<span class="moz-txt-citetags">> > </span> lib/Target/R600/AMDGPUTargetMachine.cpp |   2 +
<span class="moz-txt-citetags">> > </span> lib/Target/R600/CMakeLists.txt          |   1 +
<span class="moz-txt-citetags">> > </span> lib/Target/R600/SIFoldOperands.cpp      | 200 ++++++++++++++++++++++++++++++++
<span class="moz-txt-citetags">> > </span> test/CodeGen/R600/extload.ll            |  21 ++--
<span class="moz-txt-citetags">> > </span> test/CodeGen/R600/local-atomics.ll      |  24 ++--
<span class="moz-txt-citetags">> > </span> test/CodeGen/R600/operand-folding.ll    |  40 +++++++
<span class="moz-txt-citetags">> > </span> 7 files changed, 264 insertions(+), 28 deletions(-)
<span class="moz-txt-citetags">> > </span> create mode 100644 lib/Target/R600/SIFoldOperands.cpp
<span class="moz-txt-citetags">> > </span> create mode 100644 test/CodeGen/R600/operand-folding.ll
<span class="moz-txt-citetags">> ></span>
<span class="moz-txt-citetags">> ></span>diff --git a/lib/Target/R600/AMDGPU.h b/lib/Target/R600/AMDGPU.h
<span class="moz-txt-citetags">> ></span>index 261075e..13379e7 100644
<span class="moz-txt-citetags">> ></span>--- a/lib/Target/R600/AMDGPU.h
<span class="moz-txt-citetags">> ></span>+++ b/lib/Target/R600/AMDGPU.h
<span class="moz-txt-citetags">> ></span>@@ -38,6 +38,7 @@ FunctionPass *createAMDGPUCFGStructurizerPass();
<span class="moz-txt-citetags">> > </span> // SI Passes
<span class="moz-txt-citetags">> > </span> FunctionPass *createSITypeRewriter();
<span class="moz-txt-citetags">> > </span> FunctionPass *createSIAnnotateControlFlowPass();
<span class="moz-txt-citetags">> ></span>+FunctionPass *createSIFoldOperandsPass();
<span class="moz-txt-citetags">> > </span> FunctionPass *createSILowerI1CopiesPass();
<span class="moz-txt-citetags">> > </span> FunctionPass *createSIShrinkInstructionsPass();
<span class="moz-txt-citetags">> > </span> FunctionPass *createSILoadStoreOptimizerPass(TargetMachine &tm);
<span class="moz-txt-citetags">> ></span>@@ -47,6 +48,9 @@ FunctionPass *createSIFixSGPRLiveRangesPass();
<span class="moz-txt-citetags">> > </span> FunctionPass *createSICodeEmitterPass(formatted_raw_ostream &OS);
<span class="moz-txt-citetags">> > </span> FunctionPass *createSIInsertWaits(TargetMachine &tm);
<span class="moz-txt-citetags">> ></span>+void initializeSIFoldOperandsPass(PassRegistry &);
<span class="moz-txt-citetags">> ></span>+extern char &SIFoldOperandsID;
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> > </span> void initializeSILowerI1CopiesPass(PassRegistry &);
<span class="moz-txt-citetags">> > </span> extern char &SILowerI1CopiesID;
<span class="moz-txt-citetags">> ></span>diff --git a/lib/Target/R600/AMDGPUTargetMachine.cpp b/lib/Target/R600/AMDGPUTargetMachine.cpp
<span class="moz-txt-citetags">> ></span>index b2cd988..80142f0 100644
<span class="moz-txt-citetags">> ></span>--- a/lib/Target/R600/AMDGPUTargetMachine.cpp
<span class="moz-txt-citetags">> ></span>+++ b/lib/Target/R600/AMDGPUTargetMachine.cpp
<span class="moz-txt-citetags">> ></span>@@ -159,6 +159,8 @@ bool AMDGPUPassConfig::addInstSelector() {
<span class="moz-txt-citetags">> > </span>     addPass(createSIFixSGPRCopiesPass(*TM));
<span class="moz-txt-citetags">> > </span>   }
<span class="moz-txt-citetags">> ></span>+  addPass(createSILowerI1CopiesPass());
<span class="moz-txt-citetags">> ></span>+  addPass(createSIFoldOperandsPass());
<span class="moz-txt-citetags">> > </span>   return false;
<span class="moz-txt-citetags">> > </span> }
<span class="moz-txt-citetags">> ></span>diff --git a/lib/Target/R600/CMakeLists.txt b/lib/Target/R600/CMakeLists.txt
<span class="moz-txt-citetags">> ></span>index ed0a216..3b703e7 100644
<span class="moz-txt-citetags">> ></span>--- a/lib/Target/R600/CMakeLists.txt
<span class="moz-txt-citetags">> ></span>+++ b/lib/Target/R600/CMakeLists.txt
<span class="moz-txt-citetags">> ></span>@@ -43,6 +43,7 @@ add_llvm_target(R600CodeGen
<span class="moz-txt-citetags">> > </span>   SIAnnotateControlFlow.cpp
<span class="moz-txt-citetags">> > </span>   SIFixSGPRCopies.cpp
<span class="moz-txt-citetags">> > </span>   SIFixSGPRLiveRanges.cpp
<span class="moz-txt-citetags">> ></span>+  SIFoldOperands.cpp
<span class="moz-txt-citetags">> > </span>   SIInsertWaits.cpp
<span class="moz-txt-citetags">> > </span>   SIInstrInfo.cpp
<span class="moz-txt-citetags">> > </span>   SIISelLowering.cpp
<span class="moz-txt-citetags">> ></span>diff --git a/lib/Target/R600/SIFoldOperands.cpp b/lib/Target/R600/SIFoldOperands.cpp
<span class="moz-txt-citetags">> ></span>new file mode 100644
<span class="moz-txt-citetags">> ></span>index 0000000..1b0ba24
<span class="moz-txt-citetags">> ></span>--- /dev/null
<span class="moz-txt-citetags">> ></span>+++ b/lib/Target/R600/SIFoldOperands.cpp
<span class="moz-txt-citetags">> ></span>@@ -0,0 +1,200 @@
<span class="moz-txt-citetags">> ></span>+//===-- SIFoldOperands.cpp - Fold operands --- ----------------------------===//
<span class="moz-txt-citetags">> ></span>+//
<span class="moz-txt-citetags">> ></span>+//                     The LLVM Compiler Infrastructure
<span class="moz-txt-citetags">> ></span>+//
<span class="moz-txt-citetags">> ></span>+// This file is distributed under the University of Illinois Open Source
<span class="moz-txt-citetags">> ></span>+// License. See LICENSE.TXT for details.
<span class="moz-txt-citetags">> ></span>+//
<span class="moz-txt-citetags">> ></span>+/// \file
<span class="moz-txt-citetags">> ></span>+//===----------------------------------------------------------------------===//
<span class="moz-txt-citetags">> ></span>+//
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+#include "AMDGPU.h"
<span class="moz-txt-citetags">> ></span>+#include "AMDGPUSubtarget.h"
<span class="moz-txt-citetags">> ></span>+#include "SIInstrInfo.h"
<span class="moz-txt-citetags">> ></span>+#include "llvm/CodeGen/LiveIntervalAnalysis.h"
<span class="moz-txt-citetags">> ></span>+#include "llvm/CodeGen/MachineDominators.h"
<span class="moz-txt-citetags">> ></span>+#include "llvm/CodeGen/MachineFunctionPass.h"
<span class="moz-txt-citetags">> ></span>+#include "llvm/CodeGen/MachineInstrBuilder.h"
<span class="moz-txt-citetags">> ></span>+#include "llvm/CodeGen/MachineRegisterInfo.h"
<span class="moz-txt-citetags">> ></span>+#include "llvm/IR/LLVMContext.h"
<span class="moz-txt-citetags">> ></span>+#include "llvm/IR/Function.h"
<span class="moz-txt-citetags">> ></span>+#include "llvm/Support/Debug.h"
<span class="moz-txt-citetags">> ></span>+#include "llvm/Target/TargetMachine.h"
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+#define DEBUG_TYPE "si-fold-operands"
<span class="moz-txt-citetags">> ></span>+using namespace llvm;
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+namespace {
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+class SIFoldOperands : public MachineFunctionPass {
<span class="moz-txt-citetags">> ></span>+public:
<span class="moz-txt-citetags">> ></span>+  static char ID;
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+public:
<span class="moz-txt-citetags">> ></span>+  SIFoldOperands() : MachineFunctionPass(ID) {
<span class="moz-txt-citetags">> ></span>+    initializeSIFoldOperandsPass(*PassRegistry::getPassRegistry());
<span class="moz-txt-citetags">> ></span>+  }
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+  bool runOnMachineFunction(MachineFunction &MF) override;
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+  const char *getPassName() const override {
<span class="moz-txt-citetags">> ></span>+    return "SI Fold Operands";
<span class="moz-txt-citetags">> ></span>+  }
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+  void getAnalysisUsage(AnalysisUsage &AU) const override {
<span class="moz-txt-citetags">> ></span>+    AU.addRequired<MachineDominatorTree>();
<span class="moz-txt-citetags">> ></span>+    AU.setPreservesCFG();
<span class="moz-txt-citetags">> ></span>+    MachineFunctionPass::getAnalysisUsage(AU);
<span class="moz-txt-citetags">> ></span>+  }
<span class="moz-txt-citetags">> ></span>+};
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+} // End anonymous namespace.
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+INITIALIZE_PASS_BEGIN(SIFoldOperands, DEBUG_TYPE,
<span class="moz-txt-citetags">> ></span>+                      "SI Fold Operands", false, false)
<span class="moz-txt-citetags">> ></span>+INITIALIZE_PASS_DEPENDENCY(MachineDominatorTree)
<span class="moz-txt-citetags">> ></span>+INITIALIZE_PASS_END(SIFoldOperands, DEBUG_TYPE,
<span class="moz-txt-citetags">> ></span>+                    "SI Fold Operands", false, false)
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+char SIFoldOperands::ID = 0;
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+char &llvm::SIFoldOperandsID = SIFoldOperands::ID;
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+FunctionPass *llvm::createSIFoldOperandsPass() {
<span class="moz-txt-citetags">> ></span>+  return new SIFoldOperands();
<span class="moz-txt-citetags">> ></span>+}
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+static bool isSafeToFold(unsigned Opcode) {
<span class="moz-txt-citetags">> ></span>+  switch(Opcode) {
<span class="moz-txt-citetags">> ></span>+  case AMDGPU::V_MOV_B32_e32:
<span class="moz-txt-citetags">> ></span>+  case AMDGPU::V_MOV_B32_e64:
<span class="moz-txt-citetags">> ></span>+  case AMDGPU::S_MOV_B32:
<span class="moz-txt-citetags">> ></span>+  case AMDGPU::S_MOV_B64:
<span class="moz-txt-citetags">> ></span>+  case AMDGPU::COPY:
<span class="moz-txt-citetags">> ></span>+    return true;
<span class="moz-txt-citetags">> ></span>+  default:
<span class="moz-txt-citetags">> ></span>+    return false;
<span class="moz-txt-citetags">> ></span>+  }
<span class="moz-txt-citetags">> ></span>+}
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+static bool updateOperand(MachineInstr *MI, unsigned OpNo,
<span class="moz-txt-citetags">> ></span>+                          const MachineOperand &New,
<span class="moz-txt-citetags">> ></span>+                          const TargetRegisterInfo &TRI) {
<span class="moz-txt-citetags">> ></span>+  MachineOperand &Old = MI->getOperand(OpNo);
<span class="moz-txt-citetags">> ></span>+  assert(Old.isReg());
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+  if (New.isImm()) {
<span class="moz-txt-citetags">> ></span>+    Old.ChangeToImmediate(New.getImm());
<span class="moz-txt-citetags">> ></span>+    return true;
<span class="moz-txt-citetags">> ></span>+  }
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+  if (New.isFPImm()) {
<span class="moz-txt-citetags">> ></span>+    Old.ChangeToFPImmediate(New.getFPImm());
<span class="moz-txt-citetags">> ></span>+    return true;
<span class="moz-txt-citetags">> ></span>+  }
</pre>
          </blockquote>
          <pre wrap=""><span class="moz-txt-citetags">> </span>I've been considering replacing all fp immediates with the bitcasted
<span class="moz-txt-citetags">> </span>integers, since handling fp immediates everywhere is a pain and
<span class="moz-txt-citetags">> </span>there isn't much point to distinguishing them. Where do you think
<span class="moz-txt-citetags">> </span>the best place to do this would be?
<span class="moz-txt-citetags">> </span>
</pre>
        </blockquote>
        <pre wrap="">Would it work to convert them in AMDGPUDAGToDAGISel::Select() ?</pre>
      </div>
    </blockquote>
    I think so<br>
    <br>
    <blockquote cite="mid:20141119200553.GA30277@freedesktop.org"
      type="cite">
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        <pre wrap="">

</pre>
        <blockquote type="cite" style="color: #000000;">
          <blockquote type="cite" style="color: #000000;">
            <pre wrap=""><span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+  if (New.isReg())  {
<span class="moz-txt-citetags">> ></span>+    if (TargetRegisterInfo::isVirtualRegister(Old.getReg()) &&
<span class="moz-txt-citetags">> ></span>+        TargetRegisterInfo::isVirtualRegister(New.getReg())) {
<span class="moz-txt-citetags">> ></span>+      Old.substVirtReg(New.getReg(), New.getSubReg(), TRI);
<span class="moz-txt-citetags">> ></span>+      return true;
<span class="moz-txt-citetags">> ></span>+    }
<span class="moz-txt-citetags">> ></span>+  }
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+  // FIXME: Handle physical registers.
</pre>
          </blockquote>
          <pre wrap=""><span class="moz-txt-citetags">> </span>
<span class="moz-txt-citetags">> </span>What cases that can be folded will ever have physical registers at
<span class="moz-txt-citetags">> </span>this point?
</pre>
        </blockquote>
        <pre wrap="">If this pass was run after register allocation it would have to deal with
physical registers.</pre>
      </div>
    </blockquote>
    I don't expect a need to run this again after register allocation,
    but maybe<br>
    <br>
    <blockquote cite="mid:20141119200553.GA30277@freedesktop.org"
      type="cite">
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        <pre wrap="">

</pre>
        <blockquote type="cite" style="color: #000000;">
          <pre wrap=""><span class="moz-txt-citetags">> </span>
</pre>
          <blockquote type="cite" style="color: #000000;">
            <pre wrap=""><span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+  return false;
<span class="moz-txt-citetags">> ></span>+}
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+bool SIFoldOperands::runOnMachineFunction(MachineFunction &MF) {
<span class="moz-txt-citetags">> ></span>+  MachineRegisterInfo &MRI = MF.getRegInfo();
<span class="moz-txt-citetags">> ></span>+  const SIInstrInfo *TII =
<span class="moz-txt-citetags">> ></span>+      static_cast<const SIInstrInfo *>(MF.getSubtarget().getInstrInfo());
<span class="moz-txt-citetags">> ></span>+  const SIRegisterInfo &TRI = TII->getRegisterInfo();
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+  for (MachineFunction::iterator BI = MF.begin(), BE = MF.end();
<span class="moz-txt-citetags">> ></span>+                                                  BI != BE; ++BI) {
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+    MachineBasicBlock &MBB = *BI;
<span class="moz-txt-citetags">> ></span>+    MachineBasicBlock::iterator I, Next;
<span class="moz-txt-citetags">> ></span>+    for (I = MBB.begin(); I != MBB.end(); I = Next) {
<span class="moz-txt-citetags">> ></span>+      Next = std::next(I);
<span class="moz-txt-citetags">> ></span>+      MachineInstr &MI = *I;
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+      if (!isSafeToFold(MI.getOpcode()))
<span class="moz-txt-citetags">> ></span>+        continue;
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+      const MachineOperand &OpToFold = MI.getOperand(1);
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+      // FIXME: Fold operands with subregs.
<span class="moz-txt-citetags">> ></span>+      if (OpToFold.isReg() &&
<span class="moz-txt-citetags">> ></span>+          (!TargetRegisterInfo::isVirtualRegister(OpToFold.getReg()) ||
<span class="moz-txt-citetags">> ></span>+           OpToFold.getSubReg()))
<span class="moz-txt-citetags">> ></span>+        continue;
<span class="moz-txt-citetags">> ></span>+
<span class="moz-txt-citetags">> ></span>+      std::vector<std::pair<MachineInstr *, unsigned>> FoldList;
<span class="moz-txt-citetags">> ></span>+      for (MachineRegisterInfo::use_iterator
<span class="moz-txt-citetags">> ></span>+           Use = MRI.use_begin(MI.getOperand(0).getReg()), E = MRI.use_end();
<span class="moz-txt-citetags">> ></span>+           Use != E; ++Use) {
</pre>
          </blockquote>
          <pre wrap=""><span class="moz-txt-citetags">> </span>I think you can use a range for here with MRI.use_instructions(Op.getReg())
</pre>
        </blockquote>
        <pre wrap="">I'm using the iterator, because it has a reference to the operand number.
I can't get that using a range.

I've attached updated patches.  Let me know what you think.

-Tom
</pre>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"><legend
          class="mimeAttachmentHeaderName">0001-R600-SI-Add-SIFoldOperands-pass.patch</legend></fieldset>
      <br>
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        <pre wrap="">From 70f32795a67223e2618fbd37f959138d85c949ca Mon Sep 17 00:00:00 2001
From: Tom Stellard <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:thomas.stellard@amd.com"><thomas.stellard@amd.com></a>
Date: Tue, 18 Nov 2014 11:45:28 -0500
Subject: [PATCH 1/4] R600/SI: Add SIFoldOperands pass

This pass attempts to fold the source operands of mov and copy
instructions into their uses.
---
 lib/Target/R600/AMDGPU.h                |   4 +
 lib/Target/R600/AMDGPUTargetMachine.cpp |   2 +
 lib/Target/R600/CMakeLists.txt          |   1 +
 lib/Target/R600/SIFoldOperands.cpp      | 202 ++++++++++++++++++++++++++++++++
 test/CodeGen/R600/extload.ll            |  21 ++--
 test/CodeGen/R600/local-atomics.ll      |  24 ++--
 test/CodeGen/R600/operand-folding.ll    |  40 +++++++
 7 files changed, 266 insertions(+), 28 deletions(-)
 create mode 100644 lib/Target/R600/SIFoldOperands.cpp
 create mode 100644 test/CodeGen/R600/operand-folding.ll</pre>
      </div>
    </blockquote>
    <br>
    <br>
    LGTM<br>
    <blockquote cite="mid:20141119200553.GA30277@freedesktop.org"
      type="cite">
      <fieldset class="mimeAttachmentHeader"><legend
          class="mimeAttachmentHeaderName">0002-R600-SI-Mark-s_mov_b32-and-s_mov_b64-as-rematerializ.patch</legend></fieldset>
      <br>
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        <pre wrap="">From fa2551ed3093c2dffab3685ee7e1d2fde7974f30 Mon Sep 17 00:00:00 2001
From: Tom Stellard <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:thomas.stellard@amd.com"><thomas.stellard@amd.com></a>
Date: Fri, 14 Nov 2014 18:10:26 -0500
Subject: [PATCH 2/4] R600/SI: Mark s_mov_b32 and s_mov_b64 as rematerializable

---
 lib/Target/R600/SIInstructions.td | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td
index 90da7a9..bd91577 100644
--- a/lib/Target/R600/SIInstructions.td
+++ b/lib/Target/R600/SIInstructions.td
@@ -96,8 +96,10 @@ defm S_BUFFER_LOAD_DWORDX16 : SMRD_Helper <
 //===----------------------------------------------------------------------===//
 
 let isMoveImm = 1 in {
+let isReMaterializable = 1 in {
 def S_MOV_B32 : SOP1_32 <0x00000003, "s_mov_b32", []>;
 def S_MOV_B64 : SOP1_64 <0x00000004, "s_mov_b64", []>;
+} // let isRematerializeable = 1
 def S_CMOV_B32 : SOP1_32 <0x00000005, "s_cmov_b32", []>;
 def S_CMOV_B64 : SOP1_64 <0x00000006, "s_cmov_b64", []>;
 } // End isMoveImm = 1
<div class="moz-txt-sig">-- 
1.8.5.5
</div></pre>
      </div>
    </blockquote>
    LGTM<br>
    <br>
    <blockquote cite="mid:20141119200553.GA30277@freedesktop.org"
      type="cite">
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        <pre wrap=""><div class="moz-txt-sig">
</div></pre>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"><legend
          class="mimeAttachmentHeaderName">0003-R600-SI-Emit-s_mov_b32-m0-1-before-every-DS-instruct.patch</legend></fieldset>
      <br>
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        <pre wrap="">From a7b6ce3cb14627e644d0dad4ca5e2d3c6f8cfe78 Mon Sep 17 00:00:00 2001
From: Tom Stellard <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:thomas.stellard@amd.com"><thomas.stellard@amd.com></a>
Date: Fri, 31 Oct 2014 13:10:22 -0400
Subject: [PATCH 3/4] R600/SI: Emit s_mov_b32 m0, -1 before every DS
 instruction

This s_mov_b32 will write to a virtual register from the M0Reg
class and all the ds instructions now take an extra M0Reg explicit
argument.

This change is necessary to prevent issues with the scheduler
mixing together instructions that expect different values in the m0
registers.
---
 lib/Target/R600/SIISelLowering.cpp       |  2 +-
 lib/Target/R600/SIInstrFormats.td        |  1 +
 lib/Target/R600/SIInstrInfo.td           | 17 +++++++++--------
 lib/Target/R600/SIInstructions.td        | 15 ++++++++-------
 lib/Target/R600/SILoadStoreOptimizer.cpp | 10 +++++++++-
 lib/Target/R600/SILowerControlFlow.cpp   | 23 -----------------------
 test/CodeGen/R600/shl_add_ptr.ll         |  3 ++-
 7 files changed, 30 insertions(+), 41 deletions(-)
</pre>
      </div>
    </blockquote>
    LGTM<br>
    <br>
    <blockquote cite="mid:20141119200553.GA30277@freedesktop.org"
      type="cite">
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        <pre wrap="">
</pre>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"><legend
          class="mimeAttachmentHeaderName">0004-R600-SI-Add-an-s_mov_b32-to-patterns-which-use-the-M.patch</legend></fieldset>
      <br>
      <div class="moz-text-plain" wrap="true" graphical-quote="true"
        style="font-family: -moz-fixed; font-size: 12px;"
        lang="x-western">
        <pre wrap="">From 1782fd5e32dcfefbd5c4949cfdbca1d6f5581e41 Mon Sep 17 00:00:00 2001
From: Tom Stellard <a moz-do-not-send="true" class="moz-txt-link-rfc2396E" href="mailto:thomas.stellard@amd.com"><thomas.stellard@amd.com></a>
Date: Fri, 31 Oct 2014 13:15:24 -0400
Subject: [PATCH 4/4] R600/SI: Add an s_mov_b32 to patterns which use the
 M0RegClass

We need to use a s_mov_b32 rather than a copy, so that CSE will
eliminate redundant moves to the m0 register.
---
 lib/Target/R600/SIInstrInfo.cpp   | 20 --------------------
 lib/Target/R600/SIInstructions.td | 12 ++++++++----
 2 files changed, 8 insertions(+), 24 deletions(-)</pre>
      </div>
    </blockquote>
    <br>
    LGTM
    <pre wrap=""><div class="moz-txt-sig">
</div></pre>
    <blockquote cite="mid:20141119200553.GA30277@freedesktop.org"
      type="cite">
    </blockquote>
    <br>
  </body>
</html>