<div dir="ltr"><div>Hello Yonghong,</div><div><br></div><div>This commit broke tests on one of our builders:</div><div><a href="http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/11261">http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/11261</a></div><div><br></div><div>. . .</div><div>Failing Tests (2):</div><div>    LLVM :: CodeGen/BPF/memcpy-expand-in-order.ll</div><div>    LLVM :: tools/dsymutil/X86/accelerator.test</div><div><br></div><div><br></div><div>Please have a look?</div><div><br></div><div>Thanks</div><div><br></div><div>Galina</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 25, 2018 at 3:40 PM, Yonghong Song via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: yhs<br>
Date: Wed Jul 25 15:40:02 2018<br>
New Revision: 337977<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=337977&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project?rev=337977&view=rev</a><br>
Log:<br>
bpf: new option -bpf-expand-memcpy-in-order to expand memcpy in order<br>
<br>
Some BPF JIT backends would want to optimize memcpy in their own<br>
architecture specific way.<br>
<br>
However, at the moment, there is no way for JIT backends to see memcpy<br>
semantics in a reliable way. This is due to LLVM BPF backend is expanding<br>
memcpy into load/store sequences and could possibly schedule them apart from<br>
each other further. So, BPF JIT backends inside kernel can't reliably<br>
recognize memcpy semantics by peephole BPF sequence.<br>
<br>
This patch introduce new intrinsic expand infrastructure to memcpy.<br>
<br>
To get stable in-order load/store sequence from memcpy, we first lower<br>
memcpy into BPF::MEMCPY node which then expanded into in-order load/store<br>
sequences in expandPostRAPseudo pass which will happen after instruction<br>
scheduling. By this way, kernel JIT backends could reliably recognize<br>
memcpy through scanning BPF sequence.<br>
<br>
This new memcpy expand infrastructure is gated by a new option:<br>
<br>
  -bpf-expand-memcpy-in-order<br>
<br>
Acked-by: Jakub Kicinski <<a href="mailto:jakub.kicinski@netronome.com">jakub.kicinski@netronome.com</a>><br>
Signed-off-by: Jiong Wang <<a href="mailto:jiong.wang@netronome.com">jiong.wang@netronome.com</a>><br>
Signed-off-by: Yonghong Song <<a href="mailto:yhs@fb.com">yhs@fb.com</a>><br>
<br>
Added:<br>
    llvm/trunk/lib/Target/BPF/<wbr>BPFSelectionDAGInfo.cpp<br>
    llvm/trunk/lib/Target/BPF/<wbr>BPFSelectionDAGInfo.h<br>
    llvm/trunk/test/CodeGen/BPF/<wbr>memcpy-expand-in-order.ll<br>
Modified:<br>
    llvm/trunk/lib/Target/BPF/<wbr>BPFISelLowering.cpp<br>
    llvm/trunk/lib/Target/BPF/<wbr>BPFISelLowering.h<br>
    llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.cpp<br>
    llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.h<br>
    llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.td<br>
    llvm/trunk/lib/Target/BPF/<wbr>BPFSubtarget.h<br>
    llvm/trunk/lib/Target/BPF/<wbr>CMakeLists.txt<br>
<br>
Modified: llvm/trunk/lib/Target/BPF/<wbr>BPFISelLowering.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFISelLowering.cpp?rev=337977&r1=337976&r2=337977&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>BPF/BPFISelLowering.cpp?rev=<wbr>337977&r1=337976&r2=337977&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/BPF/<wbr>BPFISelLowering.cpp (original)<br>
+++ llvm/trunk/lib/Target/BPF/<wbr>BPFISelLowering.cpp Wed Jul 25 15:40:02 2018<br>
@@ -33,6 +33,10 @@ using namespace llvm;<br>
<br>
 #define DEBUG_TYPE "bpf-lower"<br>
<br>
+static cl::opt<bool> BPFExpandMemcpyInOrder("bpf-<wbr>expand-memcpy-in-order",<br>
+  cl::Hidden, cl::init(false),<br>
+  cl::desc("Expand memcpy into load/store pairs in order"));<br>
+<br>
 static void fail(const SDLoc &DL, SelectionDAG &DAG, const Twine &Msg) {<br>
   MachineFunction &MF = DAG.getMachineFunction();<br>
   DAG.getContext()->diagnose(<br>
@@ -132,10 +136,30 @@ BPFTargetLowering::<wbr>BPFTargetLowering(con<br>
   setMinFunctionAlignment(3);<br>
   setPrefFunctionAlignment(3);<br>
<br>
-  // inline memcpy() for kernel to see explicit copy<br>
-  MaxStoresPerMemset = MaxStoresPerMemsetOptSize = 128;<br>
-  MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = 128;<br>
-  MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = 128;<br>
+  if (BPFExpandMemcpyInOrder) {<br>
+    // LLVM generic code will try to expand memcpy into load/store pairs at this<br>
+    // stage which is before quite a few IR optimization passes, therefore the<br>
+    // loads and stores could potentially be moved apart from each other which<br>
+    // will cause trouble to memcpy pattern matcher inside kernel eBPF JIT<br>
+    // compilers.<br>
+    //<br>
+    // When -bpf-expand-memcpy-in-order specified, we want to defer the expand<br>
+    // of memcpy to later stage in IR optimization pipeline so those load/store<br>
+    // pairs won't be touched and could be kept in order. Hence, we set<br>
+    // MaxStoresPerMem* to zero to disable the generic getMemcpyLoadsAndStores<br>
+    // code path, and ask LLVM to use target expander EmitTargetCodeForMemcpy.<br>
+    MaxStoresPerMemset = MaxStoresPerMemsetOptSize = 0;<br>
+    MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = 0;<br>
+    MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = 0;<br>
+  } else {<br>
+    // inline memcpy() for kernel to see explicit copy<br>
+    unsigned CommonMaxStores =<br>
+      STI.getSelectionDAGInfo()-><wbr>getCommonMaxStoresPerMemFunc()<wbr>;<br>
+<br>
+    MaxStoresPerMemset = MaxStoresPerMemsetOptSize = CommonMaxStores;<br>
+    MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = CommonMaxStores;<br>
+    MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = CommonMaxStores;<br>
+  }<br>
<br>
   // CPU/Feature control<br>
   HasAlu32 = STI.getHasAlu32();<br>
@@ -518,6 +542,8 @@ const char *BPFTargetLowering::getTarget<br>
     return "BPFISD::BR_CC";<br>
   case BPFISD::Wrapper:<br>
     return "BPFISD::Wrapper";<br>
+  case BPFISD::MEMCPY:<br>
+    return "BPFISD::MEMCPY";<br>
   }<br>
   return nullptr;<br>
 }<br>
@@ -557,6 +583,37 @@ BPFTargetLowering::<wbr>EmitSubregExt(Machine<br>
 }<br>
<br>
 MachineBasicBlock *<br>
+BPFTargetLowering::<wbr>EmitInstrWithCustomInserterMem<wbr>cpy(MachineInstr &MI,<br>
+                                                     MachineBasicBlock *BB)<br>
+                                                     const {<br>
+  MachineFunction *MF = MI.getParent()->getParent();<br>
+  MachineRegisterInfo &MRI = MF->getRegInfo();<br>
+  MachineInstrBuilder MIB(*MF, MI);<br>
+  unsigned ScratchReg;<br>
+<br>
+  // This function does custom insertion during lowering BPFISD::MEMCPY which<br>
+  // only has two register operands from memcpy semantics, the copy source<br>
+  // address and the copy destination address.<br>
+  //<br>
+  // Because we will expand BPFISD::MEMCPY into load/store pairs, we will need<br>
+  // a third scratch register to serve as the destination register of load and<br>
+  // source register of store.<br>
+  //<br>
+  // The scratch register here is with the Define | Dead | EarlyClobber flags.<br>
+  // The EarlyClobber flag has the semantic property that the operand it is<br>
+  // attached to is clobbered before the rest of the inputs are read. Hence it<br>
+  // must be unique among the operands to the instruction. The Define flag is<br>
+  // needed to coerce the machine verifier that an Undef value isn't a problem<br>
+  // as we anyway is loading memory into it. The Dead flag is needed as the<br>
+  // value in scratch isn't supposed to be used by any other instruction.<br>
+  ScratchReg = MRI.createVirtualRegister(&<wbr>BPF::GPRRegClass);<br>
+  MIB.addReg(ScratchReg,<br>
+             RegState::Define | RegState::Dead | RegState::EarlyClobber);<br>
+<br>
+  return BB;<br>
+}<br>
+<br>
+MachineBasicBlock *<br>
 BPFTargetLowering::<wbr>EmitInstrWithCustomInserter(<wbr>MachineInstr &MI,<br>
                                                MachineBasicBlock *BB) const {<br>
   const TargetInstrInfo &TII = *BB->getParent()-><wbr>getSubtarget().getInstrInfo();<br>
@@ -567,6 +624,8 @@ BPFTargetLowering::<wbr>EmitInstrWithCustomIn<br>
                        Opc == BPF::Select_32 ||<br>
                        Opc == BPF::Select_32_64);<br>
<br>
+  bool isMemcpyOp = Opc == BPF::MEMCPY;<br>
+<br>
 #ifndef NDEBUG<br>
   bool isSelectRIOp = (Opc == BPF::Select_Ri ||<br>
                        Opc == BPF::Select_Ri_64_32 ||<br>
@@ -574,9 +633,13 @@ BPFTargetLowering::<wbr>EmitInstrWithCustomIn<br>
                        Opc == BPF::Select_Ri_32_64);<br>
<br>
<br>
-  assert((isSelectRROp || isSelectRIOp) && "Unexpected instr type to insert");<br>
+  assert((isSelectRROp || isSelectRIOp || isMemcpyOp) &&<br>
+         "Unexpected instr type to insert");<br>
 #endif<br>
<br>
+  if (isMemcpyOp)<br>
+    return EmitInstrWithCustomInserterMem<wbr>cpy(MI, BB);<br>
+<br>
   bool is32BitCmp = (Opc == BPF::Select_32 ||<br>
                      Opc == BPF::Select_32_64 ||<br>
                      Opc == BPF::Select_Ri_32 ||<br>
<br>
Modified: llvm/trunk/lib/Target/BPF/<wbr>BPFISelLowering.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFISelLowering.h?rev=337977&r1=337976&r2=337977&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>BPF/BPFISelLowering.h?rev=<wbr>337977&r1=337976&r2=337977&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/BPF/<wbr>BPFISelLowering.h (original)<br>
+++ llvm/trunk/lib/Target/BPF/<wbr>BPFISelLowering.h Wed Jul 25 15:40:02 2018<br>
@@ -28,7 +28,8 @@ enum NodeType : unsigned {<br>
   CALL,<br>
   SELECT_CC,<br>
   BR_CC,<br>
-  Wrapper<br>
+  Wrapper,<br>
+  MEMCPY<br>
 };<br>
 }<br>
<br>
@@ -110,6 +111,11 @@ private:<br>
<br>
   unsigned EmitSubregExt(MachineInstr &MI, MachineBasicBlock *BB, unsigned Reg,<br>
                          bool isSigned) const;<br>
+<br>
+  MachineBasicBlock * EmitInstrWithCustomInserterMem<wbr>cpy(MachineInstr &MI,<br>
+                                                        MachineBasicBlock *BB)<br>
+                                                        const;<br>
+<br>
 };<br>
 }<br>
<br>
<br>
Modified: llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFInstrInfo.cpp?rev=337977&r1=337976&r2=337977&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>BPF/BPFInstrInfo.cpp?rev=<wbr>337977&r1=337976&r2=337977&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.cpp Wed Jul 25 15:40:02 2018<br>
@@ -43,6 +43,83 @@ void BPFInstrInfo::copyPhysReg(<wbr>MachineBa<br>
     llvm_unreachable("Impossible reg-to-reg copy");<br>
 }<br>
<br>
+void BPFInstrInfo::expandMEMCPY(<wbr>MachineBasicBlock::iterator MI) const {<br>
+  unsigned DstReg = MI->getOperand(0).getReg();<br>
+  unsigned SrcReg = MI->getOperand(1).getReg();<br>
+  uint64_t CopyLen = MI->getOperand(2).getImm();<br>
+  uint64_t Alignment = MI->getOperand(3).getImm();<br>
+  unsigned ScratchReg = MI->getOperand(4).getReg();<br>
+  MachineBasicBlock *BB = MI->getParent();<br>
+  DebugLoc dl = MI->getDebugLoc();<br>
+  unsigned LdOpc, StOpc;<br>
+<br>
+  switch (Alignment) {<br>
+  case 1:<br>
+    LdOpc = BPF::LDB;<br>
+    StOpc = BPF::STB;<br>
+    break;<br>
+  case 2:<br>
+    LdOpc = BPF::LDH;<br>
+    StOpc = BPF::STH;<br>
+    break;<br>
+  case 4:<br>
+    LdOpc = BPF::LDW;<br>
+    StOpc = BPF::STW;<br>
+    break;<br>
+  case 8:<br>
+    LdOpc = BPF::LDD;<br>
+    StOpc = BPF::STD;<br>
+    break;<br>
+  default:<br>
+    llvm_unreachable("unsupported memcpy alignment");<br>
+  }<br>
+<br>
+  unsigned IterationNum = CopyLen >> Log2_64(Alignment);<br>
+  for(unsigned I = 0; I < IterationNum; ++I) {<br>
+    BuildMI(*BB, MI, dl, get(LdOpc))<br>
+            .addReg(ScratchReg).addReg(<wbr>SrcReg).addImm(I * Alignment);<br>
+    BuildMI(*BB, MI, dl, get(StOpc))<br>
+            .addReg(ScratchReg).addReg(<wbr>DstReg).addImm(I * Alignment);<br>
+  }<br>
+<br>
+  unsigned BytesLeft = CopyLen & (Alignment - 1);<br>
+  unsigned Offset = IterationNum * Alignment;<br>
+  bool Hanging4Byte = BytesLeft & 0x4;<br>
+  bool Hanging2Byte = BytesLeft & 0x2;<br>
+  bool Hanging1Byte = BytesLeft & 0x1;<br>
+  if (Hanging4Byte) {<br>
+    BuildMI(*BB, MI, dl, get(BPF::LDW))<br>
+            .addReg(ScratchReg).addReg(<wbr>SrcReg).addImm(Offset);<br>
+    BuildMI(*BB, MI, dl, get(BPF::STW))<br>
+            .addReg(ScratchReg).addReg(<wbr>DstReg).addImm(Offset);<br>
+    Offset += 4;<br>
+  }<br>
+  if (Hanging2Byte) {<br>
+    BuildMI(*BB, MI, dl, get(BPF::LDH))<br>
+            .addReg(ScratchReg).addReg(<wbr>SrcReg).addImm(Offset);<br>
+    BuildMI(*BB, MI, dl, get(BPF::STH))<br>
+            .addReg(ScratchReg).addReg(<wbr>DstReg).addImm(Offset);<br>
+    Offset += 2;<br>
+  }<br>
+  if (Hanging1Byte) {<br>
+    BuildMI(*BB, MI, dl, get(BPF::LDB))<br>
+            .addReg(ScratchReg).addReg(<wbr>SrcReg).addImm(Offset);<br>
+    BuildMI(*BB, MI, dl, get(BPF::STB))<br>
+            .addReg(ScratchReg).addReg(<wbr>DstReg).addImm(Offset);<br>
+  }<br>
+<br>
+  BB->erase(MI);<br>
+}<br>
+<br>
+bool BPFInstrInfo::<wbr>expandPostRAPseudo(<wbr>MachineInstr &MI) const {<br>
+  if (MI.getOpcode() == BPF::MEMCPY) {<br>
+    expandMEMCPY(MI);<br>
+    return true;<br>
+  }<br>
+<br>
+  return false;<br>
+}<br>
+<br>
 void BPFInstrInfo::<wbr>storeRegToStackSlot(<wbr>MachineBasicBlock &MBB,<br>
                                        MachineBasicBlock::iterator I,<br>
                                        unsigned SrcReg, bool IsKill, int FI,<br>
<br>
Modified: llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFInstrInfo.h?rev=337977&r1=337976&r2=337977&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>BPF/BPFInstrInfo.h?rev=337977&<wbr>r1=337976&r2=337977&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.h (original)<br>
+++ llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.h Wed Jul 25 15:40:02 2018<br>
@@ -34,6 +34,8 @@ public:<br>
                    const DebugLoc &DL, unsigned DestReg, unsigned SrcReg,<br>
                    bool KillSrc) const override;<br>
<br>
+  bool expandPostRAPseudo(<wbr>MachineInstr &MI) const override;<br>
+<br>
   void storeRegToStackSlot(<wbr>MachineBasicBlock &MBB,<br>
                            MachineBasicBlock::iterator MBBI, unsigned SrcReg,<br>
                            bool isKill, int FrameIndex,<br>
@@ -55,6 +57,9 @@ public:<br>
                         MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,<br>
                         const DebugLoc &DL,<br>
                         int *BytesAdded = nullptr) const override;<br>
+private:<br>
+  void expandMEMCPY(<wbr>MachineBasicBlock::iterator) const;<br>
+<br>
 };<br>
 }<br>
<br>
<br>
Modified: llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFInstrInfo.td?rev=337977&r1=337976&r2=337977&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>BPF/BPFInstrInfo.td?rev=<wbr>337977&r1=337976&r2=337977&<wbr>view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.td (original)<br>
+++ llvm/trunk/lib/Target/BPF/<wbr>BPFInstrInfo.td Wed Jul 25 15:40:02 2018<br>
@@ -28,6 +28,10 @@ def SDT_BPFBrCC         : SDTypeProfile<<br>
                                                SDTCisVT<3, OtherVT>]>;<br>
 def SDT_BPFWrapper      : SDTypeProfile<1, 1, [SDTCisSameAs<0, 1>,<br>
                                                SDTCisPtrTy<0>]>;<br>
+def SDT_BPFMEMCPY       : SDTypeProfile<0, 4, [SDTCisVT<0, i64>,<br>
+                                               SDTCisVT<1, i64>,<br>
+                                               SDTCisVT<2, i64>,<br>
+                                               SDTCisVT<3, i64>]>;<br>
<br>
 def BPFcall         : SDNode<"BPFISD::CALL", SDT_BPFCall,<br>
                              [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,<br>
@@ -43,6 +47,9 @@ def BPFbrcc         : SDNode<"BPFISD::BR<br>
<br>
 def BPFselectcc     : SDNode<"BPFISD::SELECT_CC", SDT_BPFSelectCC, [SDNPInGlue]>;<br>
 def BPFWrapper      : SDNode<"BPFISD::Wrapper", SDT_BPFWrapper>;<br>
+def BPFmemcpy       : SDNode<"BPFISD::MEMCPY", SDT_BPFMEMCPY,<br>
+                             [SDNPHasChain, SDNPInGlue, SDNPOutGlue,<br>
+                              SDNPMayStore, SDNPMayLoad]>;<br>
 def BPFIsLittleEndian : Predicate<"CurDAG-><wbr>getDataLayout().<wbr>isLittleEndian()">;<br>
 def BPFIsBigEndian    : Predicate<"!CurDAG-><wbr>getDataLayout().<wbr>isLittleEndian()">;<br>
 def BPFHasALU32 : Predicate<"Subtarget-><wbr>getHasAlu32()">;<br>
@@ -714,3 +721,11 @@ let Predicates = [BPFHasALU32] in {<br>
   def : Pat<(i64 (extloadi32 ADDRri:$src)),<br>
             (SUBREG_TO_REG (i64 0), (LDW32 ADDRri:$src), sub_32)>;<br>
 }<br>
+<br>
+let usesCustomInserter = 1, isCodeGenOnly = 1 in {<br>
+    def MEMCPY : Pseudo<<br>
+      (outs),<br>
+      (ins GPR:$dst, GPR:$src, i64imm:$len, i64imm:$align, variable_ops),<br>
+      "#memcpy dst: $dst, src: $src, len: $len, align: $align",<br>
+      [(BPFmemcpy GPR:$dst, GPR:$src, imm:$len, imm:$align)]>;<br>
+}<br>
<br>
Added: llvm/trunk/lib/Target/BPF/<wbr>BPFSelectionDAGInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.cpp?rev=337977&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>BPF/BPFSelectionDAGInfo.cpp?<wbr>rev=337977&view=auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/BPF/<wbr>BPFSelectionDAGInfo.cpp (added)<br>
+++ llvm/trunk/lib/Target/BPF/<wbr>BPFSelectionDAGInfo.cpp Wed Jul 25 15:40:02 2018<br>
@@ -0,0 +1,43 @@<br>
+//===-- BPFSelectionDAGInfo.cpp - BPF SelectionDAG Info -------------------===//<br>
+//<br>
+//                     The LLVM Compiler Infrastructure<br>
+//<br>
+// This file is distributed under the University of Illinois Open Source<br>
+// License. See LICENSE.TXT for details.<br>
+//<br>
+//===------------------------<wbr>------------------------------<wbr>----------------===//<br>
+//<br>
+// This file implements the BPFSelectionDAGInfo class.<br>
+//<br>
+//===------------------------<wbr>------------------------------<wbr>----------------===//<br>
+<br>
+#include "BPFTargetMachine.h"<br>
+#include "llvm/CodeGen/SelectionDAG.h"<br>
+#include "llvm/IR/DerivedTypes.h"<br>
+using namespace llvm;<br>
+<br>
+#define DEBUG_TYPE "bpf-selectiondag-info"<br>
+<br>
+SDValue BPFSelectionDAGInfo::<wbr>EmitTargetCodeForMemcpy(<br>
+    SelectionDAG &DAG, const SDLoc &dl, SDValue Chain, SDValue Dst, SDValue Src,<br>
+    SDValue Size, unsigned Align, bool isVolatile, bool AlwaysInline,<br>
+    MachinePointerInfo DstPtrInfo, MachinePointerInfo SrcPtrInfo) const {<br>
+  // Requires the copy size to be a constant.<br>
+  ConstantSDNode *ConstantSize = dyn_cast<ConstantSDNode>(Size)<wbr>;<br>
+  if (!ConstantSize)<br>
+    return SDValue();<br>
+<br>
+  unsigned CopyLen = ConstantSize->getZExtValue();<br>
+  unsigned StoresNumEstimate = alignTo(CopyLen, Align) >> Log2_32(Align);<br>
+  // Impose the same copy length limit as MaxStoresPerMemcpy.<br>
+  if (StoresNumEstimate > getCommonMaxStoresPerMemFunc()<wbr>)<br>
+    return SDValue();<br>
+<br>
+  SDVTList VTs = DAG.getVTList(MVT::Other, MVT::Glue);<br>
+<br>
+  Dst = DAG.getNode(BPFISD::MEMCPY, dl, VTs, Chain, Dst, Src,<br>
+                    DAG.getConstant(CopyLen, dl, MVT::i64),<br>
+                    DAG.getConstant(Align, dl, MVT::i64));<br>
+<br>
+  return Dst.getValue(0);<br>
+}<br>
<br>
Added: llvm/trunk/lib/Target/BPF/<wbr>BPFSelectionDAGInfo.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.h?rev=337977&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>BPF/BPFSelectionDAGInfo.h?rev=<wbr>337977&view=auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/BPF/<wbr>BPFSelectionDAGInfo.h (added)<br>
+++ llvm/trunk/lib/Target/BPF/<wbr>BPFSelectionDAGInfo.h Wed Jul 25 15:40:02 2018<br>
@@ -0,0 +1,36 @@<br>
+//===-- BPFSelectionDAGInfo.h - BPF SelectionDAG Info -----------*- C++ -*-===//<br>
+//<br>
+//                     The LLVM Compiler Infrastructure<br>
+//<br>
+// This file is distributed under the University of Illinois Open Source<br>
+// License. See LICENSE.TXT for details.<br>
+//<br>
+//===------------------------<wbr>------------------------------<wbr>----------------===//<br>
+//<br>
+// This file defines the BPF subclass for SelectionDAGTargetInfo.<br>
+//<br>
+//===------------------------<wbr>------------------------------<wbr>----------------===//<br>
+<br>
+#ifndef LLVM_LIB_TARGET_BPF_<wbr>BPFSELECTIONDAGINFO_H<br>
+#define LLVM_LIB_TARGET_BPF_<wbr>BPFSELECTIONDAGINFO_H<br>
+<br>
+#include "llvm/CodeGen/<wbr>SelectionDAGTargetInfo.h"<br>
+<br>
+namespace llvm {<br>
+<br>
+class BPFSelectionDAGInfo : public SelectionDAGTargetInfo {<br>
+public:<br>
+  SDValue EmitTargetCodeForMemcpy(<wbr>SelectionDAG &DAG, const SDLoc &dl,<br>
+                                  SDValue Chain, SDValue Dst, SDValue Src,<br>
+                                  SDValue Size, unsigned Align, bool isVolatile,<br>
+                                  bool AlwaysInline,<br>
+                                  MachinePointerInfo DstPtrInfo,<br>
+                                  MachinePointerInfo SrcPtrInfo) const override;<br>
+<br>
+  unsigned getCommonMaxStoresPerMemFunc() const { return 128; }<br>
+<br>
+};<br>
+<br>
+}<br>
+<br>
+#endif<br>
<br>
Modified: llvm/trunk/lib/Target/BPF/<wbr>BPFSubtarget.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFSubtarget.h?rev=337977&r1=337976&r2=337977&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>BPF/BPFSubtarget.h?rev=337977&<wbr>r1=337976&r2=337977&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/BPF/<wbr>BPFSubtarget.h (original)<br>
+++ llvm/trunk/lib/Target/BPF/<wbr>BPFSubtarget.h Wed Jul 25 15:40:02 2018<br>
@@ -17,6 +17,7 @@<br>
 #include "BPFFrameLowering.h"<br>
 #include "BPFISelLowering.h"<br>
 #include "BPFInstrInfo.h"<br>
+#include "BPFSelectionDAGInfo.h"<br>
 #include "llvm/CodeGen/<wbr>SelectionDAGTargetInfo.h"<br>
 #include "llvm/CodeGen/<wbr>TargetSubtargetInfo.h"<br>
 #include "llvm/IR/DataLayout.h"<br>
@@ -33,7 +34,7 @@ class BPFSubtarget : public BPFGenSubtar<br>
   BPFInstrInfo InstrInfo;<br>
   BPFFrameLowering FrameLowering;<br>
   BPFTargetLowering TLInfo;<br>
-  SelectionDAGTargetInfo TSInfo;<br>
+  BPFSelectionDAGInfo TSInfo;<br>
<br>
 private:<br>
   void initializeEnvironment();<br>
@@ -75,7 +76,7 @@ public:<br>
   const BPFTargetLowering *getTargetLowering() const override {<br>
     return &TLInfo;<br>
   }<br>
-  const SelectionDAGTargetInfo *getSelectionDAGInfo() const override {<br>
+  const BPFSelectionDAGInfo *getSelectionDAGInfo() const override {<br>
     return &TSInfo;<br>
   }<br>
   const TargetRegisterInfo *getRegisterInfo() const override {<br>
<br>
Modified: llvm/trunk/lib/Target/BPF/<wbr>CMakeLists.txt<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/CMakeLists.txt?rev=337977&r1=337976&r2=337977&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/Target/<wbr>BPF/CMakeLists.txt?rev=337977&<wbr>r1=337976&r2=337977&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/lib/Target/BPF/<wbr>CMakeLists.txt (original)<br>
+++ llvm/trunk/lib/Target/BPF/<wbr>CMakeLists.txt Wed Jul 25 15:40:02 2018<br>
@@ -20,6 +20,7 @@ add_llvm_target(BPFCodeGen<br>
   BPFISelLowering.cpp<br>
   BPFMCInstLower.cpp<br>
   BPFRegisterInfo.cpp<br>
+  BPFSelectionDAGInfo.cpp<br>
   BPFSubtarget.cpp<br>
   BPFTargetMachine.cpp<br>
   BPFMIPeephole.cpp<br>
<br>
Added: llvm/trunk/test/CodeGen/BPF/<wbr>memcpy-expand-in-order.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/memcpy-expand-in-order.ll?rev=337977&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>CodeGen/BPF/memcpy-expand-in-<wbr>order.ll?rev=337977&view=auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- llvm/trunk/test/CodeGen/BPF/<wbr>memcpy-expand-in-order.ll (added)<br>
+++ llvm/trunk/test/CodeGen/BPF/<wbr>memcpy-expand-in-order.ll Wed Jul 25 15:40:02 2018<br>
@@ -0,0 +1,116 @@<br>
+; RUN: llc < %s -march=bpfel -bpf-expand-memcpy-in-order | FileCheck %s<br>
+; RUN: llc < %s -march=bpfeb -bpf-expand-memcpy-in-order | FileCheck %s<br>
+;<br>
+; #define COPY_LEN     9<br>
+;<br>
+; void cal_align1(void *a, void *b)<br>
+; {<br>
+;   __builtin_memcpy(a, b, COPY_LEN);<br>
+; }<br>
+;<br>
+; void cal_align2(short *a, short *b)<br>
+; {<br>
+;   __builtin_memcpy(a, b, COPY_LEN);<br>
+; }<br>
+;<br>
+; #undef COPY_LEN<br>
+; #define COPY_LEN     19<br>
+; void cal_align4(int *a, int *b)<br>
+; {<br>
+;   __builtin_memcpy(a, b, COPY_LEN);<br>
+; }<br>
+;<br>
+; #undef COPY_LEN<br>
+; #define COPY_LEN     27<br>
+; void cal_align8(long long *a, long long *b)<br>
+; {<br>
+;   __builtin_memcpy(a, b, COPY_LEN);<br>
+; }<br>
+<br>
+; Function Attrs: nounwind<br>
+define dso_local void @cal_align1(i8* nocapture %a, i8* nocapture readonly %b) local_unnamed_addr #0 {<br>
+entry:<br>
+  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %a, i8* align 1 %b, i64 9, i1 false)<br>
+  ret void<br>
+}<br>
+<br>
+; Function Attrs: argmemonly nounwind<br>
+declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i1) #1<br>
+<br>
+; CHECK: [[SCRATCH_REG:r[0-9]]] = *(u8 *)([[SRC_REG:r[0-9]]] + 0)<br>
+; CHECK: *(u8 *)([[DST_REG:r[0-9]]] + 0) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 1)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 1) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 2)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 2) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 3)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 3) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 4)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 4) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 5)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 5) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 6)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 6) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 7)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 7) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 8)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 8) = [[SCRATCH_REG]]<br>
+<br>
+; Function Attrs: nounwind<br>
+define dso_local void @cal_align2(i16* nocapture %a, i16* nocapture readonly %b) local_unnamed_addr #0 {<br>
+entry:<br>
+  %0 = bitcast i16* %a to i8*<br>
+  %1 = bitcast i16* %b to i8*<br>
+  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 2 %0, i8* align 2 %1, i64 9, i1 false)<br>
+  ret void<br>
+}<br>
+; CHECK: [[SCRATCH_REG:r[0-9]]] = *(u16 *)([[SRC_REG:r[0-9]]] + 0)<br>
+; CHECK: *(u16 *)([[DST_REG:r[0-9]]] + 0) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 2)<br>
+; CHECK: *(u16 *)([[DST_REG]] + 2) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 4)<br>
+; CHECK: *(u16 *)([[DST_REG]] + 4) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 6)<br>
+; CHECK: *(u16 *)([[DST_REG]] + 6) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 8)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 8) = [[SCRATCH_REG]]<br>
+<br>
+; Function Attrs: nounwind<br>
+define dso_local void @cal_align4(i32* nocapture %a, i32* nocapture readonly %b) local_unnamed_addr #0 {<br>
+entry:<br>
+  %0 = bitcast i32* %a to i8*<br>
+  %1 = bitcast i32* %b to i8*<br>
+  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %0, i8* align 4 %1, i64 19, i1 false)<br>
+  ret void<br>
+}<br>
+; CHECK: [[SCRATCH_REG:r[0-9]]] = *(u32 *)([[SRC_REG:r[0-9]]] + 0)<br>
+; CHECK: *(u32 *)([[DST_REG:r[0-9]]] + 0) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u32 *)([[SRC_REG]] + 4)<br>
+; CHECK: *(u32 *)([[DST_REG]] + 4) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u32 *)([[SRC_REG]] + 8)<br>
+; CHECK: *(u32 *)([[DST_REG]] + 8) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u32 *)([[SRC_REG]] + 12)<br>
+; CHECK: *(u32 *)([[DST_REG]] + 12) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 16)<br>
+; CHECK: *(u16 *)([[DST_REG]] + 16) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 18)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 18) = [[SCRATCH_REG]]<br>
+<br>
+; Function Attrs: nounwind<br>
+define dso_local void @cal_align8(i64* nocapture %a, i64* nocapture readonly %b) local_unnamed_addr #0 {<br>
+entry:<br>
+  %0 = bitcast i64* %a to i8*<br>
+  %1 = bitcast i64* %b to i8*<br>
+  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %0, i8* align 8 %1, i64 27, i1 false)<br>
+  ret void<br>
+}<br>
+; CHECK: [[SCRATCH_REG:r[0-9]]] = *(u64 *)([[SRC_REG:r[0-9]]] + 0)<br>
+; CHECK: *(u64 *)([[DST_REG:r[0-9]]] + 0) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u64 *)([[SRC_REG]] + 8)<br>
+; CHECK: *(u64 *)([[DST_REG]] + 8) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u64 *)([[SRC_REG]] + 16)<br>
+; CHECK: *(u64 *)([[DST_REG]] + 16) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 24)<br>
+; CHECK: *(u16 *)([[DST_REG]] + 24) = [[SCRATCH_REG]]<br>
+; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 26)<br>
+; CHECK: *(u8 *)([[DST_REG]] + 26) = [[SCRATCH_REG]]<br>
<br>
<br>
______________________________<wbr>_________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br></div>