[llvm] r337977 - bpf: new option -bpf-expand-memcpy-in-order to expand memcpy in order

Yonghong Song via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 26 13:52:45 PDT 2018



On 7/26/18 12:33 PM, Galina Kistanova wrote:
> Hello Yonghong,
> 
> This commit broke tests on one of our builders:
> http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/11261 
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lab.llvm.org-3A8011_builders_llvm-2Dclang-2Dx86-5F64-2Dexpensive-2Dchecks-2Dwin_builds_11261&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=6BpjJAMypx6qn2fiYHPiF7SxU4gKSG0YFDFJHHZAhEs&e=>
> 
> . . .
> Failing Tests (2):
>      LLVM :: CodeGen/BPF/memcpy-expand-in-order.ll
>      LLVM :: tools/dsymutil/X86/accelerator.test
> 
> 
> Please have a look?

Thanks.
I can reproduce the issues with the following:

-bash-4.2$ llc -march=bpfel -bpf-expand-memcpy-in-order 
-verify-machineinstrs memcpy-expand-in-order.ll

# After Post-RA pseudo instruction expansion pass
# Machine code for function cal_align1: NoPHIs, NoVRegs
Function Live Ins: $r1, $r2

bb.0.entry:
   LDB $r3, $r2, 0
   STB $r3, $r1, 0
   LDB $r3, $r2, 1
   STB $r3, $r1, 1
   LDB $r3, $r2, 2
   STB $r3, $r1, 2
   LDB $r3, $r2, 3
   STB $r3, $r1, 3
   LDB $r3, $r2, 4
   STB $r3, $r1, 4
   LDB $r3, $r2, 5
   STB $r3, $r1, 5
   LDB $r3, $r2, 6
   STB $r3, $r1, 6
   LDB $r3, $r2, 7
   STB $r3, $r1, 7
   LDB $r3, $r2, 8
   STB $r3, $r1, 8
   RET

# End machine code for function cal_align1.

*** Bad machine code: Explicit definition marked as use ***
- function:    cal_align1
- basic block: %bb.0 entry (0x47edd98)
- instruction: LDB $r3, $r2, 0
...

Will try to fix the issue soon.

> 
> Thanks
> 
> Galina
> 
> On Wed, Jul 25, 2018 at 3:40 PM, Yonghong Song via llvm-commits 
> <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
> 
>     Author: yhs
>     Date: Wed Jul 25 15:40:02 2018
>     New Revision: 337977
> 
>     URL: http://llvm.org/viewvc/llvm-project?rev=337977&view=rev
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject-3Frev-3D337977-26view-3Drev&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=UaYL0mWOnN6Z-GPOOUAaz_j_rFdQ5Sa1SWPou0hBdxs&e=>
>     Log:
>     bpf: new option -bpf-expand-memcpy-in-order to expand memcpy in order
> 
>     Some BPF JIT backends would want to optimize memcpy in their own
>     architecture specific way.
> 
>     However, at the moment, there is no way for JIT backends to see memcpy
>     semantics in a reliable way. This is due to LLVM BPF backend is
>     expanding
>     memcpy into load/store sequences and could possibly schedule them
>     apart from
>     each other further. So, BPF JIT backends inside kernel can't reliably
>     recognize memcpy semantics by peephole BPF sequence.
> 
>     This patch introduce new intrinsic expand infrastructure to memcpy.
> 
>     To get stable in-order load/store sequence from memcpy, we first lower
>     memcpy into BPF::MEMCPY node which then expanded into in-order
>     load/store
>     sequences in expandPostRAPseudo pass which will happen after instruction
>     scheduling. By this way, kernel JIT backends could reliably recognize
>     memcpy through scanning BPF sequence.
> 
>     This new memcpy expand infrastructure is gated by a new option:
> 
>        -bpf-expand-memcpy-in-order
> 
>     Acked-by: Jakub Kicinski <jakub.kicinski at netronome.com
>     <mailto:jakub.kicinski at netronome.com>>
>     Signed-off-by: Jiong Wang <jiong.wang at netronome.com
>     <mailto:jiong.wang at netronome.com>>
>     Signed-off-by: Yonghong Song <yhs at fb.com <mailto:yhs at fb.com>>
> 
>     Added:
>          llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.cpp
>          llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.h
>          llvm/trunk/test/CodeGen/BPF/memcpy-expand-in-order.ll
>     Modified:
>          llvm/trunk/lib/Target/BPF/BPFISelLowering.cpp
>          llvm/trunk/lib/Target/BPF/BPFISelLowering.h
>          llvm/trunk/lib/Target/BPF/BPFInstrInfo.cpp
>          llvm/trunk/lib/Target/BPF/BPFInstrInfo.h
>          llvm/trunk/lib/Target/BPF/BPFInstrInfo.td
>          llvm/trunk/lib/Target/BPF/BPFSubtarget.h
>          llvm/trunk/lib/Target/BPF/CMakeLists.txt
> 
>     Modified: llvm/trunk/lib/Target/BPF/BPFISelLowering.cpp
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFISelLowering.cpp?rev=337977&r1=337976&r2=337977&view=diff
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Target_BPF_BPFISelLowering.cpp-3Frev-3D337977-26r1-3D337976-26r2-3D337977-26view-3Ddiff&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=nTj1esv_aRXmgKWMxSGK04HGBXZB43WxikNbC6LZUDk&e=>
>     ==============================================================================
>     --- llvm/trunk/lib/Target/BPF/BPFISelLowering.cpp (original)
>     +++ llvm/trunk/lib/Target/BPF/BPFISelLowering.cpp Wed Jul 25
>     15:40:02 2018
>     @@ -33,6 +33,10 @@ using namespace llvm;
> 
>       #define DEBUG_TYPE "bpf-lower"
> 
>     +static cl::opt<bool>
>     BPFExpandMemcpyInOrder("bpf-expand-memcpy-in-order",
>     +  cl::Hidden, cl::init(false),
>     +  cl::desc("Expand memcpy into load/store pairs in order"));
>     +
>       static void fail(const SDLoc &DL, SelectionDAG &DAG, const Twine
>     &Msg) {
>         MachineFunction &MF = DAG.getMachineFunction();
>         DAG.getContext()->diagnose(
>     @@ -132,10 +136,30 @@ BPFTargetLowering::BPFTargetLowering(con
>         setMinFunctionAlignment(3);
>         setPrefFunctionAlignment(3);
> 
>     -  // inline memcpy() for kernel to see explicit copy
>     -  MaxStoresPerMemset = MaxStoresPerMemsetOptSize = 128;
>     -  MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = 128;
>     -  MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = 128;
>     +  if (BPFExpandMemcpyInOrder) {
>     +    // LLVM generic code will try to expand memcpy into load/store
>     pairs at this
>     +    // stage which is before quite a few IR optimization passes,
>     therefore the
>     +    // loads and stores could potentially be moved apart from each
>     other which
>     +    // will cause trouble to memcpy pattern matcher inside kernel
>     eBPF JIT
>     +    // compilers.
>     +    //
>     +    // When -bpf-expand-memcpy-in-order specified, we want to defer
>     the expand
>     +    // of memcpy to later stage in IR optimization pipeline so
>     those load/store
>     +    // pairs won't be touched and could be kept in order. Hence, we set
>     +    // MaxStoresPerMem* to zero to disable the generic
>     getMemcpyLoadsAndStores
>     +    // code path, and ask LLVM to use target expander
>     EmitTargetCodeForMemcpy.
>     +    MaxStoresPerMemset = MaxStoresPerMemsetOptSize = 0;
>     +    MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = 0;
>     +    MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = 0;
>     +  } else {
>     +    // inline memcpy() for kernel to see explicit copy
>     +    unsigned CommonMaxStores =
>     +      STI.getSelectionDAGInfo()->getCommonMaxStoresPerMemFunc();
>     +
>     +    MaxStoresPerMemset = MaxStoresPerMemsetOptSize = CommonMaxStores;
>     +    MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = CommonMaxStores;
>     +    MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = CommonMaxStores;
>     +  }
> 
>         // CPU/Feature control
>         HasAlu32 = STI.getHasAlu32();
>     @@ -518,6 +542,8 @@ const char *BPFTargetLowering::getTarget
>           return "BPFISD::BR_CC";
>         case BPFISD::Wrapper:
>           return "BPFISD::Wrapper";
>     +  case BPFISD::MEMCPY:
>     +    return "BPFISD::MEMCPY";
>         }
>         return nullptr;
>       }
>     @@ -557,6 +583,37 @@ BPFTargetLowering::EmitSubregExt(Machine
>       }
> 
>       MachineBasicBlock *
>     +BPFTargetLowering::EmitInstrWithCustomInserterMemcpy(MachineInstr &MI,
>     +                                                   
>       MachineBasicBlock *BB)
>     +                                                     const {
>     +  MachineFunction *MF = MI.getParent()->getParent();
>     +  MachineRegisterInfo &MRI = MF->getRegInfo();
>     +  MachineInstrBuilder MIB(*MF, MI);
>     +  unsigned ScratchReg;
>     +
>     +  // This function does custom insertion during lowering
>     BPFISD::MEMCPY which
>     +  // only has two register operands from memcpy semantics, the copy
>     source
>     +  // address and the copy destination address.
>     +  //
>     +  // Because we will expand BPFISD::MEMCPY into load/store pairs,
>     we will need
>     +  // a third scratch register to serve as the destination register
>     of load and
>     +  // source register of store.
>     +  //
>     +  // The scratch register here is with the Define | Dead |
>     EarlyClobber flags.
>     +  // The EarlyClobber flag has the semantic property that the
>     operand it is
>     +  // attached to is clobbered before the rest of the inputs are
>     read. Hence it
>     +  // must be unique among the operands to the instruction. The
>     Define flag is
>     +  // needed to coerce the machine verifier that an Undef value
>     isn't a problem
>     +  // as we anyway is loading memory into it. The Dead flag is
>     needed as the
>     +  // value in scratch isn't supposed to be used by any other
>     instruction.
>     +  ScratchReg = MRI.createVirtualRegister(&BPF::GPRRegClass);
>     +  MIB.addReg(ScratchReg,
>     +             RegState::Define | RegState::Dead |
>     RegState::EarlyClobber);
>     +
>     +  return BB;
>     +}
>     +
>     +MachineBasicBlock *
>       BPFTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
>                                                      MachineBasicBlock
>     *BB) const {
>         const TargetInstrInfo &TII =
>     *BB->getParent()->getSubtarget().getInstrInfo();
>     @@ -567,6 +624,8 @@ BPFTargetLowering::EmitInstrWithCustomIn
>                              Opc == BPF::Select_32 ||
>                              Opc == BPF::Select_32_64);
> 
>     +  bool isMemcpyOp = Opc == BPF::MEMCPY;
>     +
>       #ifndef NDEBUG
>         bool isSelectRIOp = (Opc == BPF::Select_Ri ||
>                              Opc == BPF::Select_Ri_64_32 ||
>     @@ -574,9 +633,13 @@ BPFTargetLowering::EmitInstrWithCustomIn
>                              Opc == BPF::Select_Ri_32_64);
> 
> 
>     -  assert((isSelectRROp || isSelectRIOp) && "Unexpected instr type
>     to insert");
>     +  assert((isSelectRROp || isSelectRIOp || isMemcpyOp) &&
>     +         "Unexpected instr type to insert");
>       #endif
> 
>     +  if (isMemcpyOp)
>     +    return EmitInstrWithCustomInserterMemcpy(MI, BB);
>     +
>         bool is32BitCmp = (Opc == BPF::Select_32 ||
>                            Opc == BPF::Select_32_64 ||
>                            Opc == BPF::Select_Ri_32 ||
> 
>     Modified: llvm/trunk/lib/Target/BPF/BPFISelLowering.h
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFISelLowering.h?rev=337977&r1=337976&r2=337977&view=diff
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Target_BPF_BPFISelLowering.h-3Frev-3D337977-26r1-3D337976-26r2-3D337977-26view-3Ddiff&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=HmRzfMCwvHSqO3dXa7g7PrRvpW3D80Sw9jr1WDZuWNQ&e=>
>     ==============================================================================
>     --- llvm/trunk/lib/Target/BPF/BPFISelLowering.h (original)
>     +++ llvm/trunk/lib/Target/BPF/BPFISelLowering.h Wed Jul 25 15:40:02 2018
>     @@ -28,7 +28,8 @@ enum NodeType : unsigned {
>         CALL,
>         SELECT_CC,
>         BR_CC,
>     -  Wrapper
>     +  Wrapper,
>     +  MEMCPY
>       };
>       }
> 
>     @@ -110,6 +111,11 @@ private:
> 
>         unsigned EmitSubregExt(MachineInstr &MI, MachineBasicBlock *BB,
>     unsigned Reg,
>                                bool isSigned) const;
>     +
>     +  MachineBasicBlock *
>     EmitInstrWithCustomInserterMemcpy(MachineInstr &MI,
>     +                                                       
>     MachineBasicBlock *BB)
>     +                                                        const;
>     +
>       };
>       }
> 
> 
>     Modified: llvm/trunk/lib/Target/BPF/BPFInstrInfo.cpp
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFInstrInfo.cpp?rev=337977&r1=337976&r2=337977&view=diff
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Target_BPF_BPFInstrInfo.cpp-3Frev-3D337977-26r1-3D337976-26r2-3D337977-26view-3Ddiff&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=kl8aBNtDBfC8hBRSNEKW1ejkic-kxr2QFyJ4Kc9WO08&e=>
>     ==============================================================================
>     --- llvm/trunk/lib/Target/BPF/BPFInstrInfo.cpp (original)
>     +++ llvm/trunk/lib/Target/BPF/BPFInstrInfo.cpp Wed Jul 25 15:40:02 2018
>     @@ -43,6 +43,83 @@ void BPFInstrInfo::copyPhysReg(MachineBa
>           llvm_unreachable("Impossible reg-to-reg copy");
>       }
> 
>     +void BPFInstrInfo::expandMEMCPY(MachineBasicBlock::iterator MI) const {
>     +  unsigned DstReg = MI->getOperand(0).getReg();
>     +  unsigned SrcReg = MI->getOperand(1).getReg();
>     +  uint64_t CopyLen = MI->getOperand(2).getImm();
>     +  uint64_t Alignment = MI->getOperand(3).getImm();
>     +  unsigned ScratchReg = MI->getOperand(4).getReg();
>     +  MachineBasicBlock *BB = MI->getParent();
>     +  DebugLoc dl = MI->getDebugLoc();
>     +  unsigned LdOpc, StOpc;
>     +
>     +  switch (Alignment) {
>     +  case 1:
>     +    LdOpc = BPF::LDB;
>     +    StOpc = BPF::STB;
>     +    break;
>     +  case 2:
>     +    LdOpc = BPF::LDH;
>     +    StOpc = BPF::STH;
>     +    break;
>     +  case 4:
>     +    LdOpc = BPF::LDW;
>     +    StOpc = BPF::STW;
>     +    break;
>     +  case 8:
>     +    LdOpc = BPF::LDD;
>     +    StOpc = BPF::STD;
>     +    break;
>     +  default:
>     +    llvm_unreachable("unsupported memcpy alignment");
>     +  }
>     +
>     +  unsigned IterationNum = CopyLen >> Log2_64(Alignment);
>     +  for(unsigned I = 0; I < IterationNum; ++I) {
>     +    BuildMI(*BB, MI, dl, get(LdOpc))
>     +            .addReg(ScratchReg).addReg(SrcReg).addImm(I * Alignment);
>     +    BuildMI(*BB, MI, dl, get(StOpc))
>     +            .addReg(ScratchReg).addReg(DstReg).addImm(I * Alignment);
>     +  }
>     +
>     +  unsigned BytesLeft = CopyLen & (Alignment - 1);
>     +  unsigned Offset = IterationNum * Alignment;
>     +  bool Hanging4Byte = BytesLeft & 0x4;
>     +  bool Hanging2Byte = BytesLeft & 0x2;
>     +  bool Hanging1Byte = BytesLeft & 0x1;
>     +  if (Hanging4Byte) {
>     +    BuildMI(*BB, MI, dl, get(BPF::LDW))
>     +            .addReg(ScratchReg).addReg(SrcReg).addImm(Offset);
>     +    BuildMI(*BB, MI, dl, get(BPF::STW))
>     +            .addReg(ScratchReg).addReg(DstReg).addImm(Offset);
>     +    Offset += 4;
>     +  }
>     +  if (Hanging2Byte) {
>     +    BuildMI(*BB, MI, dl, get(BPF::LDH))
>     +            .addReg(ScratchReg).addReg(SrcReg).addImm(Offset);
>     +    BuildMI(*BB, MI, dl, get(BPF::STH))
>     +            .addReg(ScratchReg).addReg(DstReg).addImm(Offset);
>     +    Offset += 2;
>     +  }
>     +  if (Hanging1Byte) {
>     +    BuildMI(*BB, MI, dl, get(BPF::LDB))
>     +            .addReg(ScratchReg).addReg(SrcReg).addImm(Offset);
>     +    BuildMI(*BB, MI, dl, get(BPF::STB))
>     +            .addReg(ScratchReg).addReg(DstReg).addImm(Offset);
>     +  }
>     +
>     +  BB->erase(MI);
>     +}
>     +
>     +bool BPFInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {
>     +  if (MI.getOpcode() == BPF::MEMCPY) {
>     +    expandMEMCPY(MI);
>     +    return true;
>     +  }
>     +
>     +  return false;
>     +}
>     +
>       void BPFInstrInfo::storeRegToStackSlot(MachineBasicBlock &MBB,
>                                              MachineBasicBlock::iterator I,
>                                              unsigned SrcReg, bool
>     IsKill, int FI,
> 
>     Modified: llvm/trunk/lib/Target/BPF/BPFInstrInfo.h
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFInstrInfo.h?rev=337977&r1=337976&r2=337977&view=diff
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Target_BPF_BPFInstrInfo.h-3Frev-3D337977-26r1-3D337976-26r2-3D337977-26view-3Ddiff&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=2siTcniBAP7wYEnjTz73d6dX8z1NlFbhFL9GOY2WENg&e=>
>     ==============================================================================
>     --- llvm/trunk/lib/Target/BPF/BPFInstrInfo.h (original)
>     +++ llvm/trunk/lib/Target/BPF/BPFInstrInfo.h Wed Jul 25 15:40:02 2018
>     @@ -34,6 +34,8 @@ public:
>                          const DebugLoc &DL, unsigned DestReg, unsigned
>     SrcReg,
>                          bool KillSrc) const override;
> 
>     +  bool expandPostRAPseudo(MachineInstr &MI) const override;
>     +
>         void storeRegToStackSlot(MachineBasicBlock &MBB,
>                                  MachineBasicBlock::iterator MBBI,
>     unsigned SrcReg,
>                                  bool isKill, int FrameIndex,
>     @@ -55,6 +57,9 @@ public:
>                               MachineBasicBlock *FBB,
>     ArrayRef<MachineOperand> Cond,
>                               const DebugLoc &DL,
>                               int *BytesAdded = nullptr) const override;
>     +private:
>     +  void expandMEMCPY(MachineBasicBlock::iterator) const;
>     +
>       };
>       }
> 
> 
>     Modified: llvm/trunk/lib/Target/BPF/BPFInstrInfo.td
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFInstrInfo.td?rev=337977&r1=337976&r2=337977&view=diff
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Target_BPF_BPFInstrInfo.td-3Frev-3D337977-26r1-3D337976-26r2-3D337977-26view-3Ddiff&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=BJRCI1PY-LkMO0cs6uf0FG_aEt4XWiFJEXdh2Jm0aTo&e=>
>     ==============================================================================
>     --- llvm/trunk/lib/Target/BPF/BPFInstrInfo.td (original)
>     +++ llvm/trunk/lib/Target/BPF/BPFInstrInfo.td Wed Jul 25 15:40:02 2018
>     @@ -28,6 +28,10 @@ def SDT_BPFBrCC         : SDTypeProfile<
>                                                      SDTCisVT<3, OtherVT>]>;
>       def SDT_BPFWrapper      : SDTypeProfile<1, 1, [SDTCisSameAs<0, 1>,
>                                                      SDTCisPtrTy<0>]>;
>     +def SDT_BPFMEMCPY       : SDTypeProfile<0, 4, [SDTCisVT<0, i64>,
>     +                                               SDTCisVT<1, i64>,
>     +                                               SDTCisVT<2, i64>,
>     +                                               SDTCisVT<3, i64>]>;
> 
>       def BPFcall         : SDNode<"BPFISD::CALL", SDT_BPFCall,
>                                    [SDNPHasChain, SDNPOptInGlue,
>     SDNPOutGlue,
>     @@ -43,6 +47,9 @@ def BPFbrcc         : SDNode<"BPFISD::BR
> 
>       def BPFselectcc     : SDNode<"BPFISD::SELECT_CC", SDT_BPFSelectCC,
>     [SDNPInGlue]>;
>       def BPFWrapper      : SDNode<"BPFISD::Wrapper", SDT_BPFWrapper>;
>     +def BPFmemcpy       : SDNode<"BPFISD::MEMCPY", SDT_BPFMEMCPY,
>     +                             [SDNPHasChain, SDNPInGlue, SDNPOutGlue,
>     +                              SDNPMayStore, SDNPMayLoad]>;
>       def BPFIsLittleEndian :
>     Predicate<"CurDAG->getDataLayout().isLittleEndian()">;
>       def BPFIsBigEndian    :
>     Predicate<"!CurDAG->getDataLayout().isLittleEndian()">;
>       def BPFHasALU32 : Predicate<"Subtarget->getHasAlu32()">;
>     @@ -714,3 +721,11 @@ let Predicates = [BPFHasALU32] in {
>         def : Pat<(i64 (extloadi32 ADDRri:$src)),
>                   (SUBREG_TO_REG (i64 0), (LDW32 ADDRri:$src), sub_32)>;
>       }
>     +
>     +let usesCustomInserter = 1, isCodeGenOnly = 1 in {
>     +    def MEMCPY : Pseudo<
>     +      (outs),
>     +      (ins GPR:$dst, GPR:$src, i64imm:$len, i64imm:$align,
>     variable_ops),
>     +      "#memcpy dst: $dst, src: $src, len: $len, align: $align",
>     +      [(BPFmemcpy GPR:$dst, GPR:$src, imm:$len, imm:$align)]>;
>     +}
> 
>     Added: llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.cpp
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.cpp?rev=337977&view=auto
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Target_BPF_BPFSelectionDAGInfo.cpp-3Frev-3D337977-26view-3Dauto&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=AEAVoZwxLwSdVa0JnYWsNFuLVQ-9vYK7aMxJWnaYOkY&e=>
>     ==============================================================================
>     --- llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.cpp (added)
>     +++ llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.cpp Wed Jul 25
>     15:40:02 2018
>     @@ -0,0 +1,43 @@
>     +//===-- BPFSelectionDAGInfo.cpp - BPF SelectionDAG Info
>     -------------------===//
>     +//
>     +//                     The LLVM Compiler Infrastructure
>     +//
>     +// This file is distributed under the University of Illinois Open
>     Source
>     +// License. See LICENSE.TXT for details.
>     +//
>     +//===----------------------------------------------------------------------===//
>     +//
>     +// This file implements the BPFSelectionDAGInfo class.
>     +//
>     +//===----------------------------------------------------------------------===//
>     +
>     +#include "BPFTargetMachine.h"
>     +#include "llvm/CodeGen/SelectionDAG.h"
>     +#include "llvm/IR/DerivedTypes.h"
>     +using namespace llvm;
>     +
>     +#define DEBUG_TYPE "bpf-selectiondag-info"
>     +
>     +SDValue BPFSelectionDAGInfo::EmitTargetCodeForMemcpy(
>     +    SelectionDAG &DAG, const SDLoc &dl, SDValue Chain, SDValue Dst,
>     SDValue Src,
>     +    SDValue Size, unsigned Align, bool isVolatile, bool AlwaysInline,
>     +    MachinePointerInfo DstPtrInfo, MachinePointerInfo SrcPtrInfo)
>     const {
>     +  // Requires the copy size to be a constant.
>     +  ConstantSDNode *ConstantSize = dyn_cast<ConstantSDNode>(Size);
>     +  if (!ConstantSize)
>     +    return SDValue();
>     +
>     +  unsigned CopyLen = ConstantSize->getZExtValue();
>     +  unsigned StoresNumEstimate = alignTo(CopyLen, Align) >>
>     Log2_32(Align);
>     +  // Impose the same copy length limit as MaxStoresPerMemcpy.
>     +  if (StoresNumEstimate > getCommonMaxStoresPerMemFunc())
>     +    return SDValue();
>     +
>     +  SDVTList VTs = DAG.getVTList(MVT::Other, MVT::Glue);
>     +
>     +  Dst = DAG.getNode(BPFISD::MEMCPY, dl, VTs, Chain, Dst, Src,
>     +                    DAG.getConstant(CopyLen, dl, MVT::i64),
>     +                    DAG.getConstant(Align, dl, MVT::i64));
>     +
>     +  return Dst.getValue(0);
>     +}
> 
>     Added: llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.h
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.h?rev=337977&view=auto
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Target_BPF_BPFSelectionDAGInfo.h-3Frev-3D337977-26view-3Dauto&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=mt-08g321KTUkRzqokbMy4YhSZF-uTmovaS6ZMvxsug&e=>
>     ==============================================================================
>     --- llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.h (added)
>     +++ llvm/trunk/lib/Target/BPF/BPFSelectionDAGInfo.h Wed Jul 25
>     15:40:02 2018
>     @@ -0,0 +1,36 @@
>     +//===-- BPFSelectionDAGInfo.h - BPF SelectionDAG Info -----------*-
>     C++ -*-===//
>     +//
>     +//                     The LLVM Compiler Infrastructure
>     +//
>     +// This file is distributed under the University of Illinois Open
>     Source
>     +// License. See LICENSE.TXT for details.
>     +//
>     +//===----------------------------------------------------------------------===//
>     +//
>     +// This file defines the BPF subclass for SelectionDAGTargetInfo.
>     +//
>     +//===----------------------------------------------------------------------===//
>     +
>     +#ifndef LLVM_LIB_TARGET_BPF_BPFSELECTIONDAGINFO_H
>     +#define LLVM_LIB_TARGET_BPF_BPFSELECTIONDAGINFO_H
>     +
>     +#include "llvm/CodeGen/SelectionDAGTargetInfo.h"
>     +
>     +namespace llvm {
>     +
>     +class BPFSelectionDAGInfo : public SelectionDAGTargetInfo {
>     +public:
>     +  SDValue EmitTargetCodeForMemcpy(SelectionDAG &DAG, const SDLoc &dl,
>     +                                  SDValue Chain, SDValue Dst,
>     SDValue Src,
>     +                                  SDValue Size, unsigned Align,
>     bool isVolatile,
>     +                                  bool AlwaysInline,
>     +                                  MachinePointerInfo DstPtrInfo,
>     +                                  MachinePointerInfo SrcPtrInfo)
>     const override;
>     +
>     +  unsigned getCommonMaxStoresPerMemFunc() const { return 128; }
>     +
>     +};
>     +
>     +}
>     +
>     +#endif
> 
>     Modified: llvm/trunk/lib/Target/BPF/BPFSubtarget.h
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFSubtarget.h?rev=337977&r1=337976&r2=337977&view=diff
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Target_BPF_BPFSubtarget.h-3Frev-3D337977-26r1-3D337976-26r2-3D337977-26view-3Ddiff&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=tumo58GCUWYx_rh5JSENdFhpMP9iWoE21dsqwl6ju6I&e=>
>     ==============================================================================
>     --- llvm/trunk/lib/Target/BPF/BPFSubtarget.h (original)
>     +++ llvm/trunk/lib/Target/BPF/BPFSubtarget.h Wed Jul 25 15:40:02 2018
>     @@ -17,6 +17,7 @@
>       #include "BPFFrameLowering.h"
>       #include "BPFISelLowering.h"
>       #include "BPFInstrInfo.h"
>     +#include "BPFSelectionDAGInfo.h"
>       #include "llvm/CodeGen/SelectionDAGTargetInfo.h"
>       #include "llvm/CodeGen/TargetSubtargetInfo.h"
>       #include "llvm/IR/DataLayout.h"
>     @@ -33,7 +34,7 @@ class BPFSubtarget : public BPFGenSubtar
>         BPFInstrInfo InstrInfo;
>         BPFFrameLowering FrameLowering;
>         BPFTargetLowering TLInfo;
>     -  SelectionDAGTargetInfo TSInfo;
>     +  BPFSelectionDAGInfo TSInfo;
> 
>       private:
>         void initializeEnvironment();
>     @@ -75,7 +76,7 @@ public:
>         const BPFTargetLowering *getTargetLowering() const override {
>           return &TLInfo;
>         }
>     -  const SelectionDAGTargetInfo *getSelectionDAGInfo() const override {
>     +  const BPFSelectionDAGInfo *getSelectionDAGInfo() const override {
>           return &TSInfo;
>         }
>         const TargetRegisterInfo *getRegisterInfo() const override {
> 
>     Modified: llvm/trunk/lib/Target/BPF/CMakeLists.txt
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/CMakeLists.txt?rev=337977&r1=337976&r2=337977&view=diff
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Target_BPF_CMakeLists.txt-3Frev-3D337977-26r1-3D337976-26r2-3D337977-26view-3Ddiff&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=Yrqdr0xRZPOscByazBNhVtHtralQuvKAs012WnHniFk&e=>
>     ==============================================================================
>     --- llvm/trunk/lib/Target/BPF/CMakeLists.txt (original)
>     +++ llvm/trunk/lib/Target/BPF/CMakeLists.txt Wed Jul 25 15:40:02 2018
>     @@ -20,6 +20,7 @@ add_llvm_target(BPFCodeGen
>         BPFISelLowering.cpp
>         BPFMCInstLower.cpp
>         BPFRegisterInfo.cpp
>     +  BPFSelectionDAGInfo.cpp
>         BPFSubtarget.cpp
>         BPFTargetMachine.cpp
>         BPFMIPeephole.cpp
> 
>     Added: llvm/trunk/test/CodeGen/BPF/memcpy-expand-in-order.ll
>     URL:
>     http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/memcpy-expand-in-order.ll?rev=337977&view=auto
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_test_CodeGen_BPF_memcpy-2Dexpand-2Din-2Dorder.ll-3Frev-3D337977-26view-3Dauto&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=ZeXgcYyAA6qE3oxyzBFoMUwsgIBhEbgujfMXknUSinI&e=>
>     ==============================================================================
>     --- llvm/trunk/test/CodeGen/BPF/memcpy-expand-in-order.ll (added)
>     +++ llvm/trunk/test/CodeGen/BPF/memcpy-expand-in-order.ll Wed Jul 25
>     15:40:02 2018
>     @@ -0,0 +1,116 @@
>     +; RUN: llc < %s -march=bpfel -bpf-expand-memcpy-in-order | FileCheck %s
>     +; RUN: llc < %s -march=bpfeb -bpf-expand-memcpy-in-order | FileCheck %s
>     +;
>     +; #define COPY_LEN     9
>     +;
>     +; void cal_align1(void *a, void *b)
>     +; {
>     +;   __builtin_memcpy(a, b, COPY_LEN);
>     +; }
>     +;
>     +; void cal_align2(short *a, short *b)
>     +; {
>     +;   __builtin_memcpy(a, b, COPY_LEN);
>     +; }
>     +;
>     +; #undef COPY_LEN
>     +; #define COPY_LEN     19
>     +; void cal_align4(int *a, int *b)
>     +; {
>     +;   __builtin_memcpy(a, b, COPY_LEN);
>     +; }
>     +;
>     +; #undef COPY_LEN
>     +; #define COPY_LEN     27
>     +; void cal_align8(long long *a, long long *b)
>     +; {
>     +;   __builtin_memcpy(a, b, COPY_LEN);
>     +; }
>     +
>     +; Function Attrs: nounwind
>     +define dso_local void @cal_align1(i8* nocapture %a, i8* nocapture
>     readonly %b) local_unnamed_addr #0 {
>     +entry:
>     +  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %a, i8*
>     align 1 %b, i64 9, i1 false)
>     +  ret void
>     +}
>     +
>     +; Function Attrs: argmemonly nounwind
>     +declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly,
>     i8* nocapture readonly, i64, i1) #1
>     +
>     +; CHECK: [[SCRATCH_REG:r[0-9]]] = *(u8 *)([[SRC_REG:r[0-9]]] + 0)
>     +; CHECK: *(u8 *)([[DST_REG:r[0-9]]] + 0) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 1)
>     +; CHECK: *(u8 *)([[DST_REG]] + 1) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 2)
>     +; CHECK: *(u8 *)([[DST_REG]] + 2) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 3)
>     +; CHECK: *(u8 *)([[DST_REG]] + 3) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 4)
>     +; CHECK: *(u8 *)([[DST_REG]] + 4) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 5)
>     +; CHECK: *(u8 *)([[DST_REG]] + 5) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 6)
>     +; CHECK: *(u8 *)([[DST_REG]] + 6) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 7)
>     +; CHECK: *(u8 *)([[DST_REG]] + 7) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 8)
>     +; CHECK: *(u8 *)([[DST_REG]] + 8) = [[SCRATCH_REG]]
>     +
>     +; Function Attrs: nounwind
>     +define dso_local void @cal_align2(i16* nocapture %a, i16* nocapture
>     readonly %b) local_unnamed_addr #0 {
>     +entry:
>     +  %0 = bitcast i16* %a to i8*
>     +  %1 = bitcast i16* %b to i8*
>     +  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 2 %0, i8*
>     align 2 %1, i64 9, i1 false)
>     +  ret void
>     +}
>     +; CHECK: [[SCRATCH_REG:r[0-9]]] = *(u16 *)([[SRC_REG:r[0-9]]] + 0)
>     +; CHECK: *(u16 *)([[DST_REG:r[0-9]]] + 0) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 2)
>     +; CHECK: *(u16 *)([[DST_REG]] + 2) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 4)
>     +; CHECK: *(u16 *)([[DST_REG]] + 4) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 6)
>     +; CHECK: *(u16 *)([[DST_REG]] + 6) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 8)
>     +; CHECK: *(u8 *)([[DST_REG]] + 8) = [[SCRATCH_REG]]
>     +
>     +; Function Attrs: nounwind
>     +define dso_local void @cal_align4(i32* nocapture %a, i32* nocapture
>     readonly %b) local_unnamed_addr #0 {
>     +entry:
>     +  %0 = bitcast i32* %a to i8*
>     +  %1 = bitcast i32* %b to i8*
>     +  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %0, i8*
>     align 4 %1, i64 19, i1 false)
>     +  ret void
>     +}
>     +; CHECK: [[SCRATCH_REG:r[0-9]]] = *(u32 *)([[SRC_REG:r[0-9]]] + 0)
>     +; CHECK: *(u32 *)([[DST_REG:r[0-9]]] + 0) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u32 *)([[SRC_REG]] + 4)
>     +; CHECK: *(u32 *)([[DST_REG]] + 4) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u32 *)([[SRC_REG]] + 8)
>     +; CHECK: *(u32 *)([[DST_REG]] + 8) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u32 *)([[SRC_REG]] + 12)
>     +; CHECK: *(u32 *)([[DST_REG]] + 12) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 16)
>     +; CHECK: *(u16 *)([[DST_REG]] + 16) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 18)
>     +; CHECK: *(u8 *)([[DST_REG]] + 18) = [[SCRATCH_REG]]
>     +
>     +; Function Attrs: nounwind
>     +define dso_local void @cal_align8(i64* nocapture %a, i64* nocapture
>     readonly %b) local_unnamed_addr #0 {
>     +entry:
>     +  %0 = bitcast i64* %a to i8*
>     +  %1 = bitcast i64* %b to i8*
>     +  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %0, i8*
>     align 8 %1, i64 27, i1 false)
>     +  ret void
>     +}
>     +; CHECK: [[SCRATCH_REG:r[0-9]]] = *(u64 *)([[SRC_REG:r[0-9]]] + 0)
>     +; CHECK: *(u64 *)([[DST_REG:r[0-9]]] + 0) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u64 *)([[SRC_REG]] + 8)
>     +; CHECK: *(u64 *)([[DST_REG]] + 8) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u64 *)([[SRC_REG]] + 16)
>     +; CHECK: *(u64 *)([[DST_REG]] + 16) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u16 *)([[SRC_REG]] + 24)
>     +; CHECK: *(u16 *)([[DST_REG]] + 24) = [[SCRATCH_REG]]
>     +; CHECK: [[SCRATCH_REG]] = *(u8 *)([[SRC_REG]] + 26)
>     +; CHECK: *(u8 *)([[DST_REG]] + 26) = [[SCRATCH_REG]]
> 
> 
>     _______________________________________________
>     llvm-commits mailing list
>     llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Dcommits&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=-fBrtKjffdZycEZfbk-f6ZaJJdyem3YTyq5Q5M3QltE&s=B1im8Es9u_Nvqr6O7gMW3FQ2LbHA0CxFnROQbkuBCZ0&e=>
> 
> 


More information about the llvm-commits mailing list