[llvm] r217371 - [AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.

Tue Sep 30 15:36:38 PDT 2014

Andy,

Can you take a quick look at this? I think that what Sanjin says makes sense; do you agree? Is there a way to add an assert in case something goes wrong?

Sanjin, if nothing else, please add a comment in the code to this effect.

Thanks again,
Hal

----- Original Message -----
> From: "Sanjin Sijaric" <ssijaric at codeaurora.org>
> To: mcrosier at codeaurora.org
> Cc: "Hal Finkel" <hfinkel at anl.gov>, llvm-commits at cs.uiuc.edu, "Matt Arsenault" <arsenm2 at gmail.com>
> Sent: Monday, September 29, 2014 5:22:47 PM
> Subject: RE: [llvm] r217371 - [AArch64] Improve AA to remove unneeded edges in the	AA MI scheduling graph.
> 
> Hi Hal,
> 
> The original purpose of the function was for discarding trivial chain
> edges
> when constructing the dependence graph for scheduling.  If BaseReg is
> a
> physical register that gets modified, then we may have something
> like:
> 
> LD R1 = A(BaseReg, Offset1)               (1)
> 
> BaseReg = ....                                          (2)
> 
> ST R2, B(BaseReg, Offset2)                  (3)
> 
> Given the inputs (1) and (3) to areMemAccessesTriviallyDisjoint, it
> may be
> determined that there is no need for an edge between them.  There
> will be no
> dependence between (1) and (3), but there will be dependences between
> "(1)
> and (2)" and "(2) and (3)".  I think this should be enough to prevent
> any
> bad scheduling from happening.
> 
> Thanks,
> Sanjin
> 
> -----Original Message-----
> From: Chad Rosier [mailto:mcrosier at codeaurora.org]
> Sent: Monday, September 29, 2014 6:47 AM
> To: ssijaric at codeaurora.org
> Cc: Hal Finkel; Chad Rosier; llvm-commits at cs.uiuc.edu; Matt Arsenault
> Subject: Re: [llvm] r217371 - [AArch64] Improve AA to remove unneeded
> edges
> in the AA MI scheduling graph.
> 
> +Sanjin
> 
> Adding Sanjin since this is his work.
> 
>  Chad
> 
> > ----- Original Message -----
> >> From: "Chad Rosier" <mcrosier at codeaurora.org>
> >> To: llvm-commits at cs.uiuc.edu
> >> Sent: Monday, September 8, 2014 9:43:48 AM
> >> Subject: [llvm] r217371 - [AArch64] Improve AA to remove unneeded
> >> edges
> >> in the	AA MI scheduling graph.
> >>
> >> Author: mcrosier
> >> Date: Mon Sep  8 09:43:48 2014
> >> New Revision: 217371
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=217371&view=rev
> >> Log:
> >> [AArch64] Improve AA to remove unneeded edges in the AA MI
> >> scheduling
> >> graph.
> >>
> >> Patch by Sanjin Sijaric <ssijaric at codeaurora.org>!
> >> Phabricator Review: http://reviews.llvm.org/D5103
> >>
> >> Added:
> >>     llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
> >> Modified:
> >>     llvm/trunk/include/llvm/Target/TargetInstrInfo.h
> >>     llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp
> >>     llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp
> >>     llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h
> >>
> >> Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/Ta
> >> rgetInstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff
> >> =====================================================================
> >> =========
> >> --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original)
> >> +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Mon Sep  8
> >> 09:43:48 2014
> >> @@ -1192,6 +1192,20 @@ public:
> >>      return nullptr;
> >>    }
> >>
> >> +  // areMemAccessesTriviallyDisjoint - Sometimes, it is possible
> >> for
> >> the target
> >> +  // to tell, even without aliasing information, that two MIs
> >> access
> >> different
> >> +  // memory addresses. This function returns true if two MIs
> >> access
> >> different
> >> +  // memory addresses, and false otherwise.
> >> +  virtual bool
> >> +  areMemAccessesTriviallyDisjoint(MachineInstr *MIa, MachineInstr
> >> *MIb,
> >> +                                  AliasAnalysis *AA = nullptr)
> >> const
> >> {
> >> +    assert(MIa && (MIa->mayLoad() || MIa->mayStore()) &&
> >> +           "MIa must load from or modify a memory location");
> >> +    assert(MIb && (MIb->mayLoad() || MIb->mayStore()) &&
> >> +           "MIb must load from or modify a memory location");
> >> +    return false;
> >> +  }
> >> +
> >>  private:
> >>    int CallFrameSetupOpcode, CallFrameDestroyOpcode;  };
> >>
> >> Modified: llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDA
> >> GInstrs.cpp?rev=217371&r1=217370&r2=217371&view=diff
> >> =====================================================================
> >> =========
> >> --- llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp (original)
> >> +++ llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp Mon Sep  8
> >> 09:43:48
> >> 2014
> >> @@ -511,9 +511,18 @@ static inline bool isUnsafeMemoryObject(
> >>  static
> >> bool MIsNeedChainEdge(AliasAnalysis *AA, const  MachineFrameInfo
> >> *MFI,
> >>                               MachineInstr *MIa,
> >>                               MachineInstr *MIb) {
> >> +  const MachineFunction *MF = MIa->getParent()->getParent();
> >>  const
> >> + TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
> >> +
> >>    // Cover a trivial case - no edge is need to itself.
> >>    if (MIa == MIb)
> >>      return false;
> >> +
> >> +  // Let the target decide if memory accesses cannot possibly
> >> overlap.
> >> +  if ((MIa->mayLoad() || MIa->mayStore()) &&
> >> +      (MIb->mayLoad() || MIb->mayStore()))
> >> +    if (TII->areMemAccessesTriviallyDisjoint(MIa, MIb, AA))
> >> +      return false;
> >>
> >>    // FIXME: Need to handle multiple memory operands to support
> >>    all
> >>    targets.
> >>    if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand())
> >>
> >> Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AAr
> >> ch64InstrInfo.cpp?rev=217371&r1=217370&r2=217371&view=diff
> >> =====================================================================
> >> =========
> >> --- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp (original)
> >> +++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp Mon Sep  8
> >> 09:43:48 2014
> >> @@ -607,6 +607,42 @@ bool AArch64InstrInfo::isCoalescableExtI
> >>    }
> >>  }
> >>
> >> +bool
> >> +AArch64InstrInfo::areMemAccessesTriviallyDisjoint(MachineInstr
> >> *MIa,
> >> +                                                  MachineInstr
> >> *MIb,
> >> +                                                  AliasAnalysis
> >> *AA)
> >> const {
> >> +  const TargetRegisterInfo *TRI = &getRegisterInfo();  unsigned
> >> + BaseRegA = 0, BaseRegB = 0;  int OffsetA = 0, OffsetB = 0;  int
> >> + WidthA = 0, WidthB = 0;
> >> +
> >> +  assert(MIa && (MIa->mayLoad() || MIa->mayStore()) &&
> >> +         "MIa must be a store or a load");  assert(MIb &&
> >> + (MIb->mayLoad() || MIb->mayStore()) &&
> >> +         "MIb must be a store or a load");
> >> +
> >> +  if (MIa->hasUnmodeledSideEffects() ||
> >> MIb->hasUnmodeledSideEffects() ||
> >> +      MIa->hasOrderedMemoryRef() || MIb->hasOrderedMemoryRef())
> >> +    return false;
> >> +
> >> +  // Retrieve the base register, offset from the base register
> >> and
> >> width. Width
> >> +  // is the size of memory that is being loaded/stored (e.g. 1,
> >> 2,
> >> 4, 8).  If
> >> +  // base registers are identical, and the offset of a lower
> >> memory
> >> access +
> >> +  // the width doesn't overlap the offset of a higher memory
> >> access,
> >> + // then the memory accesses are different.
> >> +  if (getLdStBaseRegImmOfsWidth(MIa, BaseRegA, OffsetA, WidthA,
> >> TRI)
> >> &&
> >> +      getLdStBaseRegImmOfsWidth(MIb, BaseRegB, OffsetB, WidthB,
> >> TRI)) {
> >> +    if (BaseRegA == BaseRegB) {
> >
> > I think that this makes sense only for SSA virtual registers, but
> > not
> > for physical ones (because the value in a physical register might
> > have
> > changed in between the two instructions). If I'm right, then you'll
> > want to exclude physical registers here (at least any that might be
> > modified during the function's execution).
> >
> >  -Hal
> >
> >> +      int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
> >> +      int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
> >> +      int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
> >> +      if (LowOffset + LowWidth <= HighOffset)
> >> +        return true;
> >> +    }
> >> +  }
> >> +  return false;
> >> +}
> >> +
> >>  /// analyzeCompare - For a comparison instruction, return the
> >>  source
> >> registers  /// in SrcReg and SrcReg2, and the value it compares
> >> against in  CmpValue.
> >>  /// Return true if the comparison instruction can be analyzed.
> >> @@ -1270,6 +1306,102 @@ AArch64InstrInfo::getLdStBaseRegImmOfs(M
> >>    };
> >>  }
> >>
> >> +bool AArch64InstrInfo::getLdStBaseRegImmOfsWidth(
> >> +    MachineInstr *LdSt, unsigned &BaseReg, int &Offset, int
> >> &Width,
> >> +    const TargetRegisterInfo *TRI) const {
> >> +  // Handle only loads/stores with base register followed by
> >> immediate offset.
> >> +  if (LdSt->getNumOperands() != 3)
> >> +    return false;
> >> +  if (!LdSt->getOperand(1).isReg() ||
> >> !LdSt->getOperand(2).isImm())
> >> +    return false;
> >> +
> >> +  // Offset is calculated as the immediate operand multiplied by
> >> the
> >> scaling factor.
> >> +  // Unscaled instructions have scaling factor set to 1.
> >> +  int Scale = 0;
> >> +  switch (LdSt->getOpcode()) {
> >> +  default:
> >> +    return false;
> >> +  case AArch64::LDURQi:
> >> +  case AArch64::STURQi:
> >> +    Width = 16;
> >> +    Scale = 1;
> >> +    break;
> >> +  case AArch64::LDURXi:
> >> +  case AArch64::LDURDi:
> >> +  case AArch64::STURXi:
> >> +  case AArch64::STURDi:
> >> +    Width = 8;
> >> +    Scale = 1;
> >> +    break;
> >> +  case AArch64::LDURWi:
> >> +  case AArch64::LDURSi:
> >> +  case AArch64::LDURSWi:
> >> +  case AArch64::STURWi:
> >> +  case AArch64::STURSi:
> >> +    Width = 4;
> >> +    Scale = 1;
> >> +    break;
> >> +  case AArch64::LDURHi:
> >> +  case AArch64::LDURHHi:
> >> +  case AArch64::LDURSHXi:
> >> +  case AArch64::LDURSHWi:
> >> +  case AArch64::STURHi:
> >> +  case AArch64::STURHHi:
> >> +    Width = 2;
> >> +    Scale = 1;
> >> +    break;
> >> +  case AArch64::LDURBi:
> >> +  case AArch64::LDURBBi:
> >> +  case AArch64::LDURSBXi:
> >> +  case AArch64::LDURSBWi:
> >> +  case AArch64::STURBi:
> >> +  case AArch64::STURBBi:
> >> +    Width = 1;
> >> +    Scale = 1;
> >> +    break;
> >> +  case AArch64::LDRXui:
> >> +  case AArch64::STRXui:
> >> +    Scale = Width = 8;
> >> +    break;
> >> +  case AArch64::LDRWui:
> >> +  case AArch64::STRWui:
> >> +    Scale = Width = 4;
> >> +    break;
> >> +  case AArch64::LDRBui:
> >> +  case AArch64::STRBui:
> >> +    Scale = Width = 1;
> >> +    break;
> >> +  case AArch64::LDRHui:
> >> +  case AArch64::STRHui:
> >> +    Scale = Width = 2;
> >> +    break;
> >> +  case AArch64::LDRSui:
> >> +  case AArch64::STRSui:
> >> +    Scale = Width = 4;
> >> +    break;
> >> +  case AArch64::LDRDui:
> >> +  case AArch64::STRDui:
> >> +    Scale = Width = 8;
> >> +    break;
> >> +  case AArch64::LDRQui:
> >> +  case AArch64::STRQui:
> >> +    Scale = Width = 16;
> >> +    break;
> >> +  case AArch64::LDRBBui:
> >> +  case AArch64::STRBBui:
> >> +    Scale = Width = 1;
> >> +    break;
> >> +  case AArch64::LDRHHui:
> >> +  case AArch64::STRHHui:
> >> +    Scale = Width = 2;
> >> +    break;
> >> +  };
> >> +
> >> +  BaseReg = LdSt->getOperand(1).getReg();
> >> +  Offset = LdSt->getOperand(2).getImm() * Scale;
> >> +  return true;
> >> +}
> >> +
> >>  /// Detect opportunities for ldp/stp formation.
> >>  ///
> >>  /// Only called for LdSt for which getLdStBaseRegImmOfs returns
> >> true.
> >>
> >> Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AAr
> >> ch64InstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff
> >> =====================================================================
> >> =========
> >> --- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h (original)
> >> +++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h Mon Sep  8
> >> 09:43:48 2014
> >> @@ -52,6 +52,10 @@ public:
> >>    bool isCoalescableExtInstr(const MachineInstr &MI, unsigned
> >>    &SrcReg,
> >>                               unsigned &DstReg, unsigned &SubIdx)
> >>                               const override;
> >>
> >> +  bool
> >> +  areMemAccessesTriviallyDisjoint(MachineInstr *MIa, MachineInstr
> >> *MIb,
> >> +                                  AliasAnalysis *AA = nullptr)
> >> const
> >> override;
> >> +
> >>    unsigned isLoadFromStackSlot(const MachineInstr *MI,
> >>                                 int &FrameIndex) const override;
> >>    unsigned isStoreToStackSlot(const MachineInstr *MI, @@ -90,6
> >> +94,10 @@ public:
> >>                              unsigned &Offset,
> >>                              const TargetRegisterInfo *TRI) const
> >>                              override;
> >>
> >> +  bool getLdStBaseRegImmOfsWidth(MachineInstr *LdSt, unsigned
> >> &BaseReg,
> >> +                                 int &Offset, int &Width,
> >> +                                 const TargetRegisterInfo *TRI)
> >> const;
> >> +
> >>    bool enableClusterLoads() const override { return true; }
> >>
> >>    bool shouldClusterLoads(MachineInstr *FirstLdSt, MachineInstr
> >>    *SecondLdSt,
> >>
> >> Added:
> >> llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/a
> >> rm64-triv-disjoint-mem-access.ll?rev=217371&view=auto
> >> =====================================================================
> >> =========
> >> ---
> >> llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
> >> (added)
> >> +++
> >> llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
> >> Mon Sep  8 09:43:48 2014
> >> @@ -0,0 +1,31 @@
> >> +; RUN: llc < %s -mtriple=arm64-linux-gnu -mcpu=cortex-a53
> >> -enable-aa-sched-mi | FileCheck %s
> >> +; Check that the scheduler moves the load from a[1] past the
> >> store
> >> into a[2].
> >> + at a = common global i32* null, align 8 @m = common global i32 0,
> >> +align 4
> >> +
> >> +; Function Attrs: nounwind
> >> +define i32 @func(i32 %i, i32 %j, i32 %k) #0 {
> >> +entry:
> >> +; CHECK: ldr {{w[0-9]+}}, [x[[REG:[0-9]+]], #4] ; CHECK: str
> >> +{{w[0-9]+}}, [x[[REG]], #8]
> >> +  %0 = load i32** @a, align 8, !tbaa !1
> >> +  %arrayidx = getelementptr inbounds i32* %0, i64 2
> >> +  store i32 %i, i32* %arrayidx, align 4, !tbaa !5
> >> +  %arrayidx1 = getelementptr inbounds i32* %0, i64 1
> >> +  %1 = load i32* %arrayidx1, align 4, !tbaa !5
> >> +  %add = add nsw i32 %k, %i
> >> +  store i32 %add, i32* @m, align 4, !tbaa !5
> >> +  ret i32 %1
> >> +}
> >> +
> >> +attributes #0 = { nounwind "less-precise-fpmad"="false"
> >> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
> >> "no-infs-fp-math"="true" "no-nans-fp-math"="true"
> >> "stack-protector-buffer-size"="8" "unsafe-fp-math"="true"
> >> "use-soft-float"="false" }
> >> +
> >> +!llvm.ident = !{!0}
> >> +
> >> +!0 = metadata !{metadata !"clang version 3.6.0 "}
> >> +!1 = metadata !{metadata !2, metadata !2, i64 0}
> >> +!2 = metadata !{metadata !"any pointer", metadata !3, i64 0}
> >> +!3 = metadata !{metadata !"omnipotent char", metadata !4, i64 0}
> >> +!4 = metadata !{metadata !"Simple C/C++ TBAA"}
> >> +!5 = metadata !{metadata !6, metadata !6, i64 0}
> >> +!6 = metadata !{metadata !"int", metadata !3, i64 0}
> >>
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
> 
> 
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory