[llvm] r217371 - [AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.
Hal Finkel
hfinkel at anl.gov
Tue Sep 30 15:36:38 PDT 2014
Andy,
Can you take a quick look at this? I think that what Sanjin says makes sense; do you agree? Is there a way to add an assert in case something goes wrong?
Sanjin, if nothing else, please add a comment in the code to this effect.
Thanks again,
Hal
----- Original Message -----
> From: "Sanjin Sijaric" <ssijaric at codeaurora.org>
> To: mcrosier at codeaurora.org
> Cc: "Hal Finkel" <hfinkel at anl.gov>, llvm-commits at cs.uiuc.edu, "Matt Arsenault" <arsenm2 at gmail.com>
> Sent: Monday, September 29, 2014 5:22:47 PM
> Subject: RE: [llvm] r217371 - [AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.
>
> Hi Hal,
>
> The original purpose of the function was for discarding trivial chain
> edges
> when constructing the dependence graph for scheduling. If BaseReg is
> a
> physical register that gets modified, then we may have something
> like:
>
> LD R1 = A(BaseReg, Offset1) (1)
>
> BaseReg = .... (2)
>
> ST R2, B(BaseReg, Offset2) (3)
>
> Given the inputs (1) and (3) to areMemAccessesTriviallyDisjoint, it
> may be
> determined that there is no need for an edge between them. There
> will be no
> dependence between (1) and (3), but there will be dependences between
> "(1)
> and (2)" and "(2) and (3)". I think this should be enough to prevent
> any
> bad scheduling from happening.
>
> Thanks,
> Sanjin
>
> -----Original Message-----
> From: Chad Rosier [mailto:mcrosier at codeaurora.org]
> Sent: Monday, September 29, 2014 6:47 AM
> To: ssijaric at codeaurora.org
> Cc: Hal Finkel; Chad Rosier; llvm-commits at cs.uiuc.edu; Matt Arsenault
> Subject: Re: [llvm] r217371 - [AArch64] Improve AA to remove unneeded
> edges
> in the AA MI scheduling graph.
>
> +Sanjin
>
> Adding Sanjin since this is his work.
>
> Chad
>
> > ----- Original Message -----
> >> From: "Chad Rosier" <mcrosier at codeaurora.org>
> >> To: llvm-commits at cs.uiuc.edu
> >> Sent: Monday, September 8, 2014 9:43:48 AM
> >> Subject: [llvm] r217371 - [AArch64] Improve AA to remove unneeded
> >> edges
> >> in the AA MI scheduling graph.
> >>
> >> Author: mcrosier
> >> Date: Mon Sep 8 09:43:48 2014
> >> New Revision: 217371
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=217371&view=rev
> >> Log:
> >> [AArch64] Improve AA to remove unneeded edges in the AA MI
> >> scheduling
> >> graph.
> >>
> >> Patch by Sanjin Sijaric <ssijaric at codeaurora.org>!
> >> Phabricator Review: http://reviews.llvm.org/D5103
> >>
> >> Added:
> >> llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
> >> Modified:
> >> llvm/trunk/include/llvm/Target/TargetInstrInfo.h
> >> llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp
> >> llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp
> >> llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h
> >>
> >> Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/Ta
> >> rgetInstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff
> >> =====================================================================
> >> =========
> >> --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original)
> >> +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Mon Sep 8
> >> 09:43:48 2014
> >> @@ -1192,6 +1192,20 @@ public:
> >> return nullptr;
> >> }
> >>
> >> + // areMemAccessesTriviallyDisjoint - Sometimes, it is possible
> >> for
> >> the target
> >> + // to tell, even without aliasing information, that two MIs
> >> access
> >> different
> >> + // memory addresses. This function returns true if two MIs
> >> access
> >> different
> >> + // memory addresses, and false otherwise.
> >> + virtual bool
> >> + areMemAccessesTriviallyDisjoint(MachineInstr *MIa, MachineInstr
> >> *MIb,
> >> + AliasAnalysis *AA = nullptr)
> >> const
> >> {
> >> + assert(MIa && (MIa->mayLoad() || MIa->mayStore()) &&
> >> + "MIa must load from or modify a memory location");
> >> + assert(MIb && (MIb->mayLoad() || MIb->mayStore()) &&
> >> + "MIb must load from or modify a memory location");
> >> + return false;
> >> + }
> >> +
> >> private:
> >> int CallFrameSetupOpcode, CallFrameDestroyOpcode; };
> >>
> >> Modified: llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDA
> >> GInstrs.cpp?rev=217371&r1=217370&r2=217371&view=diff
> >> =====================================================================
> >> =========
> >> --- llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp (original)
> >> +++ llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp Mon Sep 8
> >> 09:43:48
> >> 2014
> >> @@ -511,9 +511,18 @@ static inline bool isUnsafeMemoryObject(
> >> static
> >> bool MIsNeedChainEdge(AliasAnalysis *AA, const MachineFrameInfo
> >> *MFI,
> >> MachineInstr *MIa,
> >> MachineInstr *MIb) {
> >> + const MachineFunction *MF = MIa->getParent()->getParent();
> >> const
> >> + TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
> >> +
> >> // Cover a trivial case - no edge is need to itself.
> >> if (MIa == MIb)
> >> return false;
> >> +
> >> + // Let the target decide if memory accesses cannot possibly
> >> overlap.
> >> + if ((MIa->mayLoad() || MIa->mayStore()) &&
> >> + (MIb->mayLoad() || MIb->mayStore()))
> >> + if (TII->areMemAccessesTriviallyDisjoint(MIa, MIb, AA))
> >> + return false;
> >>
> >> // FIXME: Need to handle multiple memory operands to support
> >> all
> >> targets.
> >> if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand())
> >>
> >> Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AAr
> >> ch64InstrInfo.cpp?rev=217371&r1=217370&r2=217371&view=diff
> >> =====================================================================
> >> =========
> >> --- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp (original)
> >> +++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp Mon Sep 8
> >> 09:43:48 2014
> >> @@ -607,6 +607,42 @@ bool AArch64InstrInfo::isCoalescableExtI
> >> }
> >> }
> >>
> >> +bool
> >> +AArch64InstrInfo::areMemAccessesTriviallyDisjoint(MachineInstr
> >> *MIa,
> >> + MachineInstr
> >> *MIb,
> >> + AliasAnalysis
> >> *AA)
> >> const {
> >> + const TargetRegisterInfo *TRI = &getRegisterInfo(); unsigned
> >> + BaseRegA = 0, BaseRegB = 0; int OffsetA = 0, OffsetB = 0; int
> >> + WidthA = 0, WidthB = 0;
> >> +
> >> + assert(MIa && (MIa->mayLoad() || MIa->mayStore()) &&
> >> + "MIa must be a store or a load"); assert(MIb &&
> >> + (MIb->mayLoad() || MIb->mayStore()) &&
> >> + "MIb must be a store or a load");
> >> +
> >> + if (MIa->hasUnmodeledSideEffects() ||
> >> MIb->hasUnmodeledSideEffects() ||
> >> + MIa->hasOrderedMemoryRef() || MIb->hasOrderedMemoryRef())
> >> + return false;
> >> +
> >> + // Retrieve the base register, offset from the base register
> >> and
> >> width. Width
> >> + // is the size of memory that is being loaded/stored (e.g. 1,
> >> 2,
> >> 4, 8). If
> >> + // base registers are identical, and the offset of a lower
> >> memory
> >> access +
> >> + // the width doesn't overlap the offset of a higher memory
> >> access,
> >> + // then the memory accesses are different.
> >> + if (getLdStBaseRegImmOfsWidth(MIa, BaseRegA, OffsetA, WidthA,
> >> TRI)
> >> &&
> >> + getLdStBaseRegImmOfsWidth(MIb, BaseRegB, OffsetB, WidthB,
> >> TRI)) {
> >> + if (BaseRegA == BaseRegB) {
> >
> > I think that this makes sense only for SSA virtual registers, but
> > not
> > for physical ones (because the value in a physical register might
> > have
> > changed in between the two instructions). If I'm right, then you'll
> > want to exclude physical registers here (at least any that might be
> > modified during the function's execution).
> >
> > -Hal
> >
> >> + int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
> >> + int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
> >> + int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
> >> + if (LowOffset + LowWidth <= HighOffset)
> >> + return true;
> >> + }
> >> + }
> >> + return false;
> >> +}
> >> +
> >> /// analyzeCompare - For a comparison instruction, return the
> >> source
> >> registers /// in SrcReg and SrcReg2, and the value it compares
> >> against in CmpValue.
> >> /// Return true if the comparison instruction can be analyzed.
> >> @@ -1270,6 +1306,102 @@ AArch64InstrInfo::getLdStBaseRegImmOfs(M
> >> };
> >> }
> >>
> >> +bool AArch64InstrInfo::getLdStBaseRegImmOfsWidth(
> >> + MachineInstr *LdSt, unsigned &BaseReg, int &Offset, int
> >> &Width,
> >> + const TargetRegisterInfo *TRI) const {
> >> + // Handle only loads/stores with base register followed by
> >> immediate offset.
> >> + if (LdSt->getNumOperands() != 3)
> >> + return false;
> >> + if (!LdSt->getOperand(1).isReg() ||
> >> !LdSt->getOperand(2).isImm())
> >> + return false;
> >> +
> >> + // Offset is calculated as the immediate operand multiplied by
> >> the
> >> scaling factor.
> >> + // Unscaled instructions have scaling factor set to 1.
> >> + int Scale = 0;
> >> + switch (LdSt->getOpcode()) {
> >> + default:
> >> + return false;
> >> + case AArch64::LDURQi:
> >> + case AArch64::STURQi:
> >> + Width = 16;
> >> + Scale = 1;
> >> + break;
> >> + case AArch64::LDURXi:
> >> + case AArch64::LDURDi:
> >> + case AArch64::STURXi:
> >> + case AArch64::STURDi:
> >> + Width = 8;
> >> + Scale = 1;
> >> + break;
> >> + case AArch64::LDURWi:
> >> + case AArch64::LDURSi:
> >> + case AArch64::LDURSWi:
> >> + case AArch64::STURWi:
> >> + case AArch64::STURSi:
> >> + Width = 4;
> >> + Scale = 1;
> >> + break;
> >> + case AArch64::LDURHi:
> >> + case AArch64::LDURHHi:
> >> + case AArch64::LDURSHXi:
> >> + case AArch64::LDURSHWi:
> >> + case AArch64::STURHi:
> >> + case AArch64::STURHHi:
> >> + Width = 2;
> >> + Scale = 1;
> >> + break;
> >> + case AArch64::LDURBi:
> >> + case AArch64::LDURBBi:
> >> + case AArch64::LDURSBXi:
> >> + case AArch64::LDURSBWi:
> >> + case AArch64::STURBi:
> >> + case AArch64::STURBBi:
> >> + Width = 1;
> >> + Scale = 1;
> >> + break;
> >> + case AArch64::LDRXui:
> >> + case AArch64::STRXui:
> >> + Scale = Width = 8;
> >> + break;
> >> + case AArch64::LDRWui:
> >> + case AArch64::STRWui:
> >> + Scale = Width = 4;
> >> + break;
> >> + case AArch64::LDRBui:
> >> + case AArch64::STRBui:
> >> + Scale = Width = 1;
> >> + break;
> >> + case AArch64::LDRHui:
> >> + case AArch64::STRHui:
> >> + Scale = Width = 2;
> >> + break;
> >> + case AArch64::LDRSui:
> >> + case AArch64::STRSui:
> >> + Scale = Width = 4;
> >> + break;
> >> + case AArch64::LDRDui:
> >> + case AArch64::STRDui:
> >> + Scale = Width = 8;
> >> + break;
> >> + case AArch64::LDRQui:
> >> + case AArch64::STRQui:
> >> + Scale = Width = 16;
> >> + break;
> >> + case AArch64::LDRBBui:
> >> + case AArch64::STRBBui:
> >> + Scale = Width = 1;
> >> + break;
> >> + case AArch64::LDRHHui:
> >> + case AArch64::STRHHui:
> >> + Scale = Width = 2;
> >> + break;
> >> + };
> >> +
> >> + BaseReg = LdSt->getOperand(1).getReg();
> >> + Offset = LdSt->getOperand(2).getImm() * Scale;
> >> + return true;
> >> +}
> >> +
> >> /// Detect opportunities for ldp/stp formation.
> >> ///
> >> /// Only called for LdSt for which getLdStBaseRegImmOfs returns
> >> true.
> >>
> >> Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AAr
> >> ch64InstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff
> >> =====================================================================
> >> =========
> >> --- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h (original)
> >> +++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h Mon Sep 8
> >> 09:43:48 2014
> >> @@ -52,6 +52,10 @@ public:
> >> bool isCoalescableExtInstr(const MachineInstr &MI, unsigned
> >> &SrcReg,
> >> unsigned &DstReg, unsigned &SubIdx)
> >> const override;
> >>
> >> + bool
> >> + areMemAccessesTriviallyDisjoint(MachineInstr *MIa, MachineInstr
> >> *MIb,
> >> + AliasAnalysis *AA = nullptr)
> >> const
> >> override;
> >> +
> >> unsigned isLoadFromStackSlot(const MachineInstr *MI,
> >> int &FrameIndex) const override;
> >> unsigned isStoreToStackSlot(const MachineInstr *MI, @@ -90,6
> >> +94,10 @@ public:
> >> unsigned &Offset,
> >> const TargetRegisterInfo *TRI) const
> >> override;
> >>
> >> + bool getLdStBaseRegImmOfsWidth(MachineInstr *LdSt, unsigned
> >> &BaseReg,
> >> + int &Offset, int &Width,
> >> + const TargetRegisterInfo *TRI)
> >> const;
> >> +
> >> bool enableClusterLoads() const override { return true; }
> >>
> >> bool shouldClusterLoads(MachineInstr *FirstLdSt, MachineInstr
> >> *SecondLdSt,
> >>
> >> Added:
> >> llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/a
> >> rm64-triv-disjoint-mem-access.ll?rev=217371&view=auto
> >> =====================================================================
> >> =========
> >> ---
> >> llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
> >> (added)
> >> +++
> >> llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
> >> Mon Sep 8 09:43:48 2014
> >> @@ -0,0 +1,31 @@
> >> +; RUN: llc < %s -mtriple=arm64-linux-gnu -mcpu=cortex-a53
> >> -enable-aa-sched-mi | FileCheck %s
> >> +; Check that the scheduler moves the load from a[1] past the
> >> store
> >> into a[2].
> >> + at a = common global i32* null, align 8 @m = common global i32 0,
> >> +align 4
> >> +
> >> +; Function Attrs: nounwind
> >> +define i32 @func(i32 %i, i32 %j, i32 %k) #0 {
> >> +entry:
> >> +; CHECK: ldr {{w[0-9]+}}, [x[[REG:[0-9]+]], #4] ; CHECK: str
> >> +{{w[0-9]+}}, [x[[REG]], #8]
> >> + %0 = load i32** @a, align 8, !tbaa !1
> >> + %arrayidx = getelementptr inbounds i32* %0, i64 2
> >> + store i32 %i, i32* %arrayidx, align 4, !tbaa !5
> >> + %arrayidx1 = getelementptr inbounds i32* %0, i64 1
> >> + %1 = load i32* %arrayidx1, align 4, !tbaa !5
> >> + %add = add nsw i32 %k, %i
> >> + store i32 %add, i32* @m, align 4, !tbaa !5
> >> + ret i32 %1
> >> +}
> >> +
> >> +attributes #0 = { nounwind "less-precise-fpmad"="false"
> >> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
> >> "no-infs-fp-math"="true" "no-nans-fp-math"="true"
> >> "stack-protector-buffer-size"="8" "unsafe-fp-math"="true"
> >> "use-soft-float"="false" }
> >> +
> >> +!llvm.ident = !{!0}
> >> +
> >> +!0 = metadata !{metadata !"clang version 3.6.0 "}
> >> +!1 = metadata !{metadata !2, metadata !2, i64 0}
> >> +!2 = metadata !{metadata !"any pointer", metadata !3, i64 0}
> >> +!3 = metadata !{metadata !"omnipotent char", metadata !4, i64 0}
> >> +!4 = metadata !{metadata !"Simple C/C++ TBAA"}
> >> +!5 = metadata !{metadata !6, metadata !6, i64 0}
> >> +!6 = metadata !{metadata !"int", metadata !3, i64 0}
> >>
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
>
>
>
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-commits
mailing list