<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><br><div><div>On Sep 9, 2014, at 6:03 PM, Matt Arsenault <<a href="mailto:arsenm2@gmail.com">arsenm2@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br>On Sep 8, 2014, at 10:43 AM, Chad Rosier <<a href="mailto:mcrosier@codeaurora.org">mcrosier@codeaurora.org</a>> wrote:<br><br><blockquote type="cite">Author: mcrosier<br>Date: Mon Sep 8 09:43:48 2014<br>New Revision: 217371<br><br>URL: <a href="http://llvm.org/viewvc/llvm-project?rev=217371&view=rev">http://llvm.org/viewvc/llvm-project?rev=217371&view=rev</a><br>Log:<br>[AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.<br><br>Patch by Sanjin Sijaric <<a href="mailto:ssijaric@codeaurora.org">ssijaric@codeaurora.org</a>>!<br>Phabricator Review: <a href="http://reviews.llvm.org/D5103">http://reviews.llvm.org/D5103</a><br><br>Added:<br> llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll<br>Modified:<br> llvm/trunk/include/llvm/Target/TargetInstrInfo.h<br> llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp<br> llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp<br> llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h<br><br>Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h<br>URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff</a><br>==============================================================================<br>--- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original)<br>+++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Mon Sep 8 09:43:48 2014<br>@@ -1192,6 +1192,20 @@ public:<br> return nullptr;<br> }<br><br>+ // areMemAccessesTriviallyDisjoint - Sometimes, it is possible for the target<br>+ // to tell, even without aliasing information, that two MIs access different<br>+ // memory addresses. This function returns true if two MIs access different<br>+ // memory addresses, and false otherwise.<br>+ virtual bool<br>+ areMemAccessesTriviallyDisjoint(MachineInstr *MIa, MachineInstr *MIb,<br>+ AliasAnalysis *AA = nullptr) const {<br>+ assert(MIa && (MIa->mayLoad() || MIa->mayStore()) &&<br>+ "MIa must load from or modify a memory location");<br>+ assert(MIb && (MIb->mayLoad() || MIb->mayStore()) &&<br>+ "MIb must load from or modify a memory location");<br>+ return false;<br>+ }<br>+<br>private:<br> int CallFrameSetupOpcode, CallFrameDestroyOpcode;<br>};<br><br>Modified: llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp<br>URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp?rev=217371&r1=217370&r2=217371&view=diff">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp?rev=217371&r1=217370&r2=217371&view=diff</a><br>==============================================================================<br>--- llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp (original)<br>+++ llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp Mon Sep 8 09:43:48 2014<br>@@ -511,9 +511,18 @@ static inline bool isUnsafeMemoryObject(<br>static bool MIsNeedChainEdge(AliasAnalysis *AA, const MachineFrameInfo *MFI,<br> MachineInstr *MIa,<br> MachineInstr *MIb) {<br>+ const MachineFunction *MF = MIa->getParent()->getParent();<br>+ const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();<br>+<br> // Cover a trivial case - no edge is need to itself.<br> if (MIa == MIb)<br> return false;<br>+<span class="Apple-converted-space"> </span><br>+ // Let the target decide if memory accesses cannot possibly overlap.<br>+ if ((MIa->mayLoad() || MIa->mayStore()) &&<br>+ (MIb->mayLoad() || MIb->mayStore()))<br>+ if (TII->areMemAccessesTriviallyDisjoint(MIa, MIb, AA))<br>+ return false;<br><br> // FIXME: Need to handle multiple memory operands to support all targets.<br> if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand())<br><br>Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp<br>URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp?rev=217371&r1=217370&r2=217371&view=diff">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp?rev=217371&r1=217370&r2=217371&view=diff</a><br>==============================================================================<br>--- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp (original)<br>+++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp Mon Sep 8 09:43:48 2014<br>@@ -607,6 +607,42 @@ bool AArch64InstrInfo::isCoalescableExtI<br> }<br>}<br><br>+bool<br>+AArch64InstrInfo::areMemAccessesTriviallyDisjoint(MachineInstr *MIa,<br>+ MachineInstr *MIb,<br>+ AliasAnalysis *AA) const {<br>+ const TargetRegisterInfo *TRI = &getRegisterInfo();<br>+ unsigned BaseRegA = 0, BaseRegB = 0;<br>+ int OffsetA = 0, OffsetB = 0;<br>+ int WidthA = 0, WidthB = 0;<br>+<br>+ assert(MIa && (MIa->mayLoad() || MIa->mayStore()) &&<br>+ "MIa must be a store or a load");<br>+ assert(MIb && (MIb->mayLoad() || MIb->mayStore()) &&<br>+ "MIb must be a store or a load");<br>+<br>+ if (MIa->hasUnmodeledSideEffects() || MIb->hasUnmodeledSideEffects() ||<br>+ MIa->hasOrderedMemoryRef() || MIb->hasOrderedMemoryRef())<br>+ return false;<br>+<br>+ // Retrieve the base register, offset from the base register and width. Width<br>+ // is the size of memory that is being loaded/stored (e.g. 1, 2, 4, 8). If<br>+ // base registers are identical, and the offset of a lower memory access +<br>+ // the width doesn't overlap the offset of a higher memory access,<br>+ // then the memory accesses are different.<br>+ if (getLdStBaseRegImmOfsWidth(MIa, BaseRegA, OffsetA, WidthA, TRI) &&<br>+ getLdStBaseRegImmOfsWidth(MIb, BaseRegB, OffsetB, WidthB, TRI)) {<br>+ if (BaseRegA == BaseRegB) {<br>+ int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;<br>+ int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;<br>+ int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;<br>+ if (LowOffset + LowWidth <= HighOffset)<br>+ return true;<br>+ }<br>+ }<br>+ return false;<br>+}<br>+<br>/// analyzeCompare - For a comparison instruction, return the source registers<br>/// in SrcReg and SrcReg2, and the value it compares against in CmpValue.<br>/// Return true if the comparison instruction can be analyzed.<br>@@ -1270,6 +1306,102 @@ AArch64InstrInfo::getLdStBaseRegImmOfs(M<br> };<br>}<br><br></blockquote><br><br>Why not add the width argument to the standard getLdStBaseRegImmOfs? Other targets that will want to implement areMemAccessesTriviallyDisjoint are going to want the same information. The logic of figuring out overlapping offsets is also likely going to be the same, so maybe that part should be pulled into the user of this.<br></div></blockquote><div><br></div><div>Actually I don’t think it’s necessary for the width argument. I’ll be posting a patch later today that implements this for R600, and I just used the MachineMemOperand to get the size since it was easier than parsing it out from the instruction</div><div><br></div><br><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br><br><blockquote type="cite">+bool AArch64InstrInfo::getLdStBaseRegImmOfsWidth(<br>+ MachineInstr *LdSt, unsigned &BaseReg, int &Offset, int &Width,<br>+ const TargetRegisterInfo *TRI) const {<br>+ // Handle only loads/stores with base register followed by immediate offset.<br>+ if (LdSt->getNumOperands() != 3)<br>+ return false;<br>+ if (!LdSt->getOperand(1).isReg() || !LdSt->getOperand(2).isImm())<br>+ return false;<br>+<br>+ // Offset is calculated as the immediate operand multiplied by the scaling factor.<br>+ // Unscaled instructions have scaling factor set to 1.<br>+ int Scale = 0;<br>+ switch (LdSt->getOpcode()) {<br>+ default:<br>+ return false;<br>+ case AArch64::LDURQi:<br>+ case AArch64::STURQi:<br>+ Width = 16;<br>+ Scale = 1;<br>+ break;<br>+ case AArch64::LDURXi:<br>+ case AArch64::LDURDi:<br>+ case AArch64::STURXi:<br>+ case AArch64::STURDi:<br>+ Width = 8;<br>+ Scale = 1;<br>+ break;<br>+ case AArch64::LDURWi:<br>+ case AArch64::LDURSi:<br>+ case AArch64::LDURSWi:<br>+ case AArch64::STURWi:<br>+ case AArch64::STURSi:<br>+ Width = 4;<br>+ Scale = 1;<br>+ break;<br>+ case AArch64::LDURHi:<br>+ case AArch64::LDURHHi:<br>+ case AArch64::LDURSHXi:<br>+ case AArch64::LDURSHWi:<br>+ case AArch64::STURHi:<br>+ case AArch64::STURHHi:<br>+ Width = 2;<br>+ Scale = 1;<br>+ break;<br>+ case AArch64::LDURBi:<br>+ case AArch64::LDURBBi:<br>+ case AArch64::LDURSBXi:<br>+ case AArch64::LDURSBWi:<br>+ case AArch64::STURBi:<br>+ case AArch64::STURBBi:<br>+ Width = 1;<br>+ Scale = 1;<br>+ break;<br>+ case AArch64::LDRXui:<br>+ case AArch64::STRXui:<br>+ Scale = Width = 8;<br>+ break;<br>+ case AArch64::LDRWui:<br>+ case AArch64::STRWui:<br>+ Scale = Width = 4;<br>+ break;<br>+ case AArch64::LDRBui:<br>+ case AArch64::STRBui:<br>+ Scale = Width = 1;<br>+ break;<br>+ case AArch64::LDRHui:<br>+ case AArch64::STRHui:<br>+ Scale = Width = 2;<br>+ break;<br>+ case AArch64::LDRSui:<br>+ case AArch64::STRSui:<br>+ Scale = Width = 4;<br>+ break;<br>+ case AArch64::LDRDui:<br>+ case AArch64::STRDui:<br>+ Scale = Width = 8;<br>+ break;<br>+ case AArch64::LDRQui:<br>+ case AArch64::STRQui:<br>+ Scale = Width = 16;<br>+ break;<br>+ case AArch64::LDRBBui:<br>+ case AArch64::STRBBui:<br>+ Scale = Width = 1;<br>+ break;<br>+ case AArch64::LDRHHui:<br>+ case AArch64::STRHHui:<br>+ Scale = Width = 2;<br>+ break;<br>+ };<br>+<br>+ BaseReg = LdSt->getOperand(1).getReg();<br>+ Offset = LdSt->getOperand(2).getImm() * Scale;<br>+ return true;<br>+}<br>+<br>/// Detect opportunities for ldp/stp formation.<br>///<br>/// Only called for LdSt for which getLdStBaseRegImmOfs returns true.<br><br>Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h<br>URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff</a><br>==============================================================================<br>--- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h (original)<br>+++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h Mon Sep 8 09:43:48 2014<br>@@ -52,6 +52,10 @@ public:<br> bool isCoalescableExtInstr(const MachineInstr &MI, unsigned &SrcReg,<br> unsigned &DstReg, unsigned &SubIdx) const override;<br><br>+ bool<br>+ areMemAccessesTriviallyDisjoint(MachineInstr *MIa, MachineInstr *MIb,<br>+ AliasAnalysis *AA = nullptr) const override;<br>+<br> unsigned isLoadFromStackSlot(const MachineInstr *MI,<br> int &FrameIndex) const override;<br> unsigned isStoreToStackSlot(const MachineInstr *MI,<br>@@ -90,6 +94,10 @@ public:<br> unsigned &Offset,<br> const TargetRegisterInfo *TRI) const override;<br><br>+ bool getLdStBaseRegImmOfsWidth(MachineInstr *LdSt, unsigned &BaseReg,<br>+ int &Offset, int &Width,<br>+ const TargetRegisterInfo *TRI) const;<br>+<br> bool enableClusterLoads() const override { return true; }<br><br> bool shouldClusterLoads(MachineInstr *FirstLdSt, MachineInstr *SecondLdSt,<br><br>Added: llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll<br>URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll?rev=217371&view=auto">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll?rev=217371&view=auto</a><br>==============================================================================<br>--- llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll (added)<br>+++ llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll Mon Sep 8 09:43:48 2014<br>@@ -0,0 +1,31 @@<br>+; RUN: llc < %s -mtriple=arm64-linux-gnu -mcpu=cortex-a53 -enable-aa-sched-mi | FileCheck %s<br>+; Check that the scheduler moves the load from a[1] past the store into a[2].<br>+@a = common global i32* null, align 8<br>+@m = common global i32 0, align 4<br>+<br>+; Function Attrs: nounwind<br>+define i32 @func(i32 %i, i32 %j, i32 %k) #0 {<br>+entry:<br>+; CHECK: ldr {{w[0-9]+}}, [x[[REG:[0-9]+]], #4]<br>+; CHECK: str {{w[0-9]+}}, [x[[REG]], #8]<br>+ %0 = load i32** @a, align 8, !tbaa !1<br>+ %arrayidx = getelementptr inbounds i32* %0, i64 2<br>+ store i32 %i, i32* %arrayidx, align 4, !tbaa !5<br>+ %arrayidx1 = getelementptr inbounds i32* %0, i64 1<br>+ %1 = load i32* %arrayidx1, align 4, !tbaa !5<br>+ %add = add nsw i32 %k, %i<br>+ store i32 %add, i32* @m, align 4, !tbaa !5<br>+ ret i32 %1<br>+}<br>+<br>+attributes #0 = { nounwind "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="true" "no-nans-fp-math"="true" "stack-protector-buffer-size"="8" "unsafe-fp-math"="true" "use-soft-float"="false" }<br>+<br>+!llvm.ident = !{!0}<br>+<br>+!0 = metadata !{metadata !"clang version 3.6.0 "}<br>+!1 = metadata !{metadata !2, metadata !2, i64 0}<br>+!2 = metadata !{metadata !"any pointer", metadata !3, i64 0}<br>+!3 = metadata !{metadata !"omnipotent char", metadata !4, i64 0}<br>+!4 = metadata !{metadata !"Simple C/C++ TBAA"}<br>+!5 = metadata !{metadata !6, metadata !6, i64 0}<br>+!6 = metadata !{metadata !"int", metadata !3, i64 0}<br><br><br>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</blockquote></div></blockquote></div><br></body></html>