[llvm] r217371 - [AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.

Wed Oct 1 17:46:42 PDT 2014

Hi Andy,

 

Thanks for clarifying.  I'll add a check to do this only when dealing with
virtual registers.  The reason is that some chain edges are not being
discarded by the BasicAA query inside MIsNeedChainEdge is because the check
for isIdentifiedObject (called by isUnsafeMemoryObject) is done before the
AA part has a chance to determine that there is no aliasing.

 

Thanks,

Sanjin

 

From: Andrew Trick [mailto:atrick at apple.com] 
Sent: Tuesday, September 30, 2014 7:44 PM
To: Hal Finkel
Cc: llvm commits; Matt Arsenault; mcrosier at codeaurora.org; Sanjin Sijaric
Subject: Re: [llvm] r217371 - [AArch64] Improve AA to remove unneeded edges
in the AA MI scheduling graph.

 

 

On Sep 30, 2014, at 3:36 PM, Hal Finkel <hfinkel at anl.gov> wrote:

 

Andy,

Can you take a quick look at this? I think that what Sanjin says makes
sense; do you agree? Is there a way to add an assert in case something goes
wrong?

Sanjin, if nothing else, please add a comment in the code to this effect.

 

Thanks Hal, sometimes I need to be copied on commits to notice them.

 

Is BasicAA unable to pick up on this because it can't answer queries about
aliasing within a single loop iteration? It seems like long term that may be
something to revisit. There's no fundamental reason we need to decode
machine instructions to figure this case out.

 

But given that's a pretty major design issue, I'm OK with checking disjoint
access at the MI level.

 

As you said, areMemAccessesTriviallyDisjoint is lying if it reports true for
physical registers. I don't see the harm in explicitly checking the register
type. Otherwise, there does need to be a very bold comment in the function
header.

 

-Andy

 


Thanks again,
Hal

----- Original Message -----



From: "Sanjin Sijaric" <ssijaric at codeaurora.org>
To: mcrosier at codeaurora.org
Cc: "Hal Finkel" <hfinkel at anl.gov>, llvm-commits at cs.uiuc.edu, "Matt
Arsenault" <arsenm2 at gmail.com>
Sent: Monday, September 29, 2014 5:22:47 PM
Subject: RE: [llvm] r217371 - [AArch64] Improve AA to remove unneeded edges
in the     AA MI scheduling graph.

Hi Hal,

The original purpose of the function was for discarding trivial chain
edges
when constructing the dependence graph for scheduling.  If BaseReg is
a
physical register that gets modified, then we may have something
like:

LD R1 = A(BaseReg, Offset1)               (1)

BaseReg = ....                                          (2)

ST R2, B(BaseReg, Offset2)                  (3)

Given the inputs (1) and (3) to areMemAccessesTriviallyDisjoint, it
may be
determined that there is no need for an edge between them.  There
will be no
dependence between (1) and (3), but there will be dependences between
"(1)
and (2)" and "(2) and (3)".  I think this should be enough to prevent
any
bad scheduling from happening.

Thanks,
Sanjin

-----Original Message-----
From: Chad Rosier [mailto:mcrosier at codeaurora.org]
Sent: Monday, September 29, 2014 6:47 AM
To: ssijaric at codeaurora.org
Cc: Hal Finkel; Chad Rosier; llvm-commits at cs.uiuc.edu; Matt Arsenault
Subject: Re: [llvm] r217371 - [AArch64] Improve AA to remove unneeded
edges
in the AA MI scheduling graph.

+Sanjin

Adding Sanjin since this is his work.

Chad




----- Original Message -----



From: "Chad Rosier" <mcrosier at codeaurora.org>
To: llvm-commits at cs.uiuc.edu
Sent: Monday, September 8, 2014 9:43:48 AM
Subject: [llvm] r217371 - [AArch64] Improve AA to remove unneeded
edges
in the      AA MI scheduling graph.

Author: mcrosier
Date: Mon Sep  8 09:43:48 2014
New Revision: 217371

URL: http://llvm.org/viewvc/llvm-project?rev=217371
<http://llvm.org/viewvc/llvm-project?rev=217371&view=rev> &view=rev
Log:
[AArch64] Improve AA to remove unneeded edges in the AA MI
scheduling
graph.

Patch by Sanjin Sijaric <ssijaric at codeaurora.org>!
Phabricator Review: http://reviews.llvm.org/D5103

Added:
   llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
Modified:
   llvm/trunk/include/llvm/Target/TargetInstrInfo.h
   llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp
   llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp
   llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h

Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h
URL:
http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/Ta
rgetInstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff
=====================================================================
=========

--- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original)
+++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Mon Sep  8
09:43:48 2014
@@ -1192,6 +1192,20 @@ public:
    return nullptr;
  }

+  // areMemAccessesTriviallyDisjoint - Sometimes, it is possible
for
the target
+  // to tell, even without aliasing information, that two MIs
access
different
+  // memory addresses. This function returns true if two MIs
access
different
+  // memory addresses, and false otherwise.
+  virtual bool
+  areMemAccessesTriviallyDisjoint(MachineInstr *MIa, MachineInstr
*MIb,
+                                  AliasAnalysis *AA = nullptr)
const
{
+    assert(MIa && (MIa->mayLoad() || MIa->mayStore()) &&
+           "MIa must load from or modify a memory location");
+    assert(MIb && (MIb->mayLoad() || MIb->mayStore()) &&
+           "MIb must load from or modify a memory location");
+    return false;
+  }
+
private:
  int CallFrameSetupOpcode, CallFrameDestroyOpcode;  };

Modified: llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp
URL:
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDA
GInstrs.cpp?rev=217371&r1=217370&r2=217371&view=diff
=====================================================================
=========
--- llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp (original)
+++ llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp Mon Sep  8
09:43:48
2014
@@ -511,9 +511,18 @@ static inline bool isUnsafeMemoryObject(
static
bool MIsNeedChainEdge(AliasAnalysis *AA, const  MachineFrameInfo
*MFI,
                             MachineInstr *MIa,
                             MachineInstr *MIb) {
+  const MachineFunction *MF = MIa->getParent()->getParent();
const
+ TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
+
  // Cover a trivial case - no edge is need to itself.
  if (MIa == MIb)
    return false;
+
+  // Let the target decide if memory accesses cannot possibly
overlap.
+  if ((MIa->mayLoad() || MIa->mayStore()) &&
+      (MIb->mayLoad() || MIb->mayStore()))
+    if (TII->areMemAccessesTriviallyDisjoint(MIa, MIb, AA))
+      return false;

  // FIXME: Need to handle multiple memory operands to support
  all
  targets.
  if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand())

Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp
URL:
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AAr
ch64InstrInfo.cpp?rev=217371&r1=217370&r2=217371&view=diff
=====================================================================
=========
--- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp (original)
+++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp Mon Sep  8
09:43:48 2014
@@ -607,6 +607,42 @@ bool AArch64InstrInfo::isCoalescableExtI
  }
}

+bool
+AArch64InstrInfo::areMemAccessesTriviallyDisjoint(MachineInstr
*MIa,
+                                                  MachineInstr
*MIb,
+                                                  AliasAnalysis
*AA)
const {
+  const TargetRegisterInfo *TRI = &getRegisterInfo();  unsigned
+ BaseRegA = 0, BaseRegB = 0;  int OffsetA = 0, OffsetB = 0;  int
+ WidthA = 0, WidthB = 0;
+
+  assert(MIa && (MIa->mayLoad() || MIa->mayStore()) &&
+         "MIa must be a store or a load");  assert(MIb &&
+ (MIb->mayLoad() || MIb->mayStore()) &&
+         "MIb must be a store or a load");
+
+  if (MIa->hasUnmodeledSideEffects() ||
MIb->hasUnmodeledSideEffects() ||
+      MIa->hasOrderedMemoryRef() || MIb->hasOrderedMemoryRef())
+    return false;
+
+  // Retrieve the base register, offset from the base register
and
width. Width
+  // is the size of memory that is being loaded/stored (e.g. 1,
2,
4, 8).  If
+  // base registers are identical, and the offset of a lower
memory
access +
+  // the width doesn't overlap the offset of a higher memory
access,
+ // then the memory accesses are different.
+  if (getLdStBaseRegImmOfsWidth(MIa, BaseRegA, OffsetA, WidthA,
TRI)
&&
+      getLdStBaseRegImmOfsWidth(MIb, BaseRegB, OffsetB, WidthB,
TRI)) {
+    if (BaseRegA == BaseRegB) {


I think that this makes sense only for SSA virtual registers, but
not
for physical ones (because the value in a physical register might
have
changed in between the two instructions). If I'm right, then you'll
want to exclude physical registers here (at least any that might be
modified during the function's execution).

-Hal




+      int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
+      int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
+      int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
+      if (LowOffset + LowWidth <= HighOffset)
+        return true;
+    }
+  }
+  return false;
+}
+
/// analyzeCompare - For a comparison instruction, return the
source
registers  /// in SrcReg and SrcReg2, and the value it compares
against in  CmpValue.
/// Return true if the comparison instruction can be analyzed.
@@ -1270,6 +1306,102 @@ AArch64InstrInfo::getLdStBaseRegImmOfs(M
  };
}

+bool AArch64InstrInfo::getLdStBaseRegImmOfsWidth(
+    MachineInstr *LdSt, unsigned &BaseReg, int &Offset, int
&Width,
+    const TargetRegisterInfo *TRI) const {
+  // Handle only loads/stores with base register followed by
immediate offset.
+  if (LdSt->getNumOperands() != 3)
+    return false;
+  if (!LdSt->getOperand(1).isReg() ||
!LdSt->getOperand(2).isImm())
+    return false;
+
+  // Offset is calculated as the immediate operand multiplied by
the
scaling factor.
+  // Unscaled instructions have scaling factor set to 1.
+  int Scale = 0;
+  switch (LdSt->getOpcode()) {
+  default:
+    return false;
+  case AArch64::LDURQi:
+  case AArch64::STURQi:
+    Width = 16;
+    Scale = 1;
+    break;
+  case AArch64::LDURXi:
+  case AArch64::LDURDi:
+  case AArch64::STURXi:
+  case AArch64::STURDi:
+    Width = 8;
+    Scale = 1;
+    break;
+  case AArch64::LDURWi:
+  case AArch64::LDURSi:
+  case AArch64::LDURSWi:
+  case AArch64::STURWi:
+  case AArch64::STURSi:
+    Width = 4;
+    Scale = 1;
+    break;
+  case AArch64::LDURHi:
+  case AArch64::LDURHHi:
+  case AArch64::LDURSHXi:
+  case AArch64::LDURSHWi:
+  case AArch64::STURHi:
+  case AArch64::STURHHi:
+    Width = 2;
+    Scale = 1;
+    break;
+  case AArch64::LDURBi:
+  case AArch64::LDURBBi:
+  case AArch64::LDURSBXi:
+  case AArch64::LDURSBWi:
+  case AArch64::STURBi:
+  case AArch64::STURBBi:
+    Width = 1;
+    Scale = 1;
+    break;
+  case AArch64::LDRXui:
+  case AArch64::STRXui:
+    Scale = Width = 8;
+    break;
+  case AArch64::LDRWui:
+  case AArch64::STRWui:
+    Scale = Width = 4;
+    break;
+  case AArch64::LDRBui:
+  case AArch64::STRBui:
+    Scale = Width = 1;
+    break;
+  case AArch64::LDRHui:
+  case AArch64::STRHui:
+    Scale = Width = 2;
+    break;
+  case AArch64::LDRSui:
+  case AArch64::STRSui:
+    Scale = Width = 4;
+    break;
+  case AArch64::LDRDui:
+  case AArch64::STRDui:
+    Scale = Width = 8;
+    break;
+  case AArch64::LDRQui:
+  case AArch64::STRQui:
+    Scale = Width = 16;
+    break;
+  case AArch64::LDRBBui:
+  case AArch64::STRBBui:
+    Scale = Width = 1;
+    break;
+  case AArch64::LDRHHui:
+  case AArch64::STRHHui:
+    Scale = Width = 2;
+    break;
+  };
+
+  BaseReg = LdSt->getOperand(1).getReg();
+  Offset = LdSt->getOperand(2).getImm() * Scale;
+  return true;
+}
+
/// Detect opportunities for ldp/stp formation.
///
/// Only called for LdSt for which getLdStBaseRegImmOfs returns
true.

Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h
URL:
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AAr
ch64InstrInfo.h?rev=217371&r1=217370&r2=217371&view=diff
=====================================================================
=========
--- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h (original)
+++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h Mon Sep  8
09:43:48 2014
@@ -52,6 +52,10 @@ public:
  bool isCoalescableExtInstr(const MachineInstr &MI, unsigned
  &SrcReg,
                             unsigned &DstReg, unsigned &SubIdx)
                             const override;

+  bool
+  areMemAccessesTriviallyDisjoint(MachineInstr *MIa, MachineInstr
*MIb,
+                                  AliasAnalysis *AA = nullptr)
const
override;
+
  unsigned isLoadFromStackSlot(const MachineInstr *MI,
                               int &FrameIndex) const override;
  unsigned isStoreToStackSlot(const MachineInstr *MI, @@ -90,6
+94,10 @@ public:
                            unsigned &Offset,
                            const TargetRegisterInfo *TRI) const
                            override;

+  bool getLdStBaseRegImmOfsWidth(MachineInstr *LdSt, unsigned
&BaseReg,
+                                 int &Offset, int &Width,
+                                 const TargetRegisterInfo *TRI)
const;
+
  bool enableClusterLoads() const override { return true; }

  bool shouldClusterLoads(MachineInstr *FirstLdSt, MachineInstr
  *SecondLdSt,

Added:
llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
URL:
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/a
rm64-triv-disjoint-mem-access.ll?rev=217371&view=auto
=====================================================================
=========
---
llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
(added)
+++
llvm/trunk/test/CodeGen/AArch64/arm64-triv-disjoint-mem-access.ll
Mon Sep  8 09:43:48 2014
@@ -0,0 +1,31 @@
+; RUN: llc < %s -mtriple=arm64-linux-gnu -mcpu=cortex-a53
-enable-aa-sched-mi | FileCheck %s
+; Check that the scheduler moves the load from a[1] past the
store
into a[2].
+ at a = common global i32* null, align 8 @m = common global i32 0,
+align 4
+
+; Function Attrs: nounwind
+define i32 @func(i32 %i, i32 %j, i32 %k) #0 {
+entry:
+; CHECK: ldr {{w[0-9]+}}, [x[[REG:[0-9]+]], #4] ; CHECK: str
+{{w[0-9]+}}, [x[[REG]], #8]
+  %0 = load i32** @a, align 8, !tbaa !1
+  %arrayidx = getelementptr inbounds i32* %0, i64 2
+  store i32 %i, i32* %arrayidx, align 4, !tbaa !5
+  %arrayidx1 = getelementptr inbounds i32* %0, i64 1
+  %1 = load i32* %arrayidx1, align 4, !tbaa !5
+  %add = add nsw i32 %k, %i
+  store i32 %add, i32* @m, align 4, !tbaa !5
+  ret i32 %1
+}
+
+attributes #0 = { nounwind "less-precise-fpmad"="false"
"no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
"no-infs-fp-math"="true" "no-nans-fp-math"="true"
"stack-protector-buffer-size"="8" "unsafe-fp-math"="true"
"use-soft-float"="false" }
+
+!llvm.ident = !{!0}
+
+!0 = metadata !{metadata !"clang version 3.6.0 "}
+!1 = metadata !{metadata !2, metadata !2, i64 0}
+!2 = metadata !{metadata !"any pointer", metadata !3, i64 0}
+!3 = metadata !{metadata !"omnipotent char", metadata !4, i64 0}
+!4 = metadata !{metadata !"Simple C/C++ TBAA"}
+!5 = metadata !{metadata !6, metadata !6, i64 0}
+!6 = metadata !{metadata !"int", metadata !3, i64 0}


_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory







-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141001/51222acf/attachment.html>