[llvm-dev] [buildSchedGraph] memory dependencies

Wed Feb 3 17:51:06 PST 2016

----- Original Message -----
> From: "Jonas Paulsson via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Wednesday, February 3, 2016 4:41:33 AM
> Subject: [llvm-dev] [buildSchedGraph] memory dependencies
> 
> Hi,
> 
> (This only concerns MISNeedChainEdge(), and is separate from D8705)
> 
> I found out that the MIScheduler (pre-ra) could not handle a simple
> test
> case (test/CodeGen/SystemZ/alias-01.ll), with 16 independent load /
> add
> / stores.
> 
> The buildSchedGraph() put too many edges between memory accesses,
> because
> 1) There was no implementation of areMemAccessesTriviallyDisjoint()
> for
> SystemZ.
> 2) Type Based Alias Analysis (TBAA) was not used.
> 
> I have gotten rid of the edges on an experimental level, and would
> now
> like some help and feedback:
> 
> 1): It seems common for targets to check for the same virtual address
> register and then figure out via offsets/sizes if they overlap. When
> faced with the task of doing this, I realized that the
> MachineMemOperands might do the job. I tried the following and wonder
> why this is not done already, perhaps as a default implementation of
> areMemAccessesTriviallyDisjoint(). Is it because memory operands may
> be
> missing, or is it because MachineCombiner may give the register based
> analysis an advantage? This is a check in the case of the *same
> Value*.
> In this case the Value is an argument, which is unsafe against
> others,
> but I am thinking it should at least be safe against itself...
> 
> diff --git a/lib/CodeGen/ScheduleDAGInstrs.cpp
> b/lib/CodeGen/ScheduleDAGInstrs.cpp
> index 00a0b0f..cd48f51 100644
> --- a/lib/CodeGen/ScheduleDAGInstrs.cpp
> +++ b/lib/CodeGen/ScheduleDAGInstrs.cpp
> @@ -584,6 +584,25 @@ static bool MIsNeedChainEdge(AliasAnalysis *AA,
> const MachineFrameInfo *MFI,
>     if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand())
>       return true;
> 
> +  // If mem-operands show that the same address Value is used by
> both
> +  // ("normal") instructions, simply check offsets and sizes of the
> +  // accesses.
> +  MachineMemOperand *MMOa = *MIa->memoperands_begin();
> +  MachineMemOperand *MMOb = *MIb->memoperands_begin();
> +  const Value *VALa = MMOa->getValue();
> +  const Value *VALb = MMOb->getValue();
> +  if (VALa == VALb &&
> +      !MIa->hasUnmodeledSideEffects() &&
> !MIb->hasUnmodeledSideEffects() &&
> +      !MIa->hasOrderedMemoryRef() && !MIb->hasOrderedMemoryRef()) {
> +    int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset();
> +    int WidthA = MMOa->getSize(), WidthB = MMOb->getSize();
> +    int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
> +    int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
> +    int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
> +    if (LowOffset + LowWidth <= HighOffset)
> +      return false;
> +  }
> +
>     if (isUnsafeMemoryObject(MIa, MFI, DL) ||
>     isUnsafeMemoryObject(MIb,
> MFI, DL))
>       return true;
> 

This looks reasonable to me.

> 
> 2) The TBAA tags should separate the loads from the stores. In the MF
> I see
> ...
> %vreg32<def> = L %vreg0, 56, %noreg;
> mem:LD4[%src1(align=64)+56](align=8)(tbaa=!1) GR32Bit:%vreg32
> ADDR64Bit:%vreg0
> ...
> ST %vreg33, %vreg1, 60, %noreg;
> mem:ST4[%dest(align=64)+60](align=4)(tbaa=!4) GR32Bit:%vreg33
> ADDR64Bit:%vreg1
> ...
> 
> Since the tbba tags are part of the MachineMemOperands, it is easy to
> just continue the patch (above isUnsafeMemoryObject() checks) with
> 
> +  AAMDNodes AATagsA = MMOa->getAAInfo(), AATagsB =
> MMOb->getAAInfo();
> +  if (AATagsA->TBAA && AATagsB->TBAA && AATagsA != AATagsB)
> +    return false;
> 
> I see that there is a TBAA method that is intended to give this type
> of
> result. So I wonder why this is not used?
> 
> Would it be correct to implement SystemZs
> areMemAccessesTriviallyDisjoint() with the above code? (I am
> suspecting
> that perhaps my AAMDNodes check is incomplete)

I don't understand this. The whole point of allowing backends use AA is to let the existing AA infrastructure take care of this. As I recall, SystemZ enabled AA in the backend, so why do you need anything special here?

 -Hal

> 
> /Jonas
> 
> test case:
> 
> define void @f1(<16 x i32> *%src1, <16 x float> *%dest) {
> ; CHECK-LABEL: f1:
> ; CHECK-NOT: %r15
> ; CHECK: br %r14
>    %val = load <16 x i32> , <16 x i32> *%src1, !tbaa !1
>    %add = add <16 x i32> %val, %val
>    %res = bitcast <16 x i32> %add to <16 x float>
>    store <16 x float> %res, <16 x float> *%dest, !tbaa !2
>    ret void
> }
> 
> !0 = !{ !"root" }
> !1 = !{ !"set1", !0 }
> !2 = !{ !"set2", !0 }
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory