[LLVMdev] Scheduling question (memory dependency)

William J. Schmidt wschmidt at linux.vnet.ibm.com
Fri Sep 21 07:07:21 PDT 2012


Here's another data point that may be useful.  [Scheduling experts,
please help! :) ]

If the two-byte bitfield is replaced by a two-byte struct (replace
"short i:8" with "short i", etc.), the scheduler properly generates a
dependency between the store and the load.  For this case, a GEP is used
instead of a bitcast:

------------------------------------------------------------------
define void @_Z5check3fooj(%struct.foo* nocapture byval %f, i32 %i)
noinline {
entry:
  %i1 = getelementptr inbounds %struct.foo* %f, i64 0, i32 0
  %0 = load i16* %i1, align 2, !tbaa !0
------------------------------------------------------------------

One notable difference is the "!tbaa !0" decoration on the load.  I
don't know whether this helps or not.  Later the lowered instructions
look like:

------------------------------------------------------------------
16B		%vreg2<def> = COPY %X4; G8RC_with_sub_32:%vreg2
32B		%vreg1<def> = COPY %X3; G8RC:%vreg1
48B		STH8 %vreg1<kill>, 0, <fi#-1>; mem:ST2[FixedStack-1] G8RC:%vreg1
64B		%vreg0<def> = LHZ 0, <fi#-1>; mem:LD2[%i11] GPRC:%vreg0
                ...
------------------------------------------------------------------

Note the %i11 instead of %0 on the LHZ as another difference.  The
scheduler then generates a dependency between the store and the load,
and everything works properly.

Does this help tickle any memories?

Thanks,
Bill


On Thu, 2012-09-20 at 18:02 -0500, William J. Schmidt wrote:
> Greetings,
> 
> I'm investigating a bug in the PowerPC back end in which a load from a
> storage address is being reordered prior to a store to the same storage
> address.  I'm quite new to LLVM, so I would appreciate some help
> understanding what I'm seeing from the dumps.  I assume that some
> information is missing that would represent the memory dependency, but I
> don't know what form that should take.
> 
> Example source code is as follows:
> 
> ----------------------------------------------------------------
> extern "C" { int printf(const char *, ...); void exit(int);}
> struct foo {
>   short i:8;
> };
> 
> void check(struct foo f, short i) __attribute__((noinline)) {
>   if (f.i != i) {
>     short fi = f.i;
>     printf("problem with %u != %u\n", fi, i);
>     exit(0);
>   }
> }
> ---------------------------------------------------------------
> 
> The initial portion of the Clang output is:
> 
> define void @_Z5check3foos(%struct.foo* nocapture byval %f, i16 signext %i) noinline {
> entry:
>   %0 = bitcast %struct.foo* %f to i16*
>   %1 = load i16* %0, align 2
>   ...
> ---------------------------------------------------------------
> 
> The code works OK at -O0.  At -O1, the first part of the generated code
> is:
> 
> ---------------------------------------------------------------
> .L._Z5check3foos:
> 	.cfi_startproc
> # BB#0:                                 # %entry
> 	mflr 0
> 	std 0, 16(1)
> 	stdu 1, -112(1)
> .Ltmp1:
> 	.cfi_def_cfa_offset 112
> .Ltmp2:
> 	.cfi_offset lr, 16
> 	lha 5, 162(1)
> 	sth 3, 162(1)
>         ...
> ---------------------------------------------------------------
> 
> The problem here is that the incoming parameter in register 3 is stored
> too late, after an attempt to load the value into register 5.
> 
> Looking at dumps with -debug, I see the following:
> 
> ---------------------------------------------------------------
> ********** MACHINEINSTRS **********
> # Machine code for function _Z5check3foos: Post SSA
> Frame Objects:
>   fi#-1: size=2, align=2, fixed, at location [SP+50]
> Function Live Ins: %X3 in %vreg1, %X4 in %vreg2
> 
> 0B	BB#0: derived from LLVM BB %entry
> 	    Live Ins: %X3 %X4
> 16B		%vreg2<def> = COPY %X4; G8RC_with_sub_32:%vreg2
> 32B		%vreg1<def> = COPY %X3; G8RC:%vreg1
> 48B		STH8 %vreg1<kill>, 0, <fi#-1>; mem:ST2[FixedStack-1] G8RC:%vreg1
> 64B		%vreg4<def> = LHA 0, <fi#-1>; mem:LD2[%0] GPRC:%vreg4
>                 ...
> ---------------------------------------------------------------
> 
> So far, so good.  When we get to list scheduling, not quite so good:
> 
> ---------------------------------------------------------------
> ********** List Scheduling **********
> SU(0):   STH8 %X3<kill>, 162, %X1; mem:ST2[FixedStack-1]
>   # preds left       : 0
>   # succs left       : 4
>   # rdefs left       : 0
>   Latency            : 3
>   Depth              : 0
>   Height             : 0
>   Successors:
>    antiSU(2): Latency=0
>    antiSU(2): Latency=0
>    ch  SU(5): Latency=0
>    ch  SU(4294967295) *: Latency=0
> 
> SU(1):   %R5<def> = LHA 162, %X1; mem:LD2[%0]
>   # preds left       : 0
>   # succs left       : 3
>   # rdefs left       : 0
>   Latency            : 5
>   Depth              : 0
>   Height             : 0
>   Successors:
>    out SU(3): Latency=1
>    val SU(2): Latency=5
>    ch  SU(5): Latency=0
> ...
> ---------------------------------------------------------------
> 
> There is no dependency expressed between these two memory operations,
> although they both access the stack address 162(X1).  The scheduler then
> sees both instructions as ready, and chooses the load based on critical
> path height:
> 
> ---------------------------------------------------------------
> *** Examining Available
> Height 9: SU(1):   %R5<def> = LHA 162, %X1; mem:LD2[%0]
> Height 4: SU(0):   STH8 %X3<kill>, 162, %X1; mem:ST2[FixedStack-1]
> *** Scheduling [0]: SU(1):   %R5<def> = LHA 162, %X1; mem:LD2[%0]
> ---------------------------------------------------------------
> 
> The obvious questions are:  Why is there no dependence between these two
> instructions?  And what needs to be done to ensure there is one?  My
> guess is that we somehow need to unify FixedStack-1 with %0, but it's
> not clear to me how this would be accomplished.
> 
> (The store is generated as part of SelectionDAGISel::LowerArguments from
> lib/CodeGen/SelectionDAG/SelectionDAGBuilder, using the PowerPC-specific
> code in lib/Target/PowerPC/PPCISelLowering.cpp.  The load is generated
> directly from the "load" in the LLVM IR at some other time.)
> 
> Thanks very much for any help!
> 
> Bill
> 





More information about the llvm-dev mailing list