[LLVMbugs] [Bug 13891] New: [ppc64] Incorrect code when passing small bitfield parameters at -O1 and above

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Thu Sep 20 16:09:52 PDT 2012


http://llvm.org/bugs/show_bug.cgi?id=13891

             Bug #: 13891
           Summary: [ppc64] Incorrect code when passing small bitfield
                    parameters at -O1 and above
           Product: libraries
           Version: trunk
          Platform: Other
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: PowerPC
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: wschmidt at linux.vnet.ibm.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified


When lowering a formal parameter representing a 1-, 2-, or 4-byte bitfield, the
PowerPC back end generates a store to a stack address.  Subsequent accesses to
the parameter load from that address.  At -O1 and above, the scheduler may
reorder the load and store, resulting in incorrect code.

Example source code is as follows:

----------------------------------------------------------------
extern "C" { int printf(const char *, ...); void exit(int);}
struct foo {
  short i:8;
};

void check(struct foo f, short i) __attribute__((noinline)) {
  if (f.i != i) {
    short fi = f.i;
    printf("problem with %u != %u\n", fi, i);
    exit(0);
  }
}
---------------------------------------------------------------

The initial portion of the Clang output is:

define void @_Z5check3foos(%struct.foo* nocapture byval %f, i16 signext %i)
noinline {
entry:
  %0 = bitcast %struct.foo* %f to i16*
  %1 = load i16* %0, align 2
  ...
---------------------------------------------------------------

The code works OK at -O0.  At -O1, the first part of the generated code
is:

---------------------------------------------------------------
.L._Z5check3foos:
        .cfi_startproc
# BB#0:                                 # %entry
        mflr 0
        std 0, 16(1)
        stdu 1, -112(1)
.Ltmp1:
        .cfi_def_cfa_offset 112
.Ltmp2:
        .cfi_offset lr, 16
        lha 5, 162(1)
        sth 3, 162(1)
        ...
---------------------------------------------------------------

The problem here is that the incoming parameter in register 3 is stored
too late, after an attempt to load the value into register 5.

Looking at dumps with -debug, we can see the following:

---------------------------------------------------------------
********** MACHINEINSTRS **********
# Machine code for function _Z5check3foos: Post SSA
Frame Objects:
  fi#-1: size=2, align=2, fixed, at location [SP+50]
Function Live Ins: %X3 in %vreg1, %X4 in %vreg2

0B      BB#0: derived from LLVM BB %entry
            Live Ins: %X3 %X4
16B             %vreg2<def> = COPY %X4; G8RC_with_sub_32:%vreg2
32B             %vreg1<def> = COPY %X3; G8RC:%vreg1
48B             STH8 %vreg1<kill>, 0, <fi#-1>; mem:ST2[FixedStack-1]
G8RC:%vreg1
64B             %vreg4<def> = LHA 0, <fi#-1>; mem:LD2[%0] GPRC:%vreg4
                ...
---------------------------------------------------------------

So far, so good.  When we get to list scheduling, not quite so good:

---------------------------------------------------------------
********** List Scheduling **********
SU(0):   STH8 %X3<kill>, 162, %X1; mem:ST2[FixedStack-1]
  # preds left       : 0
  # succs left       : 4
  # rdefs left       : 0
  Latency            : 3
  Depth              : 0
  Height             : 0
  Successors:
   antiSU(2): Latency=0
   antiSU(2): Latency=0
   ch  SU(5): Latency=0
   ch  SU(4294967295) *: Latency=0

SU(1):   %R5<def> = LHA 162, %X1; mem:LD2[%0]
  # preds left       : 0
  # succs left       : 3
  # rdefs left       : 0
  Latency            : 5
  Depth              : 0
  Height             : 0
  Successors:
   out SU(3): Latency=1
   val SU(2): Latency=5
   ch  SU(5): Latency=0
...
---------------------------------------------------------------

There is no dependency expressed between these two memory operations,
although they both access the stack address 162(X1).  The scheduler then
sees both instructions as ready, and chooses the load based on critical
path height:

---------------------------------------------------------------
*** Examining Available
Height 9: SU(1):   %R5<def> = LHA 162, %X1; mem:LD2[%0]
Height 4: SU(0):   STH8 %X3<kill>, 162, %X1; mem:ST2[FixedStack-1]
*** Scheduling [0]: SU(1):   %R5<def> = LHA 162, %X1; mem:LD2[%0]
---------------------------------------------------------------

We need to determine why there is no scheduling dependency between these two
instructions, and how to ensure there is one.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list