[PowerPC, RFC] Wrong-code bug in pre-inc creation

Wed May 15 05:40:29 PDT 2013

Hi Hal,

running the test-suite in 32-bit mode I ran into a crash which turns out to
be caused by wrong code generated by the pre-inc matcher.  In theory this
ought to be triggerable in 64-bit mode as well, but I haven't found a test
case yet.   The 32-bit test case is generated (via the vectorizer) from a
loop along the lines of:

void test(int lo, int hi, double p[])
{
  for (int j = hi; j > lo; j--)
    p[j] = 0;
}

The minimal 32-bit test case is (extracted from the vectorized and then
unrolled loop):

target datalayout =
"E-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v128:128:128-n32"
target triple = "powerpc-unknown-linux-gnu"

define void @test(double* nocapture %p) {
entry:
  %p1 = bitcast double* %p to <2 x double>*
  store <2 x double> zeroinitializer, <2 x double>* %p1, align 8
  %p2 = getelementptr <2 x double>* %p1, i32 -1
  store <2 x double> zeroinitializer, <2 x double>* %p2, align 8
  ret void
}

which gets compiled to:

test:
        li 4, 0
        mr 5, 3
        stwu 4, 8(5)
        stw 4, 4(3)
        stw 4, 0(3)
        stw 4, 4(5)
        lis 5, -1
        stwu 4, -16(3)
        ori 6, 5, 12
        ori 5, 5, 8
        stwx 4, 3, 6
        stwx 4, 3, 5
        stw 4, 4(3)
        blr

Note the obviously bogus offsets in registers 5 and 6.

It turns out the problem originates with
PPCTargetLowering::getPreIndexedAddressParts.  This routine calls
SelectAddressRegImm to extract the base and displacement parts.  However,
the displacement is translated into a TargetConstant by code like:

    if (isIntS16Immediate(N.getOperand(1), imm)) {
      Disp = DAG.getTargetConstant((int)imm & 0xFFFF, MVT::i32);

Note how e.g. a displacement of -8 results in a displacement TargetConstant
of type i32 and value 65528.   This doesn't much matter for code
generation, as it is implicitly treated as a 16-bit signed value there.
However, on return from getPreIndexedAddressParts, the code in
DAGCombiner::CombineToPreIndexedLoadStore looks at the displacement value
returned by target code and interprets it as the actual displacement, so it
thinks after the pre-inc instruction, the base register will actually have
been *incremented* by 65528 instead of decremented by 8.  It therefore
changes the subsequent users of the base register to account for that
supposed increment ...

By inspection, it would appear that for 64-bit the problem might be even
worse, since the "displacement" returned by SelectAddressRegImmShift is
actually the displacement divided by 4, but again common DAGCombiner code
would treat this value as the displacement itself.   I've not yet attempted
to construct a test case showing this issue.

Now the question is how to best fix this.  And this in turn leads me to
wonder why the code does it this way in the first place.  It would appear
more natural to me to have the displacement MI operand represent the actual
displacement in all cases, i.e. not force it to an unsigned 16-bit value
for 32-bit, and neither divide it by 4 for 64-bit (but simply ensure proper
alignment).  However, this code basically did it that way since the very
beginning, so I was wondering whether there was a reason for this that I
may not be seeing ...

Any thoughts on this?   I'll try to come up with a patch that implements
changes along those lines and see what falls out.

Bye,
Ulrich