[PowerPC, RFC] Wrong-code bug in pre-inc creation

Wed May 15 06:48:14 PDT 2013

----- Original Message -----
> 
> Hi Hal,
> 
> running the test-suite in 32-bit mode I ran into a crash which turns
> out to
> be caused by wrong code generated by the pre-inc matcher.  In theory
> this
> ought to be triggerable in 64-bit mode as well, but I haven't found a
> test
> case yet.   The 32-bit test case is generated (via the vectorizer)
> from a
> loop along the lines of:
> 
> void test(int lo, int hi, double p[])
> {
>   for (int j = hi; j > lo; j--)
>     p[j] = 0;
> }
> 
> The minimal 32-bit test case is (extracted from the vectorized and
> then
> unrolled loop):
> 
> target datalayout =
> "E-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v128:128:128-n32"
> target triple = "powerpc-unknown-linux-gnu"
> 
> define void @test(double* nocapture %p) {
> entry:
>   %p1 = bitcast double* %p to <2 x double>*
>   store <2 x double> zeroinitializer, <2 x double>* %p1, align 8
>   %p2 = getelementptr <2 x double>* %p1, i32 -1
>   store <2 x double> zeroinitializer, <2 x double>* %p2, align 8
>   ret void
> }
> 
> which gets compiled to:
> 
> test:
>         li 4, 0
>         mr 5, 3
>         stwu 4, 8(5)
>         stw 4, 4(3)
>         stw 4, 0(3)
>         stw 4, 4(5)
>         lis 5, -1
>         stwu 4, -16(3)
>         ori 6, 5, 12
>         ori 5, 5, 8
>         stwx 4, 3, 6
>         stwx 4, 3, 5
>         stw 4, 4(3)
>         blr
> 
> Note the obviously bogus offsets in registers 5 and 6.
> 
> 
> It turns out the problem originates with
> PPCTargetLowering::getPreIndexedAddressParts.  This routine calls
> SelectAddressRegImm to extract the base and displacement parts.
>  However,
> the displacement is translated into a TargetConstant by code like:
> 
>     if (isIntS16Immediate(N.getOperand(1), imm)) {
>       Disp = DAG.getTargetConstant((int)imm & 0xFFFF, MVT::i32);
> 
> Note how e.g. a displacement of -8 results in a displacement
> TargetConstant
> of type i32 and value 65528.   This doesn't much matter for code
> generation, as it is implicitly treated as a 16-bit signed value
> there.
> However, on return from getPreIndexedAddressParts, the code in
> DAGCombiner::CombineToPreIndexedLoadStore looks at the displacement
> value
> returned by target code and interprets it as the actual displacement,
> so it
> thinks after the pre-inc instruction, the base register will actually
> have
> been *incremented* by 65528 instead of decremented by 8.  It
> therefore
> changes the subsequent users of the base register to account for that
> supposed increment ...
> 
> By inspection, it would appear that for 64-bit the problem might be
> even
> worse, since the "displacement" returned by SelectAddressRegImmShift
> is
> actually the displacement divided by 4, but again common DAGCombiner
> code
> would treat this value as the displacement itself.   I've not yet
> attempted
> to construct a test case showing this issue.
> 
> 
> Now the question is how to best fix this.  And this in turn leads me
> to
> wonder why the code does it this way in the first place.  It would
> appear
> more natural to me to have the displacement MI operand represent the
> actual
> displacement in all cases, i.e. not force it to an unsigned 16-bit
> value
> for 32-bit, and neither divide it by 4 for 64-bit (but simply ensure
> proper
> alignment).  However, this code basically did it that way since the
> very
> beginning, so I was wondering whether there was a reason for this
> that I
> may not be seeing ...
> 
> Any thoughts on this? 

I think that this is the right thing to do. Thanks!

I had some impression that things are as they are because that way the constants are setup for the built-in assembler (but given that they values would just be sign extended, I don't know if that matters anyway).

 -Hal

>  I'll try to come up with a patch that
> implements
> changes along those lines and see what falls out.
> 
> 
> Bye,
> Ulrich
> 
>