[llvm] r181927 - Implement PPC counter loops as a late IR-level pass
Hal Finkel
hfinkel at anl.gov
Thu May 16 07:55:10 PDT 2013
----- Original Message -----
> Hal Finkel <hfinkel at anl.gov> wrote:
>
> > Implement PPC counter loops as a late IR-level pass
>
> This seems to have introduced two new test-suite failures in 32-bit
> mode:
> SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding
> SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant
>
> Looking at the first one, it crashes on the line:
>
> ::fill(data64, data64+SIZE, int64_t(init_value));
>
> where the variables are:
>
> const int SIZE = 8000;
> double init_value = 1.0;
> int64_t data64[SIZE];
>
> The generated assembly for this line is:
>
> 0x10002f14 <+8536>: lis r3,4100
> 0x10002f18 <+8540>: li r4,2000
> 0x10002f1c <+8544>: li r30,0
> 0x10002f20 <+8548>: lfd f1,-6872(r3)
> 0x10002f24 <+8552>: mtctr r4
> 0x10002f28 <+8556>: bl 0x1002cdc0 <__fixdfdi at plt>
> 0x10002f2c <+8560>: lis r5,4102
> 0x10002f30 <+8564>: addi r5,r5,22096
> 0x10002f34 <+8568>: mr r6,r5
> 0x10002f38 <+8572>: stwux r3,r6,r30
> 0x10002f3c <+8576>: addi r30,r30,32
> 0x10002f40 <+8580>: stw r4,12(r6)
> 0x10002f44 <+8584>: stw r3,8(r6)
> 0x10002f48 <+8588>: stw r4,4(r6)
> 0x10002f4c <+8592>: stw r4,28(r6)
> 0x10002f50 <+8596>: stw r3,24(r6)
> 0x10002f54 <+8600>: stw r4,20(r6)
> 0x10002f58 <+8604>: stw r3,16(r6)
> 0x10002f5c <+8608>: bdnz 0x10002f34 <main(int, char**)+8568>
>
> Note how 0x1002f24 sets up CTR to hold the loop counter, but this is
> then clobbered by the function call to __fixdfdi.
>
> > The most fragile part of this new implementation is that
> > interfering uses
> of
> > the counter register must be detected on the IR level (and, on PPC,
> > this
> also
> > includes any indirect branches in addition to function calls).
>
> Does this maybe need anything special to detect libgcc helper
> function
> calls?
In PPCCTRLoops.cpp:
} else if (isa<UIToFPInst>(J) || isa<SIToFPInst>(J) ||
isa<FPToUIInst>(J) || isa<FPToSIInst>(J)) {
CastInst *CI = cast<CastInst>(J);
if (CI->getSrcTy()->getScalarType()->isPPC_FP128Ty() ||
CI->getDestTy()->getScalarType()->isPPC_FP128Ty())
return MadeChange;
Is seems like this needs to be expanded to also pick up f64 cases when in 32-bit mode. Thanks!
-Hal
>
>
> Bye,
> Ulrich
>
>
More information about the llvm-commits
mailing list