[llvm] r182023 - PPC32 cannot form counter loops around i64 FP conversions

Hal Finkel hfinkel at anl.gov
Thu May 16 11:54:35 PDT 2013


----- Original Message -----
> 
> On May 16, 2013, at 9:52 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > Author: hfinkel
> > Date: Thu May 16 11:52:41 2013
> > New Revision: 182023
> > 
> > URL: http://llvm.org/viewvc/llvm-project?rev=182023&view=rev
> > Log:
> > PPC32 cannot form counter loops around i64 FP conversions
> > 
> > On PPC32, i64 FP conversions are implemented using runtime calls
> > (which clobber
> > the counter register). These must be excluded.
> 
> As you're finding, this is a really really fragile way to do this
> sort of thing. 

Indeed, I even said this in the original commit message :(

> You're building all kinds of assumptions of how
> lowering will happen into the pass in a very non-obvious way. I fear
> that this will be a continuous source of bugs as the compiler
> evolves.
> 
> Is it really that impractical to do counted loop formation after
> isel?

There are essentially two problems:

 1. Generating an expression for the count.
 2. Using the counter-based loop instructions.

As we've discovered, (1) should really be done at the IR level. Doing this at the MI level essentially means reimplementing large parts of SE at the MI level, and the resulting expressions often contain min/maxs (which turn into selects), etc. and an entire custom code generator for these expressions is also difficult to maintain and produces suboptimal code.

As for (2), this should really be done at the MI level. That's where we can really detect interfering uses of the counter register (and avoid problem such as this).

So to solve problems with generating good (and correct) code for the counts, I tried moving everything from the MI level into the IR level. Maybe it is possible to do the count-expression generation at the IR level and the actual loop conversion at the MI level. I'll try to make it work this way.

Thanks again,
Hal

> 
> -Chris
> 
> > 
> > Added:
> >    llvm/trunk/test/CodeGen/PowerPC/ctrloop-fp64.ll
> > Modified:
> >    llvm/trunk/lib/Target/PowerPC/PPCCTRLoops.cpp
> > 
> > Modified: llvm/trunk/lib/Target/PowerPC/PPCCTRLoops.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCCTRLoops.cpp?rev=182023&r1=182022&r2=182023&view=diff
> > ==============================================================================
> > --- llvm/trunk/lib/Target/PowerPC/PPCCTRLoops.cpp (original)
> > +++ llvm/trunk/lib/Target/PowerPC/PPCCTRLoops.cpp Thu May 16
> > 11:52:41 2013
> > @@ -305,7 +305,11 @@ bool PPCCTRLoops::convertToCTRLoop(Loop
> >                  isa<FPToUIInst>(J) || isa<FPToSIInst>(J)) {
> >         CastInst *CI = cast<CastInst>(J);
> >         if (CI->getSrcTy()->getScalarType()->isPPC_FP128Ty() ||
> > -            CI->getDestTy()->getScalarType()->isPPC_FP128Ty())
> > +            CI->getDestTy()->getScalarType()->isPPC_FP128Ty() ||
> > +            (TT.isArch32Bit() &&
> > +             (CI->getSrcTy()->getScalarType()->isIntegerTy(64) ||
> > +              CI->getDestTy()->getScalarType()->isIntegerTy(64))
> > +            ))
> >           return MadeChange;
> >       } else if (isa<IndirectBrInst>(J) || isa<InvokeInst>(J)) {
> >         // On PowerPC, indirect jumps use the counter register.
> > 
> > Added: llvm/trunk/test/CodeGen/PowerPC/ctrloop-fp64.ll
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/ctrloop-fp64.ll?rev=182023&view=auto
> > ==============================================================================
> > --- llvm/trunk/test/CodeGen/PowerPC/ctrloop-fp64.ll (added)
> > +++ llvm/trunk/test/CodeGen/PowerPC/ctrloop-fp64.ll Thu May 16
> > 11:52:41 2013
> > @@ -0,0 +1,28 @@
> > +; RUN: llc < %s -mcpu=generic | FileCheck %s
> > +
> > +target datalayout =
> > "E-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v128:128:128-n32"
> > +target triple = "powerpc-unknown-linux-gnu"
> > +
> > +define i64 @foo(double* nocapture %n) nounwind readonly {
> > +entry:
> > +  br label %for.body
> > +
> > +for.body:                                         ; preds =
> > %for.body, %entry
> > +  %i.06 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
> > +  %x.05 = phi i64 [ 0, %entry ], [ %conv1, %for.body ]
> > +  %arrayidx = getelementptr inbounds double* %n, i32 %i.06
> > +  %0 = load double* %arrayidx, align 8
> > +  %conv = sitofp i64 %x.05 to double
> > +  %add = fadd double %conv, %0
> > +  %conv1 = fptosi double %add to i64
> > +  %inc = add nsw i32 %i.06, 1
> > +  %exitcond = icmp eq i32 %inc, 2048
> > +  br i1 %exitcond, label %for.end, label %for.body
> > +
> > +for.end:                                          ; preds =
> > %for.body
> > +  ret i64 %conv1
> > +}
> > +
> > +; CHECK: @foo
> > +; CHECK-NOT: mtctr
> > +
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> 



More information about the llvm-commits mailing list