[llvm-commits] [llvm] r58625 - in /llvm/trunk: lib/Transforms/Scalar/IndVarSimplify.cpp test/Transforms/IndVarsSimplify/2008-11-03-Floating.ll

Sat Nov 15 23:30:32 PST 2008

On Nov 3, 2008, at 10:32 AM, Devang Patel wrote:

> Author: dpatel
> Date: Mon Nov  3 12:32:19 2008
> New Revision: 58625
>
> URL: http://llvm.org/viewvc/llvm-project?rev=58625&view=rev
> Log:
> Turn floating point IVs into integer IVs where possible.
> This allows SCEV users to effectively calculate trip count.
> LSR later on transforms back integer IVs to floating point IVs
> later on to avoid int-to-float casts inside the loop.

Very nice!  This does great things to simple loops.

> @@ -466,6 +467,7 @@
>   // auxillary induction variables.
>   std::vector<std::pair<PHINode*, SCEVHandle> > IndVars;
>
> +  HandleFloatingPointIV(L);
>   for (BasicBlock::iterator I = Header->begin(); isa<PHINode>(I); + 
> +I) {
>     PHINode *PN = cast<PHINode>(I);
>     if (PN->getType()->isInteger()) { // FIXME: when we have fast- 
> math, enable!

This does an extra pass over all phi nodes in a loop.  Since there can  
be many many of them, this is suboptimal.  Would it be reasonable to  
make this happen at the same time when EliminatePointerRecurrence is  
called?

> +/// HandleFloatingPointIV - If the loop has floating induction  
> variable
> +/// then insert corresponding integer induction variable if possible.
> +void IndVarSimplify::HandleFloatingPointIV(Loop *L) {

Please add a comment giving an example of the sort of loop this  
benefits.  Something like this is sufficient:

   for (double i = 0; i < 10000; ++i)
     bar((int)i);

>
> +  BasicBlock *Header = L->getHeader();
> +  SmallVector <PHINode *, 4> FPHIs;
> +  Instruction *NonPHIInsn = NULL;
> +
> +  // Collect all floating point IVs first.
> +  BasicBlock::iterator I = Header->begin();
> +  while(true) {
> +    if (!isa<PHINode>(I)) {
> +      NonPHIInsn = I;
> +      break;
> +    }
> +    PHINode *PH = cast<PHINode>(I);
> +    if (PH->getType()->isFloatingPoint())
> +      FPHIs.push_back(PH);
> +    ++I;
> +  }

Why collect into a list and then handle all at once?  It seems that  
you can just handle them on the fly like EliminatePointerRecurrence  
does.

> +  for (SmallVector<PHINode *, 4>::iterator I = FPHIs.begin(), E =  
> FPHIs.end();
> +       I != E; ++I) {
> +    PHINode *PH = *I;
> +    unsigned IncomingEdge = L->contains(PH->getIncomingBlock(0));
> +    unsigned BackEdge     = IncomingEdge^1;
> +
> +    // Check incoming value.
> +    ConstantFP *CZ = dyn_cast<ConstantFP>(PH- 
> >getIncomingValue(IncomingEdge));
> +    if (!CZ) continue;
> +    APFloat PHInit = CZ->getValueAPF();
> +    if (!PHInit.isPosZero()) continue;

Please comment about what your checking and why.  It is unclear to me  
why the initial value has to be 0.0 for example.  Why not -0.0, why  
not 5.0?

> +    // Check IV increment.
> +    BinaryOperator *Incr =
> +      dyn_cast<BinaryOperator>(PH->getIncomingValue(BackEdge));
> +    if (!Incr) continue;
> +    if (Incr->getOpcode() != Instruction::Add) continue;
> +    ConstantFP *IncrValue = NULL;
> +    unsigned IncrVIndex = 1;
> +    if (Incr->getOperand(1) == PH)
> +      IncrVIndex = 0;
> +    IncrValue = dyn_cast<ConstantFP>(Incr->getOperand(IncrVIndex));
> +    if (!IncrValue) continue;
> +    APFloat IVAPF = IncrValue->getValueAPF();
> +    APFloat One = APFloat(IVAPF.getSemantics(), 1);
> +    if (!IVAPF.bitwiseIsEqual(One)) continue;

Again, please describe what and why you're requiring these various  
things.  I think that requiring an increment by a FP constant is fine,  
but it allow should be any FP constant with a non-fractional value.   
Also, I don't think you need to copy IVAPF here.

>
> +    // Check Incr uses.
> +    Value::use_iterator IncrUse = Incr->use_begin();
> +    Instruction *U1 = cast<Instruction>(IncrUse++);

Comment that we know there is at least one use, the PHI.

> +    if (IncrUse == Incr->use_end()) continue;
> +    Instruction *U2 = cast<Instruction>(IncrUse++);
> +    if (IncrUse != Incr->use_end()) continue;
> +
> +    // Find exict condition.

typo "exict"

>
> +    FCmpInst *EC = dyn_cast<FCmpInst>(U1);
> +    if (!EC)
> +      EC = dyn_cast<FCmpInst>(U2);
> +    if (!EC) continue;
> +    bool skip = false;
> +    Instruction *Terminator = EC->getParent()->getTerminator();
> +    for(Value::use_iterator ECUI = EC->use_begin(), ECUE = EC- 
> >use_end();
> +        ECUI != ECUE; ++ECUI) {
> +      Instruction *U = cast<Instruction>(ECUI);
> +      if (U != Terminator) {
> +        skip = true;
> +        break;
> +      }
> +    }
> +    if (skip) continue;

Instead of using a 'skip' bool, please factor this inner loop out into  
a static helper function.  However, why do you even do this at all?   
Why not just see if Terminator is a cond branch whose condition is EC?

> +    // Find exit value.
> +    ConstantFP *EV = NULL;
> +    unsigned EVIndex = 1;
> +    if (EC->getOperand(1) == Incr)
> +      EVIndex = 0;
> +    EV = dyn_cast<ConstantFP>(EC->getOperand(EVIndex));
> +    if (!EV) continue;
> +    APFloat EVAPF = EV->getValueAPF();

No need to copy EVAPF.

>
> +    if (EVAPF.isNegative()) continue;
> +
> +    // Find corresponding integer exit value.
> +    uint64_t integerVal = Type::Int32Ty->getPrimitiveSizeInBits();
> +    bool isExact = false;
> +    if (EVAPF.convertToInteger(&integerVal, 32, false,  
> APFloat::rmTowardZero, &isExact)
> +        != APFloat::opOK)
> +      continue;
> +    if (!isExact) continue;

It would be really really nice to allow an exit condition that is an  
sitofp, to handle this case:

void foo(int N) {
   double i;
   for (i = 0; i < N; ++i)
     bar((int)i);
}

Any conversion from an integer whose size is smaller than the mantissa  
of the FP value should be ok.

> +    // Insert new integer induction variable.
> +    SCEVExpander Rewriter(*SE, *LI);

You don't need to use SCEVExpander for this, just do what  
EliminatePointerRecurrence does.

> +    // Replace floating induction variable.
> +    UIToFPInst *Conv = new UIToFPInst(NewIV, PH->getType(),  
> "indvar.conv",
> +                                      NonPHIInsn);

UIToFP is generally more expensive than SIToFP on common targets (e.g.  
X86).  Is it safe to use SIToFP?  Can you do range/size analysis to  
use sitofp when safe if not?

-Chris