[LLVMbugs] [Bug 21914] New: [InstCombine] prematurely promoting floating point negation

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Sun Dec 14 22:28:43 PST 2014


            Bug ID: 21914
           Summary: [InstCombine] prematurely promoting floating point
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedbugs at nondot.org
          Reporter: wujingyue at gmail.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

The optimization that transforms (-x * y) to -(x * y) for floating pointer
numbers seems premature. 


00602       if (Opnd0->hasOneUse()) {
00603         // -X * Y => -(X*Y) (Promote negation as high as possible)
00604         Value *T = Builder->CreateFMul(N0, Opnd1);
00605         Value *Neg = Builder->CreateFNeg(T);
00606         Neg->takeName(&I);
00607         return ReplaceInstUsesWith(I, Neg);
00608       }

While I understand the motivation of this transformation, it prevents potential
LICM if X is a loop invariant and Y is a loop variant. For instance,

int bar(int, int);

void foo(int n, float *output, float beta) {
  float sum = 0;
  for (int i = 0; i < n; ++i)
    sum += bar(i, i * (-beta));
  *output = sum;

With i*(-beta) transformed to -(i*beta), the LICM passes after instcombine is
unable to hoist -beta outside the loop. As a result, "opt -O3" leaves a fsub
instruction inside the loop:

define void @_Z3fooiPff(i32 %n, float* nocapture %output, float %beta) #0 {
  %cmp7 = icmp sgt i32 %n, 0
  br i1 %cmp7, label %for.body.lr.ph, label %for.end

for.body.lr.ph:                                   ; preds = %entry
  %0 = add i32 %n, -1
  br label %for.body

for.body:                                         ; preds = %for.body,
  %i.09 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
  %sum.08 = phi float [ 0.000000e+00, %for.body.lr.ph ], [ %add, %for.body ]
  %conv = sitofp i32 %i.09 to float
  %1 = fmul float %conv, %beta
  %mul = fsub float -0.000000e+00, %1
  %conv1 = fptosi float %mul to i32
  %call = tail call i32 @_Z3barii(i32 %i.09, i32 %conv1)
  %conv2 = sitofp i32 %call to float
  %add = fadd float %sum.08, %conv2
  %inc = add nsw i32 %i.09, 1
  %exitcond = icmp eq i32 %i.09, %0
  br i1 %exitcond, label %for.end.loopexit, label %for.body

for.end.loopexit:                                 ; preds = %for.body
  %add.lcssa = phi float [ %add, %for.body ]
  br label %for.end

for.end:                                          ; preds = %for.end.loopexit,
  %sum.0.lcssa = phi float [ 0.000000e+00, %entry ], [ %add.lcssa,
%for.end.loopexit ]
  store float %sum.0.lcssa, float* %output, align 4, !tbaa !1
  ret void


