[LLVMbugs] [Bug 19394] New: Failure to compute loop count (and, thus, vectorize)

Thu Apr 10 13:51:56 PDT 2014

http://llvm.org/bugs/show_bug.cgi?id=19394

            Bug ID: 19394
           Summary: Failure to compute loop count (and, thus, vectorize)
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Loop Optimizer
          Assignee: unassignedbugs at nondot.org
          Reporter: hfinkel at anl.gov
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Created attachment 12363
  --> http://llvm.org/bugs/attachment.cgi?id=12363&action=edit
test case

We currently are unable to vectorize (or do much of anything else) with the
following loop:

omp.lb.le.global_ub..lr.ph.split.us:              ; preds = %entry
  %12 = sext i32 %7 to i64
  br label %omp.lb.le.global_ub..us

omp.lb.le.global_ub..us:                          ; preds =
%omp.lb_ub.check_pass.us, %omp.lb.le.global_ub..lr.ph.split.us
  %indvars.iv14 = phi i64 [ %indvars.iv.next15, %omp.lb_ub.check_pass.us ], [
%12, %omp.lb.le.global_ub..lr.ph.split.us ]
  %13 = trunc i64 %indvars.iv14 to i32
  %omp.idx.le.ub.us = icmp sgt i32 %13, %10
  br i1 %omp.idx.le.ub.us, label %omp.loop.end.loopexit, label
%omp.lb_ub.check_pass.us

omp.lb_ub.check_pass.us:                          ; preds =
%omp.lb.le.global_ub..us
  %14 = load double* %ref3, align 8, !tbaa !4
  %arrayidx.us = getelementptr inbounds [10000000 x double]* @c, i64 0, i64
%indvars.iv14
  %15 = load double* %arrayidx.us, align 8, !tbaa !4
  %mul5.us = fmul double %14, %15
  %arrayidx6.us = getelementptr inbounds [10000000 x double]* @b, i64 0, i64
%indvars.iv14
  store double %mul5.us, double* %arrayidx6.us, align 8, !tbaa !4
  %indvars.iv.next15 = add nsw i64 %indvars.iv14, 1
  br label %omp.lb.le.global_ub..us

omp.loop.end.loopexit:                            ; preds =
%omp.lb.le.global_ub..us
  br label %omp.loop.end

this is important because this is what is generated by Intel's OpenMP branch
(after optimization) from this basic loop (the outlined portion):

        ssize_t j;
#pragma omp parallel for
        for (j=0; j<STREAM_ARRAY_SIZE; j++)
            c[j] = a[j];

running opt -analyze -scalar-evolution -S < /tmp/stream_omt35.ll demonstrates
the problem. SE seems to understand the loop induction variable well enough:

  %indvars.iv.next15 = add nsw i64 %indvars.iv14, 1
  -->  {(1 + (sext i32 %7 to i64)),+,1}<nsw><%omp.lb.le.global_ub..us>       
Exits: <<Unknown>>

but it cannot compute an expression for the backedge count:

Determining loop execution counts for: @.omp_microtask.35
Loop %omp.lb.le.global_ub..us: Unpredictable backedge-taken count. 
Loop %omp.lb.le.global_ub..us: Unpredictable max backedge-taken count. 

We should either enhance SE, or maybe there is a different want to formulate
the loop in the frontend that will be friendlier to the existing SE
implementation.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20140410/97eb0476/attachment.html>