[PATCH] D12765: [LV] Allow vectorization of loops with induction post-inc expressions

Jakub Kuderski via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 14 04:18:02 PDT 2015


kuhar added a comment.

Hi Michael,

I followed your comments and did some investigation on the difference between the IR with `int n = y < BUFF_SIZE ? (y - 1) : (BUFF_SIZE - 1)` and with `int n = (y*9)%17`. As you mentioned, they start to differ after IndVarSimplify pass - with the second expression `(y*9)%17` it replaces a phi node with gep generated by SCEVExpander. It's not expensive, because its essentially a reuse of previous value (`%rem`).

When it comes to the first code, it generates such BB when entering IndVarSimplify:

  entry:
    %cmp = icmp slt i32 %y, 4096
    %sub = add nsw i32 %y, -1
    %cond = select i1 %cmp, i32 %sub, i32 4095
    %cmp1.4 = icmp sgt i32 %cond, 0
    br i1 %cmp1.4, label %for.body.lr.ph, label %for.end
  
  ...
  
  for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
    %ptr.0.lcssa = phi i8* [ %incdec.ptr2, %for.cond.for.end_crit_edge ], [ %dest, %entry ]

The problem is that the SCEV for backedge taken count looks like this: `(1 + (zext i32 (-3 + (-1 * (-4097 smax (-1 + (-1 * %y))))<nsw>) to i64) + %dest)`. IndVarSimplify makes sure that it would be cheap to expand it with SCEVExpander and here the real problems start. SCEVExpander thinks that `smax` is a costly operation, so this whole expression is also considered costly.
I don't know how hard it'd be to make IndVarSimplify or SCEVExpander to look around the function to find an already generated equivalent expression, but I don't think it'd be trivial. IndVarSimplify certainly doesn't know anything about loop vectorization, so it's not able to determine that even though some SCEV is expensive to expand, it would be beneficial to do so because of speed-up caused by later vectorization.

I'm not yet convinced that it'd be better to follow this patch of fixing IndVarSimplify - performing this optimization in LoopVectorizer is quite easy and doesn't seem hacky to me.


Repository:
  rL LLVM

http://reviews.llvm.org/D12765





More information about the llvm-commits mailing list