[llvm-dev] Enabling IRCE pass or Adding something similar in the pipeline of new pass manager

Jie He via llvm-dev llvm-dev at lists.llvm.org
Tue May 11 07:17:15 PDT 2021


as I know, current IRCE implementation relies on some preconditions. it's
intended to language runtime with garbage collection, not for loop
vectorization.
the same is true for loop predication, which is also helpful for
eliminating condition check within a loop.

Jie He
B.R

On Tue, 11 May 2021 at 20:50, Jingu Kang via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi Philip,
>
>
>
> I have extended your suggestion slightly more as below.
>
>
>
>                                  newbound1 = min(n, c)
>
>                                  newbound2 = max(n, c)
>
>      while (iv < n) {            while(iv < newbound1) {
>
>        A                           A
>
>        if (iv < c)                 B
>
>          B                         C
>
>        C                         }
>
>      }                           iv = newbound1
>
>                                  while (iv < newbound2) {
>
>                                    A
>
>                                    C
>
>                                  }
>
>
>
> I have implemented a simple pass to split bound of loop, which has
> conditional branch with IV, as above example.
> https://reviews.llvm.org/D102234 It is initial version. If possible,
> please review it.
>
>
>
> Thanks
>
> JinGu Kang
>
>
>
> *From:* Jingu Kang <Jingu.Kang at arm.com>
> *Sent:* 04 May 2021 12:45
> *To:* Philip Reames <listmail at philipreames.com>; Jingu Kang <
> Jingu.Kang at arm.com>
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* RE: [llvm-dev] Enabling IRCE pass or Adding something similar
> in the pipeline of new pass manager
>
>
>
> Philip, I appreciate your kind comments.
>
> >In this example, forming the full pre/main/post loop structure of IRCE is
> overkill.  Instead, we could simply restrict the loop bounds in the
> following manner:
>
> >loop.ph:
>
> >  ;; Warning: psuedo code, might have edge conditions wrong
>
> >  %c = icmp sgt %iv, %n
>
> >  %min = umax(%n, %a)
>
> >  br i1 %c, label %exit, label %loop.ph
>
> >
>
> >loop.ph.split:
>
> >  br label %loop
>
> >
>
> >loop:
>
> >  %iv = phi i64 [ %inc, %loop ], [ 1, %loop.ph ]
>
> >  %src.arrayidx = getelementptr inbounds i64, i64* %src, i64 %iv
>
> >  %val = load i64, i64* %src.arrayidx
>
> >  %dst.arrayidx = getelementptr inbounds i64, i64* %dst, i64 %iv
>
> >  store i64 %val, i64* %dst.arrayidx
>
> >  %inc = add nuw nsw i64 %iv, 1
>
> >  %cond = icmp eq i64 %inc, %min
>
> >  br i1 %cond, label %exit, label %loop
>
> >
>
> >exit:
>
> >  ret void
>
> >}
>
> >
>
> >I'm not quite sure what to call this transform, but it's not IRCE.  If this example is actually general enough to cover your use cases, it's going to be a lot easier to judge profitability on than the general form of iteration set splitting
>
>
>
> I agree with you. If the llvm community is ok to accept above approach as
> a pass or a part of a certain pass, I would be happy to implement it
> because I am aiming to handle this case with llvm upstream.
>
>
>
> >Another way to frame this special case might be to recognize the
> conditional block can be inverted into an early exit.  (Reasoning: %iv is
> strictly increasing, condition is monotonic, path if not taken has no
> observable effect)  Consider:
>
> >loop.ph:
>
> >  br label %loop
>
> >
>
> >loop:
>
> >  %iv = phi i64 [ %inc, %for.inc ], [ 1, %loop.ph ]
>
> >  %cmp = icmp sge i64 %iv, %a
>
> >  br i1 %cmp, label %exit, label %for.inc
>
> >
>
> >for.inc:
>
> >  %src.arrayidx = getelementptr inbounds i64, i64* %src, i64 %iv
>
> >  %val = load i64, i64* %src.arrayidx
>
> >  %dst.arrayidx = getelementptr inbounds i64, i64* %dst, i64 %iv
>
> >  store i64 %val, i64* %dst.arrayidx
>
> >  %inc = add nuw nsw i64 %iv, 1
>
> >  %cond = icmp eq i64 %inc, %n
>
> >  br i1 %cond, label %exit, label %loop
>
> >
>
> >exit:
>
> >  ret void
>
> >}
>
> >Once that's done, the multiple exit vectorization work should vectorize
> this loop. Thinking about it, I really like this variant.
>
>  I have not looked at the multiple exit vectorization work yet but it
> looks we could consider the inverted condition as early exit’s condition.
>
> >The costing here seems quite off.  I have not looked at how the vectorize
> costs predicated loads on hardware without predication, but needing to
> scalarize a conditional VF-times and form a vector again does not have a
> cost of 3 million.  This could definitely be improved.
>
> I agree with you.
>
>
>
> Additionally, if possible, I would like to suggest to enable or add
> transformations in order to help vectorization. For example, as removing
> conditional branch inside loop, we could split a loop with dependency,
> which blocks vectorization, into vectorizable loop and non-vectorizable one
> using transformations like loop distribution. I am not sure why these
> features have not been enabled as default on pass manager but it would make
> more loops vectorizable.
>
>
>
> Thanks
>
> JinGu Kang
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>


-- 
Best Regards
He Jie 何杰
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210511/af917cf8/attachment.html>


More information about the llvm-dev mailing list