[llvm-dev] Enabling IRCE pass or Adding something similar in the pipeline of new pass manager
Jie He via llvm-dev
llvm-dev at lists.llvm.org
Tue May 11 19:41:29 PDT 2021
yes, but current lowering deopt implementation would generate a statepoint
IR which currently only supports X86-64, as mentioned in GC documentation
in LLVM.
iRCE doesn't reply on GCed language, I remember wrong. but it's not smart
right now, can't handle bounds check well like java RCE did.
On Tue, 11 May 2021 at 23:04, Philip Reames <listmail at philipreames.com>
wrote:
> This is incorrect.
>
> IRCE's current sole known user happens to be a compiler for a GCed
> language, but there is no (intentional) dependence on that fact. It should
> work on arbitrary IR.
>
> Loop predication (the form in IndVars) triggers for arbitrary IR. The
> separate pass depends on semantics of guards which is related to deopt
> semantics, but *not* GC.
>
> Philip
> On 5/11/21 7:17 AM, Jie He wrote:
>
> as I know, current IRCE implementation relies on some preconditions. it's
> intended to language runtime with garbage collection, not for loop
> vectorization.
> the same is true for loop predication, which is also helpful for
> eliminating condition check within a loop.
>
> Jie He
> B.R
>
> On Tue, 11 May 2021 at 20:50, Jingu Kang via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi Philip,
>>
>>
>>
>> I have extended your suggestion slightly more as below.
>>
>>
>>
>> newbound1 = min(n, c)
>>
>> newbound2 = max(n, c)
>>
>> while (iv < n) { while(iv < newbound1) {
>>
>> A A
>>
>> if (iv < c) B
>>
>> B C
>>
>> C }
>>
>> } iv = newbound1
>>
>> while (iv < newbound2) {
>>
>> A
>>
>> C
>>
>> }
>>
>>
>>
>> I have implemented a simple pass to split bound of loop, which has
>> conditional branch with IV, as above example.
>> https://reviews.llvm.org/D102234 It is initial version. If possible,
>> please review it.
>>
>>
>>
>> Thanks
>>
>> JinGu Kang
>>
>>
>>
>> *From:* Jingu Kang <Jingu.Kang at arm.com>
>> *Sent:* 04 May 2021 12:45
>> *To:* Philip Reames <listmail at philipreames.com>; Jingu Kang <
>> Jingu.Kang at arm.com>
>> *Cc:* llvm-dev at lists.llvm.org
>> *Subject:* RE: [llvm-dev] Enabling IRCE pass or Adding something similar
>> in the pipeline of new pass manager
>>
>>
>>
>> Philip, I appreciate your kind comments.
>>
>> >In this example, forming the full pre/main/post loop structure of IRCE
>> is overkill. Instead, we could simply restrict the loop bounds in the
>> following manner:
>>
>> >loop.ph:
>>
>> > ;; Warning: psuedo code, might have edge conditions wrong
>>
>> > %c = icmp sgt %iv, %n
>>
>> > %min = umax(%n, %a)
>>
>> > br i1 %c, label %exit, label %loop.ph
>>
>> >
>>
>> >loop.ph.split:
>>
>> > br label %loop
>>
>> >
>>
>> >loop:
>>
>> > %iv = phi i64 [ %inc, %loop ], [ 1, %loop.ph ]
>>
>> > %src.arrayidx = getelementptr inbounds i64, i64* %src, i64 %iv
>>
>> > %val = load i64, i64* %src.arrayidx
>>
>> > %dst.arrayidx = getelementptr inbounds i64, i64* %dst, i64 %iv
>>
>> > store i64 %val, i64* %dst.arrayidx
>>
>> > %inc = add nuw nsw i64 %iv, 1
>>
>> > %cond = icmp eq i64 %inc, %min
>>
>> > br i1 %cond, label %exit, label %loop
>>
>> >
>>
>> >exit:
>>
>> > ret void
>>
>> >}
>>
>> >
>>
>> >I'm not quite sure what to call this transform, but it's not IRCE. If this example is actually general enough to cover your use cases, it's going to be a lot easier to judge profitability on than the general form of iteration set splitting
>>
>>
>>
>> I agree with you. If the llvm community is ok to accept above approach as
>> a pass or a part of a certain pass, I would be happy to implement it
>> because I am aiming to handle this case with llvm upstream.
>>
>>
>>
>> >Another way to frame this special case might be to recognize the
>> conditional block can be inverted into an early exit. (Reasoning: %iv is
>> strictly increasing, condition is monotonic, path if not taken has no
>> observable effect) Consider:
>>
>> >loop.ph:
>>
>> > br label %loop
>>
>> >
>>
>> >loop:
>>
>> > %iv = phi i64 [ %inc, %for.inc ], [ 1, %loop.ph ]
>>
>> > %cmp = icmp sge i64 %iv, %a
>>
>> > br i1 %cmp, label %exit, label %for.inc
>>
>> >
>>
>> >for.inc:
>>
>> > %src.arrayidx = getelementptr inbounds i64, i64* %src, i64 %iv
>>
>> > %val = load i64, i64* %src.arrayidx
>>
>> > %dst.arrayidx = getelementptr inbounds i64, i64* %dst, i64 %iv
>>
>> > store i64 %val, i64* %dst.arrayidx
>>
>> > %inc = add nuw nsw i64 %iv, 1
>>
>> > %cond = icmp eq i64 %inc, %n
>>
>> > br i1 %cond, label %exit, label %loop
>>
>> >
>>
>> >exit:
>>
>> > ret void
>>
>> >}
>>
>> >Once that's done, the multiple exit vectorization work should vectorize
>> this loop. Thinking about it, I really like this variant.
>>
>> I have not looked at the multiple exit vectorization work yet but it
>> looks we could consider the inverted condition as early exit’s condition.
>>
>> >The costing here seems quite off. I have not looked at how the
>> vectorize costs predicated loads on hardware without predication, but
>> needing to scalarize a conditional VF-times and form a vector again does
>> not have a cost of 3 million. This could definitely be improved.
>>
>> I agree with you.
>>
>>
>>
>> Additionally, if possible, I would like to suggest to enable or add
>> transformations in order to help vectorization. For example, as removing
>> conditional branch inside loop, we could split a loop with dependency,
>> which blocks vectorization, into vectorizable loop and non-vectorizable one
>> using transformations like loop distribution. I am not sure why these
>> features have not been enabled as default on pass manager but it would make
>> more loops vectorizable.
>>
>>
>>
>> Thanks
>>
>> JinGu Kang
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
> --
> Best Regards
> He Jie 何杰
>
>
--
Best Regards
He Jie 何杰
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210512/97d4f353/attachment.html>
More information about the llvm-dev
mailing list