[llvm-dev] Question about Loop with multiple latches and LICM
Jingu Kang via llvm-dev
llvm-dev at lists.llvm.org
Fri Jul 9 08:02:39 PDT 2021
Hi Eli,
I have checked the separateNestedLoop function and it seems we can extend the function to handle more cases.
I have created a review for it. https://reviews.llvm.org/D105700
Please feel free to review it.
Thanks
JinGu Kang
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Jingu Kang via llvm-dev
Sent: 06 July 2021 11:02
To: Eli Friedman <efriedma at quicinc.com>; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Question about Loop with multiple latches and LICM
Eli, I appreciate your kind comments.
1. Why does LoopInfo partition loops the way it does? It's the most straightforward way to define a "loop". See https://llvm.org/docs/LoopTerminology.html .
Yep, I agree.
1. Why doesn't LoopSimplify split the loop into a nested loop? See separateNestedLoop in llvm/lib/Transforms/Utils/LoopSimplify.cpp; essentially, separating out a nested loop is based on a heuristic, and your example doesn't trigger that heuristic. I don't think anyone has looked at this in a long time.
Thank you very much for letting me know. It looks like that is what I want to know. Let me have a look.
1. Why does LICM hoist out conditionally executed instructions? Fewer instructions inside the loop means it's easier to analyze/optimize. LoopSink can reverse the transform later if it makes sense.
OK, I agree. During reducing the test case, I made a mistake. There was originally nested loop from goto statement and the gep was executed at every iteration of outer loop. I am sorry for wrong explanation.
If I can transform the loop with multiple latches to a nested Loop Simplify Form, it looks like the issues could be gone.
Thanks for good comments again,
JinGu Kang
From: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>>
Sent: 06 July 2021 01:42
To: Jingu Kang <Jingu.Kang at arm.com<mailto:Jingu.Kang at arm.com>>; llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
Subject: RE: [llvm-dev] Question about Loop with multiple latches and LICM
I see three questions here:
1. Why does LoopInfo partition loops the way it does? It's the most straightforward way to define a "loop". See https://llvm.org/docs/LoopTerminology.html .
2. Why doesn't LoopSimplify split the loop into a nested loop? See separateNestedLoop in llvm/lib/Transforms/Utils/LoopSimplify.cpp; essentially, separating out a nested loop is based on a heuristic, and your example doesn't trigger that heuristic. I don't think anyone has looked at this in a long time.
3. Why does LICM hoist out conditionally executed instructions? Fewer instructions inside the loop means it's easier to analyze/optimize. LoopSink can reverse the transform later if it makes sense.
-Eli
From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Jingu Kang via llvm-dev
Sent: Monday, July 5, 2021 5:53 AM
To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
Subject: [EXT] [llvm-dev] Question about Loop with multiple latches and LICM
Hi All,
I have a couple of questions about loop multiple latches and LICM. Let see a simple LLVM IR code snippet.
define void @test(i1 %a, i32 %b, i8* noalias %src, i8* noalias %dst) {
entry:
br label %while.cond
while.cond: ; preds = %sw.bb4001, %while.body, %while.body, %entry
br i1 %a, label %while.end6895, label %while.body
while.body: ; preds = %while.cond
switch i32 %b, label %sw.default6833 [
i32 82, label %no_ret
i32 30, label %sw.bb4001
i32 40, label %while.cond
i32 41, label %while.cond
]
sw.bb4001: ; preds = %while.body
%addr = getelementptr i8, i8* %src, i32 31
%res = load i8, i8* %addr
store i8 %res, i8* %dst
br label %while.cond
sw.default6833: ; preds = %while.body
unreachable
while.end6895: ; preds = %while.cond
unreachable
no_ret: ; preds = %while.body
ret void
}
LLVM detects loop as below.
Loop at depth 1 containing: %while.cond<header><exiting>,%while.body<latch><exiting>,%sw.bb4001<latch>
Loop at depth 1 containing:
<header><exiting>
while.cond: ; preds = %sw.bb4001, %while.body, %while.body, %entry
br i1 %a, label %while.end6895, label %while.body
<latch><exiting>
while.body: ; preds = %while.cond
switch i32 %b, label %sw.default6833 [
i32 82, label %no_ret
i32 30, label %sw.bb4001
i32 40, label %while.cond
i32 41, label %while.cond
]
<latch>
sw.bb4001: ; preds = %while.body
br label %while.cond
I can see llvm checks header and its backedges and goes through the predecessors of the latches. At this point, I wonder why llvm allows loops to have multiple latches. There is something good from it? Can we choose only one latch from multiple latches like closest one to header in domtree please?
After detecting the loop from above IR code, If we run LoopSimplify and LICM pass, we can see below output.
define void @test(i1 %a, i32 %b, i8* noalias %src, i8* noalias %dst) {
entry:
%addr = getelementptr i8, i8* %src, i32 31 ---> Hoisted
br label %while.cond
while.cond: ; preds = %while.cond.backedge, %entry
br i1 %a, label %while.end6895, label %while.body
while.body: ; preds = %while.cond
switch i32 %b, label %sw.default6833 [
i32 82, label %no_ret
i32 30, label %sw.bb4001
i32 40, label %while.cond.backedge
i32 41, label %while.cond.backedge
]
while.cond.backedge: ; preds = %while.body, %while.body, %sw.bb4001
br label %while.cond
sw.bb4001: ; preds = %while.body
%res = load i8, i8* %addr, align 1
store i8 %res, i8* %dst, align 1
br label %while.cond.backedge
sw.default6833: ; preds = %while.body
unreachable
while.end6895: ; preds = %while.cond
unreachable
no_ret: ; preds = %while.body
ret void
}
As you can see, the getelementptr instruction is hoisted to the preheader. If the control flow just reaches to the while.body, the gep is just redundant but it is executed at every iteration of the loop. I can see LICM pass checks the instructions with isSafeToSpeculativelyExecute but it looks like it is not good enough. At this point, I have other question. LICM pass need to consider something for the instructions which are conditionally executed? Rather than just checking safety of unconditional execution of the instruction.
Additionally, goto statement causes above CFG.
If there is already something in llvm to handle above case correctly or I missed something, please let me know.
Thanks
JinGu Kang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210709/20cd4b95/attachment.html>
More information about the llvm-dev
mailing list