[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

Chandler Carruth chandlerc at google.com
Wed Jul 15 17:34:54 PDT 2015


On Wed, Jul 15, 2015 at 12:55 PM Hyojin Sung <hsung at us.ibm.com> wrote:

> Hi all,
>
> I would like to propose an improvement of the “almost dead” block
> elimination in Transforms/Local.cpp so that it will preserve the canonical
> loop form for loops with a volatile iteration variable.
>
> *** Problem statement
> Nested loops in LCALS Subset B (*https://codesign.llnl.gov/LCALS.php*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__codesign.llnl.gov_LCALS.php&d=AwMGaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=aWKfvN4c8lvUSvVn8J0Z2ajTctlBJf0198Au28epBr0&s=4d9dt5ODcDWHHatSrwu5ZYT9ebgVzNEtpOlIR87izCM&e=>)
> are not vectorized with LLVM -O3 because the LLVM loop vectorizer fails the
> test whether the loop latch and exiting block of a loop is the same. The
> loops are vectorizable, and get vectorized with LLVM -O2
>
I would be interested to know why -O2 succeeds here.


> and also with other commercial compilers (icc, xlc).
>
> *** Details
> These loops ended up with different loop latch and exiting block after a
> series of optimizations including loop unswitching, jump threading,
> simplify-the-CFG, and loop simplify. The fundamental problem here is that
> the above optimizations cannot recognize a loop with a volatile iteration
> variable and do not preserve its canonical loop structure.
>
Ok, meta-level question first:

Why do we care about performance of loops with a volatile iteration
variable? That seems both counter-intuitive and unlikely to be a useful
goal. We simply don't optimize volatile operations well in *any* part of
the optimizer, and I'm not sure why we need to start trying to fix that.
This seems like an irreparably broken benchmark, but perhaps there is a
motivation I don't yet see.


Assuming that sufficient motivation arises to try to fix this, see my
comments below:


>
>
> (1) Loop unswitching generates several empty placeholder BBs only with PHI
> nodes after separating out a shorter path with no inner loop execution from
> a standard path.
>
> (2) Jump threading and simplify-the-CFG passes independently calls
> TryToSimplifyUnconditionalBranchFromEmptyBlock() in
> Transforms/Utils/Local.cpp to get rid of almost empty BBs.
>
> (3) TryToSimplifyUnconditionalBranchFromEmtpyBlock() eliminates the
> placeholder BBs after loop unswitching and merges them into subsequent
> blocks including the header of the inner loop. Before eliminating the
> blocks, the function checks if the block is a loop header by looking at its
> PHI nodes so that it can be saved, but the test fails with the loops with a
> volatile iteration variable.
>
Why does this fail for a volatile iteration variable but not for a
non-volatile one? I think understanding that will be key to understanding
how it should be fixed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150716/0f2f5cf8/attachment.html>


More information about the llvm-dev mailing list