[LLVMdev] [PATCH] parallel loop metadata
pekka.jaaskelainen at tut.fi
Tue Jan 29 09:52:03 PST 2013
The reason for this was pointed out here:
That is, there might be optimizations that *should* invalidate the
parallel loop-assumption as they convert the loop to a sequential
one (without a possibility for the programmer to interfere).
This is detected by additional non-annotated memory instructions.
The reg2mem is one such example and there might be others not yet known.
So, it's just an additional safety measure due to most of the passes
not being aware of the "parallel loop concept" (yet).
Yes, it's "fragile" in the sense that the parallelism information is
dropped too easily in some cases. E.g. inlining calls inside the loop
body will at the moment make the loop a non-parallel one if the inlined
function contains at least one (unannotated) memory instruction.
It should be safe to do it this way around: add support for
propagating/keeping the parallelism data to passes gradually instead
boldly assume all passes do the safe thing.
E.g. the inliner should eventually propagate the parallelism data to the
memory instructions of the function calls it inlines to the parallel loop
body. These features can be added later on in an incremental fashion.
I added cc to llvmdev as this might be worth wider attention.
On 01/29/2013 07:29 PM, Nadav Rotem wrote:
> Hi Pekka,
> I am okay with the first part (of adding lvm.loop.ignore_assumed_deps), but I am not sure why we need the second one.
> Adding metadata to every single (memory) instructions sounds fragile to me, and I am not sure that I understand the motivation.
> On Jan 29, 2013, at 8:38 AM, Pekka Jääskeläinen<pekka.jaaskelainen at tut.fi> wrote:
>> The attached patch implements a simple mechanism to mark parallel
>> It uses two types of metadata:
>> llvm.loop.ignore_assumed_deps attached to the loop latch's
>> branch instruction and llvm.mem.parallel_loop_access attached to
>> all of the parallel loop's memory accesses.
>> Loop::isParallel() checks these. If llvm.loop.ignore_assumed_deps
>> is found, it ensures all the memory instructions inside the
>> loop body have the llvm.mem.parallel_loop_access attached before
>> returning true.
>> Test included for the LoopVectorizer that uses this info to parallelize
>> a strange looking loop which it otherwise skips completely.
>> Also included is a test that the parallel loop is converted to a
>> non-vectorizable one after reg2mem adds new memory instructions (without
>> the llvm.mem.parallel_loop_access metadata).
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
More information about the llvm-dev