[LLVMdev] Speculative Loop Parallelization on LLVM IR

Mon Jun 21 01:29:33 PDT 2010

On 21 June 2010 06:12, Javed Absar <javed.absar at gmail.com> wrote:
> OK, so whats my point? To be able to do at least some loop-transformations
> at run-time to expose parallelism etc, perhaps some kind of LLVM IR --> Poly
> ---> LLVM IR
> support at run-time may be required. Definitely a scaled down version, since
> polyhedral transformations need a lot of processing in my opinion.

Hi Javed,

This is quite interesting, but there are many dangers that can be
easily forgotten. First and most obvious, you'll have to put all
computation into account when comparing if there was a net gain, not
only the vectorized execution. It might not be trivial to calculate
the gain if you don't have control of the cycles or if the pipeline is
too deep. In big loops, it makes little difference, but there will be
a threshold (depending on machine configurations etc) that there will
be a big hit on performance, especially on program start-up, which is
the most critical to the user.

Another phantom is that, when dealing with pointers (nor arrays, as
you suggest), the locality is relative and depend on *each* execution.
You might tage the loop as "safe" if at first it seems so, but you
still have to check *every* time for disparities not to get caught in
that trap. That also add to the "extra cost" this mechanism brings.

Last, as you said, the poly is very expensive. Stripped down versions
might not be suitable to analyse all cases, so you might end up with a
handful of sub-versions, and that adds time and space complexity to
the program. This might also be running on a JITed environment, which
also adds to the invisible cost, with unpredictable cache hits, memory
usage etc.

My tuppence.

cheers,
--renato