[LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?

Wed Jan 2 03:02:28 PST 2013

On 01/01/2013 02:45 PM, Duncan Sands wrote:
> Hi Dmitry,
>
>>
>> In our compiler we use a modified version LLVM Polly, which is very
>> sensitive to
>> proper code generation. Among the number of limitations, the loop region
>> (enclosed by phi node on induction variable and branch) is required to
>> be free
>> of additional memory-dependent branches. In other words, there must be no
>> conditional "br" instructions below phi nodes. The problem we are
>> facing is that
>> from *identical* GIMPLE for 3d loop used in different contexts
>> DragonEgg may
>> generate LLVM IR either conforming the described limitation, or
>> violating it.
>
> the gimple isn't the same at all (see below).  The differences are directly
> reflected in the unoptimized LLVM IR, turning up as additional memory loads
> in the "bad" version.  In addition, the Fortran isn't really the same
> either:
> Fortran semantics allows the compiler to assume that the parameters of your
> new function "compute" (which are all passed in by reference, i.e. as
> pointers)
> do not alias each other or anything else in sight (i.e. they get the
> "restrict"
> qualifier in the gimple, noalias in the LLVM IR).  Thus by factorizing
> the loop
> into "compute" you are actually giving the compiler more information.
>
> Summary:
>    (1) as far as I can see the unoptimized LLVM IR is a direct
> reflection of
> the gimple: the differences for the loop part come directly from
> differences
> in the gimple;
>    (2) the optimizers do a better good when you have "compute" partly
> because you
> provided them with additional aliasing information; this better optimized
> version then gets inlined into MAIN__.
>    (3) this leaves the question of whether in the bad version it is
> logically
> possible for the optimizers to deduce the same aliasing information as is
> handed to them for free in the good version.  To work this out it would be
> nice to have a smaller testcase.

I would also be interested in a minimal test case. If e.g. only the 
alias check is missing, we could introduce run-time alias checks such 
that Polly would be able to optimize both versions. It is probably not 
as simple, but a reduced test case would make it easier to figure out 
the exact problems.

Thanks
Tobi