[LLVMdev] Instructions that cannot be duplicated

Jeffrey Yasskin jyasskin at google.com
Thu Oct 8 11:08:32 PDT 2009


On Thu, Oct 8, 2009 at 10:49 AM, Villmow, Micah <Micah.Villmow at amd.com> wrote:
>
>
>> -----Original Message-----
>> From: Eli Friedman [mailto:eli.friedman at gmail.com]
>> Sent: Wednesday, October 07, 2009 5:50 PM
>> To: Villmow, Micah
>> Cc: LLVM Developers Mailing List
>> Subject: Re: [LLVMdev] Instructions that cannot be duplicated
>>
>> On Wed, Oct 7, 2009 at 11:20 AM, Villmow, Micah
> <Micah.Villmow at amd.com>
>> wrote:
>> > Is there a current way to specify that an instruction or function
>> call
>> > cannot be duplicated and thus any optimizations that might want to
>> duplicate
>> > this instruction would fail?
>>
>> No.  Anything can be duplicated.  That could change, but you would
>> need to make a strong case for why other solutions won't work.
> [Villmow, Micah] Well the problem is that the function in question
> cannot get duplicated because it has side-effects that duplicating
> causes undefined behavior on vector hardware. Also, moving the
> instruction inside of flow control when it is originally outside of flow
> control produces undefined behavior. There currently is no way to
> specify this in LLVM that I know of. We've tried lowering it to an
> intrinsic and setting MayWriteMem and this does not solve the problem.
> After looking at the llvm IR, there is no equivalent method of
> representing an instruction that is an execution barrier(not a memory
> barrier, which llvm.barrier.[ss|ll|ls|sl] is). If you have any idea's,
> we would be willing to give them a try.

Is the effect similar to pthread_barrier_wait(barrier_for($pc))
[http://linux.die.net/man/3/pthread_barrier_wait]  where the
implementation automatically generates the barrier_for() function and
automatically calculates the number of threads to wait for?

If the barrier lowers to any sort of function call, it sounds like
you're currently looking up the PC of the caller and finding the
barrier that way. Instead, could specify the barrier as an explicit
argument to the function when your frontend generates the call
instruction, which would free you from worrying about whether the call
winds up in multiple places in the optimized IR.

If the barrier lowers to a magic instruction on your chip, and that
instruction doesn't take an ID of any sort besides its address, you
could generate a one-instruction function for each barrier() in the
source language and allow calls to that function to be duplicated.
There may be optimizations that merge "identical" functions, but
they'll be easier to turn off than optimizations that assume they can
rearrange control flow.

If your chip doesn't support function calls, that might constitute the
strong case Eli's asking for.

> On the unique barrier issue, even if the barrier is given a unique
> global identifier, it is the function duplication that causes the
> problem. A unique global identifier lets us identify that invalid
> optimizations have occurred, but it does not guarantee correctness since
> the barrier function is unique per function call. So any sort of
> duplication is invalid.
> Micah




More information about the llvm-dev mailing list