[LLVMdev] llvm.memory.barrier does not work
Junjie Gu
jgu222 at gmail.com
Thu Sep 29 09:16:35 PDT 2011
On Wed, Sep 28, 2011 at 5:47 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Wed, Sep 28, 2011 at 3:27 PM, Junjie Gu <jgu222 at gmail.com> wrote:
>> Instrinsic llvm.memory.barrier does not work as expected. Is it a bug
>> or it has not been implemented yet ?
>
> It's going away in favor of the new fence instruction (and I'll remove
> it as soon as dragonegg catches up). It should still work at the
> moment, though.
>
>> (1) false arguments do not work
>>
>> // pesudo code
>> void foo(int *x) {
>> x[2] = 10;
>> llvm.memory.barrier(0, 0, 0, 0, 0);
>> x[2] = 20;
>> return void
>> }
>>
>>
>> The barrier is actually noop, but it prevents "x[2] = 10" from being deleted.
>
> Don't do that. :) Really, why are you using a noop barrier?
Just to show it affects optimization, but it shouldn't.
>
>> (2) True arguments do not work.
>>
>> // pesudo code
>> void foo(int * restrict x) {
>> x[2] = 10;
>> llvm.memory.barrier(1, 1, 1, 1, 1);
>> x[2] = 20;
>> return void
>> }
>>
>> "x[2] = 10' should not be deleted because barrier is present. But it
>> is deleted anyway.
>
> The pointer is "restrict", therefore the compiler assumes nothing else
> can touch it while the function runs.
>
> Actually, the transformation in question is probably valid even if the
> pointer isn't restrict (although LLVM won't actually do that); your
> use of a barrier here doesn't really make sense.
If you think of multiple-threaded code, it will make sense. Again,
this is a simplied code and is used just for showing the point.
The memory barrier requires that all memory operations prior to the
barrier point completes before any memory operations after the barrier
start. I think that it requires that compilers do not optimize across
the barrier point (and compilers do generate memory barrier
instructions if needed. I think LLVM only does this.), do you agree
on this ?
For the following gcc code, the asm basically behaves like barrier
that prevents across-barrier optimization. You can see that both
writes to p[2] are in the gcc output.
void foo (int * __restrict__ p)
{
p[2] = 10;
__asm__ __volatile__ ("":::"memory");
p[2] = 20;
}
Junjie
>
> -Eli
>
More information about the llvm-dev
mailing list