[PATCH] D11382: x86 atomic: optimize a.store(reg op a.load(acquire), release)
Dmitry Vyukov
dvyukov at google.com
Wed Jul 22 01:57:42 PDT 2015
dvyukov added a comment.
In http://reviews.llvm.org/D11382#209066, @jfb wrote:
> In http://reviews.llvm.org/D11382#208803, @dvyukov wrote:
>
> > Will this optimization transform:
> >
> > int foo() {
> > int r = atomic_load_n(&x, __ATOMIC_RELAXED);
> > atomic_store_n(&x, r+1, __ATOMIC_RELAXED);
> > return r;
> > }
> >
> >
> > ? If yes, how?
>
>
> Good point, I added test `add_32r_self` to ensure that this doesn't happen, and that the pattern matching figures out dependencies properly.
I am glad that the comment was useful, but I actually asked a different thing :)
My example does not contain self-add. It contains two usages of a load result, and one of the usages can be potentially folded. My concern was that the code can be compiled as:
MOV [addr], r
ADD [addr], 1
MOV r, rax
RET
or to:
ADD [addr], 1
MOV [addr], rax
RET
Both of which would be incorrect transformations -- two loads instead of one.
I guess this transformation should require that the folded store is the only usage of the load result.
http://reviews.llvm.org/D11382
More information about the llvm-commits
mailing list