[PATCH] D11382: x86 atomic: optimize a.store(reg op a.load(acquire), release)
JF Bastien
jfb at chromium.org
Wed Jul 22 11:23:54 PDT 2015
jfb added a comment.
In http://reviews.llvm.org/D11382#209576, @dvyukov wrote:
> In http://reviews.llvm.org/D11382#209066, @jfb wrote:
>
> > In http://reviews.llvm.org/D11382#208803, @dvyukov wrote:
> >
> > > Will this optimization transform:
> > >
> > > int foo() {
> > > int r = atomic_load_n(&x, __ATOMIC_RELAXED);
> > > atomic_store_n(&x, r+1, __ATOMIC_RELAXED);
> > > return r;
> > > }
> > >
> > >
> > > ? If yes, how?
> >
> >
> > Good point, I added test `add_32r_self` to ensure that this doesn't happen, and that the pattern matching figures out dependencies properly.
>
>
> I am glad that the comment was useful, but I actually asked a different thing :)
> My example does not contain self-add. It contains two usages of a load result, and one of the usages can be potentially folded. My concern was that the code can be compiled as:
>
> MOV [addr], r
> ADD [addr], 1
> MOV r, rax
> RET
>
>
> or to:
>
> ADD [addr], 1
> MOV [addr], rax
> RET
>
>
> Both of which would be incorrect transformations -- two loads instead of one.
> I guess this transformation should require that the folded store is the only usage of the load result.
Oh sorry, I totally misunderstood you! I added a test for this. IIUC it can't happen because the entire pattern that's matched is replace with a pseudo instruction, so an escaping intermediate result wouldn't have a def anymore.
http://reviews.llvm.org/D11382
More information about the llvm-commits
mailing list