[PATCH] D11434: Allow value forwarding past release fences in EarlyCSE
jfb at chromium.org
Fri Jul 24 08:58:27 PDT 2015
jfb added inline comments.
Comment at: test/Transforms/EarlyCSE/fence.ll:43
@@ +42,3 @@
+ fence acquire
+ %a2 = load i32, i32* %addr.i, align 4
+ %res = sub i32 %a, %a2
> reames wrote:
> > dvyukov wrote:
> > > jfb wrote:
> > > > You can't forward `%a` and eliminate `%a2`, but you could eliminate `%a`, and replace it by `%a2` if there are no intervening uses, no?
> > > In this particular case we can do whatever we want with accesses to %addr.i. No other thread can access it without a race.
> > @jfb - Yes, but that's not what the current optimization tries to do. Nor does EarlyCSE have a mechanism to do that in general even for unordered loads.
> > @dvyukov - Yes, but we need to actually prove that and my change to EarlyCSE is no where near that aggressive. And frankly, I'm really not interested in exploiting the undefined behaviour part of this.
> > Do you have suggestions for how to restructure the test in a way which still tests EarlyCSE but avoids disallowing the optimization you pointed out? Would simply removing the first load from the check sequence be sufficient?
> I can't answer your question. I don't understand what EarlyCSE does, change description talks only about release fences but this test contains an acquire fence. I am trying to figure out what this change does by reading the tests. And the tests says that we cannot do something that we actually can do...
I think the main problem Dmitry and I have is the comment: it says the transformation can't happen but it in fact is allowed to happen and simply ins't implemented here. You're testing the implementation knowing what it does, but your comment hints to future contributors that doing this optimization isn't valid.
I'm assuming that the test can't be restructured because it would require a loop, and that's something that the current implementation won't see through. You're testing a very minimal thing instead (an overly conservative on, since it's a race!), and ensuring that the current code can't optimize it.
More information about the llvm-commits