[PATCH] D64405: [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151
Evgeniy via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 15 04:27:28 PDT 2019
ebrevnov added a comment.
In D64405#1582896 <https://reviews.llvm.org/D64405#1582896>, @jdoerfert wrote:
> In D64405#1582297 <https://reviews.llvm.org/D64405#1582297>, @ebrevnov wrote:
>
> > In D64405#1578347 <https://reviews.llvm.org/D64405#1578347>, @jdoerfert wrote:
> >
> > > In D64405#1577916 <https://reviews.llvm.org/D64405#1577916>, @ebrevnov wrote:
> > >
> > > > In D64405#1577644 <https://reviews.llvm.org/D64405#1577644>, @jdoerfert wrote:
> > > >
> > > > > In D64405#1577614 <https://reviews.llvm.org/D64405#1577614>, @ebrevnov wrote:
> > > > >
> > > > > > In D64405#1576608 <https://reviews.llvm.org/D64405#1576608>, @jdoerfert wrote:
> > > > > >
> > > > > > > Why do we need both changes, reading the cache and writing it, assuming I understand the code correctly?
> > > > > > > I would have assumed not caching "inv-load" results would fix the problem. And I can see we do not want to reuse
> > > > > > > non-inv-load results for inv-loads. However, I would argue one could always check for a cached result and use
> > > > > > > it if the result is "good". Do I miss something here?
> > > > > >
> > > > > >
> > > > > > Yes, you are reading the change correctly. The thing is that both results (one for inv-load case another for non inv-load) are "good". They are just different. Each result should be used for corresponding case only and not for the other. In other words using result from non-inv case for inv-case is not correct as well. Does it make sense?
> > > > >
> > > > >
> > > > > No, but I'm tired. I would have assumed that I can drop the `inv-load` metadata without semantic change. If that is the case, I should be able to use the `non-inv` result if it is cached and already good enough. That would mean we can keep the lookup but do only return the cached result if it is "the best". Maybe my problem is that I do not know the lattice of potential values.
> > > > >
> > > > > Either way, is it possible and useful to test for the number of (un)cached queries?
> > > >
> > > >
> > > > I believe you assumption regarding 'invariant.load' metadata not changing the semantics is not accurate. According to spec, If such instruction is executed there are number of assumptions the optimizer may do regarding the program. For example it's safe to assume that there are no stores before and no stores changing value after such instruction. Otherwise behavior is undefined. In general case there is no way to know at compile time if invariant load will be exectuted or not. That means you can't blindly propagate results of dependence analysis for invariant load to "normal" loads. And visa-verse results of dependence analysis for "normal" load doesn't take semantic imposed by "invariant.load" into account and would be sub-optimal if used for invariant load.
> > >
> > >
> > > First, I already agreed, you cannot propagate results from inv. loads to non-inv. loads.
> > >
> > > I still think you can drop inv. metadata, you did only say it has more restrictions but that is not a problem. If you think dropping the inv. metdata is a problem, please give me an example.
> > >
> > > Now the "sub-optimal" part of propagating non-inv. load results to inv. loads. I also agree that this is in general not what you want. However, I was trying to make the point that hat there is one exception: the best possible result. If we have the best possible result already cached for a non-inv. load, we could reuse it for inv. loads as well because they only have less, not more dependences. If you disagree, an example would be good.
> >
> >
> > Looks we are totally aligned here. Honestly I don't understand why we can't track this thread down to the end. I agree potentially we may reuse result of non-inv load for inv load if we know it's the best. But supporting this case would be about the same complexity as extending current caching for non-inv as well as inv loads. If I simply drop the first guard around the code which does cache lookup we will end up with inv load not removed on the following example.
> >
> > target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
> > target triple = "x86_64-unknown-linux-gnu"
> >
> > declare void @llvm.memset.p0i8.i8(i8*, i8, i32, i1)
> > declare void @foo(i8*)
> >
> > define i8 @test(i1 %cmp, i8 *%p) {
> > entry:
> >
> > %res1 = load i8, i8* %p
> > call void @foo(i8 *%p)
> > br i1 %cmp, label %b2, label %b1
> >
> > b1:
> >
> > %res2 = load i8, i8* %p
> > %res3 = add i8 %res1, %res2
> > br label %alive
> >
> > b2:
> >
> > %v = load i8, i8* %p, !invariant.load !1
> > %res.dead = add i8 %v, %res1
> > br label %alive
> >
> > alive:
> >
> > %res.phi = phi i8 [%res3, %b1], [%res.dead, %b2]
> > ret i8 %res.phi
> >
> > }
> >
> > !1 = !{}
> >
> > That means we should either follow current approach and completely exclude non inv loads from caching or implement full support for caching inv.load.
>
>
> I don't want to block this further, so LGTM.
>
> ---
>
> For the record, I did not want to simply remove the first guard but I thought we could do something like this:
>
> Before:
>
> // If we have a cached entry, and it is non-dirty, use it as the value for
> // this dependency.
> if (ExistingResult && !ExistingResult->getResult().isDirty()) {
> ++NumCacheNonLocalPtr;
> return ExistingResult->getResult();
> }
>
>
> After:
>
> // If we have a cached entry, and it is non-dirty, use it as the value for
> // this dependency. If this is an invariant load, use the existing result
> // only if it is the best possible, otherwise we forget we checked for a
> // cached value.
> if (ExistingResult && !ExistingResult->getResult().isDirty()) {
> if (!isInvariantLoad || isBestPossible(ExistingResult)) {
> ++NumCacheNonLocalPtr;
> return ExistingResult->getResult();
> }
> ExistingResult = nullptr;
> }
>
>
> Out of curiosity, would that still be broken with your example?
It should work. Main challenge I see here is how to perform "isBestPossible" check effectively so it doesn't introduce compile time slow down. Because when this check fails we will have to perform full analysis in addition.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D64405/new/
https://reviews.llvm.org/D64405
More information about the llvm-commits
mailing list