[PATCH] D26348: Allow convergent attribute for function arguments

Tue Mar 7 05:58:04 PST 2017

hfinkel added a comment.

In https://reviews.llvm.org/D26348#693574, @mehdi_amini wrote:

> In https://reviews.llvm.org/D26348#693554, @dark_sylinc wrote:
>
> > In https://reviews.llvm.org/D26348#693537, @mehdi_amini wrote:
> >
> > > In https://reviews.llvm.org/D26348#693528, @hakzsam wrote:
> > >
> > > > This patch should also fix https://bugs.freedesktop.org/show_bug.cgi?id=99484
> > >
> > >
> > > Please read above my comment about this bug and how it can be fixed without changing LLVM.
> >
> >
> > While true, making texture calls have unknown behavior forces the compiler to prevent reordering at all... on an architecture that heavily relies on latency hiding (both horizontal i.e. wave occupancy and vertical i.e. instruction reordering).
>
>
> Right now, frontends are given a choice between performance and correctness... 
>  My point is that one can't choose performance and sacrifices knowingly correctness, and then complains about a LLVM bug that need to be fixed. The bug is in the shader compiler in the first place!

I missed the rationale for this statement, but this does not seem obviously correct to me. Even marking the intrinsics as having unknown side effects does not seem strong enough to ensure the semantics required here - we don't really have a strong definition for "unknown side effects", AFAIK, but I'd not have thought that even unknown side effects on Foo would prevent the folding of:

  if (cond) {
    Tmp1 = Foo(v [convergent])
  } else {
    Tmp2 = Foo(v [convergent])
  }
  Tmp3 = phi [Tmp1, Tmp2]

and so, no, I don't think we currently have a solution for this of any kind (even a conservative one) - at least in theory. The closest thing I can think of might be the sideeffect marker on inline-asm statements: That might prevent the folding if we allow inline asm statements to define labels visible to other inline asm statements?

> I perfectly understand how having a solution to enable these optimizations is very important for shader compilers.
>  It is likely that if I was in charge of such a shader compiler, I'd try to explore how to express my intrinsics differently such that I can get "most" optimizations  (i.e. not having to say to LLVM that this call can do anything).
> 
> Example to think about: can token help? Could the intrinsic be modeled by taking a pointer to a memory location instead of a SSA value? (there are finer grain way to express the effect on a pointer argument, and it wouldn't break SSA).
> 
>> Btw. I've been running Mesa radeonsi with this patch on top of LLVM 5.0 for my daily work and leisure time for almost a month now, and haven't encountered any issue.
> 
> Great. But be aware that anecdotal data are unlikely to expedite a review here. What is needed is someone to spend time to find a solution to and nail the issues discussed in LangRef above.

https://reviews.llvm.org/D26348