[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

Alexey Bataev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Dec 12 12:13:52 PST 2019


ABataev added a comment.

In D71241#1782317 <https://reviews.llvm.org/D71241#1782317>, @jdoerfert wrote:

> In D71241#1782173 <https://reviews.llvm.org/D71241#1782173>, @ABataev wrote:
>
> > In D71241#1782157 <https://reviews.llvm.org/D71241#1782157>, @jdoerfert wrote:
> >
> > > In D71241#1781955 <https://reviews.llvm.org/D71241#1781955>, @ABataev wrote:
> > >
> > > > In D71241#1780715 <https://reviews.llvm.org/D71241#1780715>, @jdoerfert wrote:
> > > >
> > > > > In D71241#1779097 <https://reviews.llvm.org/D71241#1779097>, @ABataev wrote:
> > > > >
> > > > > > In D71241#1778963 <https://reviews.llvm.org/D71241#1778963>, @jdoerfert wrote:
> > > > > >
> > > > > > > In D71241#1778736 <https://reviews.llvm.org/D71241#1778736>, @ABataev wrote:
> > > > > > >
> > > > > > > > In D71241#1778717 <https://reviews.llvm.org/D71241#1778717>, @jdoerfert wrote:
> > > > > > > >
> > > > > > > > > >> There is no evidence that this is more complicated. By all measures, this is less complicated (see also below). It is also actually doing the right thing when it comes to code emission. Take https://godbolt.org/z/sJiP3B for example. The calls are wrong and the definition of base is missing.
> > > > > > > > > > 
> > > > > > > > > > How did you measure it? I have a completely different opinion. Also, tried to reproduce the problem locally, could not reproduce. It seems to me, the output of the test misses several important things. You can check it yourself. `tgt_target_teams()` call uses `@.offload_maptypes` global var but it is not defined.
> > > > > > > > >
> > > > > > > > > Here is the link with the globals not hidden: https://godbolt.org/z/5etB5S
> > > > > > > > >  The base function is called both times but should not be called at all. What is your local output and why does it differ?
> > > > > > > >
> > > > > > > >
> > > > > > > > On the host `base` is an alias for the `hst` function. On the device `base` has the body of `dev` function because NVPTX does nit support function aliases (10+ suppprts it, but LLVM does not support it yet). Try to change the bodies of dev and hst and you will see.
> > > > > > > >
> > > > > > > > I tried to keep original function names to improve debugging and make users less wonder why instead of `base` something else is called.
> > > > > > >
> > > > > > >
> > > > > > > How does that work with linking? Another translation unit can call both the base and hst/dev function, right? I mean they both need to be present.
> > > > > >
> > > > > >
> > > > > > If hst or dev are needed, they are emitted too, independently. On the host, the variamt function is emitted and base function is set to be an alias of the variant function. On the device we just inherit the body of the variant function, but this variant function also can be emitted, if used independently.
> > > > >
> > > > >
> > > > > This is confusing:
> > > > >
> > > > > 1. Especially for debugging we should do the same on host and device.
> > > >
> > > >
> > > > We do the same on the host and on the device. The user will see that he is the context of originally called function `base`. It is the same behavior just like the behavior of the GNU gcc `alias` attribute.
> > >
> > >
> > > No we do not. You said it yourself just in the last comment: one time you emit an alias (host), one time you "emit code into the base function" (device).
> > >  That is not the same thing, especially if you look at debug information, stack frames, ...
> >
> >
> > Yes, formally it is not the same. But only because NVPTX does not support function aliasing.
>
>
> Long story short, code generation (and debugging) is different in the existing approach as a workaround.
>
> >> 
> >> 
> >>>> 2. The device function now exists twice, that is bad.
> >>> 
> >>> It is not so bad, actually. In your solution, you have almost the same situation. Plus, it is not a big problem taking into account that most of the function calls will be inlined and function will be eliminated. As soon as we have support for global aliases in LLVM/NVPTX, we can improve it.
> >> 
> >> It is bad to duplicate code for no reason. The SemaOverload solution has not "almost the same situation", as it does not replace the base function body with the variant function body.
> > 
> > Again, just a workaround for NVPTX problem with function aliases.
>
> Long story short, we duplicate code in the exisiting approach as a workaround.
>
> >> 
> >> 
> >>>> 3. How is this supposed to work with type corrections? I mean the variant needs to be compatible but not necessarily of the same type, right? https://godbolt.org/z/QAXCuv we just produce a cryptic error (locally it crashes for me afterwards).
> >>> 
> >>> Actually, these functions are not compatible. We're missing some extra checks in Sema, will add them later.
> >> 
> >> That is not necessarily clear to me. The standard does not say the types need to be equal but compatible.
> > 
> > Compatible in terms of the base language. In C, these functions are not compatible. Try to declare the function `base` as `int base(float)` and then try to redeclare it as `int base(int)`. You will get the error because the types are not compatible. For example, `int ()` and `int(int)` are compatible though are not equal.
>
> The problem exists for C++ with short vs int as well: https://godbolt.org/z/wYB_pX
>
> >> 
> >> 
> >>>> 4. On the host the expression `&base == &hst` will evaluate to true with the alias.
> >>> 
> >>> And what is the problem with this? I believe, with your solution the result will be the same because you will just replace `base` with the `hst` upon overloading resolution.
> >> 
> >> But I can do it for calls only. Aliases are not as fine granular. That is also why the current code cannot handle the `construct` trait but an overload solution can.
> > 
> > I see no problem here.
>
> The current approach *cannot* be used to implement `construct` without adding a *third* way to handle declare variants. So we have aliases (no 1), if supported by the architecture and if the trait is not `construct`. We have "function hijacking" (no 2), if aliases are not supported and if the trait is not `construct`. We will need something else if the trait is `construct` (no 3). The alternative is to handle *all of it* during overload resolution. If you still believe it is better to modify the IR in various ways *after* we created it, I'll move this discussion on the openmp-dev list to unblock it.
>
> >>>> In D71241#1779779 <https://reviews.llvm.org/D71241#1779779>, @ABataev wrote:
> >>>> 
> >>>>> Here is the example that does not work with the proposed solution but works with the existing one:
> >>>>>
> >>>>>   static void cpu() { asm("nop"); }
> >>>>>  
> >>>>>   #pragma omp declare variant(cpu) match(device = {kind(cpu)})
> >>>>>   static __attribute__((used)) void wrong_asm() {
> >>>>>     asm ("xxx");
> >>>>>   }
> >>>>>
> >>>>>
> >>>>> The existing solution has some problems with the delayed error messages too, but they are very easy to fix.
> >>>> 
> >>>> 
> >>>> I don't understand this example. What is the expected outcome here (I get the error below from ToT clang). Why is that not achievable by a SemaOverload solution?
> >>>>  `asm.c:4:35: error: function with '#pragma omp declare variant' must have a prototype`
> >>> 
> >>> Try to compile as C++:
> >>> 
> >>>   clang++ -с -fopenmp repro.cpp
> >> 
> >> You did not answer any of the questions here. If you ignore my comments this is not working.
> >>  What is the expected outcome here?  Why is that not achievable by a SemaOverload solution?
> > 
> > Expected result - the code is compiled successfully. Your solution will produce the error message for incorrect asm. SemaOverload won't help you with this. Assume, you have a base function with the target-specific code for the host and a variant with the target-specific code on the device.
>
> I compiled it as cpp with both approaches just fine (ignoring the math errors due to the wrong order of things because I had the begin/end declare variant patch build as well).
>  No asm error on my side, did you run it or just assume it woulnd't work?
>  In fact, the current solution will disregard the `used` attribute here and not emit the function, which is bad. The Sema solution will dispatch the right calls and honor the `used` attribute properly.


One note:

In D71241#1782365 <https://reviews.llvm.org/D71241#1782365>, @jdoerfert wrote:

> >> In fact, the current solution will disregard the `used` attribute here and not emit the function, which is bad. The Sema solution will dispatch the right calls and honor the `used` attribute properly.
> > 
> > Nope, the function is emitted but as an alias to the function with the correct assembler.
>
> The function is marked as `used` but the current solution does not emit it. That is plain wrong. Even if it was not marked as used but externally visible you cannot *not emit it*. The alias solution is simply not working.


Please, add Richard Smith and John McCall as reviewers. Explain that you're replacing the function written by the user on the fly by another one. If they accept it, go ahead.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241





More information about the cfe-commits mailing list