[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants
Johannes Doerfert via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Dec 12 09:56:06 PST 2019
jdoerfert added a comment.
In D71241#1781955 <https://reviews.llvm.org/D71241#1781955>, @ABataev wrote:
> In D71241#1780715 <https://reviews.llvm.org/D71241#1780715>, @jdoerfert wrote:
>
> > In D71241#1779097 <https://reviews.llvm.org/D71241#1779097>, @ABataev wrote:
> >
> > > In D71241#1778963 <https://reviews.llvm.org/D71241#1778963>, @jdoerfert wrote:
> > >
> > > > In D71241#1778736 <https://reviews.llvm.org/D71241#1778736>, @ABataev wrote:
> > > >
> > > > > In D71241#1778717 <https://reviews.llvm.org/D71241#1778717>, @jdoerfert wrote:
> > > > >
> > > > > > >> There is no evidence that this is more complicated. By all measures, this is less complicated (see also below). It is also actually doing the right thing when it comes to code emission. Take https://godbolt.org/z/sJiP3B for example. The calls are wrong and the definition of base is missing.
> > > > > > >
> > > > > > > How did you measure it? I have a completely different opinion. Also, tried to reproduce the problem locally, could not reproduce. It seems to me, the output of the test misses several important things. You can check it yourself. `tgt_target_teams()` call uses `@.offload_maptypes` global var but it is not defined.
> > > > > >
> > > > > > Here is the link with the globals not hidden: https://godbolt.org/z/5etB5S
> > > > > > The base function is called both times but should not be called at all. What is your local output and why does it differ?
> > > > >
> > > > >
> > > > > On the host `base` is an alias for the `hst` function. On the device `base` has the body of `dev` function because NVPTX does nit support function aliases (10+ suppprts it, but LLVM does not support it yet). Try to change the bodies of dev and hst and you will see.
> > > > >
> > > > > I tried to keep original function names to improve debugging and make users less wonder why instead of `base` something else is called.
> > > >
> > > >
> > > > How does that work with linking? Another translation unit can call both the base and hst/dev function, right? I mean they both need to be present.
> > >
> > >
> > > If hst or dev are needed, they are emitted too, independently. On the host, the variamt function is emitted and base function is set to be an alias of the variant function. On the device we just inherit the body of the variant function, but this variant function also can be emitted, if used independently.
> >
> >
> > This is confusing:
> >
> > 1. Especially for debugging we should do the same on host and device.
>
>
> We do the same on the host and on the device. The user will see that he is the context of originally called function `base`. It is the same behavior just like the behavior of the GNU gcc `alias` attribute.
No we do not. You said it yourself just in the last comment: one time you emit an alias (host), one time you "emit code into the base function" (device).
That is not the same thing, especially if you look at debug information, stack frames, ...
>> 2. The device function now exists twice, that is bad.
>
> It is not so bad, actually. In your solution, you have almost the same situation. Plus, it is not a big problem taking into account that most of the function calls will be inlined and function will be eliminated. As soon as we have support for global aliases in LLVM/NVPTX, we can improve it.
It is bad to duplicate code for no reason. The SemaOverload solution has not "almost the same situation", as it does not replace the base function body with the variant function body.
>> 3. How is this supposed to work with type corrections? I mean the variant needs to be compatible but not necessarily of the same type, right? https://godbolt.org/z/QAXCuv we just produce a cryptic error (locally it crashes for me afterwards).
>
> Actually, these functions are not compatible. We're missing some extra checks in Sema, will add them later.
That is not necessarily clear to me. The standard does not say the types need to be equal but compatible.
>> 4. On the host the expression `&base == &hst` will evaluate to true with the alias.
>
> And what is the problem with this? I believe, with your solution the result will be the same because you will just replace `base` with the `hst` upon overloading resolution.
But I can do it for calls only. Aliases are not as fine granular. That is also why the current code cannot handle the `construct` trait but an overload solution can.
>> In D71241#1779779 <https://reviews.llvm.org/D71241#1779779>, @ABataev wrote:
>>
>>> Here is the example that does not work with the proposed solution but works with the existing one:
>>>
>>> static void cpu() { asm("nop"); }
>>>
>>> #pragma omp declare variant(cpu) match(device = {kind(cpu)})
>>> static __attribute__((used)) void wrong_asm() {
>>> asm ("xxx");
>>> }
>>>
>>>
>>> The existing solution has some problems with the delayed error messages too, but they are very easy to fix.
>>
>>
>> I don't understand this example. What is the expected outcome here (I get the error below from ToT clang). Why is that not achievable by a SemaOverload solution?
>> `asm.c:4:35: error: function with '#pragma omp declare variant' must have a prototype`
>
> Try to compile as C++:
>
> clang++ -с -fopenmp repro.cpp
You did not answer any of the questions here. If you ignore my comments this is not working.
What is the expected outcome here? Why is that not achievable by a SemaOverload solution?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D71241/new/
https://reviews.llvm.org/D71241
More information about the cfe-commits
mailing list