[PATCH] D18110: [OpenMP] Fix SEMA bug in the capture of global variables in template functions.

Alexey Bataev via cfe-commits cfe-commits at lists.llvm.org
Sun Apr 3 23:54:20 PDT 2016


Carlo,
Again you're talking about performance loss. I don't think that 
performance loss violates some parsing/semantic/operation rules. If we 
won't pass the argument by value the code will not work properly? OpenMP 
4.5 does not specify anything about passing firstprivate args by value 
or by reference.
I'm not against optimization. But what I'm talking about is: let's start 
with basic support and then, we the basic part of the code is committed 
and the design/architecture of the solution is stable enough, we will 
work on optimizations. Because early optimization does nopt allow to 
simplify design of the whole OpenMP implementation and it leads to the 
code that very hard to modify, to read and to mantain. We already have a 
lot of troubles with Sema part, it is very complex, not very good 
structured. I don't want to have the same troubles with the codegen part 
also.

Best regards,
Alexey Bataev
=============
Software Engineer
Intel Compiler Team

01.04.2016 19:10, Carlo Bertolli пишет:
> carlo.bertolli added a comment.
>
> Hi
>
> If I understand correctly the problem, I would like to add something on top of Samuel's comment.
> My understanding is that Alexey is suggesting that we pass a reference type to kernels for every pointer input parameter, regardless of the actual type and data sharing attribute of the related variable. Ignore the remained of this comment if this is not the case.
>
> In my viewpoint, this violates the basic logic for which target firstprivate was introduced in the OpenMP specification: to avoid having to pass to GPU kernels a reference to every pointer variable. In fact, having to pass a reference results in ptxas generating an additional register for each input parameter. We saw significant performance differences in this respect, and reported about this in a paper at SC15. This has been reported by various members of the OpenMP committee multiple times as the reason why the target firstprivate logic is defined the way it is.
>
> In conclusion, I do not see this patch as an optimization, but as a way of correctly implementing what the OpenMP specification clearly state. Any other implementation is wrong, not a simpler, non optimized version.
>
>
> http://reviews.llvm.org/D18110
>
>
>



More information about the cfe-commits mailing list