[PATCH] D18286: [OPENMP] private and firstprivate clauses of teams code generation for nvptx
Carlo Bertolli via cfe-commits
cfe-commits at lists.llvm.org
Tue Mar 22 07:05:24 PDT 2016
carlo.bertolli added a comment.
Thanks for your comment. The suggested change will not work as I intended in my patch when using the host as a device. This happens when you select the following options:
In this case we generate device code and the target is ppc64. In ppc64 we need to generate a call to kmpc_fork_teams. In your proposed change, we treat all devices in an undistinguished way and we do not generate a call to fork_teams.
There are various reasons why we should not do that, the most clear ones to me being:
- When using the host as host or as target device we generate different codes. This would mess up with performance profiling.
- On a host it is still important to have teams as that may be the place where coarse grain parallelism comes from.
If you still want no specialization in CGOpenMPRuntimeNVPTX, we will need to check if we are targeting a device and if that device is an nvptx one.
I know that the problem is that we have two CodeGen objects being created in different places if we target nvptx or host. However, by the way the interface is currently structured, I do not see any way out of this duplication.
More information about the cfe-commits