[Openmp-commits] [PATCH] D71384: [libomptarget] Implement hsa plugin
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Dec 18 09:17:11 PST 2019
jdoerfert added a comment.
In D71384#1789094 <https://reviews.llvm.org/D71384#1789094>, @JonChesterfield wrote:
> A fprintf simplification is at https://github.com/ROCm-Developer-Tools/llvm-project/pull/63, I'll update this diff once that lands.
In case it helps (performance) we can wrap the body of the function in an ifdef #OMP_TARGET_DEBUG (I forgot the actual macro).
================
Comment at: openmp/libomptarget/plugins/hsa/src/rtl.cpp:1180
+ num_groups > DeviceInfo.EnvMaxTeamsDefault)
+ num_groups = DeviceInfo.EnvMaxTeamsDefault;
+ }
----------------
JonChesterfield wrote:
> JonChesterfield wrote:
> > jdoerfert wrote:
> > > I'm fairly certain we have to get rid of this `EnvMaxTeamsDefault`. Btw. I cannot find it in my llvm version.
> > Ah, nice catch. The aomp branch has an abstraction over nvptx and amdgcn constants which gets used by clang and this plugin. This is one of the pieces I've missed by not actually compiling against trunk. Will fix.
> OK, so this one isn't a constant - it's another hook for pulling a value from the environment. Cuda/rtl.cpp does a lot of this too.
>
> Comparing the two, both use a class named RTLDeviceInfoTy, but with different fields and different implementations. The cuda and hsa implementations look broadly similar in general - same functions, doing roughly the same things - but there's a lot of differences in the detail. I don't think the code size savings from pulling out common sequences would be worth coupling the two together.
Fair. We need to make sure this builds though, so the members are there.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D71384/new/
https://reviews.llvm.org/D71384
More information about the Openmp-commits
mailing list