[PATCH] D142393: [OpenMP] Add 'amdgpu-flat-work-group-size' to OpenMP kernels
Yaxun Liu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Jan 23 13:47:06 PST 2023
yaxunl added inline comments.
================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:9552
F->addFnAttr("uniform-work-group-size", "true");
+ if (IsOpenMPkernel)
+ F->addFnAttr("amdgpu-flat-work-group-size",
----------------
jhuber6 wrote:
> arsenm wrote:
> > jhuber6 wrote:
> > > arsenm wrote:
> > > > Probably shouldn’t check the language, just it’s a kernel. Also shouldn’t emit this if it’s the default 1024. I’ve been trying to cut down on the superfluous attribute spam
> > > There's a section for HIP above that does the same. We could probably consolidate here for all "AMDGPU" kernels and get rid of the redundant attribute. Maybe in a separate patch?
> > All the isCUDA || HIP || OpenMP checks scattered around are driving me crazy. A bunch of the out of tree divergent patches are just adding to them. We should just purge everything checking languages to the actual features and stop putting language names in things
> OpenCL is the odd one out as far as I know, HIP and OpenMP are mostly equivalent as far as attributes go.
OpenCL uses 256 as default max block size. This is to avoid performance regressions for existing apps. HIP uses 1024 by default.
================
Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:61
+ DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ,
+ "1024")));
+
----------------
we should keep the default value in Options.td instead of having multiple copies at different places. save as below.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D142393/new/
https://reviews.llvm.org/D142393
More information about the cfe-commits
mailing list