[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)
Jun Wang via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 20 16:41:47 PST 2024
================
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
----------------
jwanggit86 wrote:
@Pierre-vh With the suggested change, the func `getAMDGPUTargetFeatures` looks something like the following:
```
void amdgpu::getAMDGPUTargetFeatures(...) {
...
if (Args.hasFlag(options::OPT_mwavefrontsize64,
options::OPT_mno_wavefrontsize64, false))
Features.push_back("+wavefrontsize64");
if (Args.hasFlag(options::OPT_mamdgpu_precise_memory_op,
options::OPT_mno_amdgpu_precise_memory_op, false)) {
Features.push_back("+precise-memory");
}
handleTargetFeaturesGroup(D, Triple, Args, Features,
options::OPT_m_amdgpu_Features_Group);
}
However, `handleTargetFeaturesGroup` does not seem to care whether an Arg is claimed or not. It will process every Arg, and we end up with the following:
`"-target-feature" "+precise-memory" "-target-feature" "+amdgpu-precise-memory-op"`
https://github.com/llvm/llvm-project/pull/79236
More information about the llvm-commits
mailing list