[PATCH] D106909: [clang] Add clang builtins support for gfx90a

Wed Aug 4 10:11:28 PDT 2021

rampitec added inline comments.

================
Comment at: clang/include/clang/Basic/BuiltinsAMDGPU.def:201
+TARGET_BUILTIN(__builtin_amdgcn_global_atomic_fadd_f32, "ff*1f", "t", "gfx90a-insts")
+TARGET_BUILTIN(__builtin_amdgcn_global_atomic_fadd_v2f16, "V2hV2h*1V2h", "t", "gfx90a-insts")
+TARGET_BUILTIN(__builtin_amdgcn_global_atomic_fmin_f64, "dd*1d", "t", "gfx90a-insts")
----------------
gandhi21299 wrote:
> I tried add target feature gfx908-insts for this builtin but the frontend complains that it should have target feature gfx90a-insts.
That was for global_atomic_fadd_f32, but as per discussion we are going to use builtin only starting from gfx90a because of the noret problem. Comments in the review are off their positions after multiple patch updates.

================
Comment at: clang/include/clang/Basic/BuiltinsAMDGPU.def:210
+TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_f64, "dd*3d", "t", "gfx90a-insts")
+TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_f32, "ff*3f", "t", "gfx8-insts")
+
----------------
This needs tests with a gfx8 target and a negative test with gfx7.

================
Comment at: clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx90a.cl:13
+// GFX90A:  global_atomic_add_f64
+void test_global_add(__global double *addr, double x) {
+  double *rtn;
----------------
Use _f64 or _double in the test name.

================
Comment at: clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx90a.cl:31
+// GFX90A:  global_atomic_min_f64
+void test_global_global_min(__global double *addr, double x){
+  double *rtn;
----------------
Same here and in other double tests, use a suffix f64 or double.

================
Comment at: clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx90a.cl:67
+// GFX90A:  flat_atomic_min_f64
+void test_flat_min_constant(__generic double *addr, double x){
+  double *rtn;
----------------
'constant' is wrong. It is flat. Here and everywhere.

================
Comment at: clang/test/CodeGenOpenCL/builtins-fp-atomics-unsupported-gfx908.cl:7
+  double *rtn;
+  *rtn = __builtin_amdgcn_global_atomic_fadd_f64(addr, x); // expected-error{{'__builtin_amdgcn_global_atomic_fadd_f64' needs target feature gfx90a-insts}}
+}
----------------
Need to check all other buintins too.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106909/new/

https://reviews.llvm.org/D106909