[libc-commits] [PATCH] D158320: [libc] Initial support for microbenchmarking GPU code

Matt Arsenault via Phabricator via libc-commits libc-commits at lists.llvm.org
Fri Aug 18 15:30:23 PDT 2023


arsenm added inline comments.


================
Comment at: libc/utils/gpu/timing/amdgpu/timing.h:41
+[[gnu::noinline]] static LIBC_INLINE uint64_t latency(F f, T t) {
+  // We need to store the input somewhere to guarntee that the compiler will not
+  // constant propagate it and remove the profiling region.
----------------
guarntee


================
Comment at: libc/utils/gpu/timing/amdgpu/timing.h:45
+  float arg = storage;
+  asm volatile("" ::"r"(arg));
+  // The AMDGPU architecture needs to wait on pending results.
----------------
Don't use r constraint


================
Comment at: libc/utils/gpu/timing/amdgpu/timing.h:50
+  uint64_t start = gpu::processor_clock();
+  __builtin_amdgcn_s_waitcnt(0);
+
----------------
the post wait-for-result should be handled for you


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158320/new/

https://reviews.llvm.org/D158320



More information about the libc-commits mailing list