[libc-commits] [PATCH] D158320: [libc] Initial support for microbenchmarking GPU code
Matt Arsenault via Phabricator via libc-commits
libc-commits at lists.llvm.org
Fri Aug 18 15:30:23 PDT 2023
arsenm added inline comments.
================
Comment at: libc/utils/gpu/timing/amdgpu/timing.h:41
+[[gnu::noinline]] static LIBC_INLINE uint64_t latency(F f, T t) {
+ // We need to store the input somewhere to guarntee that the compiler will not
+ // constant propagate it and remove the profiling region.
----------------
guarntee
================
Comment at: libc/utils/gpu/timing/amdgpu/timing.h:45
+ float arg = storage;
+ asm volatile("" ::"r"(arg));
+ // The AMDGPU architecture needs to wait on pending results.
----------------
Don't use r constraint
================
Comment at: libc/utils/gpu/timing/amdgpu/timing.h:50
+ uint64_t start = gpu::processor_clock();
+ __builtin_amdgcn_s_waitcnt(0);
+
----------------
the post wait-for-result should be handled for you
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158320/new/
https://reviews.llvm.org/D158320
More information about the libc-commits
mailing list