[libc-commits] [PATCH] D158320: [libc] Initial support for microbenchmarking GPU code
Jon Chesterfield via Phabricator via libc-commits
libc-commits at lists.llvm.org
Fri Aug 18 15:20:25 PDT 2023
JonChesterfield added a comment.
You want memory fences to keep the operations inside the profiled region, the asm won't do that unless it has a memory clobber. Inline asm is likely to mess up codegen too.
================
Comment at: libc/utils/gpu/timing/amdgpu/timing.h:36
+
+// Stimulate a simple function and obtain its latency in clock cycles on the
+// system. This function cannot be inlined or else it will disturb the very
----------------
Simulate? Delicate?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158320/new/
https://reviews.llvm.org/D158320
More information about the libc-commits
mailing list