[libc-commits] [PATCH] D158320: [libc] Initial support for microbenchmarking GPU code

Jon Chesterfield via Phabricator via libc-commits libc-commits at lists.llvm.org
Fri Aug 18 15:20:25 PDT 2023


JonChesterfield added a comment.

You want memory fences to keep the operations inside the profiled region, the asm won't do that unless it has a memory clobber. Inline asm is likely to mess up codegen too.



================
Comment at: libc/utils/gpu/timing/amdgpu/timing.h:36
+
+// Stimulate a simple function and obtain its latency in clock cycles on the
+// system. This function cannot be inlined or else it will disturb the very
----------------
Simulate? Delicate?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158320/new/

https://reviews.llvm.org/D158320



More information about the libc-commits mailing list