[libc-commits] [libc] [libc] Polish GPU benchmarking (PR #153900)
Leandro Lacerda via libc-commits
libc-commits at lists.llvm.org
Fri Aug 15 16:43:24 PDT 2025
================
@@ -66,7 +64,7 @@ template <typename F, typename T>
uint64_t stop = gpu::processor_clock();
cpp::atomic_thread_fence(cpp::MemoryOrder::ACQ_REL);
asm("" ::"r"(stop));
- volatile T output = result;
+ volatile auto output = result;
----------------
leandrolcampos wrote:
The reason I had to touch the NVPTX version (and not the AMDGPU one) is simply that only the NVPTX `latency()` had this instruction at the end:
```h
volatile T output = result;
```
In the ctype benches, we instantiate `latency<int (*)(int), char>`. The function returns `int`, but the template parameter `T` (the input type) is `char`. That line therefore assigns an `int` to a `volatile char`, which produces:
```bash
[4/15] Building CXX object libc/benchmarks/gpu/src/ctype/CMakeFiles/libc.benchmarks.gpu.src.ctype.isalnum_benchmark.__build__.dir/isalnum_benchmark.cpp.o
In file included from /home/leandro/llvm-project/libc/benchmarks/gpu/src/ctype/isalnum_benchmark.cpp:1:
In file included from /home/leandro/llvm-project/libc/benchmarks/gpu/LibcGpuBenchmark.h:4:
In file included from /home/leandro/llvm-project/libc/benchmarks/gpu/timing/timing.h:17:
/home/leandro/llvm-project/libc/benchmarks/gpu/timing/nvptx/timing.h:67:23: warning: implicit conversion loses integer precision: 'int' to 'volatile char' [-Wimplicit-int-conversion]
67 | volatile T output = result;
| ~~~~~~ ^~~~~~
/home/leandro/llvm-project/libc/benchmarks/gpu/src/ctype/isalnum_benchmark.cpp:7:26: note: in instantiation of function template specialization '__llvm_libc_22_0_0_git::latency<int (*)(int), char>' requested here
7 | return LIBC_NAMESPACE::latency(LIBC_NAMESPACE::isalnum, x);
| ^
1 warning generated.
[6/15] Building CXX object libc/benchmarks/gpu/src/ctype/CMakeFiles/libc.benchmarks.gpu.src.ctype.isalpha_benchmark.__build__.dir/isalpha_benchmark.cpp.o
In file included from /home/leandro/llvm-project/libc/benchmarks/gpu/src/ctype/isalpha_benchmark.cpp:1:
In file included from /home/leandro/llvm-project/libc/benchmarks/gpu/LibcGpuBenchmark.h:4:
In file included from /home/leandro/llvm-project/libc/benchmarks/gpu/timing/timing.h:17:
/home/leandro/llvm-project/libc/benchmarks/gpu/timing/nvptx/timing.h:67:23: warning: implicit conversion loses integer precision: 'int' to 'volatile char' [-Wimplicit-int-conversion]
67 | volatile T output = result;
| ~~~~~~ ^~~~~~
/home/leandro/llvm-project/libc/benchmarks/gpu/src/ctype/isalpha_benchmark.cpp:7:26: note: in instantiation of function template specialization '__llvm_libc_22_0_0_git::latency<int (*)(int), char>' requested here
7 | return LIBC_NAMESPACE::latency(LIBC_NAMESPACE::isalpha, x);
| ^
1 warning generated.
```
On AMDGPU, there is no such instruction in `latency()` (it seems to rely on the `v_or_b32` asm to use the value), and it also has a small-type special-case for the asm operand (char/bool), so there’s no narrowing assignment there.
I didn’t dive deeper here because `latency()` isn’t used by the math benchmarks; I only wanted to make the ctype benches warning-free.
https://github.com/llvm/llvm-project/pull/153900
More information about the libc-commits
mailing list