[llvm] [AMDGPU] narrow only on store to pow of 2 mem location (PR #150093)
Tiger Ding via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 25 06:49:26 PDT 2025
zerogtiger wrote:
(Credit: @ro-i)
i65 can be produced by the Clang front end with the `BitInt` feature, with common use of precise bit types in FPGAs and other fields.
Consider C source code
```C
// geni65.c
void foo(unsigned _BitInt(65) a, _BitInt(65) *ptr) {
*ptr = a;
}
```
On Ubuntu clang 18.1.3, compile command
```bash
clang --target=amdgcn -mcpu=gfx90a -std=c23 -S geni65.c -emit-llvm -o -
```
produces something along the lines of
```C
define dso_local void @_Z3fooDU65_PDB65_(i65 %0, ptr noundef %1) #0 {
%3 = alloca i65, align 8, addrspace(5)
%4 = alloca ptr, align 8, addrspace(5)
%5 = addrspacecast ptr addrspace(5) %3 to ptr
%6 = addrspacecast ptr addrspace(5) %4 to ptr
store i65 %0, ptr %5, align 8
store ptr %1, ptr %6, align 8
%7 = load i65, ptr %5, align 8
%8 = load ptr, ptr %6, align 8
store i65 %7, ptr %8, align 8
ret void
}
```
Source: https://blog.tal.bi/posts/c23-bitint/
https://github.com/llvm/llvm-project/pull/150093
More information about the llvm-commits
mailing list