[cfe-dev] Struct passing in CUDA (targeting NVPTX)
Jeroen Ketema
j.ketema at imperial.ac.uk
Wed Oct 8 05:52:43 PDT 2014
Hello,
When I use clang to compile a CUDA kernel to target NVPTX, any struct argument of a CUDA function will be compiled down to an llvm function which has a separate argument for each field of the struct. For example,
struct s {
int a;
int b;
};
__attribute__((global)) void foo(struct s arg) {
}
compiled with:
clang -emit-llvm -c -target nvptx-- -x cuda -Xclang -fcuda-is-device
results in:
%struct.s = type { i32, i32 }
; Function Attrs: nounwind
define void @_Z3foo1s(i32 %arg.coerce0, i32 %arg.coerce1) #0 {
entry:
%arg = alloca %struct.s, align 4
%0 = getelementptr %struct.s* %arg, i32 0, i32 0
store i32 %arg.coerce0, i32* %0
%1 = getelementptr %struct.s* %arg, i32 0, i32 1
store i32 %arg.coerce1, i32* %1
ret void
}
Without going into details, this code is a bit suboptimal for a static analysis I’m trying to develop and which is supposed to work on the llvm bitcode. Instead I would like to obtain:
%struct.s = type { i32, i32 }
; Function Attrs: nounwind
define void @_Z3foo1s(%struct.s %s) #0 {
entry:
ret void
}
Would this be possible (possibly by changing clang)?
Reading up on the handling of struct in llvm the expansion seem a property of the NVPTX ABI. However, looking at the NVPTX parts of clang/lib/CodeGen/TargetInfo.cpp and clang/lib/Basic/Targets.cpp, I don’t see anything which obviously affects multiple arguments being generated.
I don’t mind breaking the ABI, I’m only interested in obtaining llvm bitcode and not in code generation for the target.
Thanks,
Jeroen
More information about the cfe-dev
mailing list