[llvm] [NVPTX] Improve NVVMReflect Efficiency (PR #134416)
Yonah Goldberg via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 9 15:10:49 PDT 2025
YonahGoldberg wrote:
@Artem-B Currently I'm failing a Clang test because of the fact that I switched NVVMReflect to be a module pass now instead of a function pass. The test is `clang/test/CodeGen/builtins-nvptx.c:
```
__device__ bool reflect() {
// CHECK: call i32 @llvm.nvvm.reflect(ptr {{.*}})
unsigned x = __nvvm_reflect("__CUDA_ARCH");
return x >= 700;
}
```
I'm failing now because the NVVMReflect pass is getting run and removing the call. The difference that's causing it to get run now is in `NVPTXTargetMachine.cpp`:
```
PB.registerPipelineStartEPCallback(
[this](ModulePassManager &PM, OptimizationLevel Level) {
// We do not want to fold out calls to nvvm.reflect early if the user
// has not provided a target architecture just yet.
if (Subtarget.hasTargetName())
PM.addPass(NVVMReflectPass(Subtarget.getSmVersion()));
FunctionPassManager FPM;
// Note: NVVMIntrRangePass was causing numerical discrepancies at one
// point, if issues crop up, consider disabling.
FPM.addPass(NVVMIntrRangePass());
if (EarlyByValArgsCopy)
FPM.addPass(NVPTXCopyByValArgsPass());
PM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));
});
```
I changed this in my PR so that the reflect pass is added to the module pass manager, not the function pass manager. For some reason, these function passes added in the callback are NOT run in the lit-test, but the module pass is.
For reference, the clang cmd run is: `clang -cc1 -internal-isystem /mnt/data/ygoldberg/mainline/llvm-project/build/lib/clang/21/include -nostdsysteminc -ffp-contract=off -triple nvptx64-unknown-unknown -target-cpu sm_70 -target-feature +ptx63 -fcuda-is-device -emit-llvm -o - -x cuda /mnt/data/ygoldberg/mainline/llvm-project/clang/test/CodeGen/builtins-nvptx.c`
Do you know if this is an issue or why this is happening?
https://github.com/llvm/llvm-project/pull/134416
More information about the llvm-commits
mailing list