[llvm] [NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline (PR #121834)

Mon Jan 6 13:25:24 PST 2025

jhuber6 wrote:

> The problem is that libdevice depends on this patch and it does carry a fair amount of code that will no longer benefit from removal of unused conditional branches. The way libdevice is used in CUDA, the intent was to process conditional bitcode early. If OpenMP wants to do it differently, I would prefer to make it a special case, and keep the early reflect pass for CUDA.

I don't think this will make a considerable difference, since it's usually guarding some very shallow code paths. We still get full optimizations when the backend runs. If you think this is a major issue, I could acquiesce to making the non-backend version skip lowering if `SmVersion` is not set, but I think that this is cleaner.

```
$ clang foo.c --target=nvptx64-nvidia-cuda -flto -c -O2 // Used to run here
$ clang foo.bc --target=nvptx64-nvidia-cuda -O2 // Now only runs here
```

https://github.com/llvm/llvm-project/pull/121834