[llvm] [NVPTX] Only run LowerUnreachable when necessary (PR #109868)
Tim Besard via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 25 01:13:06 PDT 2024
maleadt wrote:
> The way I read it, it says that 11.4 and older ptxas with older GPUs may have more bugs that are not fixed by this patch
Yeah, that was what I meant.
Although we only know about issues on Pascal and earlier because `bar.sync` is not allowed to be executed divergently there, making sure that `ptxas` has an accurate view of the CFG as intended by LLVM seems important on later hardware generations too. Citing from the relevant NVIDIA bug report (4078847):
> If LLVM creates unstructured control flows then it makes many downstream compiler optimizations less effective.
We could avoid lowering `unreachable` to `exit; trap;` starting from the `ptxas` version that's known to model control flow accurately for just `trap` instructions, which it didn't when I proposed this change, but was told to me would get fixed in a future version of the compiler. I'm not sure if and how we can guard on that though, as the code generated by NVPTX may be handed to whatever version of back-end compiler out there (e.g. when compiling to PTX and using the driver APIs as opposed to having `clang` drive the `ptxas` invocation).
https://github.com/llvm/llvm-project/pull/109868
More information about the llvm-commits
mailing list