[PATCH] D45061: [NVPTX, CUDA] Use custom feature detection to handle NVPTX target builtins.

Tue Apr 3 10:13:27 PDT 2018

tra added a comment.

In https://reviews.llvm.org/D45061#1053795, @echristo wrote:

> Let's talk about the rest of it more. I'm not sure I'm seeing the need here rather than the annotations that are already here. Can you elaborate more here on why we need an additional method when you've already got subtarget features for each of the ptx versions anyhow?

The patch intends to address two issues:

- mismatch on constraints between llvm and clang. E.g. hasPTX60() on LLVM side means "ptx60 or newer". "ptx60" in TARGET_BUILTIN on clang side means "ptx60" *only*. It is possible to address this within existing implementation by enumerating all PTX versions that are newer. It works OK for ptx60 as we only need to write "ptx60|ptx61". It gets more interesting for older GPUs E.g "ptx31+" would have to become "ptx31|ptx32|ptx40|ptx41|ptx42|ptx43|ptx50|ptx60|ptx61". Similar enumeration will need to happen for GPU version, which brings us to the next point
- NVIDIA keeps growing PTX versions and GPU variants. Recently they've changed CUDA release frequency to ~1/quarter and they tend to add minor variants of PTX and GPU versions fairly frequently. It's going to be an unnecessary maintenance headache as after every new introduced variant I'll need to go and update all NVPTX builtins that have nothing to do with the CUDA changes. Granted, it's not a showstopper, but it is an annoyance that is guaranteed to stay.

With this patch, TARGET_BUILTIN constraints become semantically identical to LLVM and we no longer need to chase every bump in PTX version or GPU variant.

https://reviews.llvm.org/D45061