[PATCH] D47691: [NVPTX] Ignore target-cpu and -features for inling

Tue Jun 12 09:57:40 PDT 2018

tra added a comment.

In https://reviews.llvm.org/D47691#1123513, @hfinkel wrote:

> > Bottom line -- the situation is far from perfect, but IMO the patch does sensible thing if we're compiling IR with mixed target-cpu and target-features attributes using NVPTX.
>
> In summary, I think that's okay so long as there aren't intrinsics that depend on target features.

There are intrinsics that have *minimum* requirements for GPU variant (SM_XX) and PTX version.
So it is possible to generate instructions for a wrong GPU variant which will cause PTXAS to fail.
AFAICT, we have no good way to implement intended semantics of target-gpu and target-feature in NVPTX because it just does not allow mixing instructions from different GPU variants in the same TU. Well, you can have *older* SM/PTX entities in the IR, but all of them will be compiled for the most recent variant. IMO we can either ignore the attributes (and leave PTXAS deal with potentially invalid instructions) or error out during compilation (which LLVM does not handle particularly well, either).

Whether inline functions with mismatched attributes is actually orthogonal to that. It's the presence of the mismatched attributes that's the problem. Whether we inline those functions or not does not change the situation much. I.e. if mismatched function requires older SM/PTX, it will compile and work just fine, whether it's inlined or not. If the function requires newer SM/PTX, it will fail PTXAS compilation if not inlined and may or may not fail is it is inlined as it may end up not generating anything that would need new functionality. In the end inlining mismatched functions in NVPTX does not make the situation worse than it would be if inlining was not allowed.

The issue with handling target-cpu and target-features in NVPTX in principle does remain open. This patch should let things work the way they did before my clang patch ended up generating the attributes and exposed this issue.

Repository:
  rL LLVM

https://reviews.llvm.org/D47691