[llvm] [Offload][AMDGPU] accept generic target (PR #118919)

Sat Dec 7 11:13:35 PST 2024

================
@@ -190,8 +190,9 @@ Error asyncMemCopy(bool UseMultipleSdmaEngines, void *Dst, hsa_agent_t DstAgent,
 #endif
 }
 
-Expected<std::string> getTargetTripleAndFeatures(hsa_agent_t Agent) {
-  std::string Target;
+Expected<StringRef>
+getTargetTripleAndFeatures(hsa_agent_t Agent, SmallVector<StringRef> &Targets) {
+  StringRef SpecificTarget;
----------------
shiltian wrote:

> If you really hate to see it and make it gate landing the patch, I can remove it.

I don't hate it. I just don't see how this can help. It'd be helpful if you can stack the PR (via Graphite or something similar) that depends on this to show how your proposed solution works.

> Here, I'm assuming that specific ISA is a superset of generic ISAs. Let me know if that's not the case.

The essential issue here is, it is not OpenMP runtime's call to determine if an ISA is compatible with another. For example, on a gfx1030 GPU, if the device image has `gfx10-3-generic`, but HSA returns `gfx1030` only, OpenMP runtime should still reject this ISA instead of making the decision saying, oh I know they are compatible, so I'm gonna go ahead letting it in. This is because the HSA runtime has to support generic ISA as well. If it doesn't return a generic ISA, it means it doesn't support it. With this being said, OpenMP runtime should really just, get the ISAs from HSA, and call another function to determine if the ISA is compatible, that's all.

As for handling of XNACK or other ISA features, IIUC, they are encoded in the ISA of device image. Even if for generic ISA, it is still `gfx10-3-generic:+xnack`, which makes no difference from `gfx1030:+xnack`.

https://github.com/llvm/llvm-project/pull/118919