[clang] [clang][SPIR-V] Add support for AMDGCN flavoured SPIRV (PR #89796)
via cfe-commits
cfe-commits at lists.llvm.org
Tue Apr 23 10:12:00 PDT 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-amdgpu
Author: Alex Voicu (AlexVlx)
<details>
<summary>Changes</summary>
This change seeks to add support for vendor flavoured SPIRV - more specifically, AMDGCN flavoured SPIRV. The aim is to generate SPIRV that carries some extra bits of information that are only usable by AMDGCN targets, forfeiting absolute genericity to obtain greater expressiveness for target features:
- AMDGCN inline ASM is allowed/supported, under the assumption that the [SPV_INTEL_inline_assembly](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc) extension is enabled/used
- AMDGCN target specific builtins are allowed/supported, under the assumption that e.g. the `--spirv-allow-unknown-intrinsics` option is enabled when using the downstream translator
- the featureset matches the union of AMDGCN targets' features
- the datalayout string is overspecified to affix both the program address space and the alloca address space, the latter under the assumption that the [SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc) extension is enabled/used, case in which the extant SPIRV datalayout string would lead to pointers to function pointing to the private address space, which would be wrong.
Existing AMDGCN tests are extended to cover this new target. It is currently dormant / will require some additional changes, but I thought I'd rather put it up for review to get feedback as early as possible. I will note that an alternative option is to place this under AMDGPU, but that seems slightly less natural, since this is still SPIRV, albeit relaxed in terms of preconditions & constrained in terms of postconditions, and only guaranteed to be usable on AMDGCN targets (it is still possible to obtain pristine portable SPIRV through usage of the flavoured target, though).
---
Patch is 73.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/89796.diff
22 Files Affected:
- (modified) clang/lib/Basic/Targets.cpp (+5-1)
- (modified) clang/lib/Basic/Targets/SPIR.cpp (+288)
- (modified) clang/lib/Basic/Targets/SPIR.h (+51)
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+7)
- (modified) clang/test/CodeGen/target-data.c (+4)
- (added) clang/test/CodeGenCUDA/builtins-spirv-amdgcn.cu (+294)
- (added) clang/test/CodeGenCUDA/builtins-unsafe-atomics-spirv-amdgcn-gfx90a.cu (+31)
- (modified) clang/test/CodeGenCUDA/long-double.cu (+4)
- (added) clang/test/CodeGenCUDA/spirv-amdgcn-bf16.cu (+129)
- (added) clang/test/CodeGenCXX/spirv-amdgcn-float16.cpp (+38)
- (added) clang/test/CodeGenHIP/spirv-amdgcn-ballot.cpp (+27)
- (added) clang/test/CodeGenHIP/spirv-amdgcn-dpp-const-fold.hip (+46)
- (added) clang/test/CodeGenHIP/spirv-amdgcn-half.hip (+15)
- (modified) clang/test/Preprocessor/predefined-macros-no-warnings.c (+1)
- (modified) clang/test/Preprocessor/predefined-macros.c (+10)
- (added) clang/test/Sema/builtin-spirv-amdgcn-atomic-inc-dec-failure.cpp (+25)
- (added) clang/test/Sema/inline-asm-validate-spirv-amdgcn.cl (+111)
- (modified) clang/test/SemaCUDA/allow-int128.cu (+3)
- (modified) clang/test/SemaCUDA/amdgpu-f128.cu (+1)
- (modified) clang/test/SemaCUDA/float16.cu (+1)
- (modified) clang/test/SemaCUDA/fp16-arg-return.cu (+1)
- (added) clang/test/SemaCUDA/spirv-amdgcn-atomic-ops.cu (+86)
``````````diff
diff --git a/clang/lib/Basic/Targets.cpp b/clang/lib/Basic/Targets.cpp
index e3283510c6aac7..04a13e3385d1f6 100644
--- a/clang/lib/Basic/Targets.cpp
+++ b/clang/lib/Basic/Targets.cpp
@@ -673,8 +673,12 @@ std::unique_ptr<TargetInfo> AllocateTarget(const llvm::Triple &Triple,
}
case llvm::Triple::spirv64: {
if (os != llvm::Triple::UnknownOS ||
- Triple.getEnvironment() != llvm::Triple::UnknownEnvironment)
+ Triple.getEnvironment() != llvm::Triple::UnknownEnvironment) {
+ if (os == llvm::Triple::OSType::AMDHSA)
+ return std::make_unique<SPIRV64AMDGCNTargetInfo>(Triple, Opts);
+
return nullptr;
+ }
return std::make_unique<SPIRV64TargetInfo>(Triple, Opts);
}
case llvm::Triple::wasm32:
diff --git a/clang/lib/Basic/Targets/SPIR.cpp b/clang/lib/Basic/Targets/SPIR.cpp
index dc920177d3a910..d7d232ac9484f8 100644
--- a/clang/lib/Basic/Targets/SPIR.cpp
+++ b/clang/lib/Basic/Targets/SPIR.cpp
@@ -12,6 +12,8 @@
#include "SPIR.h"
#include "Targets.h"
+#include "clang/Basic/Builtins.h"
+#include "clang/Basic/TargetBuiltins.h"
using namespace clang;
using namespace clang::targets;
@@ -54,3 +56,289 @@ void SPIRV64TargetInfo::getTargetDefines(const LangOptions &Opts,
BaseSPIRVTargetInfo::getTargetDefines(Opts, Builder);
DefineStd(Builder, "SPIRV64", Opts);
}
+
+static constexpr Builtin::Info BuiltinInfo[] = {
+#define BUILTIN(ID, TYPE, ATTRS) \
+ {#ID, TYPE, ATTRS, nullptr, HeaderDesc::NO_HEADER, ALL_LANGUAGES},
+#define TARGET_BUILTIN(ID, TYPE, ATTRS, FEATURE) \
+ {#ID, TYPE, ATTRS, FEATURE, HeaderDesc::NO_HEADER, ALL_LANGUAGES},
+#include "clang/Basic/BuiltinsAMDGPU.def"
+};
+
+namespace {
+const char *AMDGPUGCCRegNames[] = {
+ "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8",
+ "v9", "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
+ "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", "v26",
+ "v27", "v28", "v29", "v30", "v31", "v32", "v33", "v34", "v35",
+ "v36", "v37", "v38", "v39", "v40", "v41", "v42", "v43", "v44",
+ "v45", "v46", "v47", "v48", "v49", "v50", "v51", "v52", "v53",
+ "v54", "v55", "v56", "v57", "v58", "v59", "v60", "v61", "v62",
+ "v63", "v64", "v65", "v66", "v67", "v68", "v69", "v70", "v71",
+ "v72", "v73", "v74", "v75", "v76", "v77", "v78", "v79", "v80",
+ "v81", "v82", "v83", "v84", "v85", "v86", "v87", "v88", "v89",
+ "v90", "v91", "v92", "v93", "v94", "v95", "v96", "v97", "v98",
+ "v99", "v100", "v101", "v102", "v103", "v104", "v105", "v106", "v107",
+ "v108", "v109", "v110", "v111", "v112", "v113", "v114", "v115", "v116",
+ "v117", "v118", "v119", "v120", "v121", "v122", "v123", "v124", "v125",
+ "v126", "v127", "v128", "v129", "v130", "v131", "v132", "v133", "v134",
+ "v135", "v136", "v137", "v138", "v139", "v140", "v141", "v142", "v143",
+ "v144", "v145", "v146", "v147", "v148", "v149", "v150", "v151", "v152",
+ "v153", "v154", "v155", "v156", "v157", "v158", "v159", "v160", "v161",
+ "v162", "v163", "v164", "v165", "v166", "v167", "v168", "v169", "v170",
+ "v171", "v172", "v173", "v174", "v175", "v176", "v177", "v178", "v179",
+ "v180", "v181", "v182", "v183", "v184", "v185", "v186", "v187", "v188",
+ "v189", "v190", "v191", "v192", "v193", "v194", "v195", "v196", "v197",
+ "v198", "v199", "v200", "v201", "v202", "v203", "v204", "v205", "v206",
+ "v207", "v208", "v209", "v210", "v211", "v212", "v213", "v214", "v215",
+ "v216", "v217", "v218", "v219", "v220", "v221", "v222", "v223", "v224",
+ "v225", "v226", "v227", "v228", "v229", "v230", "v231", "v232", "v233",
+ "v234", "v235", "v236", "v237", "v238", "v239", "v240", "v241", "v242",
+ "v243", "v244", "v245", "v246", "v247", "v248", "v249", "v250", "v251",
+ "v252", "v253", "v254", "v255", "s0", "s1", "s2", "s3", "s4",
+ "s5", "s6", "s7", "s8", "s9", "s10", "s11", "s12", "s13",
+ "s14", "s15", "s16", "s17", "s18", "s19", "s20", "s21", "s22",
+ "s23", "s24", "s25", "s26", "s27", "s28", "s29", "s30", "s31",
+ "s32", "s33", "s34", "s35", "s36", "s37", "s38", "s39", "s40",
+ "s41", "s42", "s43", "s44", "s45", "s46", "s47", "s48", "s49",
+ "s50", "s51", "s52", "s53", "s54", "s55", "s56", "s57", "s58",
+ "s59", "s60", "s61", "s62", "s63", "s64", "s65", "s66", "s67",
+ "s68", "s69", "s70", "s71", "s72", "s73", "s74", "s75", "s76",
+ "s77", "s78", "s79", "s80", "s81", "s82", "s83", "s84", "s85",
+ "s86", "s87", "s88", "s89", "s90", "s91", "s92", "s93", "s94",
+ "s95", "s96", "s97", "s98", "s99", "s100", "s101", "s102", "s103",
+ "s104", "s105", "s106", "s107", "s108", "s109", "s110", "s111", "s112",
+ "s113", "s114", "s115", "s116", "s117", "s118", "s119", "s120", "s121",
+ "s122", "s123", "s124", "s125", "s126", "s127", "exec", "vcc", "scc",
+ "m0", "flat_scratch", "exec_lo", "exec_hi", "vcc_lo", "vcc_hi",
+ "flat_scratch_lo", "flat_scratch_hi",
+ "a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8",
+ "a9", "a10", "a11", "a12", "a13", "a14", "a15", "a16", "a17",
+ "a18", "a19", "a20", "a21", "a22", "a23", "a24", "a25", "a26",
+ "a27", "a28", "a29", "a30", "a31", "a32", "a33", "a34", "a35",
+ "a36", "a37", "a38", "a39", "a40", "a41", "a42", "a43", "a44",
+ "a45", "a46", "a47", "a48", "a49", "a50", "a51", "a52", "a53",
+ "a54", "a55", "a56", "a57", "a58", "a59", "a60", "a61", "a62",
+ "a63", "a64", "a65", "a66", "a67", "a68", "a69", "a70", "a71",
+ "a72", "a73", "a74", "a75", "a76", "a77", "a78", "a79", "a80",
+ "a81", "a82", "a83", "a84", "a85", "a86", "a87", "a88", "a89",
+ "a90", "a91", "a92", "a93", "a94", "a95", "a96", "a97", "a98",
+ "a99", "a100", "a101", "a102", "a103", "a104", "a105", "a106", "a107",
+ "a108", "a109", "a110", "a111", "a112", "a113", "a114", "a115", "a116",
+ "a117", "a118", "a119", "a120", "a121", "a122", "a123", "a124", "a125",
+ "a126", "a127", "a128", "a129", "a130", "a131", "a132", "a133", "a134",
+ "a135", "a136", "a137", "a138", "a139", "a140", "a141", "a142", "a143",
+ "a144", "a145", "a146", "a147", "a148", "a149", "a150", "a151", "a152",
+ "a153", "a154", "a155", "a156", "a157", "a158", "a159", "a160", "a161",
+ "a162", "a163", "a164", "a165", "a166", "a167", "a168", "a169", "a170",
+ "a171", "a172", "a173", "a174", "a175", "a176", "a177", "a178", "a179",
+ "a180", "a181", "a182", "a183", "a184", "a185", "a186", "a187", "a188",
+ "a189", "a190", "a191", "a192", "a193", "a194", "a195", "a196", "a197",
+ "a198", "a199", "a200", "a201", "a202", "a203", "a204", "a205", "a206",
+ "a207", "a208", "a209", "a210", "a211", "a212", "a213", "a214", "a215",
+ "a216", "a217", "a218", "a219", "a220", "a221", "a222", "a223", "a224",
+ "a225", "a226", "a227", "a228", "a229", "a230", "a231", "a232", "a233",
+ "a234", "a235", "a236", "a237", "a238", "a239", "a240", "a241", "a242",
+ "a243", "a244", "a245", "a246", "a247", "a248", "a249", "a250", "a251",
+ "a252", "a253", "a254", "a255"
+};
+
+} // anonymous namespace
+
+ArrayRef<const char *> SPIRV64AMDGCNTargetInfo::getGCCRegNames() const {
+ return llvm::ArrayRef(AMDGPUGCCRegNames);
+}
+
+bool SPIRV64AMDGCNTargetInfo::initFeatureMap(
+ llvm::StringMap<bool> &Features, DiagnosticsEngine &Diags, StringRef,
+ const std::vector<std::string> &FeatureVec) const {
+ // This represents the union of all AMDGCN features.
+ Features["atomic-ds-pk-add-16-insts"] = true;
+ Features["atomic-flat-pk-add-16-insts"] = true;
+ Features["atomic-buffer-global-pk-add-f16-insts"] = true;
+ Features["atomic-global-pk-add-bf16-inst"] = true;
+ Features["atomic-fadd-rtn-insts"] = true;
+ Features["ci-insts"] = true;
+ Features["dot1-insts"] = true;
+ Features["dot2-insts"] = true;
+ Features["dot3-insts"] = true;
+ Features["dot4-insts"] = true;
+ Features["dot5-insts"] = true;
+ Features["dot7-insts"] = true;
+ Features["dot8-insts"] = true;
+ Features["dot9-insts"] = true;
+ Features["dot10-insts"] = true;
+ Features["dot11-insts"] = true;
+ Features["dl-insts"] = true;
+ Features["16-bit-insts"] = true;
+ Features["dpp"] = true;
+ Features["gfx8-insts"] = true;
+ Features["gfx9-insts"] = true;
+ Features["gfx90a-insts"] = true;
+ Features["gfx940-insts"] = true;
+ Features["gfx10-insts"] = true;
+ Features["gfx10-3-insts"] = true;
+ Features["gfx11-insts"] = true;
+ Features["gfx12-insts"] = true;
+ Features["image-insts"] = true;
+ Features["fp8-conversion-insts"] = true;
+ Features["s-memrealtime"] = true;
+ Features["s-memtime-inst"] = true;
+ Features["gws"] = true;
+ Features["fp8-insts"] = true;
+ Features["fp8-conversion-insts"] = true;
+ Features["atomic-ds-pk-add-16-insts"] = true;
+ Features["mai-insts"] = true;
+
+ return TargetInfo::initFeatureMap(Features, Diags, {}, FeatureVec);
+}
+
+bool SPIRV64AMDGCNTargetInfo::validateAsmConstraint(
+ const char *&Name, TargetInfo::ConstraintInfo &Info) const {
+ // This is a 1:1 copy of AMDGPUTargetInfo::validateAsmConstraint()
+ static const ::llvm::StringSet<> SpecialRegs({
+ "exec", "vcc", "flat_scratch", "m0", "scc", "tba", "tma",
+ "flat_scratch_lo", "flat_scratch_hi", "vcc_lo", "vcc_hi", "exec_lo",
+ "exec_hi", "tma_lo", "tma_hi", "tba_lo", "tba_hi",
+ });
+
+ switch (*Name) {
+ case 'I':
+ Info.setRequiresImmediate(-16, 64);
+ return true;
+ case 'J':
+ Info.setRequiresImmediate(-32768, 32767);
+ return true;
+ case 'A':
+ case 'B':
+ case 'C':
+ Info.setRequiresImmediate();
+ return true;
+ default:
+ break;
+ }
+
+ StringRef S(Name);
+
+ if (S == "DA" || S == "DB") {
+ Name++;
+ Info.setRequiresImmediate();
+ return true;
+ }
+
+ bool HasLeftParen = S.consume_front("{");
+ if (S.empty())
+ return false;
+ if (S.front() != 'v' && S.front() != 's' && S.front() != 'a') {
+ if (!HasLeftParen)
+ return false;
+ auto E = S.find('}');
+ if (!SpecialRegs.count(S.substr(0, E)))
+ return false;
+ S = S.drop_front(E + 1);
+ if (!S.empty())
+ return false;
+ // Found {S} where S is a special register.
+ Info.setAllowsRegister();
+ Name = S.data() - 1;
+ return true;
+ }
+ S = S.drop_front();
+ if (!HasLeftParen) {
+ if (!S.empty())
+ return false;
+ // Found s, v or a.
+ Info.setAllowsRegister();
+ Name = S.data() - 1;
+ return true;
+ }
+ bool HasLeftBracket = S.consume_front("[");
+ unsigned long long N;
+ if (S.empty() || consumeUnsignedInteger(S, 10, N))
+ return false;
+ if (S.consume_front(":")) {
+ if (!HasLeftBracket)
+ return false;
+ unsigned long long M;
+ if (consumeUnsignedInteger(S, 10, M) || N >= M)
+ return false;
+ }
+ if (HasLeftBracket) {
+ if (!S.consume_front("]"))
+ return false;
+ }
+ if (!S.consume_front("}"))
+ return false;
+ if (!S.empty())
+ return false;
+ // Found {vn}, {sn}, {an}, {v[n]}, {s[n]}, {a[n]}, {v[n:m]}, {s[n:m]}
+ // or {a[n:m]}.
+ Info.setAllowsRegister();
+ Name = S.data() - 1;
+ return true;
+}
+
+std::string
+SPIRV64AMDGCNTargetInfo::convertConstraint(const char *&Constraint) const {
+ // This is a 1:1 copy of AMDGPUTargetInfo::convertConstraint()
+ StringRef S(Constraint);
+ if (S == "DA" || S == "DB") {
+ return std::string("^") + std::string(Constraint++, 2);
+ }
+
+ const char *Begin = Constraint;
+ TargetInfo::ConstraintInfo Info("", "");
+ if (validateAsmConstraint(Constraint, Info))
+ return std::string(Begin).substr(0, Constraint - Begin + 1);
+
+ Constraint = Begin;
+ return std::string(1, *Constraint);
+}
+
+ArrayRef<Builtin::Info> SPIRV64AMDGCNTargetInfo::getTargetBuiltins() const {
+ return llvm::ArrayRef(BuiltinInfo,
+ clang::AMDGPU::LastTSBuiltin - Builtin::FirstTSBuiltin);
+}
+
+void SPIRV64AMDGCNTargetInfo::getTargetDefines(const LangOptions &Opts,
+ MacroBuilder &Builder) const {
+ BaseSPIRVTargetInfo::getTargetDefines(Opts, Builder);
+ DefineStd(Builder, "SPIRV64", Opts);
+
+ Builder.defineMacro("__AMD__");
+ Builder.defineMacro("__AMDGPU__");
+ Builder.defineMacro("__AMDGCN__");
+}
+
+void SPIRV64AMDGCNTargetInfo::setAuxTarget(const TargetInfo *Aux) {
+ // This is a 1:1 copy of AMDGPUTargetInfo::setAuxTarget()
+ assert(HalfFormat == Aux->HalfFormat);
+ assert(FloatFormat == Aux->FloatFormat);
+ assert(DoubleFormat == Aux->DoubleFormat);
+
+ // On x86_64 long double is 80-bit extended precision format, which is
+ // not supported by AMDGPU. 128-bit floating point format is also not
+ // supported by AMDGPU. Therefore keep its own format for these two types.
+ auto SaveLongDoubleFormat = LongDoubleFormat;
+ auto SaveFloat128Format = Float128Format;
+ auto SaveLongDoubleWidth = LongDoubleWidth;
+ auto SaveLongDoubleAlign = LongDoubleAlign;
+ copyAuxTarget(Aux);
+ LongDoubleFormat = SaveLongDoubleFormat;
+ Float128Format = SaveFloat128Format;
+ LongDoubleWidth = SaveLongDoubleWidth;
+ LongDoubleAlign = SaveLongDoubleAlign;
+ // For certain builtin types support on the host target, claim they are
+ // supported to pass the compilation of the host code during the device-side
+ // compilation.
+ // FIXME: As the side effect, we also accept `__float128` uses in the device
+ // code. To reject these builtin types supported in the host target but not in
+ // the device target, one approach would support `device_builtin` attribute
+ // so that we could tell the device builtin types from the host ones. This
+ // also solves the different representations of the same builtin type, such
+ // as `size_t` in the MSVC environment.
+ if (Aux->hasFloat128Type()) {
+ HasFloat128 = true;
+ Float128Format = DoubleFormat;
+ }
+}
diff --git a/clang/lib/Basic/Targets/SPIR.h b/clang/lib/Basic/Targets/SPIR.h
index 44265445ff004b..6b605979c9ab1d 100644
--- a/clang/lib/Basic/Targets/SPIR.h
+++ b/clang/lib/Basic/Targets/SPIR.h
@@ -364,6 +364,57 @@ class LLVM_LIBRARY_VISIBILITY SPIRV64TargetInfo : public BaseSPIRVTargetInfo {
MacroBuilder &Builder) const override;
};
+class LLVM_LIBRARY_VISIBILITY SPIRV64AMDGCNTargetInfo
+ : public BaseSPIRVTargetInfo {
+public:
+ SPIRV64AMDGCNTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)
+ : BaseSPIRVTargetInfo(Triple, Opts) {
+ assert(Triple.getArch() == llvm::Triple::spirv64 &&
+ "Invalid architecture for 64-bit AMDGCN SPIR-V.");
+ assert(Triple.getVendor() == llvm::Triple::VendorType::AMD &&
+ "64-bit AMDGCN SPIR-V target must use AMD vendor");
+ assert(getTriple().getOS() == llvm::Triple::OSType::AMDHSA &&
+ "64-bit AMDGCN SPIR-V target must use AMDHSA OS");
+ assert(getTriple().getEnvironment() == llvm::Triple::UnknownEnvironment &&
+ "64-bit SPIR-V target must use unknown environment type");
+ PointerWidth = PointerAlign = 64;
+ SizeType = TargetInfo::UnsignedLong;
+ PtrDiffType = IntPtrType = TargetInfo::SignedLong;
+
+ resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
+ "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0");
+
+ BFloat16Width = BFloat16Align = 16;
+ BFloat16Format = &llvm::APFloat::BFloat();
+
+ HasLegalHalfType = true;
+ HasFloat16 = true;
+ HalfArgsAndReturns = true;
+ }
+
+ bool hasBFloat16Type() const override { return true; }
+
+ ArrayRef<const char *> getGCCRegNames() const override;
+
+ bool initFeatureMap(llvm::StringMap<bool> &Features, DiagnosticsEngine &Diags,
+ StringRef,
+ const std::vector<std::string> &) const override;
+
+ bool validateAsmConstraint(const char *&Name,
+ TargetInfo::ConstraintInfo &Info) const override;
+
+ std::string convertConstraint(const char *&Constraint) const override;
+
+ ArrayRef<Builtin::Info> getTargetBuiltins() const override;
+
+ void getTargetDefines(const LangOptions &Opts,
+ MacroBuilder &Builder) const override;
+
+ void setAuxTarget(const TargetInfo *Aux) override;
+
+ bool hasInt128Type() const override { return TargetInfo::hasInt128Type(); }
+};
+
} // namespace targets
} // namespace clang
#endif // LLVM_CLANG_LIB_BASIC_TARGETS_SPIR_H
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 7e5f2edfc732cc..db64b0a436a095 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -6083,6 +6083,9 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
StringRef Prefix =
llvm::Triple::getArchTypePrefix(getTarget().getTriple().getArch());
if (!Prefix.empty()) {
+ if (Prefix == "spv" &&
+ getTarget().getTriple().getOS() == llvm::Triple::OSType::AMDHSA)
+ Prefix = "amdgcn";
IntrinsicID = Intrinsic::getIntrinsicForClangBuiltin(Prefix.data(), Name);
// NOTE we don't need to perform a compatibility flag check here since the
// intrinsics are declared in Builtins*.def via LANGBUILTIN which filter the
@@ -6254,6 +6257,10 @@ static Value *EmitTargetArchBuiltinExpr(CodeGenFunction *CGF,
case llvm::Triple::riscv32:
case llvm::Triple::riscv64:
return CGF->EmitRISCVBuiltinExpr(BuiltinID, E, ReturnValue);
+ case llvm::Triple::spirv64:
+ if (CGF->getTarget().getTriple().getOS() != llvm::Triple::OSType::AMDHSA)
+ return nullptr;
+ return CGF->EmitAMDGPUBuiltinExpr(BuiltinID, E);
default:
return nullptr;
}
diff --git a/clang/test/CodeGen/target-data.c b/clang/test/CodeGen/target-data.c
index c184f314f68f80..1d40b8fe46063d 100644
--- a/clang/test/CodeGen/target-data.c
+++ b/clang/test/CodeGen/target-data.c
@@ -268,3 +268,7 @@
// RUN: %clang_cc1 -triple ve -o - -emit-llvm %s | \
// RUN: FileCheck %s -check-prefix=VE
// VE: target datalayout = "e-m:e-i64:64-n32:64-S128-v64:64:64-v128:64:64-v256:64:64-v512:64:64-v1024:64:64-v2048:64:64-v4096:64:64-v8192:64:64-v16384:64:64"
+
+// RUN: %clang_cc1 -triple spirv64-amd -o - -emit-llvm %s | \
+// RUN: FileCheck %s -check-prefix=SPIR64
+// AMDGPUSPIRV64: target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0"
diff --git a/clang/test/CodeGenCUDA/builtins-spirv-amdgcn.cu b/clang/test/CodeGenCUDA/builtins-spirv-amdgcn.cu
new file mode 100644
index 00000000000000..8dbb8c538ddc16
--- /dev/null
+++ b/clang/test/CodeGenCUDA/builtins-spirv-amdgcn.cu
@@ -0,0 +1,294 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple spirv64-amd-amdhsa -x hip \
+// RUN: -aux-triple x86_64-unknown-linux-gnu -fcuda-is-device -emit-llvm %s \
+// RUN: -o - | FileCheck %s
+
+// RUN: %clang_cc1 -triple spirv64-amd-amdhsa -x hip \
+// RUN: -aux-triple x86_64-pc-windows-msvc -fcuda-is-device -emit-llvm %s \
+// RUN: -o - | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK-LABEL: @_Z16use_dispatch_ptrPi(
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[OUT:%.*]] = alloca ptr addrspace(4), align 8
+// CHECK-NEXT: [[OUT_ADDR:%.*]] = alloca ptr addrspace(4), align 8
+// CHECK-NEXT: [[DISPATCH_PTR:%.*]] = alloca ptr addrspace(4), align 8
+// CHECK-NEXT: [[OUT_ASCAST:%.*]] = addrspacecast ptr [[OUT]] to ptr addrspace(4)
+// CHECK-NEXT: [[OUT_ADDR_ASCAST:%.*]] = addrspacecast ptr [[OUT_ADDR]] to ptr addrspace(4)
+// CHECK-NEXT: [[DISPATCH_PTR_ASCAST:%.*]] = addrspacecast ptr [[DISPATCH_PTR]] to ptr addrspace(4)
+// CHECK-NEXT: [[TMP0:%.*]] = addrspacecast ptr addrspace(1) [[OUT_COERCE:%.*]] to ptr addrspace(4)
+// CHECK-NEXT: store ptr addrspace(4) [[TMP0]], ptr addrspace(4) [[OUT_ASCAST]], align 8
+// CHECK-NEXT: [[OUT1:%.*]] = load ptr addrspace(4), ptr addrspace(4) [[OUT_ASCAST]], align 8
+// CHECK-NEXT: store ptr addrspace(4) [[OUT1]], ptr addrspace(4) [[OUT_ADDR_ASCAST]], align 8
+// CHECK-NEXT: [[TMP1:%.*]] = call align 4 dereferenceable(64) addrspace(4) ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
+// CHECK-NEXT: store ptr addrspace(4) [[TMP1]], ptr addrspace(4) [[DISPATCH_PTR_ASCAST]], align 8
+// CHECK-NEXT: [[TMP2:%.*]] = load ptr addrspace(4), ptr addrspace(4) [[DISPATCH_PTR_ASCAST]], align 8
+// CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr addrspace(4) [[TMP2]], align 4
+// CHECK-NEXT: [[TMP4:%.*]] = load ptr addrspace(4), ptr addrspace(4) [[OUT_ADDR_ASCAST]], align 8
+// CHECK-NEXT: store i32 [[TMP3]], ptr addrspace(4) [[TMP4]], align 4
+// CHECK-NEXT: ret void
+//
+__global__ void use_dispatch_ptr(int* out) {
+ const int* dispatch_ptr = (co...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/89796
More information about the cfe-commits
mailing list