[clang] [llvm] [NVPTX] Change the alloca address space in NVPTXLowerAlloca (PR #154814)

Thu Oct 16 08:46:56 PDT 2025

================
@@ -1444,15 +1444,16 @@ void clang::emitBackendOutput(CompilerInstance &CI, CodeGenOptions &CGOpts,
 
   // Verify clang's TargetInfo DataLayout against the LLVM TargetMachine's
   // DataLayout.
-  if (AsmHelper.TM) {
-    std::string DLDesc = M->getDataLayout().getStringRepresentation();
-    if (DLDesc != TDesc) {
+  if (AsmHelper.TM)
+    if (!AsmHelper.TM->isCompatibleDataLayout(M->getDataLayout()) ||
+        !AsmHelper.TM->isCompatibleDataLayout(DataLayout(TDesc))) {
+      std::string DLDesc = M->getDataLayout().getStringRepresentation();
----------------
AlexMaclean wrote:

> Hypothetically, I'd expect to see cases where existing code will bail out when it sees non-default AS, and the code that should bail out, but does not, because it does not bother to check whether the AS is non-default. Some of those will be caught by assertions, but some will happen to work and will remain silent.

I doubt there will be many places where existing LLVM passes are mishandling allocas in a specific AS. The fact that AMDGPU is already using a specific AS for allocas makes me think that the support for this feature is already reasonably good. There might be some edge cases somewhere which need to be fixed, but overall think these hypothetical bugs should not be a major factor in considering which approach to choose. 

> Is it really worth doing/beneficial? What's the best case outcome we can expect from the patch?

I'd still lean towards switching to local allocas. This seems to me like it provides a more accurate representation of the machine in the LLVM IR. While the IR might be bigger when initially emitted from clang or unoptimized, once `InferAddressSpace` is run, it will be smaller with specific allocas since we'll no longer need to wrap every generic alloca in a cast to it's true address space. We should probably consider moving `InferAddressSpace` earlier to eliminate the size issue and this would have additional benefits such as improving subsequent alias-analysis compile-time. 

In general, this seems like this change allows us to eliminate a lot of hacks and work-arounds from the backend, in some cases improving the quality of emitted IR (I agree these quality improvements seem very marginal but I think it's still a win and there may be cases where they do make a difference). There are definitely some switching costs, and I'm not sure how best to handle the transition, but the final destination seems preferable even if it's not a game-changer.  

https://github.com/llvm/llvm-project/pull/154814