[clang] [llvm] [NVPTX] Change the alloca address space in NVPTXLowerAlloca (PR #154814)

Mon Oct 27 12:20:52 PDT 2025

================
@@ -1444,15 +1444,16 @@ void clang::emitBackendOutput(CompilerInstance &CI, CodeGenOptions &CGOpts,
 
   // Verify clang's TargetInfo DataLayout against the LLVM TargetMachine's
   // DataLayout.
-  if (AsmHelper.TM) {
-    std::string DLDesc = M->getDataLayout().getStringRepresentation();
-    if (DLDesc != TDesc) {
+  if (AsmHelper.TM)
+    if (!AsmHelper.TM->isCompatibleDataLayout(M->getDataLayout()) ||
+        !AsmHelper.TM->isCompatibleDataLayout(DataLayout(TDesc))) {
+      std::string DLDesc = M->getDataLayout().getStringRepresentation();
----------------
Artem-B wrote:

> AMDGPU is already using a specific AS for allocas makes me think that the support for this feature is already reasonably good. There might be some edge cases somewhere which need to be fixed, but overall think these hypothetical bugs should not be a major factor in considering which approach to choose.

Fair enough. Agreed.

> I'd still lean towards switching to local allocas. This seems to me like it provides a more accurate representation of the machine in the LLVM IR.

If we explicitly specify AS for local variables, then we should be doing that explicitly for other variables as well. AS knowledge does not benefit generic LLVM passes and is required for the NVPTX purposes only. I'm still not convinced that the proposed change buys us anything useful, other than DL being nominally closer to what the back-end needs.

> While the IR might be bigger when initially emitted from clang or unoptimized, once InferAddressSpace is run, it will be smaller with specific allocas since we'll no longer need to wrap every generic alloca in a cast to it's true address space.

In that case, InferAddressSpace should also be able to convert existing allocas to the correct AS without changing the default AS. We can infer AS for the newly materialized allocas with a late pass of InferAddressSpace.

So, it looks like the usefulness of the patch boils down to what to do about the allocas materialized throughout the compilation pipeline. We have the following scenarios:

- a) Current implementation, before the patch: All allocas start in AS0, and inferred to be local by a late InferAddressSpace pass.
- b) Proposed implementation: Makes the default AS for allocas to be local. Allows all allocas to be materialized with the correct AS. DL changes require all users to update the IR they generate (we'll presumably auto-upgrade IR with the pre-patch DL), and after the early run of InferAddressSpace will eliminate the redundant ASCs back to generic AS.
- c) Half-way proposal. Run InferAddressSpace early to give existing allocas correct AS, run another InferAddressSpace late in the pipeline to catch newly materialized generic allocas. It gives us some of the alias analysis benefits of the approach above, but without the disruption of changing DL. Effectiveness of this approach will be better than the status quo, but less than the change of the default alloca AS to local. By new, I mean the allocas that are not the result of splitting/trimming the existing alloca, as in such cases I would assume the AS to be inherited from the old alloca, which would preserve the local AS. I do not have a good idea of what's a typical ratio of pre-existing allocas vs the new allocas materialized by compiler. If new allocas are rare, then the AA effectiveness will asymptotically approach that of the (b).

My issues with (b) are mainly the invasiveness of DL change, and the fact that if the bring the idea of setting DL in a way that reflects the back-end AS where the data would live, then the same argument should apply to other data, not just local variables. It would benefit AA, but it does not look like something we should be forcing on the users. I think it's something that belongs under the compiler hood, IMO. No need to force users to do something compiler is quite capable of doing itself.

Perhaps we can have the cake and eat it here. 
The fact that Data layout allows us to specify the default alloca AS does not mean that we *have* to do it that way. In practice, we'll still need to allow user IR to use default AS for allocas, and we will need to run `InferAddressSpace` at some point. I'm fine doing it early. It leaves us with the question of the AS for the new allocas. I wonder whether we could have a parallel hint for the default AS. If the DL specifies it, use DL-specified one. If DL says nothing, check with the target. We'll still need to run InferAddressSpace once more before lowering, but we should be able to reap most of the AA benefits with no churn for the user.
This would also using (b) for experiments via explicit DL (and we can change the DL later if it proves to be the best way to handle it all), but it also avoids disrupting the existing users, and it gives us flexibility for how we handle allocas under the hood while we're sorting it out.

Does this make sense?


https://github.com/llvm/llvm-project/pull/154814