[clang] [llvm] [NVPTX] Change the alloca address space in NVPTXLowerAlloca (PR #154814)

Thu Nov 6 06:56:34 PST 2025

================
@@ -1444,15 +1444,16 @@ void clang::emitBackendOutput(CompilerInstance &CI, CodeGenOptions &CGOpts,
 
   // Verify clang's TargetInfo DataLayout against the LLVM TargetMachine's
   // DataLayout.
-  if (AsmHelper.TM) {
-    std::string DLDesc = M->getDataLayout().getStringRepresentation();
-    if (DLDesc != TDesc) {
+  if (AsmHelper.TM)
+    if (!AsmHelper.TM->isCompatibleDataLayout(M->getDataLayout()) ||
+        !AsmHelper.TM->isCompatibleDataLayout(DataLayout(TDesc))) {
+      std::string DLDesc = M->getDataLayout().getStringRepresentation();
----------------
thetheodor wrote:

> My issues with (b) are mainly the invasiveness of DL change, and the fact that if the bring the idea of setting DL in a way that reflects the back-end AS where the data would live, **then the same argument should apply to other data, not just local variables**.

I agree, but LLVM already has support for allocas, whereas (afaik) for other types (e.g., shared variables) there is currently no support. For the former we simply need to change the DL and do a very simple autoupgrade of the input IR, whereas for globals/shared/etc the front-ends would have to emit different code (or we'd have to do a much more complicated autoupgrade?). But is this a good reason not to go ahead with setting the proper address space for allocas?

> No need to force users to do something compiler is quite capable of doing itself.

Maybe I'm misunderstanding something. But shouldn't autoupgrade take care of this? Users are welcome to specify the alloca address space, but they won't have to.

> The fact that Data layout allows us to specify the default alloca AS does not mean that we have to do it that way. In practice, we'll still need to allow user IR to use default AS for allocas, and we will need to run InferAddressSpace at some point.

Would there be any benefit in not autoupgrading allocas to the local address space? Especially if we run InferAddressSpace early, wouldn't we more or less end up in the same spot?

> It leaves us with the question of the AS for the new allocas. I wonder whether we could have a parallel hint for the default AS. If the DL specifies it, use DL-specified one. If DL says nothing, check with the target.

One option would be to do this in `IRBuilder::CreateAlloca` and check what the target wants if the alloca AS is unspecified (i.e., is zero). But I have some concerns with this:

- If we want to support both DLs (with or without specified alloca AS) then we need some way to construct two different layouts. Maybe this can be handled similarly to `--nvptx-short-ptr`.
- Would this not introduce a larger surface for bugs or divergent behavior? If the NVPTX target must work with both `-A5` and `-A0` DLs, then we'll have to identify, re-implement/fix, and test anything that relies on the DL's alloca AS.
- Maybe I'm misunderstanding the proposal, but the main idea is that we support two scenarios:
   - If the DL specifies `-A5`, all allocas are expected to be in the local address space.
   - If the DL does not specify the alloca address space (or specifies `-A0`) we rely on `InferAddressSpaces` (potentially run early) to fix the existing allocas and either we rely on another late `InferAddressSpaces` or, if possible, we modify `CreateAlloca` (or something similar) to consult the Target when creating allocas. My question is how are these two scenarios different from the users' perspective? For example, clang will automatically set the correct DL to compile a .cu file and for existing IR we can simply autoupgrade it. Am I missing something? 🤔


https://github.com/llvm/llvm-project/pull/154814