[llvm] [DirectX] Add `DXILRefineTypesPass` to prevent type promotion (PR #169384)

Finn Plummer via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 24 10:29:48 PST 2025


https://github.com/inbelic created https://github.com/llvm/llvm-project/pull/169384

#### Problem Statement

Consider the following pattern:
```
%temp = load Type, ptr %src
store Type %temp, ptr %dest
```
where `Type` is an `i32/i64` and the `store` is the only user of `%temp`.

This pattern is introduced by the `SimplifyAnyMemTransfer` function during the `InstCombine` and converts `llvm.memcpy`s of size 1/2/4/8 bytes to their load/store equivalent with the corresponding integer type. The `llvm.memcpy`s are frequently introduced when handling resource buffer accesses in HLSL.

The introduction of a load/store with an i32/i64 is problematic when lowering to DXIL because it can incidentally promote operations intended for a half/float or i16/i32. See [this linked issue](https://github.com/llvm/llvm-project/issues/165753) for an example of i32 operations being promoted to use an i64.

While the promoted operations are functionally correct, it can introduce additional driver dependencies (eg: support for Int64) such that a shader may be valid for a driver when compiled with DXC but not clang, which is not acceptable.

#### Proposed Solution

This commit introduces the `DXILRefineTypesPass` that will match on the pattern above and infer the type based on one of the `src/dest` pointer arguments, and then upgrade the types into their higher fidelity types. By doing so right after they are introduced, it will prevent any incidental usage of the i32/64.

Currently, the pass only infers the type of a parent `alloca` or `dxil.resource.getpointer` instruction for the minimal change to resolve the linked issue. However it could be easily extended to infer types from more parent operations if an opportunity arises.

This pass is added to the `registerPeepholeEPCallback` which will appropriately run directly after the `InstCombine` introducing the pattern.

In addition to listed below, this approach has the additional benefit for us to keep simplified form of the `llvm.memcpy`  as a load/store.

#### Alternatives Considered

1. Aggressively inline function calls. DXC does not run into this issue because it aggressively inlines function calls and then runs an SROA pass on the function parameters. This results in generated `llvm.memcpy`s to be removed by component accesses prior to invoking the `InstCombine` pass. This would functionally resolve the linked issue, however, inlining much earlier would (most likely) result in a noticeable increase for compile time.
2. Hooking into `InstCombine`. We could alternatively provide a new parameter (or otherwise) to the `InstCombine` pass that would prevent the promotion of certain types. This would be a general option that other languages can enable when applicable. This would also resolve the issue by ensuring the `llvm.memcpy` is not modified during the pass and allow for greater certainty that this would not slip by in the future. However, this is a much more obtrusive change than the proposed solution.
3. Introduce a sub-target for the DirectX target machine where i64's are not legal. Adding functionality to correctly choose the right sub-target is very complex and this would not solve the case of promoting 16-bit operations.

>From d68261fd4de350c2ba7f78e35d2a60bc3aaf748b Mon Sep 17 00:00:00 2001
From: Finn Plummer <mail at inbelic.dev>
Date: Fri, 21 Nov 2025 12:58:59 -0800
Subject: [PATCH 1/2] mvp

---
 llvm/lib/Target/DirectX/CMakeLists.txt        |   1 +
 llvm/lib/Target/DirectX/DXILRefineTypes.cpp   | 122 +++++++++++++++
 llvm/lib/Target/DirectX/DXILRefineTypes.h     |  40 +++++
 .../Target/DirectX/DirectXPassRegistry.def    |   1 +
 .../Target/DirectX/DirectXTargetMachine.cpp   |   8 +
 .../DirectX/refine-access-load-store.ll       | 147 ++++++++++++++++++
 6 files changed, 319 insertions(+)
 create mode 100644 llvm/lib/Target/DirectX/DXILRefineTypes.cpp
 create mode 100644 llvm/lib/Target/DirectX/DXILRefineTypes.h
 create mode 100644 llvm/test/CodeGen/DirectX/refine-access-load-store.ll

diff --git a/llvm/lib/Target/DirectX/CMakeLists.txt b/llvm/lib/Target/DirectX/CMakeLists.txt
index 6c079517e22d6..da61406eb88b8 100644
--- a/llvm/lib/Target/DirectX/CMakeLists.txt
+++ b/llvm/lib/Target/DirectX/CMakeLists.txt
@@ -31,6 +31,7 @@ add_llvm_target(DirectXCodeGen
   DXILPostOptimizationValidation.cpp
   DXILPrepare.cpp
   DXILPrettyPrinter.cpp
+  DXILRefineTypes.cpp
   DXILResourceAccess.cpp
   DXILResourceImplicitBinding.cpp
   DXILShaderFlags.cpp
diff --git a/llvm/lib/Target/DirectX/DXILRefineTypes.cpp b/llvm/lib/Target/DirectX/DXILRefineTypes.cpp
new file mode 100644
index 0000000000000..5b838c620810d
--- /dev/null
+++ b/llvm/lib/Target/DirectX/DXILRefineTypes.cpp
@@ -0,0 +1,122 @@
+//===- DXILRefineTypes.cpp ----------------------------------------------===////
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "DXILRefineTypes.h"
+#include "llvm/IR/DerivedTypes.h"
+#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/Intrinsics.h"
+#include "llvm/IR/Module.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "dxil-refine-types"
+
+static Type *inferType(Value *Operand) {
+  if (auto *CI = dyn_cast<CallInst>(Operand))
+    if (CI->getCalledFunction()->getName().starts_with(
+            "llvm.dx.resource.getpointer"))
+      if (auto *ExtType = dyn_cast<TargetExtType>(CI->getOperand(0)->getType()))
+        return ExtType->getTypeParameter(0); // Valid for all dx.Types
+
+  if (auto *AI = dyn_cast<AllocaInst>(Operand))
+    return AI->getAllocatedType();
+
+  // TODO: Extend to other useful/applicable cases
+  return nullptr;
+}
+
+// Attempt to merge two inferred types.
+//
+// Returns nullptr, if an inferred can't be concluded
+static Type *mergeInferredTypes(Type *A, Type *B) {
+  if (!A)
+    return B;
+
+  if (!B)
+    return A;
+
+  if (A == B)
+    return A;
+
+  // Otherwise, neither was inferred, or inferred differently
+  return nullptr;
+}
+
+bool DXILRefineTypesPass::runImpl(Function &F) {
+  // First detect the pattern: (generated from SimplifyAnyMemTransfer)
+  //   %temp = load Type, ptr %src, ...
+  //   store Type %temp, ptr %dest, ...
+  // where,
+  //   store is the only user of %temp
+  //   Type is either an i32 or i64
+  //
+  // We are currently only concerned with i32 and i64 as these can incidently
+  // promote 16/32 bit types to 32/64 bit arthimetic.
+  SmallVector<std::pair<LoadInst *, StoreInst *>, 4> ToVisit;
+  for (BasicBlock &BB : F)
+    for (Instruction &I : BB)
+      if (auto *LI = dyn_cast<LoadInst>(&I))
+        if (LI->hasOneUse())
+          if (auto *SI = dyn_cast<StoreInst>(LI->user_back()))
+            if (LI->getAccessType() == SI->getAccessType())
+              if (LI->getAccessType()->isIntegerTy(32) ||
+                  LI->getAccessType()->isIntegerTy(64))
+                ToVisit.push_back({LI, SI});
+
+  bool Modified = false;
+  for (auto [LI, SI] : ToVisit) {
+    Type *LoadTy = inferType(LI->getPointerOperand());
+    Type *StoreTy = inferType(SI->getPointerOperand());
+
+    Type *const InferredTy = mergeInferredTypes(LoadTy, StoreTy);
+    if (!InferredTy || InferredTy == LI->getType())
+      continue; // Nothing to be done. Skip.
+
+    // Replace the type of the load/store
+    IRBuilder<> LoadBuilder(SI);
+    LoadInst *TypedLoad =
+        LoadBuilder.CreateLoad(InferredTy, LI->getPointerOperand());
+
+    TypedLoad->setAlignment(LI->getAlign());
+    TypedLoad->setVolatile(LI->isVolatile());
+    TypedLoad->setOrdering(LI->getOrdering());
+    TypedLoad->setAAMetadata(LI->getAAMetadata());
+    TypedLoad->copyMetadata(*LI, LLVMContext::MD_mem_parallel_loop_access);
+    TypedLoad->copyMetadata(*LI, LLVMContext::MD_access_group);
+
+    IRBuilder<> StoreBuilder(SI);
+    StoreInst *TypedStore = StoreBuilder.CreateStore(
+        TypedLoad, SI->getPointerOperand(), SI->isVolatile());
+
+    TypedStore->setAlignment(SI->getAlign());
+    TypedStore->setVolatile(SI->isVolatile());
+    TypedStore->setOrdering(SI->getOrdering());
+    TypedStore->setAAMetadata(SI->getAAMetadata());
+    TypedStore->copyMetadata(*SI, LLVMContext::MD_mem_parallel_loop_access);
+    TypedStore->copyMetadata(*SI, LLVMContext::MD_access_group);
+    TypedStore->copyMetadata(*SI, LLVMContext::MD_DIAssignID);
+
+    SI->eraseFromParent();
+    LI->eraseFromParent();
+
+    Modified = true;
+  }
+
+  return Modified;
+}
+
+PreservedAnalyses DXILRefineTypesPass::run(Function &F,
+                                           FunctionAnalysisManager &AM) {
+  if (!runImpl(F))
+    return PreservedAnalyses::all();
+
+  // TODO: Can probably preserve some CFG analyses
+  return PreservedAnalyses::none();
+}
diff --git a/llvm/lib/Target/DirectX/DXILRefineTypes.h b/llvm/lib/Target/DirectX/DXILRefineTypes.h
new file mode 100644
index 0000000000000..043e4eb2875f0
--- /dev/null
+++ b/llvm/lib/Target/DirectX/DXILRefineTypes.h
@@ -0,0 +1,40 @@
+//===- DXILRefineTypes.h - Infer additional type information ----*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This pass infers increased type information fidelity of memory operations and
+// replaces their uses with it. Primarily load/store.
+//
+// This is used to prevent the propogation of introduced problematic types. For
+// instance: the InstCombine pass can promote aggregates of 16/32-bit types to
+// be i32/64 loads and stores.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_SCALAR_DXILREFINETYPES_H
+#define LLVM_TRANSFORMS_SCALAR_DXILREFINETYPES_H
+
+#include "llvm/IR/BasicBlock.h"
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+class Function;
+
+class DXILRefineTypesPass : public PassInfoMixin<DXILRefineTypesPass> {
+private:
+  bool runImpl(Function &F);
+
+public:
+  DXILRefineTypesPass() = default;
+
+  PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+};
+
+} // end namespace llvm
+
+#endif // LLVM_TRANSFORMS_SCALAR_DXILREFINETYPES_H
diff --git a/llvm/lib/Target/DirectX/DirectXPassRegistry.def b/llvm/lib/Target/DirectX/DirectXPassRegistry.def
index b4b48a166800e..8f034e4ba10ff 100644
--- a/llvm/lib/Target/DirectX/DirectXPassRegistry.def
+++ b/llvm/lib/Target/DirectX/DirectXPassRegistry.def
@@ -42,6 +42,7 @@ MODULE_PASS("print<dxil-root-signature>", dxil::RootSignatureAnalysisPrinter(dbg
 #define FUNCTION_PASS(NAME, CREATE_PASS)
 #endif
 FUNCTION_PASS("dxil-forward-handle-accesses", DXILForwardHandleAccesses())
+FUNCTION_PASS("dxil-refine-types", DXILRefineTypesPass())
 FUNCTION_PASS("dxil-resource-access", DXILResourceAccess())
 FUNCTION_PASS("dxil-legalize", DXILLegalizePass())
 #undef FUNCTION_PASS
diff --git a/llvm/lib/Target/DirectX/DirectXTargetMachine.cpp b/llvm/lib/Target/DirectX/DirectXTargetMachine.cpp
index 84b1a313df2ea..785428962fe04 100644
--- a/llvm/lib/Target/DirectX/DirectXTargetMachine.cpp
+++ b/llvm/lib/Target/DirectX/DirectXTargetMachine.cpp
@@ -22,6 +22,7 @@
 #include "DXILOpLowering.h"
 #include "DXILPostOptimizationValidation.h"
 #include "DXILPrettyPrinter.h"
+#include "DXILRefineTypes.h"
 #include "DXILResourceAccess.h"
 #include "DXILResourceImplicitBinding.h"
 #include "DXILRootSignature.h"
@@ -37,10 +38,12 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/IR/IRPrintingPasses.h"
 #include "llvm/IR/LegacyPassManager.h"
+#include "llvm/IR/PassManager.h"
 #include "llvm/InitializePasses.h"
 #include "llvm/MC/MCSectionDXContainer.h"
 #include "llvm/MC/SectionKind.h"
 #include "llvm/MC/TargetRegistry.h"
+#include "llvm/Passes/OptimizationLevel.h"
 #include "llvm/Passes/PassBuilder.h"
 #include "llvm/Support/CodeGen.h"
 #include "llvm/Support/Compiler.h"
@@ -147,6 +150,11 @@ DirectXTargetMachine::~DirectXTargetMachine() {}
 void DirectXTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 #define GET_PASS_REGISTRY "DirectXPassRegistry.def"
 #include "llvm/Passes/TargetPassRegistry.inc"
+
+  PB.registerPeepholeEPCallback(
+      [](FunctionPassManager &FPM, OptimizationLevel) {
+        FPM.addPass(DXILRefineTypesPass());
+      });
 }
 
 bool DirectXTargetMachine::addPassesToEmitFile(
diff --git a/llvm/test/CodeGen/DirectX/refine-access-load-store.ll b/llvm/test/CodeGen/DirectX/refine-access-load-store.ll
new file mode 100644
index 0000000000000..a6169369a7f48
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/refine-access-load-store.ll
@@ -0,0 +1,147 @@
+; RUN: split-file %s %t
+; RUN: opt -passes=dxil-refine-types -S -mtriple=dxil-pc-shadermodel6.0-library < %t/explicit-64.ll | FileCheck %s
+; RUN: opt -passes='instcombine,dxil-refine-types' -S -mtriple=dxil-pc-shadermodel6.0-library < %t/generated-64.ll | FileCheck %s
+; RUN: opt -passes=dxil-refine-types -S -mtriple=dxil-pc-shadermodel6.0-library < %t/explicit-32.ll | FileCheck %s
+; RUN: opt -passes='instcombine,dxil-refine-types' -S -mtriple=dxil-pc-shadermodel6.0-library < %t/generated-32.ll | FileCheck %s
+
+; RUN: opt -passes='instcombine,dxil-refine-types' -S -mtriple=dxil-pc-shadermodel6.0-library < %t/folded.ll | FileCheck %s --check-prefix=FOLDED
+
+; Tests that dxil-refine-types will catch the access pattern generated by inst-combine
+
+; CHECK-LABEL: @test(
+; CHECK: %[[#FROM:]] = load %struct.PromotedStruct, ptr %get_access
+; CHECK: store %struct.PromotedStruct %[[#FROM]], ptr %param
+; CHECK: call void @external_barrier(ptr{{.*}}%param)
+; CHECK: %[[#TO:]] = load %struct.PromotedStruct, ptr %param
+; CHECK: store %struct.PromotedStruct %[[#TO]], ptr %set_access
+; CHECK:    ret void
+
+;--- explicit-64.ll
+
+%"StructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%"RWStructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%struct.PromotedStruct = type { i32, float }
+
+ at src = external constant %"StructuredBuffer<struct.PromotedStruct>"
+ at dest = external constant %"RWStructuredBuffer<struct.PromotedStruct>"
+
+define void @test(i32 %idx) {
+  %param = alloca %struct.PromotedStruct, align 1
+  %src = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @src, align 4
+  %get_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %src, i32 %idx)
+  %1 = load i64, ptr %get_access, align 1
+  store i64 %1, ptr %param, align 1
+
+  call void @external_barrier(ptr %param)
+
+  %dest = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @dest, align 4
+  %set_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %dest, i32 %idx)
+  %2 = load i64, ptr %param, align 1
+  store i64 %2, ptr %set_access, align 1
+  ret void
+}
+
+declare void @external_barrier(ptr)
+
+;--- generated-64.ll
+
+%"StructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%"RWStructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%struct.PromotedStruct = type { i32, float }
+
+ at src = external constant %"StructuredBuffer<struct.PromotedStruct>"
+ at dest = external constant %"RWStructuredBuffer<struct.PromotedStruct>"
+
+define void @test(i32 %idx) {
+  %param = alloca %struct.PromotedStruct, align 1
+  %src = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @src, align 4
+  %get_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %src, i32 %idx)
+  call void @llvm.memcpy.p0.p0.i32(ptr align 1 %param, ptr align 1 %get_access, i32 8, i1 false)
+
+  call void @external_barrier(ptr %param)
+
+  %dest = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @dest, align 4
+  %set_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %dest, i32 %idx)
+  call void @llvm.memcpy.p0.p0.i32(ptr align 1 %set_access, ptr align 1 %param, i32 8, i1 false)
+  ret void
+}
+
+declare void @external_barrier(ptr)
+
+;--- explicit-32.ll
+
+%"StructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%"RWStructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%struct.PromotedStruct = type { i16, half }
+
+ at src = external constant %"StructuredBuffer<struct.PromotedStruct>"
+ at dest = external constant %"RWStructuredBuffer<struct.PromotedStruct>"
+
+define void @test(i32 %idx) {
+  %param = alloca %struct.PromotedStruct, align 1
+  %src = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @src, align 4
+  %get_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %src, i32 %idx)
+  %1 = load i32, ptr %get_access, align 1
+  store i32 %1, ptr %param, align 1
+
+  call void @external_barrier(ptr %param)
+
+  %dest = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @dest, align 4
+  %set_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %dest, i32 %idx)
+  %2 = load i32, ptr %param, align 1
+  store i32 %2, ptr %set_access, align 1
+  ret void
+}
+
+declare void @external_barrier(ptr)
+
+;--- generated-32.ll
+
+%"StructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%"RWStructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%struct.PromotedStruct = type { i32, i32 }
+
+ at src = external constant %"StructuredBuffer<struct.PromotedStruct>"
+ at dest = external constant %"RWStructuredBuffer<struct.PromotedStruct>"
+
+define void @test(i32 %idx) {
+  %param = alloca %struct.PromotedStruct, align 1
+  %src = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @src, align 4
+  %get_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %src, i32 %idx)
+  call void @llvm.memcpy.p0.p0.i32(ptr align 1 %param, ptr align 1 %get_access, i32 8, i1 false)
+
+  call void @external_barrier(ptr %param)
+
+  %dest = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @dest, align 4
+  %set_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %dest, i32 %idx)
+  call void @llvm.memcpy.p0.p0.i32(ptr align 1 %set_access, ptr align 1 %param, i32 8, i1 false)
+  ret void
+}
+
+declare void @external_barrier(ptr)
+
+;--- folded.ll
+
+; FOLDED-LABEL: @test_folded(
+; FOLDED: %[[#TO:]] = load %struct.PromotedStruct, ptr %get_access
+; FOLDED: store %struct.PromotedStruct %[[#TO]], ptr %set_access
+; FOLDED:    ret void
+
+%"StructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%"RWStructuredBuffer<struct.PromotedStruct>" = type { %struct.PromotedStruct }
+%struct.PromotedStruct = type { i32, float }
+
+ at src = external constant %"StructuredBuffer<struct.PromotedStruct>"
+ at dest = external constant %"RWStructuredBuffer<struct.PromotedStruct>"
+
+define void @test_folded(i32 %idx) {
+  %param = alloca %struct.PromotedStruct, align 1
+  %src = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @src, align 4
+  %get_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %src, i32 %idx)
+  call void @llvm.memcpy.p0.p0.i32(ptr align 1 %param, ptr align 1 %get_access, i32 8, i1 false)
+
+  %dest = load target("dx.RawBuffer", %struct.PromotedStruct, 1, 0), ptr @dest, align 4
+  %set_access = call noundef nonnull align 1 dereferenceable(8) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_s_struct.PromotedStructs_1_0t(target("dx.RawBuffer", %struct.PromotedStruct, 1, 0) %dest, i32 %idx)
+  call void @llvm.memcpy.p0.p0.i32(ptr align 1 %set_access, ptr align 1 %param, i32 8, i1 false)
+  ret void
+}

>From 0bb23fd20fef0fda1d327f72baeb334c1b230f3f Mon Sep 17 00:00:00 2001
From: Finn Plummer <mail at inbelic.dev>
Date: Fri, 21 Nov 2025 13:06:13 -0800
Subject: [PATCH 2/2] clean up

---
 llvm/lib/Target/DirectX/DXILRefineTypes.cpp           | 4 ----
 llvm/test/CodeGen/DirectX/refine-access-load-store.ll | 3 +++
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Target/DirectX/DXILRefineTypes.cpp b/llvm/lib/Target/DirectX/DXILRefineTypes.cpp
index 5b838c620810d..e5065768046d0 100644
--- a/llvm/lib/Target/DirectX/DXILRefineTypes.cpp
+++ b/llvm/lib/Target/DirectX/DXILRefineTypes.cpp
@@ -8,7 +8,6 @@
 
 #include "DXILRefineTypes.h"
 #include "llvm/IR/DerivedTypes.h"
-#include "llvm/IR/GlobalVariable.h"
 #include "llvm/IR/IRBuilder.h"
 #include "llvm/IR/Instructions.h"
 #include "llvm/IR/Intrinsics.h"
@@ -32,9 +31,6 @@ static Type *inferType(Value *Operand) {
   return nullptr;
 }
 
-// Attempt to merge two inferred types.
-//
-// Returns nullptr, if an inferred can't be concluded
 static Type *mergeInferredTypes(Type *A, Type *B) {
   if (!A)
     return B;
diff --git a/llvm/test/CodeGen/DirectX/refine-access-load-store.ll b/llvm/test/CodeGen/DirectX/refine-access-load-store.ll
index a6169369a7f48..b3f9b9c13e338 100644
--- a/llvm/test/CodeGen/DirectX/refine-access-load-store.ll
+++ b/llvm/test/CodeGen/DirectX/refine-access-load-store.ll
@@ -122,6 +122,9 @@ declare void @external_barrier(ptr)
 
 ;--- folded.ll
 
+; This tests that when there is no function call to prevent folding, the type
+; will still successfully be replaced.
+
 ; FOLDED-LABEL: @test_folded(
 ; FOLDED: %[[#TO:]] = load %struct.PromotedStruct, ptr %get_access
 ; FOLDED: store %struct.PromotedStruct %[[#TO]], ptr %set_access



More information about the llvm-commits mailing list