[llvm] [DirectX] add support for i64 buffer load/stores (PR #145047)
Justin Bogner via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 20 12:18:50 PDT 2025
================
@@ -570,22 +575,54 @@ static bool expandTypedBufferLoadIntrinsic(CallInst *Orig) {
ExtractElements.push_back(
Builder.CreateExtractElement(Extract, Builder.getInt32(I)));
- // combine into double(s)
+ // combine into double(s) or int64(s)
Value *Result = PoisonValue::get(BufferTy);
for (unsigned I = 0; I < ExtractNum; I += 2) {
- Value *Dbl =
- Builder.CreateIntrinsic(Builder.getDoubleTy(), Intrinsic::dx_asdouble,
- {ExtractElements[I], ExtractElements[I + 1]});
+ Value *Combined = nullptr;
+ if (IsDouble) {
+ // For doubles, use dx_asdouble intrinsic
+ Combined =
+ Builder.CreateIntrinsic(Builder.getDoubleTy(), Intrinsic::dx_asdouble,
+ {ExtractElements[I], ExtractElements[I + 1]});
+ } else {
+ // For int64, manually combine two int32s
+ // First, zero-extend both values to i64
+ Value *Lo = Builder.CreateZExt(ExtractElements[I], Builder.getInt64Ty());
+ Value *Hi =
+ Builder.CreateZExt(ExtractElements[I + 1], Builder.getInt64Ty());
+ // Shift the high bits left by 32 bits
+ Value *ShiftedHi = Builder.CreateShl(Hi, Builder.getInt64(32));
+ // OR the high and low bits together
+ Combined = Builder.CreateOr(Lo, ShiftedHi);
+ }
+
if (ExtractNum == 4)
- Result =
- Builder.CreateInsertElement(Result, Dbl, Builder.getInt32(I / 2));
+ Result = Builder.CreateInsertElement(Result, Combined,
+ Builder.getInt32(I / 2));
else
- Result = Dbl;
+ Result = Combined;
}
Value *CheckBit = nullptr;
for (User *U : make_early_inc_range(Orig->users())) {
- auto *EVI = cast<ExtractValueInst>(U);
+ if (auto *Ret = dyn_cast<ReturnInst>(U)) {
+ // For return instructions, we need to handle the case where the function
+ // is directly returning the result of the call
+ Type *RetTy = Ret->getFunction()->getReturnType();
+ Value *StructRet = PoisonValue::get(RetTy);
+ StructRet = Builder.CreateInsertValue(StructRet, Result, {0});
+ Value *CheckBitForRet = Builder.CreateExtractValue(Load, {1});
----------------
bogner wrote:
Yeah, it's kind of a moot point because this is entirely unreachable from HLSL (or C++) source:
1. There is no `i1` type - bool in HLSL is 32 bits and in C++ it's 8, so the struct we would return from a function is `{i64, i32}`, not `{i64, i1}`
2. Because of struct return ABI stuff in clang, the signature we actually get from such functions is `define void @loadi64({i64, i1} *)` (see https://hlsl.godbolt.org/z/nEeeM933K)
3. On top of all of that, the way we get here from HLSL is [T Buffer<T>::Load(int, out uint)](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/sm5-object-buffer-load), so we wouldn't be using the result or the intrinsic directly even if we did manage to pack it back into the right shape
I think it makes sense to leave the return handling out.
https://github.com/llvm/llvm-project/pull/145047
More information about the llvm-commits
mailing list