[llvm] [DirectX][DXIL] Distinguish return type for overload type resolution. (PR #85646)

S. Bharadwaj Yadavalli via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 19 07:42:01 PDT 2024


================
@@ -398,10 +398,20 @@ static void emitDXILOperationTable(std::vector<DXILOperationDesc> &Ops,
 
   OS << "  static const OpCodeProperty OpCodeProps[] = {\n";
   for (auto &Op : Ops) {
+    // Consider Op.OverloadParamIndex as the overload parameter index, by
+    // default
+    auto OLParamIdx = Op.OverloadParamIndex;
+    // If no overload parameter index is set, treat first parameter type as
+    // overload type - unless the Op has no parameters, in which case treat the
+    // return type - as overload parameter to emit the appropriate overload kind
+    // enum.
+    if (OLParamIdx < 0) {
+      OLParamIdx = (Op.OpTypes.size() > 1) ? 1 : 0;
+    }
----------------
bharadwajy wrote:

> I think this still might not be right. Take for example a barrier intrinsic. It takes 0 parameters, returns void. It has no overload type at all:
> 
> ```llvm
> define void @"\01?barrier_wrapper@@YAXXZ"() #0 {
>   call void @dx.op.barrier(i32 80, i32 9), !dbg !20 ; line:2 col:3  ; Barrier(barrierMode)
>   ret void, !dbg !21 ; line:3 col:1
> }
> ```
> 

As a test, I applied the following patch that adds an HLSL intrinsic record `int_dx_barrier` (inspired by [barrier record of AMDGPU](https://github.com/llvm/llvm-project/blob/c1328db9d8e41bb96e031eb3dfa884e56b07244b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td#L52)) to `llvm/include/llvm/IR/IntrinsicsDirectX.td` and a DXIL Op record `Barrirer` to `llvm/lib/Target/DirectX/DXIL.td`

```
diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index 5c72f06f96ed..9972162f60e6 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -16,6 +16,7 @@ def int_dx_thread_id : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem, IntrW
 def int_dx_group_id : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem, IntrWillReturn]>;
 def int_dx_thread_id_in_group : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem, IntrWillReturn]>;
 def int_dx_flattened_thread_id_in_group : Intrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrWillReturn]>;
+def int_dx_barrier  : Intrinsic<[], [], [IntrConvergent, IntrWillReturn]>;
 
 def int_dx_create_handle : ClangBuiltin<"__builtin_hlsl_create_handle">,
     Intrinsic<[ llvm_ptr_ty ], [llvm_i8_ty], [IntrWillReturn]>;
diff --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td
index 36eb29d53766..b139285a9695 100644
--- a/llvm/lib/Target/DirectX/DXIL.td
+++ b/llvm/lib/Target/DirectX/DXIL.td
@@ -295,6 +295,9 @@ def IMad : DXILOpMapping<48, tertiary, int_dx_imad,
                          "Signed integer arithmetic multiply/add operation. imad(m,a,b) = m * a + b.">;
 def UMad : DXILOpMapping<49, tertiary, int_dx_umad,
                          "Unsigned integer arithmetic multiply/add operation. umad(m,a,b) = m * a + b.">;
+def Barrier : DXILOpMapping<80, barrier, int_dx_barrier,
+                          "Inserts a memory barrier in the shader",
+                          [llvm_void_ty]>;
 def ThreadId : DXILOpMapping<93, threadId, int_dx_thread_id,
                              "Reads the thread ID">;
 def GroupId  : DXILOpMapping<94, groupId, int_dx_group_id,
```

Following shows correct DXIL lowering of an LLVM IR source with `barrier` intrinsic:
```
$ cat ~/playground/barrier.ll

define void @test_barrier() #0 {
entry:
  call void @llvm.dx.barrier()
  ret void
}
$ ../llvm-build/hlsl/Debug/bin/opt -S -dxil-op-lower ~/playground/barrier.ll
; ModuleID = '/home/bharadwaj/playground/barrier.ll'
source_filename = "/home/bharadwaj/playground/barrier.ll"

define void @test_barrier() {
entry:
  call void @dx.op.barrier(i32 80)
  ret void
}

declare void @dx.op.barrier(i32)
```
The return type `void` is represented in `Barrier` record as `llvm_void_ty`. Hence the existing proposed change in this PR handles even this case correctly. 

> https://godbolt.org/z/49fW8hKh5

It appears that there is a second argument that needs to be added to the DXIL Op call viz., `call void @dx.op.barrier(i32 80, i32 9)` per the link provided. That addition will be made as part of `DXILOpLowering` pass in a subsequent PR along with addition of `barrier` intrinsics and tests.

> Are there any cases where the overload isn't the first parameter and we have an overload type?

The [code that walks](https://github.com/llvm/llvm-project/blob/ef395a492aa931f428e99e1c0a93d4ad2fb0fcfa/llvm/utils/TableGen/DXILEmitter.cpp#L127) the `OpTypes` to discover overload parameter index (a) records the index of the first (not necessarily parameter number 1) parameter of overload type, if one exists and (b) ensures that if more than one parameter has overload type specified, all such parameters have the same overload type. So, even if the first parameter (i.e., parameter number 1) does not have an overload type, the overload parameter index is correctly identified.
 


https://github.com/llvm/llvm-project/pull/85646


More information about the llvm-commits mailing list