<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/99219>99219</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Implement the `dot4add_u8packed` HLSL Function
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            metabug,
            backend:DirectX,
            HLSL,
            backend:SPIR-V,
            bot:HLSL
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          farzonl
      </td>
    </tr>
</table>

<pre>
    - [ ] Implement `dot4add_u8packed` clang builtin,
- [ ] Link `dot4add_u8packed` clang builtin with `hlsl_intrinsics.h`
- [ ] Add sema checks for `dot4add_u8packed` to `CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [ ] Add codegen for `dot4add_u8packed` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp`
- [ ] Add codegen tests to `clang/test/CodeGenHLSL/builtins/dot4add_u8packed.hlsl`
- [ ] Add sema tests to `clang/test/SemaHLSL/BuiltIns/dot4add_u8packed-errors.hlsl`
- [ ] Create the `int_dx_dot4add_u8packed` intrinsic in `IntrinsicsDirectX.td`
- [ ] Create the `DXILOpMapping` of `int_dx_dot4add_u8packed` to  `164` in `DXIL.td`
- [ ] Create the  `dot4add_u8packed.ll` and `dot4add_u8packed_errors.ll` tests in `llvm/test/CodeGen/DirectX/`
- [ ] Create the `int_spv_dot4add_u8packed` intrinsic in `IntrinsicsSPIRV.td`
- [ ] In SPIRVInstructionSelector.cpp create the `dot4add_u8packed` lowering and map  it to `int_spv_dot4add_u8packed` in `SPIRVInstructionSelector::selectIntrinsic`.
- [ ] Create SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/dot4add_u8packed.ll`

## DirectX

| DXIL Opcode | DXIL OpName | Shader Model | Shader Stages |
| ----------- | ----------- | ------------ | ------------- |
| 164 | Dot4AddU8Packed | 6.4 | () |

## SPIR-V

# [OpUDot](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpUDot):

## Description:
  
Unsigned integer dot product of *Vector 1* and *Vector 2*.  
  
*Result Type* must be an integer type with *Signedness* of 0 whose
*Width* must be greater than or equal to that of the components of
*Vector 1* and *Vector 2*.  
  
*Vector 1* and *Vector 2* must have the same type.  
  
*Vector 1* and *Vector 2* must be either 32-bit integers (enabled by the
**DotProductInput4x8BitPacked** [capability](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Capability)) or vectors of
integer type with *Signedness* of 0 (enabled by the
**DotProductInput4x8Bit** or **DotProductInputAll**
[capability](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Capability)).  
  
When *Vector 1* and *Vector 2* are scalar integer types, *Packed Vector
Format* must be specified to select how the integers are to be
interpreted as vectors.  
  
All components of the input vectors are zero-extended to the bit width
of the result’s type. The zero-extended input vectors are then
multiplied component-wise and all components of the vector resulting
from the component-wise multiplication are added together. The resulting
value will equal the low-order N bits of the correct result R, where N
is the result width and R is computed with enough precision to avoid
overflow and underflow.

[Capability](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Capability):  
**DotProduct**  
  
[Missing before](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Unified) **version 1.6**.

<table style="width:100%;">
<colgroup>
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
</colgroup>
<thead>
<tr>
<th>Word Count</th>
<th>Opcode</th>
<th>Results</th>
<th>Operands</th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p>5 + variable</p></td>
<td class="tableblock halign-left valign-top"><p>4451</p></td>
<td
class="tableblock halign-left valign-top"><p><em>&lt;id&gt;</em><br />
<em>Result Type</em></p></td>
<td class="tableblock halign-left valign-top"><p><a
href="#ResultId"><em>Result &lt;id&gt;</em></a></p></td>
<td
class="tableblock halign-left valign-top"><p><em>&lt;id&gt;</em><br />
<em>Vector 1</em></p></td>
<td
class="tableblock halign-left valign-top"><p><em>&lt;id&gt;</em><br />
<em>Vector 2</em></p></td>
<td class="tableblock halign-left valign-top"><p>Optional<br />
<a href="#Packed_Vector_Format"><em>Packed Vector Format</em></a><br />
<em>Packed Vector Format</em></p></td>
</tr>
</tbody>
</table>



## Test Case(s)

 
 ### Example 1
```hlsl
//dxc dot4add_u8packed_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export uint fn(uint p1, uint p2, uint p3) {
    return dot4add_u8packed(p1, p2, p3);
}
```
## HLSL:

## Syntax


```syntax
uint dot4add_u8packed(uint a, uint b, uint c);
```


## Type Description

| Name  | [**Template Type**](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/../direct3dhlsl/dx-graphics-hlsl-data-types.md)| [**Component Type**](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/../direct3dhlsl/dx-graphics-hlsl-data-types.md) | Size |
|-------|--------------------------------------------------------------------|----------------------------------------------------------------------|------|
| *ret* | [**scalar**](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/../direct3dhlsl/dx-graphics-hlsl-scalar.md) | [**uint**](../WinProg/windows-data-types) | 1 |
| *a* | [**scalar**](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/../direct3dhlsl/dx-graphics-hlsl-scalar.md) | [**uint**](../WinProg/windows-data-types) | 1 |
| *b* | [**scalar**](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/../direct3dhlsl/dx-graphics-hlsl-scalar.md) | [**uint**](../WinProg/windows-data-types) | 1 |
| *c* | [**scalar**](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/../direct3dhlsl/dx-graphics-hlsl-scalar.md) | [**uint**](../WinProg/windows-data-types) | 1 |

## Minimum Shader Model

This function is supported in the following shader models.
|Shader Model | Supported|
|-------------|----------|
|[Shader Model 6.4](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/../direct3dhlsl/hlsl-shader-model-6-4-features-for-direct3d-12.md) and higher shader models | yes |

## Shader Stages



## See also


- [**Intrinsic Functions (DirectX HLSL)**](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/../direct3dhlsl/dx-graphics-hlsl-intrinsic-functions.md)
- **See [Unsigned Integer Dot-Product of 4 Elements and Accumulate](../direct3dhlsl/hlsl-shader-model-6-4-features-for-direct3d-12.md#unsigned-integer-dot-product-of-4-elements-and-accumulate)**
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWl9v2zgS_zTMCyFDomzXfvCDbMd7Adpt0aTbfQsocSzxQpE6knKS_fQHkrIkN07S3m63e0CDoJXI4fz5zQw5GoYaw0sJsEKzNZptL2hrK6VXe6r_UFJc5Io9riKMZmuMZlt8VTcCapAWo3nMlJ1Sxm7bRUOLO2BoHuNCUFnivOXCconIBsXZsPotl3dftRDfc1s5ykoYccul1VwaXphJhebxCcuMMWygpriooLgzeK_0cxKscjMbR_evt9dv10HUrpWF5UpuqBCOiktHdQ019ZRclpOiac6JLRSDEuTrIi9rbkcSLx8aPUja_NINvybGgrGm4-jBQmTnxhDZbRSDX0A6GYjsOgwNIrsvdZo4PJ-F8AUJDo-OvVf36iz7CLRW2pyVstFALWBbgWPPpb1lD7fnMOu93QF01Xt_yzUU9veJZS8z3_5-9fZ98442DZelY6n2r8i0CjuKZD4dHOO4vCzqnNcnIYqoZOdmbzuAAlHAO0gT4lA_cSgiu85mRHavA2qawzciev3h6uNvZ4y8kthPXUljdevz4xoEFFZpF6a4OJF9TqZQ96C5LD0SNW0w5rYLrZdV9dn3jGyUZijNjH_rjUDzeHIOGMck-g3njrdkHmxcUAOvAO5lI7JzMRwNO8-5ZBJdkLtfkiKS4qO3wtibDXZBhN83Lofx6P1XWof364oy0PidYiDGA9eWlmDcSMcoGn7wK-9PB6IRo2Q-DZooO80Y-7T44I3xY_NJmENkgcjyuKi3LiDaDznA3zeftsqi2RaRRWVtY5yLyA6RnYaSG6sfJ3eVVlKZidJlB2_k8DUNeFRbyfccWHKEflLZWiCSdpzJ0nE8wRhMoXnj4iJMYYzi7JP0xxhzwQ4laMyUxY1WrC2sz3-S_eZjCCeIZCE_-yGCSDbxbPw_iGQfwbTC4pvHBhx53RqLc8BU9vztYwPdMUWyay9bgjGOWu1xjO8rZSAw-8yZrcZsSh-hGtuKSqw0hv-0VLjssBX12rrEKlTdKAnSGqz2gdG3WPAybVClooeQxMbFo7Po23nkgIHbCjROSZRzewTIuCgCSXMBDOePTkxgiki2VfZDcM2VbFo7fVisuQ1xGAhcaBW0oTkX3D5-j_DaDNzJ0gW70vjgbevg_lo_f6OVnX2uZDhHlAkRxh2XvxuDkfM_VyBfzxlMNWBTUEH1SV4YRNwmknVbS1iB4myndE3tOHScil45F_1hY8eVuvdB2QeSk2IVzqFzi240WGCYmqPLRppnQpymTseraW3vYMfwD9AqggcLkgXpjszF773P1jjrVmq_E6BNijKCLglaEJRt_MMUZcR0WXNTfcnxqURbgURxVrfC8kY4m3s9o3tuwANMz6of2HS6uKImzvZa1af7RGByZF9Qt0N6wZQFE0tweRqUHbM6UNG6GBfiuBNV4A7wSGl3Fv3qYDHDtqTdGdcxwB-dq-8r0IB_de4xI9AClN6sj5gbr2nrHOfTCaRqywo3GgpunKpWYXpQnDnsD6D3Qt37ta1k4W3SnQOz9eZvzIu03w5P0rVL5GG7nK3fcWNcyZPDXmn4Hpp9CuP-cPbyD6A9dslkHgaOGKUb6zYlbOyjAJRuESEhsNMsiWNEZihdI0JQehmoCyVKrdpmPHBuMU6mbjEh2Fn0k_gcMSK7J3DaCigb3vRoAqWXn5VmeKNaacNyPziaDxXk-blQqpjnFoKmkj0z-6dG3Zs-eTs1MVfs8azBzH3rGxOQ9FGaC1Xc4YoKXspIwN7iQ3i2qumCNN04LGcYkTU-UM3dsiC2GZRjf4mQ6XSWvMgaxdn_zBylG6jd_2QuLErXnCEyL91TEBQm002uT0LKD4-L0hPi74ECSjcUxVmlYR-WI5IGBa5YTzhW6zWLENnR19T9Ucj2Zc5XwvqD9STf3_3v_QcWFU8VofgkJkKNdxs0uz1WeOP4OKkCcUfxTGicM_qr1p8F4OkWdbIpuYGwj1x2p-b4O_MGjMUbagCRhXGFgJ_0532gcESXD7RuBODEzc7j8OvbX3F35LOHAj9pA1kw1nfJcHSDBc9v57cLHIXviCiZ59xGvo7G0fuuwQAPjdIWt1xavJeILPxTk7jyKzyS4TENX-9rX5tgrMG2Wj7RApFFWB-W-lUuCuMMvdmO7Rkw8S3AL77Hrx-lpQ8DesdV5jjudToj3I_TXuu8fyoGRcYqnDjHfZWNOwF9x8W3VkIXY7YONdEN1I2gFo6f8-73THFWclu1-aRQNSK7d7zQyqi93Spfkd1zmRJEdrlQue8FhZYQmDurmsjoovO2bwClzIcA2U0mT8fYQ1Rq2lS8MJFvMjFqaXD3pHal3Ynum2N1_3-ifGhi8T9gaDl1Pajh6U_9_EVsRoyG3hgimQb_fXrig_CF-8_DPug1wr3X2KXRWF_P7jOXH7Qqg05M3ZuR844cEnyCBv2JRY9F_hOLHoviJxbj0-gdl7xu65NufiC4qbjB--6KEXODTdu4g9w3iHyrZK-EUPdcltiE1bVbbSYB7C_vB1C8vD5y-HKDPbM_9iRotj5hNZ9Mf6CXgme8PpE3N5pH02gP1LYaTLRXOjquiBLSOY9KhiteVqBPkfI-eRwuS4bCZHyVcqbAuwbAVBjVj0dDcPS3S_h4P-z72d39TqiDyPKfF-79hVV0DLpjUeHN8_o6u9Fs3d-YXHWd262y0YfhxmSKL8NVv_HQZ0XR1q0ro0bp8uf9mradGlHX7o2YslF3cROpfTSNoFMjopJFdFDjiP8FW6VsmS7pBaySNyRJ5kmcJBfVapHHb2hS0Hk6ZTlN8uW-WBTAigSKJV0kxQVfkZhM4zfJnMTJfPpmktM53RfJfFosp8WMpWgaQ025mAhxqCdKlxfcmBZWyyVJlheC5iCM_8MJQmqwNG9L_1G0QYR0944ozfor3G4mBM8TsmMT8DihLEqzI-1se6FXTokob0uDprHgxppBLcutgNXw1xkv3Mw6ln1UX7RarF4I3e6W1EtutPo3FBaRnQfBxWzA4bAi_w0AAP__CC5hIg">