<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/99137>99137</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Implement the `msad4` HLSL Function
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            metabug,
            backend:DirectX,
            HLSL,
            backend:SPIR-V,
            bot:HLSL
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          farzonl
      </td>
    </tr>
</table>

<pre>
    - [ ] Implement `msad4` clang builtin,
- [ ] Link `msad4` clang builtin with `hlsl_intrinsics.h`
- [ ] Add sema checks for `msad4` to `CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [ ] Add codegen for `msad4` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp`
- [ ] Add codegen tests to `clang/test/CodeGenHLSL/builtins/msad4.hlsl`
- [ ] Add sema tests to `clang/test/SemaHLSL/BuiltIns/msad4-errors.hlsl`
- [ ] Create the `int_dx_msad4` intrinsic in `IntrinsicsDirectX.td`
- [ ] Create the `DXILOpMapping` of `int_dx_msad4` to  `53` in `DXIL.td`
- [ ] Create the  `msad4.ll` and `msad4_errors.ll` tests in `llvm/test/CodeGen/DirectX/`
- [ ] Create the `int_spv_msad4` intrinsic in `IntrinsicsSPIRV.td`
- [ ] In SPIRVInstructionSelector.cpp create the `msad4` lowering and map  it to `int_spv_msad4` in `SPIRVInstructionSelector::selectIntrinsic`.
- [ ] Create SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/msad4.ll`

## DirectX

| DXIL Opcode | DXIL OpName | Shader Model | Shader Stages |
| ----------- | ----------- | ------------ | ------------- |
| 53 | Bfi | 6.0 | () |

## SPIR-V

# SAbs:

## Description:
**SAbs**  
  
Result is *x* if *x* ≥ 0; otherwise result is -*x*, where *x* is
interpreted as a signed integer.  
  
*Result Type* and the type of *x* must both be integer scalar or integer
vector types. *Result Type* and operand types must have the same number
of components with the same component width. Results are computed per
component.

<table>
<colgroup>
<col style="width: 20%" />
<col style="width: 20%" />
<col style="width: 20%" />
<col style="width: 20%" />
<col style="width: 20%" />
</colgroup>
<thead>
<tr>
<th>Number</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p>5</p></td>
<td
class="tableblock halign-left valign-top"><p><em>&lt;id&gt;</em><br />
<em>x</em></p></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>



## Test Case(s)

 
 ### Example 1
```hlsl
//dxc msad4_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export uint4 fn(uint p1, uint2 p2, uint4 p3) {
    return msad4(p1, p2, p3);
}
```
## HLSL:

Compares a 4-byte reference value and an 8-byte source value and accumulates a vector of 4 sums. Each sum corresponds to the masked sum of absolute differences of a different byte alignment between the reference value and the source value.



| uint4 result = msad4(uint reference, uint2 source, uint4 accum); |
|------------------------------------------------------------------|



 

## Parameters

<dl> <dt>

<span id="reference"></span><span id="REFERENCE"></span>*reference*
</dt> <dd>

\[in\] The reference array of 4 bytes in one **uint** value.

</dd> <dt>

<span id="source"></span><span id="SOURCE"></span>*source*
</dt> <dd>

\[in\] The source array of 8 bytes in two **uint2** values.

</dd> <dt>

<span id="accum"></span><span id="ACCUM"></span>*accum*
</dt> <dd>

\[in\] A vector of 4 values. **msad4** adds this vector to the masked sum of absolute differences of the different byte alignments between the reference value and the source value.

</dd> </dl>

## Return Value

A vector of 4 sums. Each sum corresponds to the masked sum of absolute differences of different byte alignments between the reference value and the source value. **msad4** doesn't include a difference in the sum if that difference is masked (that is, the reference byte is 0).

## Remarks

To use the **msad4** intrinsic in your shader code, call the [**ID3D11Device::CheckFeatureSupport**](/windows/desktop/api/d3d11/nf-d3d11-id3d11device-checkfeaturesupport) method with [**D3D11\_FEATURE\_D3D11\_OPTIONS**](/windows/desktop/api/d3d11/ne-d3d11-d3d11_feature) to verify that the Direct3D device supports the [**SAD4ShaderInstructions**](/windows/desktop/api/d3d11/ns-d3d11-d3d11_feature_data_d3d11_options) feature option. The **msad4** intrinsic requires a WDDM 1.2 display driver, and all WDDM 1.2 display drivers must support **msad4**. If your app creates a rendering device with [feature level](/windows/desktop/direct3d11/overviews-direct3d-11-devices-downlevel-intro) 11.0 or 11.1 and the compilation target is shader model 5 or later, the HLSL source code can use the **msad4** intrinsic.

Return values are only accurate up to 65535. If you call the **msad4** intrinsic with inputs that might result in return values greater than 65535, **msad4** produces undefined results.

### Minimum Shader Model

This function is supported in the following shader models.



| Shader Model                                                | Supported |
|-------------------------------------------------------------|-----------|
| [Shader model 5 or later](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/d3d11-graphics-reference-sm5.md) | yes       |



 

## Examples

Here is an example result calculation for **msad4**:


```
reference = 0xA100B2C3;
source.x = 0xD7B0C372
source.y = 0x4F57C2A3
accum = {1,2,3,4}
result.x alignment source: 0xD7B0C372
result.x = accum.x + |0xD7   0xA1| + 0 (masked) + |0xC3   0xB2| + |0x72   0xC3| = 1 + 54 + 0 + 17 + 81 = 153
result.y alignment source: 0xA3D7B0C3
result.y = accum.y + |0xA3   0xA1| + 0 (masked) + |0xB0   0xB2| + |0xC3   0xC3| = 2 + 2 + 0 + 2 + 0 = 6
result.z alignment source: 0xC2A3D7B0
result.z = accum.z + |0xC2   0xA1| + 0 (masked) + |0xD7   0xB2| + |0xB0   0xC3| = 3 + 33 + 0 + 37 + 19 = 92
result.w alignment source: 0x57C2A3D7
result.w = accum.w + |0x57   0xA1| + 0 (masked) + |0xA3   0xB2| + |0xD7   0xC3| = 4 + 74 + 0 + 15 + 20 = 113
result = {153,6,92,113}
```



Here is an example of how you can use **msad4** to search for a reference pattern within a buffer:


```
uint4 accum = {0,0,0,0};
for(uint i=0;i<REF_SIZE;i++)
    accum = msad4(
        buf_ref[i], 
        uint2(buf_src[DTid.x+i], buf_src[DTid.x+i+1]), 
        accum);
buf_accum[DTid.x] = accum;
```



## Requirements



| Requirement | Value |
|-------------------------------------|-------------------------------------------------------------|
| Minimum supported client<br/> | Windows 8 \[desktop apps \| UWP apps\]<br/>           |
| Minimum supported server<br/> | Windows Server 2012 \[desktop apps \| UWP apps\]<br/> |



## See also

<dl> <dt>

[Intrinsic Functions](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/dx-graphics-hlsl-intrinsic-functions.md)
</dt> </dl>
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzUOltv47jOv0Z9ERLYcpw0D31InGS3wNzQdnYX30uh2HKib2zLR5Kbdn_9ASn5Nk060909BzhBZmKLNG8iKZIuN0YeKiFuSLwm8eaKN_ao9E3O9Z-qKq72Knu5mVASrymJN_S2rAtRispSMg9Kw7MZmQc0LXh1oPtGFlZWhCUkWPWPfJDVt8vY9CTtEcDHwhSPsrJaVkamZnok82BEZ5Vl1IiS0_Qo0m-G5kqPyFoFtwkAf_1w_2Ht6O-aKrVSVQkvCsCSFWDdi5IjpqwO07Suz_FKVSYOorrAZ1tKO2Czfa51Tz75xS__iLYVxhpPEc1C2A7WCNslKhO_iAp4ELbz1jKE7VCQKZjrooXeIAuae5oo421PcyK0VtqcJZ1owa2g9iiApqzsY_b82Jmk2zav_223jRupRWr_mNrsbYqbP24_fK4_8rqW1QFIqvwcI6soLMdRb2t48m3y3e5NnQvwKuuWHr3ODuLs5ugWxVP5ajcI23mNCNv92EamfvoZI91_ub377YwOtxVF0G1lrG7Qje9FIVKrNDgWTUcMO0aFOgktqwMqWvKaUmm9M5wRCsPhAhcSrUi0MnjXiUvmwfSc3kBk8hvd8_SbqDK0JU25ET-wJ_ImbAdeN-njv3P0wvsifFlEWETbHXBri4SCC9DPNQQVHdx_4qW7vz_yTGj6UWWiGC7cW34QBlY8oUn_oT-4f70wGRCKIwSvc4m_82mAv4RdE7Zs8TqFnOW6JXq_2hsw_UhpYVIta9gdD2LwRVS8opQE7r87YZrCUmkoYatnAMm8uyRbRq6XZBXTgERrquxR6JM0guruqYnHJSyhp6PQoqdjSLCSlRW61sKKjHJDOcXzIwP_Fgehp70ghLWyPLzUAgiAR4K72pdaYIh7wmVjLN0re6R70RKiJuUF11TpdoUEqyf0S3zeTOl5BqoWGhkBkiN95E8uTgz4RNWUe6SmcpqqslaVqKxxJ1GH1AHoSWb2OKWOk6FcO2ADBqiRToc79XsWJZbvC0GirbtLVXHQqqmHC9TYF0DZEMaQBYlWlAWExYQxCvnlfxmZsN0rpe1R8Ky_0wMAibaf3K7go7gwgH32Wxq-DWZvg6O3wbNXYLjTo7uxBlAcndUngyLHGGcodIV9odJv9MgLeagmhcgtfXLXVtWEMXg0SsBUseNUuxXgOWCYgbP9VcokSkQJv2xeWBKtZUbY_ABXjpEDRslej3YSl5_HKG8I-DdXX5t8ZGRYGERWME6SD3DmJNwIwq4NYUsHxHzkMABp-8yhiKUhQOeB-2LNA3R2hO2y55S66gBOK6yH6OSBFnL_OH-8phNRgQiTcL6XduKyzOSzP6PEc620pY2s7IzmFWHXcEnrELIpXDJas_Z6RuvIHQhrzJmUamEbXTnuhF27x9wDiAqbBUfMZih8bwCs7NqTI1FlzbWAFD2b7F8s5PhcaFGlAlykEZgteUWvHdSoRo9BadqUTcEt0vCpV-V0Rk1Tmind8vQIlzRVWgtTqyrDqhNyaMnNN5EhVOWU740qGitoJnMvgsH1bsFSlAE9F_uLvbAnASXy8bzcmKgHEk9H_rBoDeyPNRJtOqPihnQk-31x1Pq9QfWdzfujffK3P_3p33lnv4FfuOalsEKb7iTJChJtKVzY3uujxNS8ojJziWCgDeuiCjDczQj3brvb3m0_JdtzuGw1ILXqYg5ZowxZL0OckHgtK_zd0IfRPnGt-YvzFdhXLKpVhZUEYSuwr69aRnvneGU_1rfdqh8qe__5690lTVsi71bTu12n43Wvoz2pgY5sqKR5t5beAX-o5CpJvn48r6Mn8S4VV6NQ97J7rXwMoVY8g3A_StPivyv0AfNS8Ju_HP1j28J10avpQuzOpdjf4DEHWP1Hcts_qNwZ42dKmIqwhaWySosmE4NcmmLPhVSaEop_e-R2BDWtDoRdI1AayHtjgVBsaWhA2HL6nQlLrr_5FPWgaGN8F_pKzFHT-6IaTY3rvaBZA5YpLwr3bLx2z9xuok0YbsSTTIVrQHFSsxPcNlrcNzUcsA6VxBtsqXYnWWXqBE1jJsw3LHx2vJZwH2VhSNiuyid4OZH4kyH5CY6SckfZtJSXtBT2qDI_mmrlQrFInDzutquHr3dbuOzWPn95uP386f79YgkvFv7_6EUBGayiT0LL_MXtHpjINb7RhjrpqZfYjO13v9rMXIM76OjN-yUz5yR7zLjlj25J1Z7yknoodUtTzJJvOYMW_2qkK01-32w-0nDKaCZNXfAXmmn5JDT4BtYgRXEJxfd23gqv-U3pbe58jnfDEuCoRZW5AYm3Y7vPrRaFeBLFm6bK3E44S6knoZ-kOJlJuzwBsyFtM8nUqUKCONxQYK0wnAbQ1obhNOwCHnpIWXCwH7VcHwT24j5aShxdxPAQlGO6DVao99pUgeOPlFc_E40-nH0qdCkeG1tVFS9Y-GhuBW1qcMN5HEdxa8xBwL6xv2hRWdUNOie3tJSHo-1mDFVb53rOB9wbDaiVYwcKvmZQa5U1kF6bKhO5rETmSZpRfoIU9VFWsmzK0ezH5ys4sXI_FkYbOwfCEQZqlquiUCdwkKH5zasKczRXeucHn-8Y_zPl5fj5fhpF4vX9BT9CJz9aW-PACbufg7THZj9NVUnY7qNMtTIqtxuFM7mTrCJG2G5fqD3EgVv1gTExOvUdlA8EbKt8TpkcNK-PMjWT7oiZmDKelpmfidEXYXrrXK6RfQPnj59fhcZTildU-M7Ou1nKi7TxEYVD_O_9qZ-xjZup_giEtiF4XoVBsGZJ5JovF27TZw_cLNZBEi1YD3nxkNkuXiRsFZHAlWG4TBZr6Omgo4sIS2aulXMST58H_Y-vTqPVmEeHCcSQLFwz7FAAkVIU2E0b1zSAE96d9WjlFjGJEHHNWkRcXTBcTSJcjTY0RFg862itabjAn-vQIcRRL9TLBfFXkVNgiNmL_9LzX0U_Kf46OCe-V6oXnyGMDaTvrqMNnffy_HlBctg-kH2I2Uv-54A1-0nJ_Q59J7nXp5c8QlgUDUSPnOHDJSIsB85wuiC9c7_NYojZS3_q2cc_6zars27jdeqld_6yGLlN7OzvTB-GvTN0YRFDQMwJS5YQHYDyeszRfs9EvcrpUZ38GeVOwdcniFXUCK7TI2YEPqh2a26t0O59pKwop_sG6uVLOWIwImgVCAhL-n-LjUsXudLtzEGSaBOQaC1JlNxtd4_3t_-3xVu2xu_SD4F6qu3IwgPgs2_yRy1yaNgweyd0APRN5zUgQTKO15sHmU2fCeuwz4LYOkTw8jt6gxEICVbwqFvpHo43vUf52dS57er6Biz7sAv6_jAdwPAwwP7snQfjP3CAemna8qGvDdJCisrieNRNR1HK311xSK-pa6L9SQgVp8GlRUK__v4F7117PaQwKgcucjZCP-Fg_Czne4RSFoTsL8nw3VHbvpIS0LEa9fYUKl53bwVp-6Ld_Ldqiue-oBi_O5y0xZ1xtcXr2Uc7F7jKbqJsGS35lbgJFywM50EYBlfHGyEW-XUYRnnAWchnocjDPGeCL2fxPhLB7EresIDNgkU4Z0EQxsF0Np-J9JrNeBru8ziak1kgSi6LaVE8lVOlD1fSmEbcLJdhtLgq-F4UBv_kgrFSWL5vDoRB2iOM-TeoJFp175o9xL24f4Xm3x92AGVJtGpx482VvgEhJvvmYMgsKKSxphfLSluIm_7vOr5_m4wtRru5V40ubt7YWv-SF9nVWv2_SC1hO9Qc9tQp_3TD_h0AAP__SL0Vqg">