<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/99171>99171</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Implement the `WavePrefixCountBits` HLSL Function
</td>
</tr>
<tr>
<th>Labels</th>
<td>
metabug,
backend:DirectX,
HLSL,
backend:SPIR-V,
bot:HLSL
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
farzonl
</td>
</tr>
</table>
<pre>
- [ ] Implement `WavePrefixCountBits` clang builtin,
- [ ] Link `WavePrefixCountBits` clang builtin with `hlsl_intrinsics.h`
- [ ] Add sema checks for `WavePrefixCountBits` to `CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [ ] Add codegen for `WavePrefixCountBits` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp`
- [ ] Add codegen tests to `clang/test/CodeGenHLSL/builtins/WavePrefixCountBits.hlsl`
- [ ] Add sema tests to `clang/test/SemaHLSL/BuiltIns/WavePrefixCountBits-errors.hlsl`
- [ ] Create the `int_dx_WavePrefixCountBits` intrinsic in `IntrinsicsDirectX.td`
- [ ] Create the `DXILOpMapping` of `int_dx_WavePrefixCountBits` to `136` in `DXIL.td`
- [ ] Create the `WavePrefixCountBits.ll` and `WavePrefixCountBits_errors.ll` tests in `llvm/test/CodeGen/DirectX/`
- [ ] Create the `int_spv_WavePrefixCountBits` intrinsic in `IntrinsicsSPIRV.td`
- [ ] In SPIRVInstructionSelector.cpp create the `WavePrefixCountBits` lowering and map it to `int_spv_WavePrefixCountBits` in `SPIRVInstructionSelector::selectIntrinsic`.
- [ ] Create SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WavePrefixCountBits.ll`
## DirectX
| DXIL Opcode | DXIL OpName | Shader Model | Shader Stages |
| ----------- | ----------- | ------------ | ------------- |
| 136 | WavePrefixBitCount | 6.0 | ('library', 'compute', 'amplification', 'mesh', 'pixel', 'vertex', 'hull', 'domain', 'geometry', 'raygeneration', 'intersection', 'anyhit', 'closesthit', 'miss', 'callable', 'node') |
## SPIR-V
# [OpGroupNonUniformBallotBitCount](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpGroupNonUniformBallotBitCount):
## Description:
Result is the number of bits that are set to 1 in *Value*, considering
only the bits in *Value* required to represent all bits of the
[group](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Group)'s invocations.
*Result Type* must be a scalar of [*integer type*](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Integer), whose
*Signedness* operand is 0.
*Execution* is a [*Scope*](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Scope_-id-) that identifies the group of
invocations affected by this command. It must be **Subgroup**.
The identity *I* for *Operation* is 0.
*Value* must be a vector of four components of [*integer
type*](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Integer) scalar, whose *Width* operand is 32 and whose
*Signedness* operand is 0.
*Value* is a set of bitfields where the first invocation is represented
in the lowest bit of the first vector component and the last (up to the
size of the group) is the higher bit number of the last bitmask needed
to represent all bits of the group invocations.
[Capability](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Capability):
**GroupNonUniformBallot**
[Missing before](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Unified) **version 1.3**.
<table style="width:100%;">
<colgroup>
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
</colgroup>
<thead>
<tr>
<th>Word Count</th>
<th>Opcode</th>
<th>Results</th>
<th>Operands</th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p>6</p></td>
<td class="tableblock halign-left valign-top"><p>342</p></td>
<td
class="tableblock halign-left valign-top"><p><em><id></em><br />
<em>Result Type</em></p></td>
<td class="tableblock halign-left valign-top"><p><a
href="#ResultId"><em>Result <id></em></a></p></td>
<td class="tableblock halign-left valign-top"><p><a
href="#Scope_-id-"><em>Scope <id></em></a><br />
<em>Execution</em></p></td>
<td class="tableblock halign-left valign-top"><p><a
href="#Group_Operation"><em>Group Operation</em></a><br />
<em>Operation</em></p></td>
<td
class="tableblock halign-left valign-top"><p><em><id></em><br />
<em>Value</em></p></td>
</tr>
</tbody>
</table>
## Test Case(s)
### Example 1
```hlsl
//dxc WavePrefixCountBits_test.hlsl -T lib_6_8 -enable-16bit-types -O0
export uint fn(bool p1) {
return WavePrefixCountBits(p1);
}
```
## HLSL:
Returns the sum of all the specified boolean variables set to true across all active lanes with indices smaller than the current lane.
## Syntax
``` syntax
uint WavePrefixCountBits(
bool bBit
);
```
## Parameters
<dl> <dt>
*bBit*
</dt> <dd>
The specified boolean variables.
</dd> </dl>
## Return value
The sum of all the specified Boolean variables set to true across all active lanes with indices smaller than the current lane.
## Remarks
This function is supported from shader model 6.0 in all shader stages.
## Examples
The following code describes how to implement a compacted write to an ordered stream where the number of elements written per lane is either 1 or 0.
``` syntax
bool bDoesThisLaneHaveAnAppendItem = <expr>;
// compute number of items to append for the whole wave
uint laneAppendOffset = WavePrefixCountBits( bDoesThisLaneHaveAnAppendItem );
uint appendCount = WaveActiveCountBits( bDoesThisLaneHaveAnAppendItem);
// update the output location for this whole wave
uint appendOffset;
if ( WaveIsFirstLane () )
{
// this way, we only issue one atomic for the entire wave, which reduces contention
// and keeps the output data for each lane in this wave together in the output buffer
InterlockedAdd(bufferSize, appendCount, appendOffset);
}
appendOffset = WaveReadLaneFirst( appendOffset ); // broadcast value
appendOffset += laneAppendOffset; // and add in the offset for this lane
buffer[appendOffset] = myData; // write to the offset location for this lane
```
## See also
<dl> <dt>
[Overview of Shader Model 6](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/hlsl-shader-model-6-0-features-for-direct3d-12.md)
</dt> <dt>
[Shader Model 6](https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/shader-model-6-0.md)
</dt> </dl>
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWl1z27bS_jXwDUYaCoxk-8IXlBy1nkmaTpx-3HlAYiniNQjwBUDZ7q8_swtKohzZTntO2nNxOplGABb78ewDYAFGhqA3FuCKzZdsfn0m-9g4f1VL_4ez5qx06ulqwtl8ydn8mt-0nYEWbORskf0mt_Czh1o_rlxv41LHwBYZr4y0G1722kRtmVixrDgo-KDt_bfO5Q86NijcmGDutI1e26CrMG3YIjvSWijFA7SSVw1U94HXzr9iJDocXKHojx9uPyyTtXVvq6idXUljUEpblLqFVpKktptp1XWnLFdOwQbsN1l93-o4Mvr-sfMHY6sfhu63LEUIMQwaCTIm1tjHxHrlFPwAFm0wsR6QDEysT7g1RWBfxPIVI4jKYIE8vnnJwgS8d_60oZUHGYHHBtCCtvFOPd69AN4--QNSN3syXGsPVfx9GtXr-q9_v_nwqfsou07bDap09dtmo-MoNMsXhyShotetvUCCaeKVtOoFgbsBrCSX4E82jdm2X6WYifUQPBPrt8EN3fbPo3v7883nX09Ee2M5Dd3YEH1P6-YWDFTReeQur47Mv2DWuAfw2m4IklZ2nOs48O1Nh2ltvuABywuWF4Fa-1DYIpueQgiVTH7lpazuwSpCnVcywBvIk20m1kjsyWFremGdmYH8-EfkTOR8l7nUd77iSCv-qcMVzkftn2Sb2reNVOD5R6fAjDtuo9xAwJ5B0eTwH3-j_XXHZKRoli9o_BDQUkeKiboX04z-ZuKCiXOjSy_9ExPnTGDfeeXaro9w6JBtZ3StK4mpOnS3EJpDq9OPYA7NLfgIj4d205vRqHKt1CNVG3AtxLETXj5twIJ_ZlPbCD5Addwr7VOj4ygC4wKEeNTX6hBGEtIYWZpRkNap1LrcAbnPeCLavgt5-Kn7wbu--8nZX6yunW-X0hgXdzCz-TUTF02MXUBKizUTaw8bHaJ_mt433lkXps5vBjpOkI-hA2Jhb3WtQc12VJ02sTVM5G-ZFJdo6oipECqvO8KKhjhnWfEZQm8i14HWuO3bEjxuqaXGM6ORkUsPPAAt6BktJlH8Kk0PTBQIVuVs0IrWP8sKZ80TaaL5x9Lcw__32oNCVR46DwELEGlMknY1zkSH58sNBvc9cCPUEB1xjv5tXeJxmBIa9D8mdqh8eerI8bYPkZfAJQ-VNJIAwjJLFMjADXgek-T38PgmmSCfV_yhcQGSk7dY7ykLyOSCuw487r868Ow4mPePUPVpiRQ4LAffbyv33Zwm3XcTrSa4hIhHWoGNKJ2oRhnmrmZZMUoDl3UNVQTFS-SRDrxybSutmvKbuM8DUa-47ctEE2qNYv7SwGAtPqHsDQZORZ0oPnX7XaQ4gdWeq4eUb-kswpTXrvfoT-cs2ETYIxawrPhbeDCwcE8HjOs3rWLzjAe5oBP5L1BmDwPRBVd_2hJqDUYF_tCAT0VBrX2Io3WEE_ZLGxQllwSxRkBEdRzW-TB1QHePKnlME2SIeCb1HW4XaWMI-g_YTR9Sf7nbuhq9acCTgcMutldU6tjKcM8tgCK3XtuCBm6Od4dhJ50vV7KTpTY6Pn2PHI-00wa-ywcTxcntPg2NUjdfftQhYC1WQu08fA8nf0n9dDaS_S34gLmfTfNhLQ5w5auIxyoP8ckAy6-ZEA_E07yYZRkTc5YvmRAsf5-kK2dSWkcdpybz2TucLATHiP4nfEqYifVXcMYGpDq0_GiA5e9_c17xVD3QdOocjaei9vRYOjDDSxNpn3lh9N_qxZY_ah2HWDr1dDJgxSsjQ0hIEktL46p73kijN3ZioI58m35H1w0kzVeI5SJZ6g7-qP-I3vydeFUzy4q_rJvlK2jxb7EwkeVLrZhYbPBXMpQG81Xpj0hE3eNi6Ej4e4DA8pVkWdF4qNN0JvLkwI3aC47deisiJtby73V3XPuMHab-P-HvqVwcarl_JBN0CN2NaqhxfDTID4N_IqyXJ_2XrYVUGH2Tk19vTUebEXbQnTN1ZMeXtS9YLK1kACYuAtYCNEjnfJJAofePeBcHPsPRRZb-0AtdNhz16rHipx6oIoRIb3l88oUbXd4t7i74BCw6NJktSh0nWMkGPvk0PHfAY-d85L22kdeWiYvSOcO7WbohL6kA4dxD7L09ZZKJCxJGlLOCnV-PfT7ETY-Ru4vrZ9KWyrvQt1idYaVGzQ4qqkI4-gHS8q30Gt0Pu8tq9D1wWXkXAk2TVdRbLActhPQira3SFU5opTF4iWtkKlar3nusC1F2enz5f7JRPh6ytYuAh90AIXQagFSlrQQrMkKvXOpIqneojPF4RoifpZctRPBhX1spw_L3HH_EA4lEQVoFcSXRjIZJTu3lvryO4aGAw_lqmI-_zcgUOZaShGuth5Hul9K1_FvT9Rla6e_Dzi8deD18HcBrQ-g7JDUoXnvX8pAe4lp6mVtM6XUSPRn6Az3QTflRciiZRyaHRRkOWNTOGPeANTk9Cip6hykh8MY9YOR6_y1G0j1I0v33wesIOCwtd16BB8VD9CDb0fXrcNOBpCLQvAiWd-AJEAwUdMS70Yw7z7Ppi9RNpLx2EBCqD9LCj3ILhS26Dqy6idByll8jE-Cxo60t0Za2Gj48FI580hFa-vAgSQFdwdHrh8YZ4A9yC7v1go4mK5_qGgmBdk4vorcc3C0m0psMD2-dg8qCiPXtKg_LM8XZd2r3JO762PWRm93lNwWow4kI5Si6pE7XeL0ll27CGi_DaD09w17yYc_f76yD8aRdPtHdHzg9t-kQevwJXEbX6mqPM9io_eAGvRXoquEeVI-rqHI2ooCzxxbw_n0P0IVxhEpGSWpBVs1AK7tzZos03QBRbLjtD9PKvq7pWQT139gIHs9oUIVSeILQ6K3-g7wbperQHPA6PjbkKaJ8BqkQPwIScT2WIg27EEvvpKpkiPtd65nwErU-5-RIAWIkldpHm-bts48zcTml6OfLo1jm1-Rz-3Qtoxzp3K_3kcaviTWofn5SDIcTAJcmuNdPiPny0xb8VsMDrtGjzxGLU88FGx2bvpxWrmVi_VHjDu3qeO3ojeBB21wwsS6NK_F4SL0Kwn103ST4aqhD6CtJrqg4GT62pG11QtvtZDHJJjXI2HsIk9r5yW7GZCamrRpWw7Pj7CiofySQ5zG85Oru5DxTV7m6zC_lGVzNzsVstsjms_ysuYIsy7Jqls1qmNe5kKosq5mqs8sZ1BdzOT_TVyIT77Lz2UJk2UUuphdqfnFeA7yT84tKZe_Yuwxaqc3UmG07dX5zRvvC1eXl7Hx2ZmQJJtC_DRCihSjLfkMl_YoJMXwzY3mx_w45jKRvw1-J7Z6KdgMusrzYyc6vz_wVOjEp-01g7zKjA306G9yKOhq4OvwDhNe_LaJWvvukf9Z7c_VKUoePfGS88-7_oIpMrAkHzGaCYnsl_hUAAP__GItAaQ">