[libc-commits] [libc] [amdgpu][openmp] Avoiding writing to packet header twice (PR #70695)
Joseph Huber via libc-commits
libc-commits at lists.llvm.org
Mon Oct 30 11:11:07 PDT 2023
================
@@ -222,34 +222,36 @@ hsa_status_t launch_kernel(hsa_agent_t dev_agent, hsa_executable_t executable,
// Set up the packet for exeuction on the device. We currently only launch
// with one thread on the device, forcing the rest of the wavefront to be
// masked off.
- std::memset(packet, 0, sizeof(hsa_kernel_dispatch_packet_t));
- packet->setup = (1 + (params.num_blocks_y * params.num_threads_y != 1) +
- (params.num_blocks_z * params.num_threads_z != 1))
- << HSA_KERNEL_DISPATCH_PACKET_SETUP_DIMENSIONS;
+ uint16_t setup = (1 + (params.num_blocks_y * params.num_threads_y != 1) +
+ (params.num_blocks_z * params.num_threads_z != 1))
+ << HSA_KERNEL_DISPATCH_PACKET_SETUP_DIMENSIONS;
packet->workgroup_size_x = params.num_threads_x;
----------------
jhuber6 wrote:
I think we should keep the `memset` but offset it by the first 4 bytes so we don't need to worry about the reserved fields off of the packet. Unless it's very important that we write to these only once.
https://github.com/llvm/llvm-project/pull/70695
More information about the libc-commits
mailing list