[llvm] [AMDGPU][GFX12] Default component broadcast store (PR #76212)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 3 07:28:49 PST 2024
================
@@ -23,7 +23,8 @@ define amdgpu_ps void @image_store_1d_store_insert_zeros_at_end(<8 x i32> inreg
; GCN-NEXT: ret void
;
; GFX12-LABEL: @image_store_1d_store_insert_zeros_at_end(
-; GFX12-NEXT: call void @llvm.amdgcn.image.store.1d.f32.i32(float [[VDATA1:%.*]], i32 1, i32 [[S:%.*]], <8 x i32> [[RSRC:%.*]], i32 0, i32 0)
+; GFX12-NEXT: [[NEWVDATA4:%.*]] = insertelement <4 x float> <float poison, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00>, float [[VDATA1:%.*]], i64 0
+; GFX12-NEXT: call void @llvm.amdgcn.image.store.1d.v4f32.i32(<4 x float> [[NEWVDATA4]], i32 15, i32 [[S:%.*]], <8 x i32> [[RSRC:%.*]], i32 0, i32 0)
----------------
jayfoad wrote:
> We can't let optimizations run on IR without a set target-cpu break a program that ultimately executes on gfx12.
So is it OK for this optimization to do different things for GFX12 and pre-GFX12? Do we need a third option where it does nothing if the target CPU is not known?
https://github.com/llvm/llvm-project/pull/76212
More information about the llvm-commits
mailing list