[llvm] [AMDGPU][GFX12] Default component broadcast store (PR #76212)

Wed Dec 27 10:14:38 PST 2023

================
@@ -23,7 +23,8 @@ define amdgpu_ps void @image_store_1d_store_insert_zeros_at_end(<8 x i32> inreg
 ; GCN-NEXT:    ret void
 ;
 ; GFX12-LABEL: @image_store_1d_store_insert_zeros_at_end(
-; GFX12-NEXT:    call void @llvm.amdgcn.image.store.1d.f32.i32(float [[VDATA1:%.*]], i32 1, i32 [[S:%.*]], <8 x i32> [[RSRC:%.*]], i32 0, i32 0)
+; GFX12-NEXT:    [[NEWVDATA4:%.*]] = insertelement <4 x float> <float poison, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00>, float [[VDATA1:%.*]], i64 0
+; GFX12-NEXT:    call void @llvm.amdgcn.image.store.1d.v4f32.i32(<4 x float> [[NEWVDATA4]], i32 15, i32 [[S:%.*]], <8 x i32> [[RSRC:%.*]], i32 0, i32 0)
----------------
jayfoad wrote:

No the old optimization is not wrong. GFX12 has an incompatible change in the behaviour of the instruction. The old optimization takes advantage of the pre-GFX12 behaviour, and the new optimization takes advantage of the GFX12 behaviour. This relies on the subtarget being set correctly so we know whether we're compiling for GFX12 or not - is that a problem?

https://github.com/llvm/llvm-project/pull/76212