[llvm] [AMDGPU][GFX12] Default component broadcast store (PR #76212)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 3 07:28:49 PST 2024


================
@@ -23,7 +23,8 @@ define amdgpu_ps void @image_store_1d_store_insert_zeros_at_end(<8 x i32> inreg
 ; GCN-NEXT:    ret void
 ;
 ; GFX12-LABEL: @image_store_1d_store_insert_zeros_at_end(
-; GFX12-NEXT:    call void @llvm.amdgcn.image.store.1d.f32.i32(float [[VDATA1:%.*]], i32 1, i32 [[S:%.*]], <8 x i32> [[RSRC:%.*]], i32 0, i32 0)
+; GFX12-NEXT:    [[NEWVDATA4:%.*]] = insertelement <4 x float> <float poison, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00>, float [[VDATA1:%.*]], i64 0
+; GFX12-NEXT:    call void @llvm.amdgcn.image.store.1d.v4f32.i32(<4 x float> [[NEWVDATA4]], i32 15, i32 [[S:%.*]], <8 x i32> [[RSRC:%.*]], i32 0, i32 0)
----------------
jayfoad wrote:

> We can't let optimizations run on IR without a set target-cpu break a program that ultimately executes on gfx12.

So is it OK for this optimization to do different things for GFX12 and pre-GFX12? Do we need a third option where it does nothing if the target CPU is not known?

https://github.com/llvm/llvm-project/pull/76212


More information about the llvm-commits mailing list