[llvm] [AMDGPU] Use reverse iteration in CodeGenPrepare (PR #145484)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 10 02:00:38 PDT 2025


================
@@ -2160,7 +2160,22 @@ define amdgpu_kernel void @rsq_f32_vector_fpmath(ptr addrspace(1) %out, <2 x flo
 ; IEEE-GOODFREXP-NEXT:    [[TMP38:%.*]] = insertelement <2 x float> poison, float [[TMP27]], i64 0
 ; IEEE-GOODFREXP-NEXT:    [[MD_1ULP_UNDEF:%.*]] = insertelement <2 x float> [[TMP38]], float [[TMP37]], i64 1
 ; IEEE-GOODFREXP-NEXT:    store volatile <2 x float> [[MD_1ULP_UNDEF]], ptr addrspace(1) [[OUT]], align 4
-; IEEE-GOODFREXP-NEXT:    [[SQRT_X_3ULP:%.*]] = call contract <2 x float> @llvm.sqrt.v2f32(<2 x float> [[X]]), !fpmath [[META3:![0-9]+]]
+; IEEE-GOODFREXP-NEXT:    [[TMP56:%.*]] = extractelement <2 x float> [[X]], i64 0
----------------
arsenm wrote:

This is an improvement, we're getting the targeted rsq expansion that was missed before 

https://github.com/llvm/llvm-project/pull/145484


More information about the llvm-commits mailing list