[llvm] [AMDGPU] Use reverse iteration in CodeGenPrepare (PR #145484)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 10 02:00:38 PDT 2025
================
@@ -2160,7 +2160,22 @@ define amdgpu_kernel void @rsq_f32_vector_fpmath(ptr addrspace(1) %out, <2 x flo
; IEEE-GOODFREXP-NEXT: [[TMP38:%.*]] = insertelement <2 x float> poison, float [[TMP27]], i64 0
; IEEE-GOODFREXP-NEXT: [[MD_1ULP_UNDEF:%.*]] = insertelement <2 x float> [[TMP38]], float [[TMP37]], i64 1
; IEEE-GOODFREXP-NEXT: store volatile <2 x float> [[MD_1ULP_UNDEF]], ptr addrspace(1) [[OUT]], align 4
-; IEEE-GOODFREXP-NEXT: [[SQRT_X_3ULP:%.*]] = call contract <2 x float> @llvm.sqrt.v2f32(<2 x float> [[X]]), !fpmath [[META3:![0-9]+]]
+; IEEE-GOODFREXP-NEXT: [[TMP56:%.*]] = extractelement <2 x float> [[X]], i64 0
----------------
arsenm wrote:
This is an improvement, we're getting the targeted rsq expansion that was missed before
https://github.com/llvm/llvm-project/pull/145484
More information about the llvm-commits
mailing list