[llvm-branch-commits] [llvm] AMDGPU: Stop requiring afn for f32 rsq formation (PR #172082)

Yingwei Zheng via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Fri Dec 19 11:08:56 PST 2025


================
@@ -3582,9 +3585,13 @@ define <4 x float> @rsq_f32_vector_mixed_constant_numerator_correct_sqrt(<4 x fl
 ; DAZ-NEXT:    [[TMP2:%.*]] = extractelement <4 x float> [[DENOM]], i64 1
 ; DAZ-NEXT:    [[TMP3:%.*]] = extractelement <4 x float> [[DENOM]], i64 2
 ; DAZ-NEXT:    [[TMP4:%.*]] = extractelement <4 x float> [[DENOM]], i64 3
-; DAZ-NEXT:    [[TMP5:%.*]] = call contract float @llvm.amdgcn.rcp.f32(float [[TMP1]])
-; DAZ-NEXT:    [[TMP6:%.*]] = fneg contract float [[TMP2]]
-; DAZ-NEXT:    [[TMP7:%.*]] = call contract float @llvm.amdgcn.rcp.f32(float [[TMP6]])
+; DAZ-NEXT:    [[TMP31:%.*]] = extractelement <4 x float> [[ARG]], i64 0
+; DAZ-NEXT:    [[TMP6:%.*]] = extractelement <4 x float> [[ARG]], i64 1
+; DAZ-NEXT:    [[TMP32:%.*]] = extractelement <4 x float> [[ARG]], i64 2
+; DAZ-NEXT:    [[TMP33:%.*]] = extractelement <4 x float> [[ARG]], i64 3
+; DAZ-NEXT:    [[TMP5:%.*]] = call contract float @llvm.amdgcn.rsq.f32(float [[TMP31]])
----------------
dtcxzyw wrote:

Do rsq and rcp share the same latency?

https://github.com/llvm/llvm-project/pull/172082


More information about the llvm-branch-commits mailing list