[PATCH] D41599: [X86] Lowering X86 avx512 sqrt intrinsics to IR - LLVM

Uriel Korach via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 1 01:17:10 PST 2018

uriel.k added a comment.

In https://reviews.llvm.org/D41599#964739, @RKSimon wrote:

> Won't this mean that explicit calls to the SSE sqrt intrinsics may be converted to the rsqrt+NR estimates in some cases?

Yes, this is expected as that's what we are aiming by lowering the intrinsics to IR code, we want the compiler to make a better decision, to get better performance.
Correct me if miss something special about this intrinsic.

Comment at: test/CodeGen/X86/sse-intrinsics-x86.ll:476
-declare <4 x float> @llvm.x86.sse.sqrt.ps(<4 x float>) nounwind readnone
+declare void @llvm.x86.sse.stmxcsr(i8*) nounwind
RKSimon wrote:
> Why did you move this test?
You are right, my mistake. fixed.

Comment at: test/CodeGen/X86/sse2-intrinsics-fast-isel.ll:2954
-declare <2 x double> @llvm.x86.sse2.sqrt.pd(<2 x double>) nounwind readnone
+declare <2 x double> @llvm.sqrt.v2f32(<2 x double>) nounwind readnone
RKSimon wrote:
> Shouldn't that be llvm.sqrt.v2f64?


More information about the llvm-commits mailing list