<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Apr 5, 2016 at 8:23 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class="">----- Original Message -----<br>
> From: "David Majnemer via llvm-commits" <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>><br>
> To: <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
> Sent: Tuesday, April 5, 2016 7:15:01 PM<br>
> Subject: [llvm] r265493 - [SLPVectorizer] Vectorize libcalls of sqrt<br>
><br>
> Author: majnemer<br>
> Date: Tue Apr 5 19:14:59 2016<br>
> New Revision: 265493<br>
><br>
> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=265493&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=265493&view=rev</a><br>
> Log:<br>
> [SLPVectorizer] Vectorize libcalls of sqrt<br>
><br>
> We didn't realize that we could transform the libcall into a<br>
> vectorized<br>
> intrinsic.<br>
<br>
</span>But, as I recall, we can't. The problem is that our sqrt intrinsic is a special case: it has different UB-related semantics than the libcall. Specifically, the LangRef says, "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt."<br>
<br>
We can do this if we're in no-NaNs mode.<br></blockquote><div><br></div><div>Ah, good call. Fixed in r265521.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
-Hal<br>
<div class=""><div class="h5"><br>
><br>
> Modified:<br>
> llvm/trunk/lib/Analysis/VectorUtils.cpp<br>
> llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll<br>
> llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll<br>
><br>
> Modified: llvm/trunk/lib/Analysis/VectorUtils.cpp<br>
> URL:<br>
> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/VectorUtils.cpp?rev=265493&r1=265492&r2=265493&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/VectorUtils.cpp?rev=265493&r1=265492&r2=265493&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/lib/Analysis/VectorUtils.cpp (original)<br>
> +++ llvm/trunk/lib/Analysis/VectorUtils.cpp Tue Apr 5 19:14:59 2016<br>
> @@ -220,6 +220,10 @@ Intrinsic::ID llvm::getIntrinsicIDForCal<br>
> case LibFunc::powf:<br>
> case LibFunc::powl:<br>
> return checkBinaryFloatSignature(*CI, Intrinsic::pow);<br>
> + case LibFunc::sqrt:<br>
> + case LibFunc::sqrtf:<br>
> + case LibFunc::sqrtl:<br>
> + return checkUnaryFloatSignature(*CI, Intrinsic::sqrt);<br>
> }<br>
><br>
> return Intrinsic::not_intrinsic;<br>
><br>
> Modified:<br>
> llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll<br>
> URL:<br>
> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll?rev=265493&r1=265492&r2=265493&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll?rev=265493&r1=265492&r2=265493&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll<br>
> (original)<br>
> +++ llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll Tue<br>
> Apr 5 19:14:59 2016<br>
> @@ -3,31 +3,6 @@<br>
> target datalayout =<br>
> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"<br>
> target triple = "x86_64-unknown-linux-gnu"<br>
><br>
> -;CHECK-LABEL: @sqrt_f32(<br>
> -;CHECK: vsqrtf{{.*}}<4 x float><br>
> -;CHECK: ret void<br>
> -declare float @sqrtf(float) nounwind readnone<br>
> -define void @sqrt_f32(i32 %n, float* noalias %y, float* noalias %x)<br>
> nounwind uwtable {<br>
> -entry:<br>
> - %cmp6 = icmp sgt i32 %n, 0<br>
> - br i1 %cmp6, label %for.body, label %for.end<br>
> -<br>
> -for.body: ; preds = %entry,<br>
> %for.body<br>
> - %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry<br>
> ]<br>
> - %arrayidx = getelementptr inbounds float, float* %y, i64<br>
> %indvars.iv<br>
> - %0 = load float, float* %arrayidx, align 4<br>
> - %call = tail call float @sqrtf(float %0) nounwind readnone<br>
> - %arrayidx2 = getelementptr inbounds float, float* %x, i64<br>
> %indvars.iv<br>
> - store float %call, float* %arrayidx2, align 4<br>
> - %indvars.iv.next = add i64 %indvars.iv, 1<br>
> - %lftr.wideiv = trunc i64 %indvars.iv.next to i32<br>
> - %exitcond = icmp eq i32 %lftr.wideiv, %n<br>
> - br i1 %exitcond, label %for.end, label %for.body<br>
> -<br>
> -for.end: ; preds =<br>
> %for.body, %entry<br>
> - ret void<br>
> -}<br>
> -<br>
> ;CHECK-LABEL: @exp_f32(<br>
> ;CHECK: vexpf{{.*}}<4 x float><br>
> ;CHECK: ret void<br>
> @@ -160,6 +135,7 @@ for.end:<br>
> ;CHECK-LABEL: @sqrt_f32_nobuiltin(<br>
> ;CHECK-NOT: vsqrtf{{.*}}<4 x float><br>
> ;CHECK: ret void<br>
> +declare float @sqrtf(float) nounwind readnone<br>
> define void @sqrt_f32_nobuiltin(i32 %n, float* noalias %y, float*<br>
> noalias %x) nounwind uwtable {<br>
> entry:<br>
> %cmp6 = icmp sgt i32 %n, 0<br>
><br>
> Modified: llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll<br>
> URL:<br>
> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll?rev=265493&r1=265492&r2=265493&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll?rev=265493&r1=265492&r2=265493&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll (original)<br>
> +++ llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll Tue Apr 5<br>
> 19:14:59 2016<br>
> @@ -7,6 +7,7 @@ declare double @sin(double)<br>
> declare double @cos(double)<br>
> declare double @pow(double, double)<br>
> declare double @exp2(double)<br>
> +declare double @sqrt(double)<br>
> declare i64 @round(i64)<br>
><br>
><br>
> @@ -92,6 +93,28 @@ entry:<br>
> store double %call, double* %c, align 8<br>
> %arrayidx5 = getelementptr inbounds double, double* %c, i64 1<br>
> store double %call5, double* %arrayidx5, align 8<br>
> + ret void<br>
> +}<br>
> +<br>
> +<br>
> +; CHECK: sqrt_libm<br>
> +; CHECK: call <2 x double> @llvm.sqrt.v2f64<br>
> +; CHECK: ret void<br>
> +define void @sqrt_libm(double* %a, double* %b, double* %c) {<br>
> +entry:<br>
> + %i0 = load double, double* %a, align 8<br>
> + %i1 = load double, double* %b, align 8<br>
> + %mul = fmul double %i0, %i1<br>
> + %call = tail call double @sqrt(double %mul) nounwind readnone<br>
> + %arrayidx3 = getelementptr inbounds double, double* %a, i64 1<br>
> + %i3 = load double, double* %arrayidx3, align 8<br>
> + %arrayidx4 = getelementptr inbounds double, double* %b, i64 1<br>
> + %i4 = load double, double* %arrayidx4, align 8<br>
> + %mul5 = fmul double %i3, %i4<br>
> + %call5 = tail call double @sqrt(double %mul5) nounwind readnone<br>
> + store double %call, double* %c, align 8<br>
> + %arrayidx5 = getelementptr inbounds double, double* %c, i64 1<br>
> + store double %call5, double* %arrayidx5, align 8<br>
> ret void<br>
> }<br>
><br>
><br>
><br>
> _______________________________________________<br>
> llvm-commits mailing list<br>
> <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
><br>
<br>
</div></div><span class=""><font color="#888888">--<br>
Hal Finkel<br>
Assistant Computational Scientist<br>
Leadership Computing Facility<br>
Argonne National Laboratory<br>
</font></span></blockquote></div><br></div></div>