<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Apr 5, 2016 at 8:23 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class="">----- Original Message -----<br>

> From: "David Majnemer via llvm-commits" <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>><br>

> To: <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>

> Sent: Tuesday, April 5, 2016 7:15:01 PM<br>

> Subject: [llvm] r265493 - [SLPVectorizer] Vectorize libcalls of sqrt<br>

><br>

> Author: majnemer<br>

> Date: Tue Apr  5 19:14:59 2016<br>

> New Revision: 265493<br>

><br>

> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=265493&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=265493&view=rev</a><br>

> Log:<br>

> [SLPVectorizer] Vectorize libcalls of sqrt<br>

><br>

> We didn't realize that we could transform the libcall into a<br>

> vectorized<br>

> intrinsic.<br>

<br>

</span>But, as I recall, we can't. The problem is that our sqrt intrinsic is a special case: it has different UB-related semantics than the libcall. Specifically, the LangRef says, "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt."<br>

<br>

We can do this if we're in no-NaNs mode.<br></blockquote><div><br></div><div>Ah, good call.  Fixed in r265521.</div>


<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<br>

 -Hal<br>

<div class=""><div class="h5"><br>

><br>

> Modified:<br>

>     llvm/trunk/lib/Analysis/VectorUtils.cpp<br>

>     llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll<br>

>     llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll<br>

><br>

> Modified: llvm/trunk/lib/Analysis/VectorUtils.cpp<br>

> URL:<br>

> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/VectorUtils.cpp?rev=265493&r1=265492&r2=265493&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/VectorUtils.cpp?rev=265493&r1=265492&r2=265493&view=diff</a><br>

> ==============================================================================<br>

> --- llvm/trunk/lib/Analysis/VectorUtils.cpp (original)<br>

> +++ llvm/trunk/lib/Analysis/VectorUtils.cpp Tue Apr  5 19:14:59 2016<br>

> @@ -220,6 +220,10 @@ Intrinsic::ID llvm::getIntrinsicIDForCal<br>

>    case LibFunc::powf:<br>

>    case LibFunc::powl:<br>

>      return checkBinaryFloatSignature(*CI, Intrinsic::pow);<br>

> +  case LibFunc::sqrt:<br>

> +  case LibFunc::sqrtf:<br>

> +  case LibFunc::sqrtl:<br>

> +    return checkUnaryFloatSignature(*CI, Intrinsic::sqrt);<br>

>    }<br>

><br>

>    return Intrinsic::not_intrinsic;<br>

><br>

> Modified:<br>

> llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll<br>

> URL:<br>

> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll?rev=265493&r1=265492&r2=265493&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll?rev=265493&r1=265492&r2=265493&view=diff</a><br>

> ==============================================================================<br>

> --- llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll<br>

> (original)<br>

> +++ llvm/trunk/test/Transforms/LoopVectorize/X86/veclib-calls.ll Tue<br>

> Apr  5 19:14:59 2016<br>

> @@ -3,31 +3,6 @@<br>

>  target datalayout =<br>

>  "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"<br>

>  target triple = "x86_64-unknown-linux-gnu"<br>

><br>

> -;CHECK-LABEL: @sqrt_f32(<br>

> -;CHECK: vsqrtf{{.*}}<4 x float><br>

> -;CHECK: ret void<br>

> -declare float @sqrtf(float) nounwind readnone<br>

> -define void @sqrt_f32(i32 %n, float* noalias %y, float* noalias %x)<br>

> nounwind uwtable {<br>

> -entry:<br>

> -  %cmp6 = icmp sgt i32 %n, 0<br>

> -  br i1 %cmp6, label %for.body, label %for.end<br>

> -<br>

> -for.body:                                         ; preds = %entry,<br>

> %for.body<br>

> -  %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry<br>

> ]<br>

> -  %arrayidx = getelementptr inbounds float, float* %y, i64<br>

> %indvars.iv<br>

> -  %0 = load float, float* %arrayidx, align 4<br>

> -  %call = tail call float @sqrtf(float %0) nounwind readnone<br>

> -  %arrayidx2 = getelementptr inbounds float, float* %x, i64<br>

> %indvars.iv<br>

> -  store float %call, float* %arrayidx2, align 4<br>

> -  %indvars.iv.next = add i64 %indvars.iv, 1<br>

> -  %lftr.wideiv = trunc i64 %indvars.iv.next to i32<br>

> -  %exitcond = icmp eq i32 %lftr.wideiv, %n<br>

> -  br i1 %exitcond, label %for.end, label %for.body<br>

> -<br>

> -for.end:                                          ; preds =<br>

> %for.body, %entry<br>

> -  ret void<br>

> -}<br>

> -<br>

>  ;CHECK-LABEL: @exp_f32(<br>

>  ;CHECK: vexpf{{.*}}<4 x float><br>

>  ;CHECK: ret void<br>

> @@ -160,6 +135,7 @@ for.end:<br>

>  ;CHECK-LABEL: @sqrt_f32_nobuiltin(<br>

>  ;CHECK-NOT: vsqrtf{{.*}}<4 x float><br>

>  ;CHECK: ret void<br>

> +declare float @sqrtf(float) nounwind readnone<br>

>  define void @sqrt_f32_nobuiltin(i32 %n, float* noalias %y, float*<br>

>  noalias %x) nounwind uwtable {<br>

>  entry:<br>

>    %cmp6 = icmp sgt i32 %n, 0<br>

><br>

> Modified: llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll<br>

> URL:<br>

> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll?rev=265493&r1=265492&r2=265493&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll?rev=265493&r1=265492&r2=265493&view=diff</a><br>

> ==============================================================================<br>

> --- llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll (original)<br>

> +++ llvm/trunk/test/Transforms/SLPVectorizer/X86/call.ll Tue Apr  5<br>

> 19:14:59 2016<br>

> @@ -7,6 +7,7 @@ declare double @sin(double)<br>

>  declare double @cos(double)<br>

>  declare double @pow(double, double)<br>

>  declare double @exp2(double)<br>

> +declare double @sqrt(double)<br>

>  declare i64 @round(i64)<br>

><br>

><br>

> @@ -92,6 +93,28 @@ entry:<br>

>    store double %call, double* %c, align 8<br>

>    %arrayidx5 = getelementptr inbounds double, double* %c, i64 1<br>

>    store double %call5, double* %arrayidx5, align 8<br>

> +  ret void<br>

> +}<br>

> +<br>

> +<br>

> +; CHECK: sqrt_libm<br>

> +; CHECK: call <2 x double> @llvm.sqrt.v2f64<br>

> +; CHECK: ret void<br>

> +define void @sqrt_libm(double* %a, double* %b, double* %c) {<br>

> +entry:<br>

> +  %i0 = load double, double* %a, align 8<br>

> +  %i1 = load double, double* %b, align 8<br>

> +  %mul = fmul double %i0, %i1<br>

> +  %call = tail call double @sqrt(double %mul) nounwind readnone<br>

> +  %arrayidx3 = getelementptr inbounds double, double* %a, i64 1<br>

> +  %i3 = load double, double* %arrayidx3, align 8<br>

> +  %arrayidx4 = getelementptr inbounds double, double* %b, i64 1<br>

> +  %i4 = load double, double* %arrayidx4, align 8<br>

> +  %mul5 = fmul double %i3, %i4<br>

> +  %call5 = tail call double @sqrt(double %mul5) nounwind readnone<br>

> +  store double %call, double* %c, align 8<br>

> +  %arrayidx5 = getelementptr inbounds double, double* %c, i64 1<br>

> +  store double %call5, double* %arrayidx5, align 8<br>

>    ret void<br>

>  }<br>

><br>

><br>

><br>

> _______________________________________________<br>

> llvm-commits mailing list<br>

> <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>

><br>

<br>

</div></div><span class=""><font color="#888888">--<br>

Hal Finkel<br>

Assistant Computational Scientist<br>

Leadership Computing Facility<br>

Argonne National Laboratory<br>

</font></span></blockquote></div><br></div></div>