<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jun 19, 2017 at 11:58 AM, Tom Stellard <span dir="ltr"><<a href="mailto:tstellar@redhat.com" target="_blank">tstellar@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-">On 06/19/2017 11:45 AM, Andrew Kelley via llvm-dev wrote:<br>
> Greetings,<br>
><br>
> I have a Zig implementation of ceil which is emitted into LLVM IR like this:<br>
><br>
> ; Function Attrs: nobuiltin nounwind<br>
> define internal fastcc float @ceil(float) unnamed_addr #3 !dbg !644 {<br>
> Entry:<br>
> %x = alloca float, align 4<br>
> store float %0, float* %x<br>
> call void @llvm.dbg.declare(metadata float* %x, metadata !649, metadata !494), !dbg !651<br>
> %1 = load float, float* %x, !dbg !652<br>
> %2 = call fastcc float @ceil32(float %1) #8, !dbg !656<br>
> ret float %2, !dbg !657<br>
> }<br>
><br>
<br>
</span>What does the declaration of @ceil32() look like ?<br></blockquote><div><br></div><div>LLVM IR follows; source follows after that.</div><div><br></div><div>; Function Attrs: nobuiltin nounwind</div><div>define internal fastcc float @ceil32(float) unnamed_addr #3 !dbg !658 {</div><div>Entry:</div><div> %x = alloca float, align 4</div><div> %u = alloca i32, align 4</div><div> %e = alloca i32, align 4</div><div> %m = alloca i32, align 4</div><div> store float %0, float* %x</div><div> call void @llvm.dbg.declare(metadata float* %x, metadata !660, metadata !494), !dbg !670</div><div> %1 = load float, float* %x, !dbg !671</div><div> %2 = bitcast float %1 to i32, !dbg !672</div><div> store i32 %2, i32* %u, !dbg !673</div><div> call void @llvm.dbg.declare(metadata i32* %u, metadata !661, metadata !494), !dbg !673</div><div> %3 = load i32, i32* %u, !dbg !674</div><div> %4 = lshr i32 %3, 23, !dbg !675</div><div> %5 = and i32 %4, 255, !dbg !676</div><div> %6 = sub nsw i32 %5, 127, !dbg !677</div><div> store i32 %6, i32* %e, !dbg !678</div><div> call void @llvm.dbg.declare(metadata i32* %e, metadata !665, metadata !494), !dbg !678</div><div> call void @llvm.dbg.declare(metadata i32* %m, metadata !668, metadata !494), !dbg !679</div><div> %7 = load i32, i32* %e, !dbg !680</div><div> %8 = icmp sge i32 %7, 23, !dbg !682</div><div> br i1 %8, label %Then, label %Else, !dbg !682</div><div><br></div><div>Then: ; preds = %Entry</div><div> %9 = load float, float* %x, !dbg !683</div><div> ret float %9, !dbg !685</div><div><br></div><div>Else: ; preds = %Entry</div><div> %10 = load i32, i32* %e, !dbg !686</div><div> %11 = icmp sge i32 %10, 0, !dbg !687</div><div> br i1 %11, label %Then1, label %Else2, !dbg !687</div><div><br></div><div>Then1: ; preds = %Else</div><div> %12 = load i32, i32* %e, !dbg !688</div><div> %13 = lshr i32 8388607, %12, !dbg !690</div><div> store i32 %13, i32* %m, !dbg !691</div><div> %14 = load i32, i32* %u, !dbg !692</div><div> %15 = load i32, i32* %m, !dbg !693</div><div> %16 = and i32 %14, %15, !dbg !694</div><div> %17 = icmp eq i32 %16, 0, !dbg !695</div><div> br i1 %17, label %Then3, label %Else4, !dbg !695</div><div><br></div><div>Else2: ; preds = %Else</div><div> %18 = load float, float* %x, !dbg !696</div><div> %19 = fadd fast float %18, 0x4770000000000000, !dbg !698</div><div> call fastcc void @forceEval(float %19), !dbg !699</div><div> %20 = load i32, i32* %u, !dbg !700</div><div> %21 = lshr i32 %20, 31, !dbg !701</div><div> %22 = icmp ne i32 %21, 0, !dbg !702</div><div> br i1 %22, label %Then5, label %Else6, !dbg !702</div><div><br></div><div>Then3: ; preds = %Then1</div><div> %23 = load float, float* %x, !dbg !703</div><div> ret float %23, !dbg !705</div><div><br></div><div>Else4: ; preds = %Then1</div><div> br label %EndIf, !dbg !706</div><div><br></div><div>Then5: ; preds = %Else2</div><div> ret float -0.000000e+00, !dbg !707</div><div><br></div><div>Else6: ; preds = %Else2</div><div> br label %EndIf7, !dbg !709</div><div><br></div><div>EndIf: ; preds = %Else4</div><div> %24 = load float, float* %x, !dbg !710</div><div> %25 = fadd fast float %24, 0x4770000000000000, !dbg !711</div><div> call fastcc void @forceEval(float %25), !dbg !712</div><div> %26 = load i32, i32* %u, !dbg !713</div><div> %27 = lshr i32 %26, 31, !dbg !714</div><div> %28 = icmp eq i32 %27, 0, !dbg !715</div><div> br i1 %28, label %Then8, label %Else9, !dbg !715</div><div><br></div><div>EndIf7: ; preds = %Else6</div><div> br label %EndIf11, !dbg !716</div><div><br></div><div>Then8: ; preds = %EndIf</div><div> %29 = load i32, i32* %u, !dbg !717</div><div> %30 = load i32, i32* %m, !dbg !719</div><div> %31 = add nuw i32 %29, %30, !dbg !720</div><div> store i32 %31, i32* %u, !dbg !720</div><div> br label %EndIf10, !dbg !721</div><div><br></div><div>Else9: ; preds = %EndIf</div><div> br label %EndIf10, !dbg !721</div><div><br></div><div>EndIf10: ; preds = %Else9, %Then8</div><div> %32 = load i32, i32* %u, !dbg !722</div><div> %33 = load i32, i32* %m, !dbg !723</div><div> %34 = xor i32 %33, -1, !dbg !724</div><div> %35 = and i32 %32, %34, !dbg !725</div><div> store i32 %35, i32* %u, !dbg !725</div><div> %36 = load i32, i32* %u, !dbg !726</div><div> %37 = bitcast i32 %36 to float, !dbg !727</div><div> br label %EndIf11, !dbg !716</div><div><br></div><div>EndIf11: ; preds = %EndIf10, %EndIf7</div><div> %38 = phi float [ %37, %EndIf10 ], [ 1.000000e+00, %EndIf7 ], !dbg !716</div><div> ret float %38, !dbg !728</div><div>}</div><div><div>; Function Attrs: nobuiltin nounwind</div><div>define internal fastcc void @forceEval(float) unnamed_addr #3 !dbg !840 {</div><div>Entry:</div><div> %value = alloca float, align 4</div><div> %x = alloca float, align 4</div><div> %p = alloca float*, align 8</div><div> store float %0, float* %value</div><div> call void @llvm.dbg.declare(metadata float* %value, metadata !844, metadata !494), !dbg !854</div><div> call void @llvm.dbg.declare(metadata float* %x, metadata !846, metadata !494), !dbg !855</div><div> store float* %x, float** %p, !dbg !856</div><div> call void @llvm.dbg.declare(metadata float** %p, metadata !851, metadata !494), !dbg !856</div><div> %1 = load float*, float** %p, !dbg !857</div><div> %2 = load float, float* %x, !dbg !859</div><div> store volatile float %2, float* %1, !dbg !860</div><div> ret void, !dbg !861</div><div>}</div><div><br></div></div><div><br></div><div>Source:</div><div>fn ceil32(x: f32) -> f32 {</div><div> var u = @bitCast(u32, x);</div><div> var e = i32((u >> 23) & 0xFF) - 0x7F;</div><div> var m: u32 = undefined;</div><div><br></div><div> if (e >= 23) {</div><div> return x;</div><div> }</div><div> else if (e >= 0) {</div><div> m = 0x007FFFFF >> u32(e);</div><div> if (u & m == 0) {</div><div> return x;</div><div> }</div><div> math.forceEval(x + 0x1.0p120);</div><div> if (u >> 31 == 0) {</div><div> u += m;</div><div> }</div><div> u &= ~m;</div><div> @bitCast(f32, u)</div><div> } else {</div><div> math.forceEval(x + 0x1.0p120);</div><div> if (u >> 31 != 0) {</div><div> return -0.0;</div><div> } else {</div><div> 1.0</div><div> }</div><div> }</div><div>}</div><div><div>pub fn forceEval(value: var) {</div><div> const T = @typeOf(value);</div><div> switch (T) {</div><div> f32 => {</div><div> var x: f32 = undefined;</div><div> const p = @ptrCast(&volatile f32, &x);</div><div> *p = x;</div><div> },</div><div> f64 => {</div><div> var x: f64 = undefined;</div><div> const p = @ptrCast(&volatile f64, &x);</div><div> *p = x;</div><div> },</div><div> else => {</div><div> @compileError("forceEval not implemented for " ++ @typeName(T));</div><div> },</div><div> }</div><div>}</div></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<span class="gmail-"><br>
<br>
> Test case:<br>
><br>
> test "math.ceil" {<br>
> assert(ceil(f32(0.0)) == ceil32(0.0));<br>
> assert(ceil(f64(0.0)) == ceil64(0.0));<br>
> }<br>
><br>
><br>
> When I compile with optimizations on, this test case fails. The optimized code for the test case ends up being a call to panic (assertion failure), which means that LLVM determined the test failed at compile-time.<br>
><br>
> What's strange about this is that if I change the function name from @ceil to @ceil_asdf (and change the callers) then the test passes.<br>
><br>
> So I think LLVM is doing some kind of string comparison on the symbol name and detecting that it is "ceil" and then having different, undesired behavior.<br>
><br>
> I tried putting `nobuiltin` in the function attributes and at the callsite, but that did not change anything.<br>
><br>
> Any ideas what's going on?<br>
><br>
> Downstream issue: <a href="https://github.com/zig-lang/zig/issues/393" rel="noreferrer" target="_blank">https://github.com/zig-lang/<wbr>zig/issues/393</a><br>
><br>
> Regards,<br>
> Andrew<br>
><br>
><br>
</span>> ______________________________<wbr>_________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
><br>
<br>
</blockquote></div><br></div></div>