<html>

    <head>

      <base href="https://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - [x86-64] Inefficient code generated for "if (Operator *O = dyn_cast<Operator>(V)) { foo(O); }""

   href="https://llvm.org/bugs/show_bug.cgi?id=28430">28430</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>[x86-64] Inefficient code generated for "if (Operator *O = dyn_cast<Operator>(V)) { foo(O); }"

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Backend: X86

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>justin.lebar@gmail.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>I would expect the following two functions to compile to (roughly) the same

x86-64 machine code.

  void foo(Operator *O);

  void test1(Value *V) {

    if (Operator *O = dyn_cast<Operator>(V))

      foo(O);

  }   

  void test2(Value *V) {

    if (isa<Operator>(V))

      foo(cast<Operator>(V));

  }

But they don't:

_Z5test1PN4llvm5ValueE:

        mov     cl, byte ptr [rdi + 24]

        cmp     cl, 23

        seta    al

        cmp     cl, 10

        setne   cl

        test    rdi, rdi

        je      .LBB0_2

        xor     al, cl

        jne     .LBB0_2

        jmp     _Z3fooPN4llvm8OperatorE@PLT # TAILCALL

.LBB0_2:

        ret

_Z5test2PN4llvm5ValueE:

        mov     al, byte ptr [rdi + 24]

        cmp     al, 23

        ja      .LBB0_3

        cmp     al, 10

        je      .LBB0_3

        ret

.LBB0_3:

        jmp     _Z3fooPN4llvm8OperatorE@PLT # TAILCALL

It seems to me that there are two problems with the code generated for test1:

First, the null check on rdi shouldn't be necessary. The isa call inside

dyn_cast dereferences the pointer, so (I think?) we should be able to assume

that it's not null.  (Adding an explicit __builtin_assume to the implementatin

of dyn_cast gets rid of this check.)

Second, although I'm not enough of an x86 expert to say whether the two-jump

version (as in test2) or one-jump version (as in test1) is preferable, since

they do the same thing, presumably we should prefer one over the other, if only

due to code size differences.  (I also wouldn't be surprised if there were

performance differences.)

gcc 4.8.4 emits the same code for both functions, and it's basically identical

to the code clang generates for test2:

        movzx   eax, BYTE PTR 24[rdi]

        cmp     al, 23

        ja      .L9

        cmp     al, 10

        je      .L9

        rep ret

.L9:

        jmp     _Z3fooPN4llvm8OperatorE@PLT

Here are the -print-after-all -debug logs for the two functions:

test1: <a href="https://gist.github.com/fbbabc5299aca5a275b2b38a6d9d217f">https://gist.github.com/fbbabc5299aca5a275b2b38a6d9d217f</a>

test2: <a href="https://gist.github.com/24dee5133a003c12b0c127e5e5f2a3c9">https://gist.github.com/24dee5133a003c12b0c127e5e5f2a3c9</a>

Since we use this idiom everywhere in clang and llvm, it seems like a good bet

that we'd get a performance improvement from improving the codegen here.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>