<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - clang should optimize common patterns to portably read big/little-endian data"
   href="http://llvm.org/bugs/show_bug.cgi?id=20605">20605</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>clang should optimize common patterns to portably read big/little-endian data
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>new-bugs
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>3.4
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>new bugs
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>fuzxxl@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvmbugs@cs.uiuc.edu
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>A common idiom to portably read integers with well-defined endianess from a
buffer is code like this:

    uint32_t le32read(uint8_t buf[static 4]) {
            return ((uint32_t)buf[0]
                | (uint32_t)buf[1] << 8
                | (uint32_t)buf[2] << 16
                | (uint32_t)buf[3] << 24);
    }

    uint32_t be32read(uint8_t buf[static 4]) {
            return ((uint32_t)buf[0]
                | (uint32_t)buf[1] << 24
                | (uint32_t)buf[2] << 16
                | (uint32_t)buf[3] << 8);
    }

On architectures where unaligned reads are legal (e.g. amd64), the above code
could be compiled into something like this:

    le32read:
            mov (%rdi), %eax
            ret

    be32read:
            mov (%rdi), %eax
            bswap %eax
            ret

Yet clang seems to not optimize this kind of code well. Here is the assembly
clang generates (cleaned up):

    le32read:
            movzbl  (%rdi), %eax
            movzbl  1(%rdi), %ecx
            shll    $8, %ecx
            orl     %eax, %ecx
            movzbl  2(%rdi), %edx
            shll    $16, %edx
            orl     %ecx, %edx
            movzbl  3(%rdi), %eax
            shll    $24, %eax
            orl     %edx, %eax
            ret

    be32read:
            movzbl  (%rdi), %eax
            shll    $24, %eax
            movzbl  1(%rdi), %ecx
            shll    $16, %ecx
            orl     %eax, %ecx
            movzbl  2(%rdi), %edx
            shll    $8, %edx
            orl     %ecx, %edx
            movzbl  3(%rdi), %eax
            orl     %edx, %eax
            ret

It would be great if clang was capable of optimizing this kind of pattern as it
pops up continuously and is one of the few (the only?) way to (de)serialise
data with known endianess in a portable fashion.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>