[LLVMbugs] [Bug 17603] New: x86: optimize byte+shift load to unaligned load

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Wed Oct 16 13:57:16 PDT 2013


http://llvm.org/bugs/show_bug.cgi?id=17603

            Bug ID: 17603
           Summary: x86: optimize byte+shift load to unaligned load
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: darkjames-ws at darkjames.pl
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Created attachment 11382
  --> http://llvm.org/bugs/attachment.cgi?id=11382&action=edit
Sample C file with safe-load-version/ unaligned-one + disasm

Hello,

In wireshark we have bunch of safe (both alignment-safe, and endianess) macros
to read uint16_t/uint32_t/uint64_t from byte array.

They looks like this:

#define pntohs(p)   ((guint16)                       \
                     ((guint16)*((const guint8 *)(p)+0)<<8|  \
                      (guint16)*((const guint8 *)(p)+1)<<0))

#define pletohs(p)  ((guint16)                       \
                     ((guint16)*((const guint8 *)(p)+1)<<8|  \
                      (guint16)*((const guint8 *)(p)+0)<<0))

#define pntohl(p)   ((guint32)*((const guint8 *)(p)+0)<<24|  \
                     (guint32)*((const guint8 *)(p)+1)<<16|  \
                     (guint32)*((const guint8 *)(p)+2)<<8|   \
                     (guint32)*((const guint8 *)(p)+3)<<0)


It'd be great if clang could detect such operations and change it on x86/AMD64
to:

#define pntohs(p)  ((uint16_t) __builtin_bswap16(*(uint16_t *) p)))
#define plethos(p) ((uint16_t) (*(uint16_t *) p))
#define pntohl(p)  ((uint32_t) __builtin_bswap32(*(uint32_t *) p)))

Generally the idea is the same like in Linux Kernel (config symbol:
HAVE_EFFICIENT_UNALIGNED_ACCESS selected on x86/ppc) or ffmpeg (#ifdef
HAVE_FAST_UNALIGNED) to do unaligned access on CPUs which can do it fast (i.e.:
faster than byte&shift load).

We could also do it in wireshark. but I like more idea of writting portable
code, where low-level optimization like above are done on compiler-level, not
in code.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20131016/8f0b0e2f/attachment.html>


More information about the llvm-bugs mailing list