<html>
<head>
<base href="http://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - x86: optimize byte+shift load to unaligned load"
href="http://llvm.org/bugs/show_bug.cgi?id=17603">17603</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>x86: optimize byte+shift load to unaligned load
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>darkjames-ws@darkjames.pl
</td>
</tr>
<tr>
<th>CC</th>
<td>llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=11382" name="attach_11382" title="Sample C file with safe-load-version/ unaligned-one + disasm">attachment 11382</a> <a href="attachment.cgi?id=11382&action=edit" title="Sample C file with safe-load-version/ unaligned-one + disasm">[details]</a></span>
Sample C file with safe-load-version/ unaligned-one + disasm
Hello,
In wireshark we have bunch of safe (both alignment-safe, and endianess) macros
to read uint16_t/uint32_t/uint64_t from byte array.
They looks like this:
#define pntohs(p) ((guint16) \
((guint16)*((const guint8 *)(p)+0)<<8| \
(guint16)*((const guint8 *)(p)+1)<<0))
#define pletohs(p) ((guint16) \
((guint16)*((const guint8 *)(p)+1)<<8| \
(guint16)*((const guint8 *)(p)+0)<<0))
#define pntohl(p) ((guint32)*((const guint8 *)(p)+0)<<24| \
(guint32)*((const guint8 *)(p)+1)<<16| \
(guint32)*((const guint8 *)(p)+2)<<8| \
(guint32)*((const guint8 *)(p)+3)<<0)
It'd be great if clang could detect such operations and change it on x86/AMD64
to:
#define pntohs(p) ((uint16_t) __builtin_bswap16(*(uint16_t *) p)))
#define plethos(p) ((uint16_t) (*(uint16_t *) p))
#define pntohl(p) ((uint32_t) __builtin_bswap32(*(uint32_t *) p)))
Generally the idea is the same like in Linux Kernel (config symbol:
HAVE_EFFICIENT_UNALIGNED_ACCESS selected on x86/ppc) or ffmpeg (#ifdef
HAVE_FAST_UNALIGNED) to do unaligned access on CPUs which can do it fast (i.e.:
faster than byte&shift load).
We could also do it in wireshark. but I like more idea of writting portable
code, where low-level optimization like above are done on compiler-level, not
in code.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>