[llvm-bugs] [Bug 36426] New: Trick for inlining more small constant sized memcmp() and memcpy()

Sun Feb 18 08:13:10 PST 2018

https://bugs.llvm.org/show_bug.cgi?id=36426

            Bug ID: 36426
           Summary: Trick for inlining more small constant sized memcmp()
                    and memcpy()
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: dave at znu.io
                CC: llvm-bugs at lists.llvm.org

One can use a pair of overlapping loads to avoid calling the C runtime in more
scenarios when the size is constant and awkwardly small. For example,
"memcmp(a, b, 15) == 0" would generate:

size_t offset = 15 - sizeof(int64_t);
bool same = 0 == ((*(int64_t*)a ^ *(int64_t*)b) |
                  (*(int64_t*)(a + offset) ^ *(int64_t*)(b + offset)));

This should scale up to a pair of vector loads too.

I haven't benchmarked this with memcpy() yet, but I'd expect that avoiding a
call to the runtime to be worth it there too and for the same reasons: fewer
register spills, no function call overhead, and no dynamic algorithm selection
once the constantness of the size parameter is lost.

NOTE: memcmp/memcpy doesn't guarantee that the pointers are aligned, therefore
don't assume that the first load is "aligned" and the second load is "not
aligned". The opposite could be true at run time.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180218/0e03dcaf/attachment.html>