[llvm-bugs] [Bug 49210] New: Fuse v128.load64_zero + iXX.widen_low into Load-and-Extend

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Feb 16 11:33:24 PST 2021


https://bugs.llvm.org/show_bug.cgi?id=49210

            Bug ID: 49210
           Summary: Fuse v128.load64_zero + iXX.widen_low into
                    Load-and-Extend
           Product: libraries
           Version: trunk
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: WebAssembly
          Assignee: tlively at google.com
          Reporter: maratek at gmail.com
                CC: llvm-bugs at lists.llvm.org

Even though x86 SSE4 provides load-and-extend instructions (PMOVSXxx/PMOVZXxx),
it don't provide the corresponding intrinsics. Commonly, these instruction
forms are exposed as a combination of 8-byte load (which implicitly zeroes the
upper 8 bytes) and conversion intrinsics, e.g.
_mm_cvtepi16_epi32(_mm_loadl_epi64(ptr)). When such codes are cross-compiled to
x86 using Emscripten's nmmintrin.h header, they will generate two WebAssembly
SIMD instructions, v128.load64_zero and i32x4.widen_low_i16x8_s, which would
subsequently lower into two x86 instructions by a WAsm engine. LLVM should
learn to fuse v128.load64_zero + i32x4.widen_low_i16x8_s into v128.load16x4_s,
so that codes cross-compiled from x86 SSE4 intrinsics generate the same
instructions in a WAsm engine as they do in native compilation.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210216/a1152d3a/attachment.html>


More information about the llvm-bugs mailing list