[LLVMdev] SIMD instructions and memory alignment on X86

Peter Newman peter at uformia.com
Thu Jul 18 22:53:44 PDT 2013


Is there something specifically required to enable SSE? If it's not 
detected as available (based from the target triple?) then I don't think 
we enable it specifically.

Also it seems that it should handle converting to/from the vector types, 
although I can see it getting confused about needing to do that if it 
thinks SSE isn't available at all.

On 19/07/2013 3:47 PM, Craig Topper wrote:
> Hmm, maybe sse isn't being enabled so its falling back to emulating sqrt?
>
>
> On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman <peter at uformia.com 
> <mailto:peter at uformia.com>> wrote:
>
>     In the disassembly, I'm seeing three cases of
>     call        76719BA1
>
>     I am assuming this is the sqrt function as this is the only
>     function called in the LLVM IR.
>
>     The code at 76719BA1 is:
>
>     76719BA1  push        ebp
>     76719BA2  mov         ebp,esp
>     76719BA4  sub         esp,20h
>     76719BA7  and         esp,0FFFFFFF0h
>     76719BAA  fld         st(0)
>     76719BAC  fst         dword ptr [esp+18h]
>     76719BB0  fistp       qword ptr [esp+10h]
>     76719BB4  fild        qword ptr [esp+10h]
>     76719BB8  mov         edx,dword ptr [esp+18h]
>     76719BBC  mov         eax,dword ptr [esp+10h]
>     76719BC0  test        eax,eax
>     76719BC2  je          76719DCF
>     76719BC8  fsubp       st(1),st
>     76719BCA  test        edx,edx
>     76719BCC  js          7671F9DB
>     76719BD2  fstp        dword ptr [esp]
>     76719BD5  mov         ecx,dword ptr [esp]
>     76719BD8  add         ecx,7FFFFFFFh
>     76719BDE  sbb         eax,0
>     76719BE1  mov         edx,dword ptr [esp+14h]
>     76719BE5  sbb         edx,0
>     76719BE8  leave
>     76719BE9  ret
>
>
>     As you can see at 76719BD5, it modifies ECX .
>
>     I don't know that this is the sqrtpd function (for example, I'm
>     not seeing any SSE instructions here?) but whatever it is, it's
>     being called from the IR I attached earlier, and is modifying ECX
>     under some circumstances.
>
>
>     On 19/07/2013 3:29 PM, Craig Topper wrote:
>>     That should map directly to sqrtpd which can't modify ecx.
>>
>>
>>     On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com
>>     <mailto:peter at uformia.com>> wrote:
>>
>>         Sorry, that should have been llvm.x86.sse2.sqrt.pd
>>
>>
>>         On 19/07/2013 3:25 PM, Craig Topper wrote:
>>>         What is "frep.x86.sse2.sqrt.pd". I'm only familiar with
>>>         things prefixed with "llvm.x86".
>>>
>>>
>>>         On Thu, Jul 18, 2013 at 10:12 PM, Peter Newman
>>>         <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>
>>>             After stepping through the produced assembly, I believe
>>>             I have a culprit.
>>>
>>>             One of the calls to @frep.x86.sse2.sqrt.pd is modifying
>>>             the value of ECX - while the produced code is expecting
>>>             it to still contain its previous value.
>>>
>>>             Peter N
>>>
>>>
>>>             On 19/07/2013 2:09 PM, Peter Newman wrote:
>>>>             I've attached the module->dump() that our code is
>>>>             producing. Unfortunately this is the smallest test case
>>>>             I have available.
>>>>
>>>>             This is before any optimization passes are applied.
>>>>             There are two separate modules in existence at the
>>>>             time, and there are no guarantees about the order the
>>>>             surrounding code calls those functions, so there may be
>>>>             some interaction between them? There shouldn't be, they
>>>>             don't refer to any common memory etc. There is no
>>>>             multi-threading occurring.
>>>>
>>>>             The function in module-dump.ll (called crashfunc in
>>>>             this file) is called with
>>>>             - func_params 0x0018f3b0    double [3]
>>>>                     [0x0] -11.339976634695301 double
>>>>                     [0x1] -9.7504239056205506 double
>>>>                     [0x2] -5.2900856817382804 double
>>>>             at the time of the exception.
>>>>
>>>>             This is compiled on a "i686-pc-win32" triple. All of
>>>>             the non-intrinsic functions referred to in these
>>>>             modules are the standard equivalents from the MSVC
>>>>             library (e.g. @asin is the standard C lib    double
>>>>             asin( double ) ).
>>>>
>>>>             Hopefully this is reproducible for you.
>>>>
>>>>             --
>>>>             PeterN
>>>>
>>>>             On 18/07/2013 4:37 PM, Craig Topper wrote:
>>>>>             Are you able to send any IR for others to reproduce
>>>>>             this issue?
>>>>>
>>>>>
>>>>>             On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman
>>>>>             <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>>
>>>>>                 Unfortunately, this doesn't appear to be the bug
>>>>>                 I'm hitting. I applied the fix to my source and it
>>>>>                 didn't make a difference.
>>>>>
>>>>>                 Also further testing found me getting the same
>>>>>                 behavior with other SIMD instructions. The common
>>>>>                 factor is in each case, ECX is set to 0x7fffffff,
>>>>>                 and it's an operation using xmm ptr ecx+offset .
>>>>>
>>>>>                 Additionally, turning the optimization level
>>>>>                 passed to createJIT down appears to avoid it, so
>>>>>                 I'm now leaning towards a bug in one of the
>>>>>                 optimization passes.
>>>>>
>>>>>                 I'm going to dig through the passes controlled by
>>>>>                 that parameter and see if I can narrow down which
>>>>>                 optimization is causing it.
>>>>>
>>>>>                 Peter N
>>>>>
>>>>>
>>>>>                 On 17/07/2013 1:58 PM, Solomon Boulos wrote:
>>>>>
>>>>>                     As someone off list just told me, perhaps my
>>>>>                     new bug is the same issue:
>>>>>
>>>>>                     http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>
>>>>>                     Do you happen to be using FastISel?
>>>>>
>>>>>                     Solomon
>>>>>
>>>>>                     On Jul 16, 2013, at 6:39 PM, Peter Newman
>>>>>                     <peter at uformia.com <mailto:peter at uformia.com>>
>>>>>                     wrote:
>>>>>
>>>>>                         Hello all,
>>>>>
>>>>>                         I'm currently in the process of debugging
>>>>>                         a crash occurring in our program. In LLVM
>>>>>                         3.2 and 3.3 it appears that JIT generated
>>>>>                         code is attempting to perform access
>>>>>                         unaligned memory with a SSE2 instruction.
>>>>>                         However this only happens under certain
>>>>>                         conditions that seem (but may not be)
>>>>>                         related to the stacks state on calling the
>>>>>                         function.
>>>>>
>>>>>                         Our program acts as a front-end, using the
>>>>>                         LLVM C++ API to generate a JIT generated
>>>>>                         function. This function is primarily
>>>>>                         mathematical, so we use the Vector types
>>>>>                         to take advantage of SIMD instructions (as
>>>>>                         well as a few SSE2 intrinsics).
>>>>>
>>>>>                         This worked in LLVM 2.8 but started
>>>>>                         failing in 3.2 and has continued to fail
>>>>>                         in 3.3. It fails with no optimizations
>>>>>                         applied to the LLVM Function/Module. It
>>>>>                         crashes with what is reported as a memory
>>>>>                         access error (accessing 0xffffffff),
>>>>>                         however it's suggested that this is how
>>>>>                         the SSE fault raising mechanism appears.
>>>>>
>>>>>                         The generated instruction varies, but it
>>>>>                         seems to often be similar to (I don't have
>>>>>                         it in front of me, sorry):
>>>>>                         movapd xmm0, xmm[ecx+0x???????]
>>>>>                         Where the xmm register changes, and the
>>>>>                         second parameter is a memory access.
>>>>>                         ECX is always set to 0x7ffffff - however I
>>>>>                         don't know if this is part of the SSE
>>>>>                         error reporting process or is part of the
>>>>>                         situation causing the error.
>>>>>
>>>>>                         I haven't worked out exactly what code
>>>>>                         path etc is causing this crash. I'm hoping
>>>>>                         that someone can tell me if there were any
>>>>>                         changed requirements for working with SIMD
>>>>>                         in LLVM 3.2 (or earlier, we haven't tried
>>>>>                         3.0 or 3.1). I currently suspect the use
>>>>>                         of GlobalVariable (we first discovered the
>>>>>                         crash when using a feature that uses
>>>>>                         them), however I have attempted using
>>>>>                         setAlignment on the GlobalVariables
>>>>>                         without any change.
>>>>>
>>>>>                         --
>>>>>                         Peter N
>>>>>                         _______________________________________________
>>>>>                         LLVM Developers mailing list
>>>>>                         LLVMdev at cs.uiuc.edu
>>>>>                         <mailto:LLVMdev at cs.uiuc.edu>
>>>>>                         http://llvm.cs.uiuc.edu
>>>>>                         http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>>
>>>>>                 _______________________________________________
>>>>>                 LLVM Developers mailing list
>>>>>                 LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>>>>>                 http://llvm.cs.uiuc.edu
>>>>>                 http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             -- 
>>>>>             ~Craig
>>>>
>>>
>>>
>>>
>>>
>>>         -- 
>>>         ~Craig
>>
>>
>>
>>
>>     -- 
>>     ~Craig
>
>
>
>
> -- 
> ~Craig

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130719/f809d63d/attachment.html>


More information about the llvm-dev mailing list