[LLVMdev] SIMD instructions and memory alignment on X86
Peter Newman
peter at uformia.com
Thu Jul 18 22:53:44 PDT 2013
Is there something specifically required to enable SSE? If it's not
detected as available (based from the target triple?) then I don't think
we enable it specifically.
Also it seems that it should handle converting to/from the vector types,
although I can see it getting confused about needing to do that if it
thinks SSE isn't available at all.
On 19/07/2013 3:47 PM, Craig Topper wrote:
> Hmm, maybe sse isn't being enabled so its falling back to emulating sqrt?
>
>
> On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman <peter at uformia.com
> <mailto:peter at uformia.com>> wrote:
>
> In the disassembly, I'm seeing three cases of
> call 76719BA1
>
> I am assuming this is the sqrt function as this is the only
> function called in the LLVM IR.
>
> The code at 76719BA1 is:
>
> 76719BA1 push ebp
> 76719BA2 mov ebp,esp
> 76719BA4 sub esp,20h
> 76719BA7 and esp,0FFFFFFF0h
> 76719BAA fld st(0)
> 76719BAC fst dword ptr [esp+18h]
> 76719BB0 fistp qword ptr [esp+10h]
> 76719BB4 fild qword ptr [esp+10h]
> 76719BB8 mov edx,dword ptr [esp+18h]
> 76719BBC mov eax,dword ptr [esp+10h]
> 76719BC0 test eax,eax
> 76719BC2 je 76719DCF
> 76719BC8 fsubp st(1),st
> 76719BCA test edx,edx
> 76719BCC js 7671F9DB
> 76719BD2 fstp dword ptr [esp]
> 76719BD5 mov ecx,dword ptr [esp]
> 76719BD8 add ecx,7FFFFFFFh
> 76719BDE sbb eax,0
> 76719BE1 mov edx,dword ptr [esp+14h]
> 76719BE5 sbb edx,0
> 76719BE8 leave
> 76719BE9 ret
>
>
> As you can see at 76719BD5, it modifies ECX .
>
> I don't know that this is the sqrtpd function (for example, I'm
> not seeing any SSE instructions here?) but whatever it is, it's
> being called from the IR I attached earlier, and is modifying ECX
> under some circumstances.
>
>
> On 19/07/2013 3:29 PM, Craig Topper wrote:
>> That should map directly to sqrtpd which can't modify ecx.
>>
>>
>> On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com
>> <mailto:peter at uformia.com>> wrote:
>>
>> Sorry, that should have been llvm.x86.sse2.sqrt.pd
>>
>>
>> On 19/07/2013 3:25 PM, Craig Topper wrote:
>>> What is "frep.x86.sse2.sqrt.pd". I'm only familiar with
>>> things prefixed with "llvm.x86".
>>>
>>>
>>> On Thu, Jul 18, 2013 at 10:12 PM, Peter Newman
>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>
>>> After stepping through the produced assembly, I believe
>>> I have a culprit.
>>>
>>> One of the calls to @frep.x86.sse2.sqrt.pd is modifying
>>> the value of ECX - while the produced code is expecting
>>> it to still contain its previous value.
>>>
>>> Peter N
>>>
>>>
>>> On 19/07/2013 2:09 PM, Peter Newman wrote:
>>>> I've attached the module->dump() that our code is
>>>> producing. Unfortunately this is the smallest test case
>>>> I have available.
>>>>
>>>> This is before any optimization passes are applied.
>>>> There are two separate modules in existence at the
>>>> time, and there are no guarantees about the order the
>>>> surrounding code calls those functions, so there may be
>>>> some interaction between them? There shouldn't be, they
>>>> don't refer to any common memory etc. There is no
>>>> multi-threading occurring.
>>>>
>>>> The function in module-dump.ll (called crashfunc in
>>>> this file) is called with
>>>> - func_params 0x0018f3b0 double [3]
>>>> [0x0] -11.339976634695301 double
>>>> [0x1] -9.7504239056205506 double
>>>> [0x2] -5.2900856817382804 double
>>>> at the time of the exception.
>>>>
>>>> This is compiled on a "i686-pc-win32" triple. All of
>>>> the non-intrinsic functions referred to in these
>>>> modules are the standard equivalents from the MSVC
>>>> library (e.g. @asin is the standard C lib double
>>>> asin( double ) ).
>>>>
>>>> Hopefully this is reproducible for you.
>>>>
>>>> --
>>>> PeterN
>>>>
>>>> On 18/07/2013 4:37 PM, Craig Topper wrote:
>>>>> Are you able to send any IR for others to reproduce
>>>>> this issue?
>>>>>
>>>>>
>>>>> On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman
>>>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>>
>>>>> Unfortunately, this doesn't appear to be the bug
>>>>> I'm hitting. I applied the fix to my source and it
>>>>> didn't make a difference.
>>>>>
>>>>> Also further testing found me getting the same
>>>>> behavior with other SIMD instructions. The common
>>>>> factor is in each case, ECX is set to 0x7fffffff,
>>>>> and it's an operation using xmm ptr ecx+offset .
>>>>>
>>>>> Additionally, turning the optimization level
>>>>> passed to createJIT down appears to avoid it, so
>>>>> I'm now leaning towards a bug in one of the
>>>>> optimization passes.
>>>>>
>>>>> I'm going to dig through the passes controlled by
>>>>> that parameter and see if I can narrow down which
>>>>> optimization is causing it.
>>>>>
>>>>> Peter N
>>>>>
>>>>>
>>>>> On 17/07/2013 1:58 PM, Solomon Boulos wrote:
>>>>>
>>>>> As someone off list just told me, perhaps my
>>>>> new bug is the same issue:
>>>>>
>>>>> http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>
>>>>> Do you happen to be using FastISel?
>>>>>
>>>>> Solomon
>>>>>
>>>>> On Jul 16, 2013, at 6:39 PM, Peter Newman
>>>>> <peter at uformia.com <mailto:peter at uformia.com>>
>>>>> wrote:
>>>>>
>>>>> Hello all,
>>>>>
>>>>> I'm currently in the process of debugging
>>>>> a crash occurring in our program. In LLVM
>>>>> 3.2 and 3.3 it appears that JIT generated
>>>>> code is attempting to perform access
>>>>> unaligned memory with a SSE2 instruction.
>>>>> However this only happens under certain
>>>>> conditions that seem (but may not be)
>>>>> related to the stacks state on calling the
>>>>> function.
>>>>>
>>>>> Our program acts as a front-end, using the
>>>>> LLVM C++ API to generate a JIT generated
>>>>> function. This function is primarily
>>>>> mathematical, so we use the Vector types
>>>>> to take advantage of SIMD instructions (as
>>>>> well as a few SSE2 intrinsics).
>>>>>
>>>>> This worked in LLVM 2.8 but started
>>>>> failing in 3.2 and has continued to fail
>>>>> in 3.3. It fails with no optimizations
>>>>> applied to the LLVM Function/Module. It
>>>>> crashes with what is reported as a memory
>>>>> access error (accessing 0xffffffff),
>>>>> however it's suggested that this is how
>>>>> the SSE fault raising mechanism appears.
>>>>>
>>>>> The generated instruction varies, but it
>>>>> seems to often be similar to (I don't have
>>>>> it in front of me, sorry):
>>>>> movapd xmm0, xmm[ecx+0x???????]
>>>>> Where the xmm register changes, and the
>>>>> second parameter is a memory access.
>>>>> ECX is always set to 0x7ffffff - however I
>>>>> don't know if this is part of the SSE
>>>>> error reporting process or is part of the
>>>>> situation causing the error.
>>>>>
>>>>> I haven't worked out exactly what code
>>>>> path etc is causing this crash. I'm hoping
>>>>> that someone can tell me if there were any
>>>>> changed requirements for working with SIMD
>>>>> in LLVM 3.2 (or earlier, we haven't tried
>>>>> 3.0 or 3.1). I currently suspect the use
>>>>> of GlobalVariable (we first discovered the
>>>>> crash when using a feature that uses
>>>>> them), however I have attempted using
>>>>> setAlignment on the GlobalVariables
>>>>> without any change.
>>>>>
>>>>> --
>>>>> Peter N
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu
>>>>> <mailto:LLVMdev at cs.uiuc.edu>
>>>>> http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>>>>> http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~Craig
>>>>
>>>
>>>
>>>
>>>
>>> --
>>> ~Craig
>>
>>
>>
>>
>> --
>> ~Craig
>
>
>
>
> --
> ~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130719/f809d63d/attachment.html>
More information about the llvm-dev
mailing list