[LLVMdev] SIMD instructions and memory alignment on X86

Craig Topper craig.topper at gmail.com
Thu Jul 18 22:47:53 PDT 2013


Hmm, maybe sse isn't being enabled so its falling back to emulating sqrt?


On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman <peter at uformia.com> wrote:

>  In the disassembly, I'm seeing three cases of
> call        76719BA1
>
> I am assuming this is the sqrt function as this is the only function
> called in the LLVM IR.
>
> The code at 76719BA1 is:
>
> 76719BA1  push        ebp
> 76719BA2  mov         ebp,esp
> 76719BA4  sub         esp,20h
> 76719BA7  and         esp,0FFFFFFF0h
> 76719BAA  fld         st(0)
> 76719BAC  fst         dword ptr [esp+18h]
> 76719BB0  fistp       qword ptr [esp+10h]
> 76719BB4  fild        qword ptr [esp+10h]
> 76719BB8  mov         edx,dword ptr [esp+18h]
> 76719BBC  mov         eax,dword ptr [esp+10h]
> 76719BC0  test        eax,eax
> 76719BC2  je          76719DCF
> 76719BC8  fsubp       st(1),st
> 76719BCA  test        edx,edx
> 76719BCC  js          7671F9DB
> 76719BD2  fstp        dword ptr [esp]
> 76719BD5  mov         ecx,dword ptr [esp]
> 76719BD8  add         ecx,7FFFFFFFh
> 76719BDE  sbb         eax,0
> 76719BE1  mov         edx,dword ptr [esp+14h]
> 76719BE5  sbb         edx,0
> 76719BE8  leave
> 76719BE9  ret
>
>
> As you can see at 76719BD5, it modifies ECX .
>
> I don't know that this is the sqrtpd function (for example, I'm not seeing
> any SSE instructions here?) but whatever it is, it's being called from the
> IR I attached earlier, and is modifying ECX under some circumstances.
>
>
> On 19/07/2013 3:29 PM, Craig Topper wrote:
>
> That should map directly to sqrtpd which can't modify ecx.
>
>
> On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com> wrote:
>
>>  Sorry, that should have been llvm.x86.sse2.sqrt.pd
>>
>>
>> On 19/07/2013 3:25 PM, Craig Topper wrote:
>>
>> What is "frep.x86.sse2.sqrt.pd". I'm only familiar with things prefixed
>> with "llvm.x86".
>>
>>
>> On Thu, Jul 18, 2013 at 10:12 PM, Peter Newman <peter at uformia.com> wrote:
>>
>>>  After stepping through the produced assembly, I believe I have a
>>> culprit.
>>>
>>> One of the calls to @frep.x86.sse2.sqrt.pd is modifying the value of ECX
>>> - while the produced code is expecting it to still contain its previous
>>> value.
>>>
>>> Peter N
>>>
>>>
>>> On 19/07/2013 2:09 PM, Peter Newman wrote:
>>>
>>> I've attached the module->dump() that our code is producing.
>>> Unfortunately this is the smallest test case I have available.
>>>
>>> This is before any optimization passes are applied. There are two
>>> separate modules in existence at the time, and there are no guarantees
>>> about the order the surrounding code calls those functions, so there may be
>>> some interaction between them? There shouldn't be, they don't refer to any
>>> common memory etc. There is no multi-threading occurring.
>>>
>>> The function in module-dump.ll (called crashfunc in this file) is called
>>> with
>>> -        func_params    0x0018f3b0    double [3]
>>>         [0x0]    -11.339976634695301    double
>>>         [0x1]    -9.7504239056205506    double
>>>         [0x2]    -5.2900856817382804    double
>>> at the time of the exception.
>>>
>>> This is compiled on a "i686-pc-win32" triple. All of the non-intrinsic
>>> functions referred to in these modules are the standard equivalents from
>>> the MSVC library (e.g. @asin is the standard C lib    double asin( double )
>>> ).
>>>
>>> Hopefully this is reproducible for you.
>>>
>>> --
>>> PeterN
>>>
>>> On 18/07/2013 4:37 PM, Craig Topper wrote:
>>>
>>> Are you able to send any IR for others to reproduce this issue?
>>>
>>>
>>> On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman <peter at uformia.com>wrote:
>>>
>>>> Unfortunately, this doesn't appear to be the bug I'm hitting. I applied
>>>> the fix to my source and it didn't make a difference.
>>>>
>>>> Also further testing found me getting the same behavior with other SIMD
>>>> instructions. The common factor is in each case, ECX is set to 0x7fffffff,
>>>> and it's an operation using xmm ptr ecx+offset .
>>>>
>>>> Additionally, turning the optimization level passed to createJIT down
>>>> appears to avoid it, so I'm now leaning towards a bug in one of the
>>>> optimization passes.
>>>>
>>>> I'm going to dig through the passes controlled by that parameter and
>>>> see if I can narrow down which optimization is causing it.
>>>>
>>>> Peter N
>>>>
>>>>
>>>> On 17/07/2013 1:58 PM, Solomon Boulos wrote:
>>>>
>>>>> As someone off list just told me, perhaps my new bug is the same issue:
>>>>>
>>>>>    http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>
>>>>> Do you happen to be using FastISel?
>>>>>
>>>>> Solomon
>>>>>
>>>>> On Jul 16, 2013, at 6:39 PM, Peter Newman <peter at uformia.com> wrote:
>>>>>
>>>>>  Hello all,
>>>>>>
>>>>>> I'm currently in the process of debugging a crash occurring in our
>>>>>> program. In LLVM 3.2 and 3.3 it appears that JIT generated code is
>>>>>> attempting to perform access unaligned memory with a SSE2 instruction.
>>>>>> However this only happens under certain conditions that seem (but may not
>>>>>> be) related to the stacks state on calling the function.
>>>>>>
>>>>>> Our program acts as a front-end, using the LLVM C++ API to generate a
>>>>>> JIT generated function. This function is primarily mathematical, so we use
>>>>>> the Vector types to take advantage of SIMD instructions (as well as a few
>>>>>> SSE2 intrinsics).
>>>>>>
>>>>>> This worked in LLVM 2.8 but started failing in 3.2 and has continued
>>>>>> to fail in 3.3. It fails with no optimizations applied to the LLVM
>>>>>> Function/Module. It crashes with what is reported as a memory access error
>>>>>> (accessing 0xffffffff), however it's suggested that this is how the SSE
>>>>>> fault raising mechanism appears.
>>>>>>
>>>>>> The generated instruction varies, but it seems to often be similar to
>>>>>> (I don't have it in front of me, sorry):
>>>>>> movapd xmm0, xmm[ecx+0x???????]
>>>>>> Where the xmm register changes, and the second parameter is a memory
>>>>>> access.
>>>>>> ECX is always set to 0x7ffffff - however I don't know if this is part
>>>>>> of the SSE error reporting process or is part of the situation causing the
>>>>>> error.
>>>>>>
>>>>>> I haven't worked out exactly what code path etc is causing this
>>>>>> crash. I'm hoping that someone can tell me if there were any changed
>>>>>> requirements for working with SIMD in LLVM 3.2 (or earlier, we haven't
>>>>>> tried 3.0 or 3.1). I currently suspect the use of GlobalVariable (we first
>>>>>> discovered the crash when using a feature that uses them), however I have
>>>>>> attempted using setAlignment on the GlobalVariables without any change.
>>>>>>
>>>>>> --
>>>>>> Peter N
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>
>>>
>>>
>>>  --
>>> ~Craig
>>>
>>>
>>>
>>>
>>
>>
>>  --
>> ~Craig
>>
>>
>>
>
>
>  --
> ~Craig
>
>
>


-- 
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130718/ff8dda6c/attachment.html>


More information about the llvm-dev mailing list