[LLVMdev] SIMD instructions and memory alignment on X86

Craig Topper craig.topper at gmail.com
Thu Jul 18 23:00:41 PDT 2013


Hmm, I'm not able to get those .ll files to compile if I disable SSE and I
end up with SSE instructions(including sqrtpd) if I don't disable it.


On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <peter at uformia.com> wrote:

>  Is there something specifically required to enable SSE? If it's not
> detected as available (based from the target triple?) then I don't think we
> enable it specifically.
>
> Also it seems that it should handle converting to/from the vector types,
> although I can see it getting confused about needing to do that if it
> thinks SSE isn't available at all.
>
>
> On 19/07/2013 3:47 PM, Craig Topper wrote:
>
> Hmm, maybe sse isn't being enabled so its falling back to emulating sqrt?
>
>
> On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman <peter at uformia.com> wrote:
>
>>  In the disassembly, I'm seeing three cases of
>> call        76719BA1
>>
>> I am assuming this is the sqrt function as this is the only function
>> called in the LLVM IR.
>>
>> The code at 76719BA1 is:
>>
>> 76719BA1  push        ebp
>> 76719BA2  mov         ebp,esp
>> 76719BA4  sub         esp,20h
>> 76719BA7  and         esp,0FFFFFFF0h
>> 76719BAA  fld         st(0)
>> 76719BAC  fst         dword ptr [esp+18h]
>> 76719BB0  fistp       qword ptr [esp+10h]
>> 76719BB4  fild        qword ptr [esp+10h]
>> 76719BB8  mov         edx,dword ptr [esp+18h]
>> 76719BBC  mov         eax,dword ptr [esp+10h]
>> 76719BC0  test        eax,eax
>> 76719BC2  je          76719DCF
>> 76719BC8  fsubp       st(1),st
>> 76719BCA  test        edx,edx
>> 76719BCC  js          7671F9DB
>> 76719BD2  fstp        dword ptr [esp]
>> 76719BD5  mov         ecx,dword ptr [esp]
>> 76719BD8  add         ecx,7FFFFFFFh
>> 76719BDE  sbb         eax,0
>> 76719BE1  mov         edx,dword ptr [esp+14h]
>> 76719BE5  sbb         edx,0
>> 76719BE8  leave
>> 76719BE9  ret
>>
>>
>> As you can see at 76719BD5, it modifies ECX .
>>
>> I don't know that this is the sqrtpd function (for example, I'm not
>> seeing any SSE instructions here?) but whatever it is, it's being called
>> from the IR I attached earlier, and is modifying ECX under some
>> circumstances.
>>
>>
>> On 19/07/2013 3:29 PM, Craig Topper wrote:
>>
>> That should map directly to sqrtpd which can't modify ecx.
>>
>>
>> On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com> wrote:
>>
>>>  Sorry, that should have been llvm.x86.sse2.sqrt.pd
>>>
>>>
>>> On 19/07/2013 3:25 PM, Craig Topper wrote:
>>>
>>> What is "frep.x86.sse2.sqrt.pd". I'm only familiar with things prefixed
>>> with "llvm.x86".
>>>
>>>
>>> On Thu, Jul 18, 2013 at 10:12 PM, Peter Newman <peter at uformia.com>wrote:
>>>
>>>>  After stepping through the produced assembly, I believe I have a
>>>> culprit.
>>>>
>>>> One of the calls to @frep.x86.sse2.sqrt.pd is modifying the value of
>>>> ECX - while the produced code is expecting it to still contain its previous
>>>> value.
>>>>
>>>> Peter N
>>>>
>>>>
>>>> On 19/07/2013 2:09 PM, Peter Newman wrote:
>>>>
>>>> I've attached the module->dump() that our code is producing.
>>>> Unfortunately this is the smallest test case I have available.
>>>>
>>>> This is before any optimization passes are applied. There are two
>>>> separate modules in existence at the time, and there are no guarantees
>>>> about the order the surrounding code calls those functions, so there may be
>>>> some interaction between them? There shouldn't be, they don't refer to any
>>>> common memory etc. There is no multi-threading occurring.
>>>>
>>>> The function in module-dump.ll (called crashfunc in this file) is
>>>> called with
>>>> -        func_params    0x0018f3b0    double [3]
>>>>         [0x0]    -11.339976634695301    double
>>>>         [0x1]    -9.7504239056205506    double
>>>>         [0x2]    -5.2900856817382804    double
>>>> at the time of the exception.
>>>>
>>>> This is compiled on a "i686-pc-win32" triple. All of the non-intrinsic
>>>> functions referred to in these modules are the standard equivalents from
>>>> the MSVC library (e.g. @asin is the standard C lib    double asin( double )
>>>> ).
>>>>
>>>> Hopefully this is reproducible for you.
>>>>
>>>> --
>>>> PeterN
>>>>
>>>> On 18/07/2013 4:37 PM, Craig Topper wrote:
>>>>
>>>> Are you able to send any IR for others to reproduce this issue?
>>>>
>>>>
>>>> On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman <peter at uformia.com>wrote:
>>>>
>>>>> Unfortunately, this doesn't appear to be the bug I'm hitting. I
>>>>> applied the fix to my source and it didn't make a difference.
>>>>>
>>>>> Also further testing found me getting the same behavior with other
>>>>> SIMD instructions. The common factor is in each case, ECX is set to
>>>>> 0x7fffffff, and it's an operation using xmm ptr ecx+offset .
>>>>>
>>>>> Additionally, turning the optimization level passed to createJIT down
>>>>> appears to avoid it, so I'm now leaning towards a bug in one of the
>>>>> optimization passes.
>>>>>
>>>>> I'm going to dig through the passes controlled by that parameter and
>>>>> see if I can narrow down which optimization is causing it.
>>>>>
>>>>> Peter N
>>>>>
>>>>>
>>>>> On 17/07/2013 1:58 PM, Solomon Boulos wrote:
>>>>>
>>>>>> As someone off list just told me, perhaps my new bug is the same
>>>>>> issue:
>>>>>>
>>>>>>    http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>>
>>>>>> Do you happen to be using FastISel?
>>>>>>
>>>>>> Solomon
>>>>>>
>>>>>> On Jul 16, 2013, at 6:39 PM, Peter Newman <peter at uformia.com> wrote:
>>>>>>
>>>>>>  Hello all,
>>>>>>>
>>>>>>> I'm currently in the process of debugging a crash occurring in our
>>>>>>> program. In LLVM 3.2 and 3.3 it appears that JIT generated code is
>>>>>>> attempting to perform access unaligned memory with a SSE2 instruction.
>>>>>>> However this only happens under certain conditions that seem (but may not
>>>>>>> be) related to the stacks state on calling the function.
>>>>>>>
>>>>>>> Our program acts as a front-end, using the LLVM C++ API to generate
>>>>>>> a JIT generated function. This function is primarily mathematical, so we
>>>>>>> use the Vector types to take advantage of SIMD instructions (as well as a
>>>>>>> few SSE2 intrinsics).
>>>>>>>
>>>>>>> This worked in LLVM 2.8 but started failing in 3.2 and has continued
>>>>>>> to fail in 3.3. It fails with no optimizations applied to the LLVM
>>>>>>> Function/Module. It crashes with what is reported as a memory access error
>>>>>>> (accessing 0xffffffff), however it's suggested that this is how the SSE
>>>>>>> fault raising mechanism appears.
>>>>>>>
>>>>>>> The generated instruction varies, but it seems to often be similar
>>>>>>> to (I don't have it in front of me, sorry):
>>>>>>> movapd xmm0, xmm[ecx+0x???????]
>>>>>>> Where the xmm register changes, and the second parameter is a memory
>>>>>>> access.
>>>>>>> ECX is always set to 0x7ffffff - however I don't know if this is
>>>>>>> part of the SSE error reporting process or is part of the situation causing
>>>>>>> the error.
>>>>>>>
>>>>>>> I haven't worked out exactly what code path etc is causing this
>>>>>>> crash. I'm hoping that someone can tell me if there were any changed
>>>>>>> requirements for working with SIMD in LLVM 3.2 (or earlier, we haven't
>>>>>>> tried 3.0 or 3.1). I currently suspect the use of GlobalVariable (we first
>>>>>>> discovered the crash when using a feature that uses them), however I have
>>>>>>> attempted using setAlignment on the GlobalVariables without any change.
>>>>>>>
>>>>>>> --
>>>>>>> Peter N
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>
>>>>
>>>>
>>>>  --
>>>> ~Craig
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>  --
>>> ~Craig
>>>
>>>
>>>
>>
>>
>>  --
>> ~Craig
>>
>>
>>
>
>
>  --
> ~Craig
>
>
>


-- 
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130718/f10017cb/attachment.html>


More information about the llvm-dev mailing list