[LLVMdev] llvm.x86.sse2.sqrt.pd not using sqrtpd, calling a function that modifies ECX
Craig Topper
craig.topper at gmail.com
Thu Jul 18 23:59:13 PDT 2013
The calls represent the MSVC _ftol2 function I think.
On Thu, Jul 18, 2013 at 11:34 PM, Peter Newman <peter at uformia.com> wrote:
> (Changing subject line as diagnosis has changed)
>
> I'm attaching the compiled code that I've been getting, both with
> CodeGenOpt::Default and CodeGenOpt::None . The crash isn't occurring with
> CodeGenOpt::None, but that seems to be because ECX isn't being used - it
> still gets set to 0x7fffffff by one of the calls to 76719BA1
>
> I notice that X86::SQRTPD[m|r] appear in X86InstrInfo::isHighLatencyDef. I
> was thinking an optimization might be removing it, but I don't get the
> sqrtpd instruction even if the createJIT optimization level turned off.
>
> I am trying this with the Release 3.3 code - I'll try it with trunk and
> see if I get a different result there. Maybe there was a recent commit for
> this.
>
> --
> Peter N
>
> On 19/07/2013 4:00 PM, Craig Topper wrote:
>
> Hmm, I'm not able to get those .ll files to compile if I disable SSE and I
> end up with SSE instructions(including sqrtpd) if I don't disable it.
>
>
> On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <peter at uformia.com> wrote:
>
>> Is there something specifically required to enable SSE? If it's not
>> detected as available (based from the target triple?) then I don't think we
>> enable it specifically.
>>
>> Also it seems that it should handle converting to/from the vector types,
>> although I can see it getting confused about needing to do that if it
>> thinks SSE isn't available at all.
>>
>>
>> On 19/07/2013 3:47 PM, Craig Topper wrote:
>>
>> Hmm, maybe sse isn't being enabled so its falling back to emulating sqrt?
>>
>>
>> On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman <peter at uformia.com> wrote:
>>
>>> In the disassembly, I'm seeing three cases of
>>> call 76719BA1
>>>
>>> I am assuming this is the sqrt function as this is the only function
>>> called in the LLVM IR.
>>>
>>> The code at 76719BA1 is:
>>>
>>> 76719BA1 push ebp
>>> 76719BA2 mov ebp,esp
>>> 76719BA4 sub esp,20h
>>> 76719BA7 and esp,0FFFFFFF0h
>>> 76719BAA fld st(0)
>>> 76719BAC fst dword ptr [esp+18h]
>>> 76719BB0 fistp qword ptr [esp+10h]
>>> 76719BB4 fild qword ptr [esp+10h]
>>> 76719BB8 mov edx,dword ptr [esp+18h]
>>> 76719BBC mov eax,dword ptr [esp+10h]
>>> 76719BC0 test eax,eax
>>> 76719BC2 je 76719DCF
>>> 76719BC8 fsubp st(1),st
>>> 76719BCA test edx,edx
>>> 76719BCC js 7671F9DB
>>> 76719BD2 fstp dword ptr [esp]
>>> 76719BD5 mov ecx,dword ptr [esp]
>>> 76719BD8 add ecx,7FFFFFFFh
>>> 76719BDE sbb eax,0
>>> 76719BE1 mov edx,dword ptr [esp+14h]
>>> 76719BE5 sbb edx,0
>>> 76719BE8 leave
>>> 76719BE9 ret
>>>
>>>
>>> As you can see at 76719BD5, it modifies ECX .
>>>
>>> I don't know that this is the sqrtpd function (for example, I'm not
>>> seeing any SSE instructions here?) but whatever it is, it's being called
>>> from the IR I attached earlier, and is modifying ECX under some
>>> circumstances.
>>>
>>>
>>> On 19/07/2013 3:29 PM, Craig Topper wrote:
>>>
>>> That should map directly to sqrtpd which can't modify ecx.
>>>
>>>
>>> On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com>wrote:
>>>
>>>> Sorry, that should have been llvm.x86.sse2.sqrt.pd
>>>>
>>>>
>>>> On 19/07/2013 3:25 PM, Craig Topper wrote:
>>>>
>>>> What is "frep.x86.sse2.sqrt.pd". I'm only familiar with things prefixed
>>>> with "llvm.x86".
>>>>
>>>>
>>>> On Thu, Jul 18, 2013 at 10:12 PM, Peter Newman <peter at uformia.com>wrote:
>>>>
>>>>> After stepping through the produced assembly, I believe I have a
>>>>> culprit.
>>>>>
>>>>> One of the calls to @frep.x86.sse2.sqrt.pd is modifying the value of
>>>>> ECX - while the produced code is expecting it to still contain its previous
>>>>> value.
>>>>>
>>>>> Peter N
>>>>>
>>>>>
>>>>> On 19/07/2013 2:09 PM, Peter Newman wrote:
>>>>>
>>>>> I've attached the module->dump() that our code is producing.
>>>>> Unfortunately this is the smallest test case I have available.
>>>>>
>>>>> This is before any optimization passes are applied. There are two
>>>>> separate modules in existence at the time, and there are no guarantees
>>>>> about the order the surrounding code calls those functions, so there may be
>>>>> some interaction between them? There shouldn't be, they don't refer to any
>>>>> common memory etc. There is no multi-threading occurring.
>>>>>
>>>>> The function in module-dump.ll (called crashfunc in this file) is
>>>>> called with
>>>>> - func_params 0x0018f3b0 double [3]
>>>>> [0x0] -11.339976634695301 double
>>>>> [0x1] -9.7504239056205506 double
>>>>> [0x2] -5.2900856817382804 double
>>>>> at the time of the exception.
>>>>>
>>>>> This is compiled on a "i686-pc-win32" triple. All of the non-intrinsic
>>>>> functions referred to in these modules are the standard equivalents from
>>>>> the MSVC library (e.g. @asin is the standard C lib double asin( double )
>>>>> ).
>>>>>
>>>>> Hopefully this is reproducible for you.
>>>>>
>>>>> --
>>>>> PeterN
>>>>>
>>>>> On 18/07/2013 4:37 PM, Craig Topper wrote:
>>>>>
>>>>> Are you able to send any IR for others to reproduce this issue?
>>>>>
>>>>>
>>>>> On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman <peter at uformia.com>wrote:
>>>>>
>>>>>> Unfortunately, this doesn't appear to be the bug I'm hitting. I
>>>>>> applied the fix to my source and it didn't make a difference.
>>>>>>
>>>>>> Also further testing found me getting the same behavior with other
>>>>>> SIMD instructions. The common factor is in each case, ECX is set to
>>>>>> 0x7fffffff, and it's an operation using xmm ptr ecx+offset .
>>>>>>
>>>>>> Additionally, turning the optimization level passed to createJIT down
>>>>>> appears to avoid it, so I'm now leaning towards a bug in one of the
>>>>>> optimization passes.
>>>>>>
>>>>>> I'm going to dig through the passes controlled by that parameter and
>>>>>> see if I can narrow down which optimization is causing it.
>>>>>>
>>>>>> Peter N
>>>>>>
>>>>>>
>>>>>> On 17/07/2013 1:58 PM, Solomon Boulos wrote:
>>>>>>
>>>>>>> As someone off list just told me, perhaps my new bug is the same
>>>>>>> issue:
>>>>>>>
>>>>>>> http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>>>
>>>>>>> Do you happen to be using FastISel?
>>>>>>>
>>>>>>> Solomon
>>>>>>>
>>>>>>> On Jul 16, 2013, at 6:39 PM, Peter Newman <peter at uformia.com> wrote:
>>>>>>>
>>>>>>> Hello all,
>>>>>>>>
>>>>>>>> I'm currently in the process of debugging a crash occurring in our
>>>>>>>> program. In LLVM 3.2 and 3.3 it appears that JIT generated code is
>>>>>>>> attempting to perform access unaligned memory with a SSE2 instruction.
>>>>>>>> However this only happens under certain conditions that seem (but may not
>>>>>>>> be) related to the stacks state on calling the function.
>>>>>>>>
>>>>>>>> Our program acts as a front-end, using the LLVM C++ API to generate
>>>>>>>> a JIT generated function. This function is primarily mathematical, so we
>>>>>>>> use the Vector types to take advantage of SIMD instructions (as well as a
>>>>>>>> few SSE2 intrinsics).
>>>>>>>>
>>>>>>>> This worked in LLVM 2.8 but started failing in 3.2 and has
>>>>>>>> continued to fail in 3.3. It fails with no optimizations applied to the
>>>>>>>> LLVM Function/Module. It crashes with what is reported as a memory access
>>>>>>>> error (accessing 0xffffffff), however it's suggested that this is how the
>>>>>>>> SSE fault raising mechanism appears.
>>>>>>>>
>>>>>>>> The generated instruction varies, but it seems to often be similar
>>>>>>>> to (I don't have it in front of me, sorry):
>>>>>>>> movapd xmm0, xmm[ecx+0x???????]
>>>>>>>> Where the xmm register changes, and the second parameter is a
>>>>>>>> memory access.
>>>>>>>> ECX is always set to 0x7ffffff - however I don't know if this is
>>>>>>>> part of the SSE error reporting process or is part of the situation causing
>>>>>>>> the error.
>>>>>>>>
>>>>>>>> I haven't worked out exactly what code path etc is causing this
>>>>>>>> crash. I'm hoping that someone can tell me if there were any changed
>>>>>>>> requirements for working with SIMD in LLVM 3.2 (or earlier, we haven't
>>>>>>>> tried 3.0 or 3.1). I currently suspect the use of GlobalVariable (we first
>>>>>>>> discovered the crash when using a feature that uses them), however I have
>>>>>>>> attempted using setAlignment on the GlobalVariables without any change.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Peter N
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~Craig
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> ~Craig
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> ~Craig
>>>
>>>
>>>
>>
>>
>> --
>> ~Craig
>>
>>
>>
>
>
> --
> ~Craig
>
>
>
--
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130718/d13fb831/attachment.html>
More information about the llvm-dev
mailing list