[LLVMdev] fptoui calling a function that modifies ECX
Peter Newman
peter at uformia.com
Thu Jul 18 23:59:28 PDT 2013
Oh, excellent point, I agree. My bad. Now that I'm not assuming those
are the sqrt, I see the sqrtpd's in the output. Also there are three
fptoui's and there are 3 call instances.
(Changing subject line again.)
Now it looks like it's bug #13862
On 19/07/2013 4:51 PM, Craig Topper wrote:
> I think those calls correspond to this
>
> %110 = fptoui double %109 to i32
>
> The calls are followed by an imul with 12 which matches up with what
> occurs right after the fptoui in the IR.
>
>
> On Thu, Jul 18, 2013 at 11:48 PM, Peter Newman <peter at uformia.com
> <mailto:peter at uformia.com>> wrote:
>
> Yes, that is the result of module-dump.ll
>
>
> On 19/07/2013 4:46 PM, Craig Topper wrote:
>> Does this correspond to one of the .ll files you sent earlier?
>>
>>
>> On Thu, Jul 18, 2013 at 11:34 PM, Peter Newman <peter at uformia.com
>> <mailto:peter at uformia.com>> wrote:
>>
>> (Changing subject line as diagnosis has changed)
>>
>> I'm attaching the compiled code that I've been getting, both
>> with CodeGenOpt::Default and CodeGenOpt::None . The crash
>> isn't occurring with CodeGenOpt::None, but that seems to be
>> because ECX isn't being used - it still gets set to
>> 0x7fffffff by one of the calls to 76719BA1
>>
>> I notice that X86::SQRTPD[m|r] appear in
>> X86InstrInfo::isHighLatencyDef. I was thinking an
>> optimization might be removing it, but I don't get the sqrtpd
>> instruction even if the createJIT optimization level turned off.
>>
>> I am trying this with the Release 3.3 code - I'll try it with
>> trunk and see if I get a different result there. Maybe there
>> was a recent commit for this.
>>
>> --
>> Peter N
>>
>> On 19/07/2013 4:00 PM, Craig Topper wrote:
>>> Hmm, I'm not able to get those .ll files to compile if I
>>> disable SSE and I end up with SSE instructions(including
>>> sqrtpd) if I don't disable it.
>>>
>>>
>>> On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman
>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>
>>> Is there something specifically required to enable SSE?
>>> If it's not detected as available (based from the target
>>> triple?) then I don't think we enable it specifically.
>>>
>>> Also it seems that it should handle converting to/from
>>> the vector types, although I can see it getting confused
>>> about needing to do that if it thinks SSE isn't
>>> available at all.
>>>
>>>
>>> On 19/07/2013 3:47 PM, Craig Topper wrote:
>>>> Hmm, maybe sse isn't being enabled so its falling back
>>>> to emulating sqrt?
>>>>
>>>>
>>>> On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman
>>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>
>>>> In the disassembly, I'm seeing three cases of
>>>> call 76719BA1
>>>>
>>>> I am assuming this is the sqrt function as this is
>>>> the only function called in the LLVM IR.
>>>>
>>>> The code at 76719BA1 is:
>>>>
>>>> 76719BA1 push ebp
>>>> 76719BA2 mov ebp,esp
>>>> 76719BA4 sub esp,20h
>>>> 76719BA7 and esp,0FFFFFFF0h
>>>> 76719BAA fld st(0)
>>>> 76719BAC fst dword ptr [esp+18h]
>>>> 76719BB0 fistp qword ptr [esp+10h]
>>>> 76719BB4 fild qword ptr [esp+10h]
>>>> 76719BB8 mov edx,dword ptr [esp+18h]
>>>> 76719BBC mov eax,dword ptr [esp+10h]
>>>> 76719BC0 test eax,eax
>>>> 76719BC2 je 76719DCF
>>>> 76719BC8 fsubp st(1),st
>>>> 76719BCA test edx,edx
>>>> 76719BCC js 7671F9DB
>>>> 76719BD2 fstp dword ptr [esp]
>>>> 76719BD5 mov ecx,dword ptr [esp]
>>>> 76719BD8 add ecx,7FFFFFFFh
>>>> 76719BDE sbb eax,0
>>>> 76719BE1 mov edx,dword ptr [esp+14h]
>>>> 76719BE5 sbb edx,0
>>>> 76719BE8 leave
>>>> 76719BE9 ret
>>>>
>>>>
>>>> As you can see at 76719BD5, it modifies ECX .
>>>>
>>>> I don't know that this is the sqrtpd function (for
>>>> example, I'm not seeing any SSE instructions here?)
>>>> but whatever it is, it's being called from the IR I
>>>> attached earlier, and is modifying ECX under some
>>>> circumstances.
>>>>
>>>>
>>>> On 19/07/2013 3:29 PM, Craig Topper wrote:
>>>>> That should map directly to sqrtpd which can't
>>>>> modify ecx.
>>>>>
>>>>>
>>>>> On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman
>>>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>>
>>>>> Sorry, that should have been
>>>>> llvm.x86.sse2.sqrt.pd
>>>>>
>>>>>
>>>>> On 19/07/2013 3:25 PM, Craig Topper wrote:
>>>>>> What is "frep.x86.sse2.sqrt.pd". I'm only
>>>>>> familiar with things prefixed with "llvm.x86".
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 18, 2013 at 10:12 PM, Peter
>>>>>> Newman <peter at uformia.com
>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>
>>>>>> After stepping through the produced
>>>>>> assembly, I believe I have a culprit.
>>>>>>
>>>>>> One of the calls to
>>>>>> @frep.x86.sse2.sqrt.pd is modifying the
>>>>>> value of ECX - while the produced code is
>>>>>> expecting it to still contain its
>>>>>> previous value.
>>>>>>
>>>>>> Peter N
>>>>>>
>>>>>>
>>>>>> On 19/07/2013 2:09 PM, Peter Newman wrote:
>>>>>>> I've attached the module->dump() that
>>>>>>> our code is producing. Unfortunately
>>>>>>> this is the smallest test case I have
>>>>>>> available.
>>>>>>>
>>>>>>> This is before any optimization passes
>>>>>>> are applied. There are two separate
>>>>>>> modules in existence at the time, and
>>>>>>> there are no guarantees about the order
>>>>>>> the surrounding code calls those
>>>>>>> functions, so there may be some
>>>>>>> interaction between them? There
>>>>>>> shouldn't be, they don't refer to any
>>>>>>> common memory etc. There is no
>>>>>>> multi-threading occurring.
>>>>>>>
>>>>>>> The function in module-dump.ll (called
>>>>>>> crashfunc in this file) is called with
>>>>>>> - func_params 0x0018f3b0 double [3]
>>>>>>> [0x0] -11.339976634695301 double
>>>>>>> [0x1] -9.7504239056205506 double
>>>>>>> [0x2] -5.2900856817382804 double
>>>>>>> at the time of the exception.
>>>>>>>
>>>>>>> This is compiled on a "i686-pc-win32"
>>>>>>> triple. All of the non-intrinsic
>>>>>>> functions referred to in these modules
>>>>>>> are the standard equivalents from the
>>>>>>> MSVC library (e.g. @asin is the standard
>>>>>>> C lib double asin( double ) ).
>>>>>>>
>>>>>>> Hopefully this is reproducible for you.
>>>>>>>
>>>>>>> --
>>>>>>> PeterN
>>>>>>>
>>>>>>> On 18/07/2013 4:37 PM, Craig Topper wrote:
>>>>>>>> Are you able to send any IR for others
>>>>>>>> to reproduce this issue?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 17, 2013 at 11:23 PM, Peter
>>>>>>>> Newman <peter at uformia.com
>>>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>>>
>>>>>>>> Unfortunately, this doesn't appear
>>>>>>>> to be the bug I'm hitting. I
>>>>>>>> applied the fix to my source and it
>>>>>>>> didn't make a difference.
>>>>>>>>
>>>>>>>> Also further testing found me
>>>>>>>> getting the same behavior with
>>>>>>>> other SIMD instructions. The common
>>>>>>>> factor is in each case, ECX is set
>>>>>>>> to 0x7fffffff, and it's an
>>>>>>>> operation using xmm ptr ecx+offset .
>>>>>>>>
>>>>>>>> Additionally, turning the
>>>>>>>> optimization level passed to
>>>>>>>> createJIT down appears to avoid it,
>>>>>>>> so I'm now leaning towards a bug in
>>>>>>>> one of the optimization passes.
>>>>>>>>
>>>>>>>> I'm going to dig through the passes
>>>>>>>> controlled by that parameter and
>>>>>>>> see if I can narrow down which
>>>>>>>> optimization is causing it.
>>>>>>>>
>>>>>>>> Peter N
>>>>>>>>
>>>>>>>>
>>>>>>>> On 17/07/2013 1:58 PM, Solomon
>>>>>>>> Boulos wrote:
>>>>>>>>
>>>>>>>> As someone off list just told
>>>>>>>> me, perhaps my new bug is the
>>>>>>>> same issue:
>>>>>>>>
>>>>>>>> http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>>>>
>>>>>>>> Do you happen to be using FastISel?
>>>>>>>>
>>>>>>>> Solomon
>>>>>>>>
>>>>>>>> On Jul 16, 2013, at 6:39 PM,
>>>>>>>> Peter Newman <peter at uformia.com
>>>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>>>
>>>>>>>> Hello all,
>>>>>>>>
>>>>>>>> I'm currently in the
>>>>>>>> process of debugging a
>>>>>>>> crash occurring in our
>>>>>>>> program. In LLVM 3.2 and
>>>>>>>> 3.3 it appears that JIT
>>>>>>>> generated code is
>>>>>>>> attempting to perform
>>>>>>>> access unaligned memory
>>>>>>>> with a SSE2 instruction.
>>>>>>>> However this only happens
>>>>>>>> under certain conditions
>>>>>>>> that seem (but may not be)
>>>>>>>> related to the stacks state
>>>>>>>> on calling the function.
>>>>>>>>
>>>>>>>> Our program acts as a
>>>>>>>> front-end, using the LLVM
>>>>>>>> C++ API to generate a JIT
>>>>>>>> generated function. This
>>>>>>>> function is primarily
>>>>>>>> mathematical, so we use the
>>>>>>>> Vector types to take
>>>>>>>> advantage of SIMD
>>>>>>>> instructions (as well as a
>>>>>>>> few SSE2 intrinsics).
>>>>>>>>
>>>>>>>> This worked in LLVM 2.8 but
>>>>>>>> started failing in 3.2 and
>>>>>>>> has continued to fail in
>>>>>>>> 3.3. It fails with no
>>>>>>>> optimizations applied to
>>>>>>>> the LLVM Function/Module.
>>>>>>>> It crashes with what is
>>>>>>>> reported as a memory access
>>>>>>>> error (accessing
>>>>>>>> 0xffffffff), however it's
>>>>>>>> suggested that this is how
>>>>>>>> the SSE fault raising
>>>>>>>> mechanism appears.
>>>>>>>>
>>>>>>>> The generated instruction
>>>>>>>> varies, but it seems to
>>>>>>>> often be similar to (I
>>>>>>>> don't have it in front of
>>>>>>>> me, sorry):
>>>>>>>> movapd xmm0, xmm[ecx+0x???????]
>>>>>>>> Where the xmm register
>>>>>>>> changes, and the second
>>>>>>>> parameter is a memory access.
>>>>>>>> ECX is always set to
>>>>>>>> 0x7ffffff - however I don't
>>>>>>>> know if this is part of the
>>>>>>>> SSE error reporting process
>>>>>>>> or is part of the situation
>>>>>>>> causing the error.
>>>>>>>>
>>>>>>>> I haven't worked out
>>>>>>>> exactly what code path etc
>>>>>>>> is causing this crash. I'm
>>>>>>>> hoping that someone can
>>>>>>>> tell me if there were any
>>>>>>>> changed requirements for
>>>>>>>> working with SIMD in LLVM
>>>>>>>> 3.2 (or earlier, we haven't
>>>>>>>> tried 3.0 or 3.1). I
>>>>>>>> currently suspect the use
>>>>>>>> of GlobalVariable (we first
>>>>>>>> discovered the crash when
>>>>>>>> using a feature that uses
>>>>>>>> them), however I have
>>>>>>>> attempted using
>>>>>>>> setAlignment on the
>>>>>>>> GlobalVariables without any
>>>>>>>> change.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Peter N
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> LLVMdev at cs.uiuc.edu
>>>>>>>> <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> LLVMdev at cs.uiuc.edu
>>>>>>>> <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>> http://llvm.cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ~Craig
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ~Craig
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~Craig
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ~Craig
>>>
>>>
>>>
>>>
>>> --
>>> ~Craig
>>
>>
>>
>>
>> --
>> ~Craig
>
>
>
>
> --
> ~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130719/83a8df56/attachment.html>
More information about the llvm-dev
mailing list