[LLVMdev] fptoui calling a function that modifies ECX

Peter Newman peter at uformia.com
Thu Jul 18 23:59:28 PDT 2013


Oh, excellent point, I agree. My bad. Now that I'm not assuming those 
are the sqrt, I see the sqrtpd's in the output. Also there are three 
fptoui's and there are 3 call instances.

(Changing subject line again.)

Now it looks like it's bug #13862

On 19/07/2013 4:51 PM, Craig Topper wrote:
> I think those calls correspond to this
>
>   %110 = fptoui double %109 to i32
>
> The calls are followed by an imul with 12 which matches up with what 
> occurs right after the fptoui in the IR.
>
>
> On Thu, Jul 18, 2013 at 11:48 PM, Peter Newman <peter at uformia.com 
> <mailto:peter at uformia.com>> wrote:
>
>     Yes, that is the result of module-dump.ll
>
>
>     On 19/07/2013 4:46 PM, Craig Topper wrote:
>>     Does this correspond to one of the .ll files you sent earlier?
>>
>>
>>     On Thu, Jul 18, 2013 at 11:34 PM, Peter Newman <peter at uformia.com
>>     <mailto:peter at uformia.com>> wrote:
>>
>>         (Changing subject line as diagnosis has changed)
>>
>>         I'm attaching the compiled code that I've been getting, both
>>         with CodeGenOpt::Default and CodeGenOpt::None . The crash
>>         isn't occurring with CodeGenOpt::None, but that seems to be
>>         because ECX isn't being used - it still gets set to
>>         0x7fffffff by one of the calls to 76719BA1
>>
>>         I notice that X86::SQRTPD[m|r] appear in
>>         X86InstrInfo::isHighLatencyDef. I was thinking an
>>         optimization might be removing it, but I don't get the sqrtpd
>>         instruction even if the createJIT optimization level turned off.
>>
>>         I am trying this with the Release 3.3 code - I'll try it with
>>         trunk and see if I get a different result there. Maybe there
>>         was a recent commit for this.
>>
>>         --
>>         Peter N
>>
>>         On 19/07/2013 4:00 PM, Craig Topper wrote:
>>>         Hmm, I'm not able to get those .ll files to compile if I
>>>         disable SSE and I end up with SSE instructions(including
>>>         sqrtpd) if I don't disable it.
>>>
>>>
>>>         On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman
>>>         <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>
>>>             Is there something specifically required to enable SSE?
>>>             If it's not detected as available (based from the target
>>>             triple?) then I don't think we enable it specifically.
>>>
>>>             Also it seems that it should handle converting to/from
>>>             the vector types, although I can see it getting confused
>>>             about needing to do that if it thinks SSE isn't
>>>             available at all.
>>>
>>>
>>>             On 19/07/2013 3:47 PM, Craig Topper wrote:
>>>>             Hmm, maybe sse isn't being enabled so its falling back
>>>>             to emulating sqrt?
>>>>
>>>>
>>>>             On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman
>>>>             <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>
>>>>                 In the disassembly, I'm seeing three cases of
>>>>                 call 76719BA1
>>>>
>>>>                 I am assuming this is the sqrt function as this is
>>>>                 the only function called in the LLVM IR.
>>>>
>>>>                 The code at 76719BA1 is:
>>>>
>>>>                 76719BA1 push        ebp
>>>>                 76719BA2 mov ebp,esp
>>>>                 76719BA4 sub esp,20h
>>>>                 76719BA7 and esp,0FFFFFFF0h
>>>>                 76719BAA fld         st(0)
>>>>                 76719BAC fst         dword ptr [esp+18h]
>>>>                 76719BB0 fistp       qword ptr [esp+10h]
>>>>                 76719BB4 fild        qword ptr [esp+10h]
>>>>                 76719BB8 mov edx,dword ptr [esp+18h]
>>>>                 76719BBC mov eax,dword ptr [esp+10h]
>>>>                 76719BC0 test eax,eax
>>>>                 76719BC2 je 76719DCF
>>>>                 76719BC8 fsubp st(1),st
>>>>                 76719BCA test edx,edx
>>>>                 76719BCC js 7671F9DB
>>>>                 76719BD2 fstp        dword ptr [esp]
>>>>                 76719BD5 mov ecx,dword ptr [esp]
>>>>                 76719BD8 add ecx,7FFFFFFFh
>>>>                 76719BDE sbb         eax,0
>>>>                 76719BE1 mov edx,dword ptr [esp+14h]
>>>>                 76719BE5 sbb         edx,0
>>>>                 76719BE8 leave
>>>>                 76719BE9 ret
>>>>
>>>>
>>>>                 As you can see at 76719BD5, it modifies ECX .
>>>>
>>>>                 I don't know that this is the sqrtpd function (for
>>>>                 example, I'm not seeing any SSE instructions here?)
>>>>                 but whatever it is, it's being called from the IR I
>>>>                 attached earlier, and is modifying ECX under some
>>>>                 circumstances.
>>>>
>>>>
>>>>                 On 19/07/2013 3:29 PM, Craig Topper wrote:
>>>>>                 That should map directly to sqrtpd which can't
>>>>>                 modify ecx.
>>>>>
>>>>>
>>>>>                 On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman
>>>>>                 <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>>
>>>>>                     Sorry, that should have been
>>>>>                     llvm.x86.sse2.sqrt.pd
>>>>>
>>>>>
>>>>>                     On 19/07/2013 3:25 PM, Craig Topper wrote:
>>>>>>                     What is "frep.x86.sse2.sqrt.pd". I'm only
>>>>>>                     familiar with things prefixed with "llvm.x86".
>>>>>>
>>>>>>
>>>>>>                     On Thu, Jul 18, 2013 at 10:12 PM, Peter
>>>>>>                     Newman <peter at uformia.com
>>>>>>                     <mailto:peter at uformia.com>> wrote:
>>>>>>
>>>>>>                         After stepping through the produced
>>>>>>                         assembly, I believe I have a culprit.
>>>>>>
>>>>>>                         One of the calls to
>>>>>>                         @frep.x86.sse2.sqrt.pd is modifying the
>>>>>>                         value of ECX - while the produced code is
>>>>>>                         expecting it to still contain its
>>>>>>                         previous value.
>>>>>>
>>>>>>                         Peter N
>>>>>>
>>>>>>
>>>>>>                         On 19/07/2013 2:09 PM, Peter Newman wrote:
>>>>>>>                         I've attached the module->dump() that
>>>>>>>                         our code is producing. Unfortunately
>>>>>>>                         this is the smallest test case I have
>>>>>>>                         available.
>>>>>>>
>>>>>>>                         This is before any optimization passes
>>>>>>>                         are applied. There are two separate
>>>>>>>                         modules in existence at the time, and
>>>>>>>                         there are no guarantees about the order
>>>>>>>                         the surrounding code calls those
>>>>>>>                         functions, so there may be some
>>>>>>>                         interaction between them? There
>>>>>>>                         shouldn't be, they don't refer to any
>>>>>>>                         common memory etc. There is no
>>>>>>>                         multi-threading occurring.
>>>>>>>
>>>>>>>                         The function in module-dump.ll (called
>>>>>>>                         crashfunc in this file) is called with
>>>>>>>                         - func_params 0x0018f3b0 double [3]
>>>>>>>                         [0x0] -11.339976634695301 double
>>>>>>>                         [0x1] -9.7504239056205506 double
>>>>>>>                         [0x2] -5.2900856817382804 double
>>>>>>>                         at the time of the exception.
>>>>>>>
>>>>>>>                         This is compiled on a "i686-pc-win32"
>>>>>>>                         triple. All of the non-intrinsic
>>>>>>>                         functions referred to in these modules
>>>>>>>                         are the standard equivalents from the
>>>>>>>                         MSVC library (e.g. @asin is the standard
>>>>>>>                         C lib    double asin( double ) ).
>>>>>>>
>>>>>>>                         Hopefully this is reproducible for you.
>>>>>>>
>>>>>>>                         --
>>>>>>>                         PeterN
>>>>>>>
>>>>>>>                         On 18/07/2013 4:37 PM, Craig Topper wrote:
>>>>>>>>                         Are you able to send any IR for others
>>>>>>>>                         to reproduce this issue?
>>>>>>>>
>>>>>>>>
>>>>>>>>                         On Wed, Jul 17, 2013 at 11:23 PM, Peter
>>>>>>>>                         Newman <peter at uformia.com
>>>>>>>>                         <mailto:peter at uformia.com>> wrote:
>>>>>>>>
>>>>>>>>                             Unfortunately, this doesn't appear
>>>>>>>>                             to be the bug I'm hitting. I
>>>>>>>>                             applied the fix to my source and it
>>>>>>>>                             didn't make a difference.
>>>>>>>>
>>>>>>>>                             Also further testing found me
>>>>>>>>                             getting the same behavior with
>>>>>>>>                             other SIMD instructions. The common
>>>>>>>>                             factor is in each case, ECX is set
>>>>>>>>                             to 0x7fffffff, and it's an
>>>>>>>>                             operation using xmm ptr ecx+offset .
>>>>>>>>
>>>>>>>>                             Additionally, turning the
>>>>>>>>                             optimization level passed to
>>>>>>>>                             createJIT down appears to avoid it,
>>>>>>>>                             so I'm now leaning towards a bug in
>>>>>>>>                             one of the optimization passes.
>>>>>>>>
>>>>>>>>                             I'm going to dig through the passes
>>>>>>>>                             controlled by that parameter and
>>>>>>>>                             see if I can narrow down which
>>>>>>>>                             optimization is causing it.
>>>>>>>>
>>>>>>>>                             Peter N
>>>>>>>>
>>>>>>>>
>>>>>>>>                             On 17/07/2013 1:58 PM, Solomon
>>>>>>>>                             Boulos wrote:
>>>>>>>>
>>>>>>>>                                 As someone off list just told
>>>>>>>>                                 me, perhaps my new bug is the
>>>>>>>>                                 same issue:
>>>>>>>>
>>>>>>>>                                 http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>>>>
>>>>>>>>                                 Do you happen to be using FastISel?
>>>>>>>>
>>>>>>>>                                 Solomon
>>>>>>>>
>>>>>>>>                                 On Jul 16, 2013, at 6:39 PM,
>>>>>>>>                                 Peter Newman <peter at uformia.com
>>>>>>>>                                 <mailto:peter at uformia.com>> wrote:
>>>>>>>>
>>>>>>>>                                     Hello all,
>>>>>>>>
>>>>>>>>                                     I'm currently in the
>>>>>>>>                                     process of debugging a
>>>>>>>>                                     crash occurring in our
>>>>>>>>                                     program. In LLVM 3.2 and
>>>>>>>>                                     3.3 it appears that JIT
>>>>>>>>                                     generated code is
>>>>>>>>                                     attempting to perform
>>>>>>>>                                     access unaligned memory
>>>>>>>>                                     with a SSE2 instruction.
>>>>>>>>                                     However this only happens
>>>>>>>>                                     under certain conditions
>>>>>>>>                                     that seem (but may not be)
>>>>>>>>                                     related to the stacks state
>>>>>>>>                                     on calling the function.
>>>>>>>>
>>>>>>>>                                     Our program acts as a
>>>>>>>>                                     front-end, using the LLVM
>>>>>>>>                                     C++ API to generate a JIT
>>>>>>>>                                     generated function. This
>>>>>>>>                                     function is primarily
>>>>>>>>                                     mathematical, so we use the
>>>>>>>>                                     Vector types to take
>>>>>>>>                                     advantage of SIMD
>>>>>>>>                                     instructions (as well as a
>>>>>>>>                                     few SSE2 intrinsics).
>>>>>>>>
>>>>>>>>                                     This worked in LLVM 2.8 but
>>>>>>>>                                     started failing in 3.2 and
>>>>>>>>                                     has continued to fail in
>>>>>>>>                                     3.3. It fails with no
>>>>>>>>                                     optimizations applied to
>>>>>>>>                                     the LLVM Function/Module.
>>>>>>>>                                     It crashes with what is
>>>>>>>>                                     reported as a memory access
>>>>>>>>                                     error (accessing
>>>>>>>>                                     0xffffffff), however it's
>>>>>>>>                                     suggested that this is how
>>>>>>>>                                     the SSE fault raising
>>>>>>>>                                     mechanism appears.
>>>>>>>>
>>>>>>>>                                     The generated instruction
>>>>>>>>                                     varies, but it seems to
>>>>>>>>                                     often be similar to (I
>>>>>>>>                                     don't have it in front of
>>>>>>>>                                     me, sorry):
>>>>>>>>                                     movapd xmm0, xmm[ecx+0x???????]
>>>>>>>>                                     Where the xmm register
>>>>>>>>                                     changes, and the second
>>>>>>>>                                     parameter is a memory access.
>>>>>>>>                                     ECX is always set to
>>>>>>>>                                     0x7ffffff - however I don't
>>>>>>>>                                     know if this is part of the
>>>>>>>>                                     SSE error reporting process
>>>>>>>>                                     or is part of the situation
>>>>>>>>                                     causing the error.
>>>>>>>>
>>>>>>>>                                     I haven't worked out
>>>>>>>>                                     exactly what code path etc
>>>>>>>>                                     is causing this crash. I'm
>>>>>>>>                                     hoping that someone can
>>>>>>>>                                     tell me if there were any
>>>>>>>>                                     changed requirements for
>>>>>>>>                                     working with SIMD in LLVM
>>>>>>>>                                     3.2 (or earlier, we haven't
>>>>>>>>                                     tried 3.0 or 3.1). I
>>>>>>>>                                     currently suspect the use
>>>>>>>>                                     of GlobalVariable (we first
>>>>>>>>                                     discovered the crash when
>>>>>>>>                                     using a feature that uses
>>>>>>>>                                     them), however I have
>>>>>>>>                                     attempted using
>>>>>>>>                                     setAlignment on the
>>>>>>>>                                     GlobalVariables without any
>>>>>>>>                                     change.
>>>>>>>>
>>>>>>>>                                     --
>>>>>>>>                                     Peter N
>>>>>>>>                                     _______________________________________________
>>>>>>>>                                     LLVM Developers mailing list
>>>>>>>>                                     LLVMdev at cs.uiuc.edu
>>>>>>>>                                     <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu
>>>>>>>>                                     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>
>>>>>>>>
>>>>>>>>                             _______________________________________________
>>>>>>>>                             LLVM Developers mailing list
>>>>>>>>                             LLVMdev at cs.uiuc.edu
>>>>>>>>                             <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>>                             http://llvm.cs.uiuc.edu
>>>>>>>>                             http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                         -- 
>>>>>>>>                         ~Craig
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                     -- 
>>>>>>                     ~Craig
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                 -- 
>>>>>                 ~Craig
>>>>
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             ~Craig
>>>
>>>
>>>
>>>
>>>         -- 
>>>         ~Craig
>>
>>
>>
>>
>>     -- 
>>     ~Craig
>
>
>
>
> -- 
> ~Craig

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130719/83a8df56/attachment.html>


More information about the llvm-dev mailing list