[LLVMdev] fptoui calling a function that modifies ECX

Peter Newman peter at uformia.com
Fri Jul 19 00:24:04 PDT 2013


Thank you, I'm trying this now.

On 19/07/2013 5:23 PM, Craig Topper wrote:
> Try adding ECX to the Defs of this part of 
> lib/Target/X86/X86InstrCompiler.td like I've done below. I don't have 
> a Windows machine to test myself.
>
> let Defs = [EAX, EDX, ECX, EFLAGS], FPForm = SpecialFP in {
>   def WIN_FTOL_32 : I<0, Pseudo, (outs), (ins RFP32:$src),
>                       "# win32 fptoui",
>                       [(X86WinFTOL RFP32:$src)]>,
>                     Requires<[In32BitMode]>;
>
>   def WIN_FTOL_64 : I<0, Pseudo, (outs), (ins RFP64:$src),
>                       "# win32 fptoui",
>                       [(X86WinFTOL RFP64:$src)]>,
>                     Requires<[In32BitMode]>;
> }
>
>
> On Thu, Jul 18, 2013 at 11:59 PM, Peter Newman <peter at uformia.com 
> <mailto:peter at uformia.com>> wrote:
>
>     Oh, excellent point, I agree. My bad. Now that I'm not assuming
>     those are the sqrt, I see the sqrtpd's in the output. Also there
>     are three fptoui's and there are 3 call instances.
>
>     (Changing subject line again.)
>
>     Now it looks like it's bug #13862
>
>     On 19/07/2013 4:51 PM, Craig Topper wrote:
>>     I think those calls correspond to this
>>
>>       %110 = fptoui double %109 to i32
>>
>>     The calls are followed by an imul with 12 which matches up with
>>     what occurs right after the fptoui in the IR.
>>
>>
>>     On Thu, Jul 18, 2013 at 11:48 PM, Peter Newman <peter at uformia.com
>>     <mailto:peter at uformia.com>> wrote:
>>
>>         Yes, that is the result of module-dump.ll
>>
>>
>>         On 19/07/2013 4:46 PM, Craig Topper wrote:
>>>         Does this correspond to one of the .ll files you sent earlier?
>>>
>>>
>>>         On Thu, Jul 18, 2013 at 11:34 PM, Peter Newman
>>>         <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>
>>>             (Changing subject line as diagnosis has changed)
>>>
>>>             I'm attaching the compiled code that I've been getting,
>>>             both with CodeGenOpt::Default and CodeGenOpt::None . The
>>>             crash isn't occurring with CodeGenOpt::None, but that
>>>             seems to be because ECX isn't being used - it still gets
>>>             set to 0x7fffffff by one of the calls to 76719BA1
>>>
>>>             I notice that X86::SQRTPD[m|r] appear in
>>>             X86InstrInfo::isHighLatencyDef. I was thinking an
>>>             optimization might be removing it, but I don't get the
>>>             sqrtpd instruction even if the createJIT optimization
>>>             level turned off.
>>>
>>>             I am trying this with the Release 3.3 code - I'll try it
>>>             with trunk and see if I get a different result there.
>>>             Maybe there was a recent commit for this.
>>>
>>>             --
>>>             Peter N
>>>
>>>             On 19/07/2013 4:00 PM, Craig Topper wrote:
>>>>             Hmm, I'm not able to get those .ll files to compile if
>>>>             I disable SSE and I end up with SSE
>>>>             instructions(including sqrtpd) if I don't disable it.
>>>>
>>>>
>>>>             On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman
>>>>             <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>
>>>>                 Is there something specifically required to enable
>>>>                 SSE? If it's not detected as available (based from
>>>>                 the target triple?) then I don't think we enable it
>>>>                 specifically.
>>>>
>>>>                 Also it seems that it should handle converting
>>>>                 to/from the vector types, although I can see it
>>>>                 getting confused about needing to do that if it
>>>>                 thinks SSE isn't available at all.
>>>>
>>>>
>>>>                 On 19/07/2013 3:47 PM, Craig Topper wrote:
>>>>>                 Hmm, maybe sse isn't being enabled so its falling
>>>>>                 back to emulating sqrt?
>>>>>
>>>>>
>>>>>                 On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman
>>>>>                 <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>>
>>>>>                     In the disassembly, I'm seeing three cases of
>>>>>                     call 76719BA1
>>>>>
>>>>>                     I am assuming this is the sqrt function as
>>>>>                     this is the only function called in the LLVM IR.
>>>>>
>>>>>                     The code at 76719BA1 is:
>>>>>
>>>>>                     76719BA1 push ebp
>>>>>                     76719BA2 mov ebp,esp
>>>>>                     76719BA4 sub esp,20h
>>>>>                     76719BA7 and esp,0FFFFFFF0h
>>>>>                     76719BAA fld st(0)
>>>>>                     76719BAC fst dword ptr [esp+18h]
>>>>>                     76719BB0 fistp qword ptr [esp+10h]
>>>>>                     76719BB4 fild qword ptr [esp+10h]
>>>>>                     76719BB8 mov edx,dword ptr [esp+18h]
>>>>>                     76719BBC mov eax,dword ptr [esp+10h]
>>>>>                     76719BC0 test eax,eax
>>>>>                     76719BC2 je 76719DCF
>>>>>                     76719BC8 fsubp st(1),st
>>>>>                     76719BCA test edx,edx
>>>>>                     76719BCC js 7671F9DB
>>>>>                     76719BD2 fstp dword ptr [esp]
>>>>>                     76719BD5 mov ecx,dword ptr [esp]
>>>>>                     76719BD8 add ecx,7FFFFFFFh
>>>>>                     76719BDE sbb eax,0
>>>>>                     76719BE1 mov edx,dword ptr [esp+14h]
>>>>>                     76719BE5 sbb edx,0
>>>>>                     76719BE8 leave
>>>>>                     76719BE9 ret
>>>>>
>>>>>
>>>>>                     As you can see at 76719BD5, it modifies ECX .
>>>>>
>>>>>                     I don't know that this is the sqrtpd function
>>>>>                     (for example, I'm not seeing any SSE
>>>>>                     instructions here?) but whatever it is, it's
>>>>>                     being called from the IR I attached earlier,
>>>>>                     and is modifying ECX under some circumstances.
>>>>>
>>>>>
>>>>>                     On 19/07/2013 3:29 PM, Craig Topper wrote:
>>>>>>                     That should map directly to sqrtpd which
>>>>>>                     can't modify ecx.
>>>>>>
>>>>>>
>>>>>>                     On Thu, Jul 18, 2013 at 10:27 PM, Peter
>>>>>>                     Newman <peter at uformia.com
>>>>>>                     <mailto:peter at uformia.com>> wrote:
>>>>>>
>>>>>>                         Sorry, that should have been
>>>>>>                         llvm.x86.sse2.sqrt.pd
>>>>>>
>>>>>>
>>>>>>                         On 19/07/2013 3:25 PM, Craig Topper wrote:
>>>>>>>                         What is "frep.x86.sse2.sqrt.pd". I'm
>>>>>>>                         only familiar with things prefixed with
>>>>>>>                         "llvm.x86".
>>>>>>>
>>>>>>>
>>>>>>>                         On Thu, Jul 18, 2013 at 10:12 PM, Peter
>>>>>>>                         Newman <peter at uformia.com
>>>>>>>                         <mailto:peter at uformia.com>> wrote:
>>>>>>>
>>>>>>>                             After stepping through the produced
>>>>>>>                             assembly, I believe I have a culprit.
>>>>>>>
>>>>>>>                             One of the calls to
>>>>>>>                             @frep.x86.sse2.sqrt.pd is modifying
>>>>>>>                             the value of ECX - while the
>>>>>>>                             produced code is expecting it to
>>>>>>>                             still contain its previous value.
>>>>>>>
>>>>>>>                             Peter N
>>>>>>>
>>>>>>>
>>>>>>>                             On 19/07/2013 2:09 PM, Peter Newman
>>>>>>>                             wrote:
>>>>>>>>                             I've attached the module->dump()
>>>>>>>>                             that our code is producing.
>>>>>>>>                             Unfortunately this is the smallest
>>>>>>>>                             test case I have available.
>>>>>>>>
>>>>>>>>                             This is before any optimization
>>>>>>>>                             passes are applied. There are two
>>>>>>>>                             separate modules in existence at
>>>>>>>>                             the time, and there are no
>>>>>>>>                             guarantees about the order the
>>>>>>>>                             surrounding code calls those
>>>>>>>>                             functions, so there may be some
>>>>>>>>                             interaction between them? There
>>>>>>>>                             shouldn't be, they don't refer to
>>>>>>>>                             any common memory etc. There is no
>>>>>>>>                             multi-threading occurring.
>>>>>>>>
>>>>>>>>                             The function in module-dump.ll
>>>>>>>>                             (called crashfunc in this file) is
>>>>>>>>                             called with
>>>>>>>>                             - func_params 0x0018f3b0 double [3]
>>>>>>>>                             [0x0] -11.339976634695301 double
>>>>>>>>                             [0x1] -9.7504239056205506 double
>>>>>>>>                             [0x2] -5.2900856817382804 double
>>>>>>>>                             at the time of the exception.
>>>>>>>>
>>>>>>>>                             This is compiled on a
>>>>>>>>                             "i686-pc-win32" triple. All of the
>>>>>>>>                             non-intrinsic functions referred to
>>>>>>>>                             in these modules are the standard
>>>>>>>>                             equivalents from the MSVC library
>>>>>>>>                             (e.g. @asin is the standard C lib
>>>>>>>>                                double asin( double ) ).
>>>>>>>>
>>>>>>>>                             Hopefully this is reproducible for you.
>>>>>>>>
>>>>>>>>                             --
>>>>>>>>                             PeterN
>>>>>>>>
>>>>>>>>                             On 18/07/2013 4:37 PM, Craig Topper
>>>>>>>>                             wrote:
>>>>>>>>>                             Are you able to send any IR for
>>>>>>>>>                             others to reproduce this issue?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             On Wed, Jul 17, 2013 at 11:23 PM,
>>>>>>>>>                             Peter Newman <peter at uformia.com
>>>>>>>>>                             <mailto:peter at uformia.com>> wrote:
>>>>>>>>>
>>>>>>>>>                                 Unfortunately, this doesn't
>>>>>>>>>                                 appear to be the bug I'm
>>>>>>>>>                                 hitting. I applied the fix to
>>>>>>>>>                                 my source and it didn't make a
>>>>>>>>>                                 difference.
>>>>>>>>>
>>>>>>>>>                                 Also further testing found me
>>>>>>>>>                                 getting the same behavior with
>>>>>>>>>                                 other SIMD instructions. The
>>>>>>>>>                                 common factor is in each case,
>>>>>>>>>                                 ECX is set to 0x7fffffff, and
>>>>>>>>>                                 it's an operation using xmm
>>>>>>>>>                                 ptr ecx+offset .
>>>>>>>>>
>>>>>>>>>                                 Additionally, turning the
>>>>>>>>>                                 optimization level passed to
>>>>>>>>>                                 createJIT down appears to
>>>>>>>>>                                 avoid it, so I'm now leaning
>>>>>>>>>                                 towards a bug in one of the
>>>>>>>>>                                 optimization passes.
>>>>>>>>>
>>>>>>>>>                                 I'm going to dig through the
>>>>>>>>>                                 passes controlled by that
>>>>>>>>>                                 parameter and see if I can
>>>>>>>>>                                 narrow down which optimization
>>>>>>>>>                                 is causing it.
>>>>>>>>>
>>>>>>>>>                                 Peter N
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                                 On 17/07/2013 1:58 PM, Solomon
>>>>>>>>>                                 Boulos wrote:
>>>>>>>>>
>>>>>>>>>                                     As someone off list just
>>>>>>>>>                                     told me, perhaps my new
>>>>>>>>>                                     bug is the same issue:
>>>>>>>>>
>>>>>>>>>                                     http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>>>>>
>>>>>>>>>                                     Do you happen to be using
>>>>>>>>>                                     FastISel?
>>>>>>>>>
>>>>>>>>>                                     Solomon
>>>>>>>>>
>>>>>>>>>                                     On Jul 16, 2013, at 6:39
>>>>>>>>>                                     PM, Peter Newman
>>>>>>>>>                                     <peter at uformia.com
>>>>>>>>>                                     <mailto:peter at uformia.com>> wrote:
>>>>>>>>>
>>>>>>>>>                                         Hello all,
>>>>>>>>>
>>>>>>>>>                                         I'm currently in the
>>>>>>>>>                                         process of debugging a
>>>>>>>>>                                         crash occurring in our
>>>>>>>>>                                         program. In LLVM 3.2
>>>>>>>>>                                         and 3.3 it appears
>>>>>>>>>                                         that JIT generated
>>>>>>>>>                                         code is attempting to
>>>>>>>>>                                         perform access
>>>>>>>>>                                         unaligned memory with
>>>>>>>>>                                         a SSE2 instruction.
>>>>>>>>>                                         However this only
>>>>>>>>>                                         happens under certain
>>>>>>>>>                                         conditions that seem
>>>>>>>>>                                         (but may not be)
>>>>>>>>>                                         related to the stacks
>>>>>>>>>                                         state on calling the
>>>>>>>>>                                         function.
>>>>>>>>>
>>>>>>>>>                                         Our program acts as a
>>>>>>>>>                                         front-end, using the
>>>>>>>>>                                         LLVM C++ API to
>>>>>>>>>                                         generate a JIT
>>>>>>>>>                                         generated function.
>>>>>>>>>                                         This function is
>>>>>>>>>                                         primarily
>>>>>>>>>                                         mathematical, so we
>>>>>>>>>                                         use the Vector types
>>>>>>>>>                                         to take advantage of
>>>>>>>>>                                         SIMD instructions (as
>>>>>>>>>                                         well as a few SSE2
>>>>>>>>>                                         intrinsics).
>>>>>>>>>
>>>>>>>>>                                         This worked in LLVM
>>>>>>>>>                                         2.8 but started
>>>>>>>>>                                         failing in 3.2 and has
>>>>>>>>>                                         continued to fail in
>>>>>>>>>                                         3.3. It fails with no
>>>>>>>>>                                         optimizations applied
>>>>>>>>>                                         to the LLVM
>>>>>>>>>                                         Function/Module. It
>>>>>>>>>                                         crashes with what is
>>>>>>>>>                                         reported as a memory
>>>>>>>>>                                         access error
>>>>>>>>>                                         (accessing
>>>>>>>>>                                         0xffffffff), however
>>>>>>>>>                                         it's suggested that
>>>>>>>>>                                         this is how the SSE
>>>>>>>>>                                         fault raising
>>>>>>>>>                                         mechanism appears.
>>>>>>>>>
>>>>>>>>>                                         The generated
>>>>>>>>>                                         instruction varies,
>>>>>>>>>                                         but it seems to often
>>>>>>>>>                                         be similar to (I don't
>>>>>>>>>                                         have it in front of
>>>>>>>>>                                         me, sorry):
>>>>>>>>>                                         movapd xmm0,
>>>>>>>>>                                         xmm[ecx+0x???????]
>>>>>>>>>                                         Where the xmm register
>>>>>>>>>                                         changes, and the
>>>>>>>>>                                         second parameter is a
>>>>>>>>>                                         memory access.
>>>>>>>>>                                         ECX is always set to
>>>>>>>>>                                         0x7ffffff - however I
>>>>>>>>>                                         don't know if this is
>>>>>>>>>                                         part of the SSE error
>>>>>>>>>                                         reporting process or
>>>>>>>>>                                         is part of the
>>>>>>>>>                                         situation causing the
>>>>>>>>>                                         error.
>>>>>>>>>
>>>>>>>>>                                         I haven't worked out
>>>>>>>>>                                         exactly what code path
>>>>>>>>>                                         etc is causing this
>>>>>>>>>                                         crash. I'm hoping that
>>>>>>>>>                                         someone can tell me if
>>>>>>>>>                                         there were any changed
>>>>>>>>>                                         requirements for
>>>>>>>>>                                         working with SIMD in
>>>>>>>>>                                         LLVM 3.2 (or earlier,
>>>>>>>>>                                         we haven't tried 3.0
>>>>>>>>>                                         or 3.1). I currently
>>>>>>>>>                                         suspect the use of
>>>>>>>>>                                         GlobalVariable (we
>>>>>>>>>                                         first discovered the
>>>>>>>>>                                         crash when using a
>>>>>>>>>                                         feature that uses
>>>>>>>>>                                         them), however I have
>>>>>>>>>                                         attempted using
>>>>>>>>>                                         setAlignment on the
>>>>>>>>>                                         GlobalVariables
>>>>>>>>>                                         without any change.
>>>>>>>>>
>>>>>>>>>                                         --
>>>>>>>>>                                         Peter N
>>>>>>>>>                                         _______________________________________________
>>>>>>>>>                                         LLVM Developers
>>>>>>>>>                                         mailing list
>>>>>>>>>                                         LLVMdev at cs.uiuc.edu
>>>>>>>>>                                         <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>>>                                         http://llvm.cs.uiuc.edu
>>>>>>>>>                                         http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                                 _______________________________________________
>>>>>>>>>                                 LLVM Developers mailing list
>>>>>>>>>                                 LLVMdev at cs.uiuc.edu
>>>>>>>>>                                 <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>>>                                 http://llvm.cs.uiuc.edu
>>>>>>>>>                                 http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             -- 
>>>>>>>>>                             ~Craig
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                         -- 
>>>>>>>                         ~Craig
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                     -- 
>>>>>>                     ~Craig
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                 -- 
>>>>>                 ~Craig
>>>>
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             ~Craig
>>>
>>>
>>>
>>>
>>>         -- 
>>>         ~Craig
>>
>>
>>
>>
>>     -- 
>>     ~Craig
>
>
>
>
> -- 
> ~Craig

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130719/65cb4500/attachment.html>


More information about the llvm-dev mailing list