[LLVMdev] fptoui calling a function that modifies ECX

Peter Newman peter at uformia.com
Fri Jul 19 02:34:30 PDT 2013


That does appear to have worked. All my tests are passing now.

I'll hand this out to our other devs & testers and make sure it's 
working for them as well (not just on my machine).

Thank you, again.

--
Peter N

On 19/07/2013 5:45 PM, Craig Topper wrote:
> I don't think that's going to work.
>
>
> On Fri, Jul 19, 2013 at 12:24 AM, Peter Newman <peter at uformia.com 
> <mailto:peter at uformia.com>> wrote:
>
>     Thank you, I'm trying this now.
>
>
>     On 19/07/2013 5:23 PM, Craig Topper wrote:
>>     Try adding ECX to the Defs of this part of
>>     lib/Target/X86/X86InstrCompiler.td like I've done below. I don't
>>     have a Windows machine to test myself.
>>
>>     let Defs = [EAX, EDX, ECX, EFLAGS], FPForm = SpecialFP in {
>>       def WIN_FTOL_32 : I<0, Pseudo, (outs), (ins RFP32:$src),
>>                           "# win32 fptoui",
>>                           [(X86WinFTOL RFP32:$src)]>,
>>     Requires<[In32BitMode]>;
>>
>>       def WIN_FTOL_64 : I<0, Pseudo, (outs), (ins RFP64:$src),
>>                           "# win32 fptoui",
>>                           [(X86WinFTOL RFP64:$src)]>,
>>     Requires<[In32BitMode]>;
>>     }
>>
>>
>>     On Thu, Jul 18, 2013 at 11:59 PM, Peter Newman <peter at uformia.com
>>     <mailto:peter at uformia.com>> wrote:
>>
>>         Oh, excellent point, I agree. My bad. Now that I'm not
>>         assuming those are the sqrt, I see the sqrtpd's in the
>>         output. Also there are three fptoui's and there are 3 call
>>         instances.
>>
>>         (Changing subject line again.)
>>
>>         Now it looks like it's bug #13862
>>
>>         On 19/07/2013 4:51 PM, Craig Topper wrote:
>>>         I think those calls correspond to this
>>>
>>>           %110 = fptoui double %109 to i32
>>>
>>>         The calls are followed by an imul with 12 which matches up
>>>         with what occurs right after the fptoui in the IR.
>>>
>>>
>>>         On Thu, Jul 18, 2013 at 11:48 PM, Peter Newman
>>>         <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>
>>>             Yes, that is the result of module-dump.ll
>>>
>>>
>>>             On 19/07/2013 4:46 PM, Craig Topper wrote:
>>>>             Does this correspond to one of the .ll files you sent
>>>>             earlier?
>>>>
>>>>
>>>>             On Thu, Jul 18, 2013 at 11:34 PM, Peter Newman
>>>>             <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>
>>>>                 (Changing subject line as diagnosis has changed)
>>>>
>>>>                 I'm attaching the compiled code that I've been
>>>>                 getting, both with CodeGenOpt::Default and
>>>>                 CodeGenOpt::None . The crash isn't occurring with
>>>>                 CodeGenOpt::None, but that seems to be because ECX
>>>>                 isn't being used - it still gets set to 0x7fffffff
>>>>                 by one of the calls to 76719BA1
>>>>
>>>>                 I notice that X86::SQRTPD[m|r] appear in
>>>>                 X86InstrInfo::isHighLatencyDef. I was thinking an
>>>>                 optimization might be removing it, but I don't get
>>>>                 the sqrtpd instruction even if the createJIT
>>>>                 optimization level turned off.
>>>>
>>>>                 I am trying this with the Release 3.3 code - I'll
>>>>                 try it with trunk and see if I get a different
>>>>                 result there. Maybe there was a recent commit for this.
>>>>
>>>>                 --
>>>>                 Peter N
>>>>
>>>>                 On 19/07/2013 4:00 PM, Craig Topper wrote:
>>>>>                 Hmm, I'm not able to get those .ll files to
>>>>>                 compile if I disable SSE and I end up with SSE
>>>>>                 instructions(including sqrtpd) if I don't disable it.
>>>>>
>>>>>
>>>>>                 On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman
>>>>>                 <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>>
>>>>>                     Is there something specifically required to
>>>>>                     enable SSE? If it's not detected as available
>>>>>                     (based from the target triple?) then I don't
>>>>>                     think we enable it specifically.
>>>>>
>>>>>                     Also it seems that it should handle converting
>>>>>                     to/from the vector types, although I can see
>>>>>                     it getting confused about needing to do that
>>>>>                     if it thinks SSE isn't available at all.
>>>>>
>>>>>
>>>>>                     On 19/07/2013 3:47 PM, Craig Topper wrote:
>>>>>>                     Hmm, maybe sse isn't being enabled so its
>>>>>>                     falling back to emulating sqrt?
>>>>>>
>>>>>>
>>>>>>                     On Thu, Jul 18, 2013 at 10:45 PM, Peter
>>>>>>                     Newman <peter at uformia.com
>>>>>>                     <mailto:peter at uformia.com>> wrote:
>>>>>>
>>>>>>                         In the disassembly, I'm seeing three cases of
>>>>>>                         call 76719BA1
>>>>>>
>>>>>>                         I am assuming this is the sqrt function
>>>>>>                         as this is the only function called in
>>>>>>                         the LLVM IR.
>>>>>>
>>>>>>                         The code at 76719BA1 is:
>>>>>>
>>>>>>                         76719BA1 push ebp
>>>>>>                         76719BA2 mov ebp,esp
>>>>>>                         76719BA4 sub esp,20h
>>>>>>                         76719BA7 and esp,0FFFFFFF0h
>>>>>>                         76719BAA fld st(0)
>>>>>>                         76719BAC fst dword ptr [esp+18h]
>>>>>>                         76719BB0 fistp qword ptr [esp+10h]
>>>>>>                         76719BB4 fild qword ptr [esp+10h]
>>>>>>                         76719BB8 mov edx,dword ptr [esp+18h]
>>>>>>                         76719BBC mov eax,dword ptr [esp+10h]
>>>>>>                         76719BC0 test eax,eax
>>>>>>                         76719BC2 je 76719DCF
>>>>>>                         76719BC8 fsubp st(1),st
>>>>>>                         76719BCA test edx,edx
>>>>>>                         76719BCC js 7671F9DB
>>>>>>                         76719BD2 fstp dword ptr [esp]
>>>>>>                         76719BD5 mov ecx,dword ptr [esp]
>>>>>>                         76719BD8 add ecx,7FFFFFFFh
>>>>>>                         76719BDE sbb eax,0
>>>>>>                         76719BE1 mov edx,dword ptr [esp+14h]
>>>>>>                         76719BE5 sbb edx,0
>>>>>>                         76719BE8 leave
>>>>>>                         76719BE9 ret
>>>>>>
>>>>>>
>>>>>>                         As you can see at 76719BD5, it modifies ECX .
>>>>>>
>>>>>>                         I don't know that this is the sqrtpd
>>>>>>                         function (for example, I'm not seeing any
>>>>>>                         SSE instructions here?) but whatever it
>>>>>>                         is, it's being called from the IR I
>>>>>>                         attached earlier, and is modifying ECX
>>>>>>                         under some circumstances.
>>>>>>
>>>>>>
>>>>>>                         On 19/07/2013 3:29 PM, Craig Topper wrote:
>>>>>>>                         That should map directly to sqrtpd which
>>>>>>>                         can't modify ecx.
>>>>>>>
>>>>>>>
>>>>>>>                         On Thu, Jul 18, 2013 at 10:27 PM, Peter
>>>>>>>                         Newman <peter at uformia.com
>>>>>>>                         <mailto:peter at uformia.com>> wrote:
>>>>>>>
>>>>>>>                             Sorry, that should have been
>>>>>>>                             llvm.x86.sse2.sqrt.pd
>>>>>>>
>>>>>>>
>>>>>>>                             On 19/07/2013 3:25 PM, Craig Topper
>>>>>>>                             wrote:
>>>>>>>>                             What is "frep.x86.sse2.sqrt.pd".
>>>>>>>>                             I'm only familiar with things
>>>>>>>>                             prefixed with "llvm.x86".
>>>>>>>>
>>>>>>>>
>>>>>>>>                             On Thu, Jul 18, 2013 at 10:12 PM,
>>>>>>>>                             Peter Newman <peter at uformia.com
>>>>>>>>                             <mailto:peter at uformia.com>> wrote:
>>>>>>>>
>>>>>>>>                                 After stepping through the
>>>>>>>>                                 produced assembly, I believe I
>>>>>>>>                                 have a culprit.
>>>>>>>>
>>>>>>>>                                 One of the calls to
>>>>>>>>                                 @frep.x86.sse2.sqrt.pd is
>>>>>>>>                                 modifying the value of ECX -
>>>>>>>>                                 while the produced code is
>>>>>>>>                                 expecting it to still contain
>>>>>>>>                                 its previous value.
>>>>>>>>
>>>>>>>>                                 Peter N
>>>>>>>>
>>>>>>>>
>>>>>>>>                                 On 19/07/2013 2:09 PM, Peter
>>>>>>>>                                 Newman wrote:
>>>>>>>>>                                 I've attached the
>>>>>>>>>                                 module->dump() that our code
>>>>>>>>>                                 is producing. Unfortunately
>>>>>>>>>                                 this is the smallest test case
>>>>>>>>>                                 I have available.
>>>>>>>>>
>>>>>>>>>                                 This is before any
>>>>>>>>>                                 optimization passes are
>>>>>>>>>                                 applied. There are two
>>>>>>>>>                                 separate modules in existence
>>>>>>>>>                                 at the time, and there are no
>>>>>>>>>                                 guarantees about the order the
>>>>>>>>>                                 surrounding code calls those
>>>>>>>>>                                 functions, so there may be
>>>>>>>>>                                 some interaction between them?
>>>>>>>>>                                 There shouldn't be, they don't
>>>>>>>>>                                 refer to any common memory
>>>>>>>>>                                 etc. There is no
>>>>>>>>>                                 multi-threading occurring.
>>>>>>>>>
>>>>>>>>>                                 The function in module-dump.ll
>>>>>>>>>                                 (called crashfunc in this
>>>>>>>>>                                 file) is called with
>>>>>>>>>                                 - func_params 0x0018f3b0
>>>>>>>>>                                 double [3]
>>>>>>>>>                                 [0x0] -11.339976634695301 double
>>>>>>>>>                                 [0x1] -9.7504239056205506 double
>>>>>>>>>                                 [0x2] -5.2900856817382804 double
>>>>>>>>>                                 at the time of the exception.
>>>>>>>>>
>>>>>>>>>                                 This is compiled on a
>>>>>>>>>                                 "i686-pc-win32" triple. All of
>>>>>>>>>                                 the non-intrinsic functions
>>>>>>>>>                                 referred to in these modules
>>>>>>>>>                                 are the standard equivalents
>>>>>>>>>                                 from the MSVC library (e.g.
>>>>>>>>>                                 @asin is the standard C lib   
>>>>>>>>>                                 double asin( double ) ).
>>>>>>>>>
>>>>>>>>>                                 Hopefully this is reproducible
>>>>>>>>>                                 for you.
>>>>>>>>>
>>>>>>>>>                                 --
>>>>>>>>>                                 PeterN
>>>>>>>>>
>>>>>>>>>                                 On 18/07/2013 4:37 PM, Craig
>>>>>>>>>                                 Topper wrote:
>>>>>>>>>>                                 Are you able to send any IR
>>>>>>>>>>                                 for others to reproduce this
>>>>>>>>>>                                 issue?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                 On Wed, Jul 17, 2013 at 11:23
>>>>>>>>>>                                 PM, Peter Newman
>>>>>>>>>>                                 <peter at uformia.com
>>>>>>>>>>                                 <mailto:peter at uformia.com>>
>>>>>>>>>>                                 wrote:
>>>>>>>>>>
>>>>>>>>>>                                     Unfortunately, this
>>>>>>>>>>                                     doesn't appear to be the
>>>>>>>>>>                                     bug I'm hitting. I
>>>>>>>>>>                                     applied the fix to my
>>>>>>>>>>                                     source and it didn't make
>>>>>>>>>>                                     a difference.
>>>>>>>>>>
>>>>>>>>>>                                     Also further testing
>>>>>>>>>>                                     found me getting the same
>>>>>>>>>>                                     behavior with other SIMD
>>>>>>>>>>                                     instructions. The common
>>>>>>>>>>                                     factor is in each case,
>>>>>>>>>>                                     ECX is set to 0x7fffffff,
>>>>>>>>>>                                     and it's an operation
>>>>>>>>>>                                     using xmm ptr ecx+offset .
>>>>>>>>>>
>>>>>>>>>>                                     Additionally, turning the
>>>>>>>>>>                                     optimization level passed
>>>>>>>>>>                                     to createJIT down appears
>>>>>>>>>>                                     to avoid it, so I'm now
>>>>>>>>>>                                     leaning towards a bug in
>>>>>>>>>>                                     one of the optimization
>>>>>>>>>>                                     passes.
>>>>>>>>>>
>>>>>>>>>>                                     I'm going to dig through
>>>>>>>>>>                                     the passes controlled by
>>>>>>>>>>                                     that parameter and see if
>>>>>>>>>>                                     I can narrow down which
>>>>>>>>>>                                     optimization is causing it.
>>>>>>>>>>
>>>>>>>>>>                                     Peter N
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                     On 17/07/2013 1:58 PM,
>>>>>>>>>>                                     Solomon Boulos wrote:
>>>>>>>>>>
>>>>>>>>>>                                         As someone off list
>>>>>>>>>>                                         just told me, perhaps
>>>>>>>>>>                                         my new bug is the
>>>>>>>>>>                                         same issue:
>>>>>>>>>>
>>>>>>>>>>                                         http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>>>>>>
>>>>>>>>>>                                         Do you happen to be
>>>>>>>>>>                                         using FastISel?
>>>>>>>>>>
>>>>>>>>>>                                         Solomon
>>>>>>>>>>
>>>>>>>>>>                                         On Jul 16, 2013, at
>>>>>>>>>>                                         6:39 PM, Peter Newman
>>>>>>>>>>                                         <peter at uformia.com
>>>>>>>>>>                                         <mailto:peter at uformia.com>>
>>>>>>>>>>                                         wrote:
>>>>>>>>>>
>>>>>>>>>>                                             Hello all,
>>>>>>>>>>
>>>>>>>>>>                                             I'm currently in
>>>>>>>>>>                                             the process of
>>>>>>>>>>                                             debugging a crash
>>>>>>>>>>                                             occurring in our
>>>>>>>>>>                                             program. In LLVM
>>>>>>>>>>                                             3.2 and 3.3 it
>>>>>>>>>>                                             appears that JIT
>>>>>>>>>>                                             generated code is
>>>>>>>>>>                                             attempting to
>>>>>>>>>>                                             perform access
>>>>>>>>>>                                             unaligned memory
>>>>>>>>>>                                             with a SSE2
>>>>>>>>>>                                             instruction.
>>>>>>>>>>                                             However this only
>>>>>>>>>>                                             happens under
>>>>>>>>>>                                             certain
>>>>>>>>>>                                             conditions that
>>>>>>>>>>                                             seem (but may not
>>>>>>>>>>                                             be) related to
>>>>>>>>>>                                             the stacks state
>>>>>>>>>>                                             on calling the
>>>>>>>>>>                                             function.
>>>>>>>>>>
>>>>>>>>>>                                             Our program acts
>>>>>>>>>>                                             as a front-end,
>>>>>>>>>>                                             using the LLVM
>>>>>>>>>>                                             C++ API to
>>>>>>>>>>                                             generate a JIT
>>>>>>>>>>                                             generated
>>>>>>>>>>                                             function. This
>>>>>>>>>>                                             function is
>>>>>>>>>>                                             primarily
>>>>>>>>>>                                             mathematical, so
>>>>>>>>>>                                             we use the Vector
>>>>>>>>>>                                             types to take
>>>>>>>>>>                                             advantage of SIMD
>>>>>>>>>>                                             instructions (as
>>>>>>>>>>                                             well as a few
>>>>>>>>>>                                             SSE2 intrinsics).
>>>>>>>>>>
>>>>>>>>>>                                             This worked in
>>>>>>>>>>                                             LLVM 2.8 but
>>>>>>>>>>                                             started failing
>>>>>>>>>>                                             in 3.2 and has
>>>>>>>>>>                                             continued to fail
>>>>>>>>>>                                             in 3.3. It fails
>>>>>>>>>>                                             with no
>>>>>>>>>>                                             optimizations
>>>>>>>>>>                                             applied to the
>>>>>>>>>>                                             LLVM
>>>>>>>>>>                                             Function/Module.
>>>>>>>>>>                                             It crashes with
>>>>>>>>>>                                             what is reported
>>>>>>>>>>                                             as a memory
>>>>>>>>>>                                             access error
>>>>>>>>>>                                             (accessing
>>>>>>>>>>                                             0xffffffff),
>>>>>>>>>>                                             however it's
>>>>>>>>>>                                             suggested that
>>>>>>>>>>                                             this is how the
>>>>>>>>>>                                             SSE fault raising
>>>>>>>>>>                                             mechanism appears.
>>>>>>>>>>
>>>>>>>>>>                                             The generated
>>>>>>>>>>                                             instruction
>>>>>>>>>>                                             varies, but it
>>>>>>>>>>                                             seems to often be
>>>>>>>>>>                                             similar to (I
>>>>>>>>>>                                             don't have it in
>>>>>>>>>>                                             front of me, sorry):
>>>>>>>>>>                                             movapd xmm0,
>>>>>>>>>>                                             xmm[ecx+0x???????]
>>>>>>>>>>                                             Where the xmm
>>>>>>>>>>                                             register changes,
>>>>>>>>>>                                             and the second
>>>>>>>>>>                                             parameter is a
>>>>>>>>>>                                             memory access.
>>>>>>>>>>                                             ECX is always set
>>>>>>>>>>                                             to 0x7ffffff -
>>>>>>>>>>                                             however I don't
>>>>>>>>>>                                             know if this is
>>>>>>>>>>                                             part of the SSE
>>>>>>>>>>                                             error reporting
>>>>>>>>>>                                             process or is
>>>>>>>>>>                                             part of the
>>>>>>>>>>                                             situation causing
>>>>>>>>>>                                             the error.
>>>>>>>>>>
>>>>>>>>>>                                             I haven't worked
>>>>>>>>>>                                             out exactly what
>>>>>>>>>>                                             code path etc is
>>>>>>>>>>                                             causing this
>>>>>>>>>>                                             crash. I'm hoping
>>>>>>>>>>                                             that someone can
>>>>>>>>>>                                             tell me if there
>>>>>>>>>>                                             were any changed
>>>>>>>>>>                                             requirements for
>>>>>>>>>>                                             working with SIMD
>>>>>>>>>>                                             in LLVM 3.2 (or
>>>>>>>>>>                                             earlier, we
>>>>>>>>>>                                             haven't tried 3.0
>>>>>>>>>>                                             or 3.1). I
>>>>>>>>>>                                             currently suspect
>>>>>>>>>>                                             the use of
>>>>>>>>>>                                             GlobalVariable
>>>>>>>>>>                                             (we first
>>>>>>>>>>                                             discovered the
>>>>>>>>>>                                             crash when using
>>>>>>>>>>                                             a feature that
>>>>>>>>>>                                             uses them),
>>>>>>>>>>                                             however I have
>>>>>>>>>>                                             attempted using
>>>>>>>>>>                                             setAlignment on
>>>>>>>>>>                                             the
>>>>>>>>>>                                             GlobalVariables
>>>>>>>>>>                                             without any change.
>>>>>>>>>>
>>>>>>>>>>                                             --
>>>>>>>>>>                                             Peter N
>>>>>>>>>>                                             _______________________________________________
>>>>>>>>>>                                             LLVM Developers
>>>>>>>>>>                                             mailing list
>>>>>>>>>>                                             LLVMdev at cs.uiuc.edu
>>>>>>>>>>                                             <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>>>>                                             http://llvm.cs.uiuc.edu
>>>>>>>>>>                                             http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                     _______________________________________________
>>>>>>>>>>                                     LLVM Developers mailing list
>>>>>>>>>>                                     LLVMdev at cs.uiuc.edu
>>>>>>>>>>                                     <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>>>>                                     http://llvm.cs.uiuc.edu
>>>>>>>>>>                                     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                 -- 
>>>>>>>>>>                                 ~Craig
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                             -- 
>>>>>>>>                             ~Craig
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                         -- 
>>>>>>>                         ~Craig
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                     -- 
>>>>>>                     ~Craig
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                 -- 
>>>>>                 ~Craig
>>>>
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             ~Craig
>>>
>>>
>>>
>>>
>>>         -- 
>>>         ~Craig
>>
>>
>>
>>
>>     -- 
>>     ~Craig
>
>
>
>
> -- 
> ~Craig

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130719/f93ac47a/attachment.html>


More information about the llvm-dev mailing list