[LLVMdev] fptoui calling a function that modifies ECX

Craig Topper craig.topper at gmail.com
Sun Jul 21 00:40:44 PDT 2013


Committed in r186787


On Sat, Jul 20, 2013 at 12:44 AM, Peter Newman <peter at uformia.com> wrote:

>  I've applied this and the test cases I have here continue to work, so it
> looks good to me.
>
> I've ran into another (seemingly unrelated) issue which I'll describe in a
> separate email to the dev list.
>
> --
> Peter N
>
>
> On 20/07/2013 5:30 AM, Craig Topper wrote:
>
> Here's my attempt at a fix. Adding Jakob to make sure I did this right.
>
>
> On Fri, Jul 19, 2013 at 2:34 AM, Peter Newman <peter at uformia.com> wrote:
>
>>  That does appear to have worked. All my tests are passing now.
>>
>> I'll hand this out to our other devs & testers and make sure it's working
>> for them as well (not just on my machine).
>>
>> Thank you, again.
>>
>> --
>> Peter N
>>
>>
>> On 19/07/2013 5:45 PM, Craig Topper wrote:
>>
>> I don't think that's going to work.
>>
>>
>> On Fri, Jul 19, 2013 at 12:24 AM, Peter Newman <peter at uformia.com> wrote:
>>
>>>  Thank you, I'm trying this now.
>>>
>>>
>>> On 19/07/2013 5:23 PM, Craig Topper wrote:
>>>
>>> Try adding ECX to the Defs of this part of
>>> lib/Target/X86/X86InstrCompiler.td like I've done below. I don't have a
>>> Windows machine to test myself.
>>>
>>>  let Defs = [EAX, EDX, ECX, EFLAGS], FPForm = SpecialFP in {
>>>   def WIN_FTOL_32 : I<0, Pseudo, (outs), (ins RFP32:$src),
>>>                       "# win32 fptoui",
>>>                       [(X86WinFTOL RFP32:$src)]>,
>>>                     Requires<[In32BitMode]>;
>>>
>>>    def WIN_FTOL_64 : I<0, Pseudo, (outs), (ins RFP64:$src),
>>>                       "# win32 fptoui",
>>>                       [(X86WinFTOL RFP64:$src)]>,
>>>                     Requires<[In32BitMode]>;
>>> }
>>>
>>>
>>> On Thu, Jul 18, 2013 at 11:59 PM, Peter Newman <peter at uformia.com>wrote:
>>>
>>>>  Oh, excellent point, I agree. My bad. Now that I'm not assuming those
>>>> are the sqrt, I see the sqrtpd's in the output. Also there are three
>>>> fptoui's and there are 3 call instances.
>>>>
>>>> (Changing subject line again.)
>>>>
>>>> Now it looks like it's bug #13862
>>>>
>>>> On 19/07/2013 4:51 PM, Craig Topper wrote:
>>>>
>>>> I think those calls correspond to this
>>>>
>>>>    %110 = fptoui double %109 to i32
>>>>
>>>>  The calls are followed by an imul with 12 which matches up with what
>>>> occurs right after the fptoui in the IR.
>>>>
>>>>
>>>> On Thu, Jul 18, 2013 at 11:48 PM, Peter Newman <peter at uformia.com>wrote:
>>>>
>>>>>  Yes, that is the result of module-dump.ll
>>>>>
>>>>>
>>>>> On 19/07/2013 4:46 PM, Craig Topper wrote:
>>>>>
>>>>> Does this correspond to one of the .ll files you sent earlier?
>>>>>
>>>>>
>>>>> On Thu, Jul 18, 2013 at 11:34 PM, Peter Newman <peter at uformia.com>wrote:
>>>>>
>>>>>>  (Changing subject line as diagnosis has changed)
>>>>>>
>>>>>> I'm attaching the compiled code that I've been getting, both with
>>>>>> CodeGenOpt::Default and CodeGenOpt::None . The crash isn't occurring with
>>>>>> CodeGenOpt::None, but that seems to be because ECX isn't being used - it
>>>>>> still gets set to 0x7fffffff by one of the calls to 76719BA1
>>>>>>
>>>>>> I notice that X86::SQRTPD[m|r] appear in
>>>>>> X86InstrInfo::isHighLatencyDef. I was thinking an optimization might be
>>>>>> removing it, but I don't get the sqrtpd instruction even if the createJIT
>>>>>> optimization level turned off.
>>>>>>
>>>>>> I am trying this with the Release 3.3 code - I'll try it with trunk
>>>>>> and see if I get a different result there. Maybe there was a recent commit
>>>>>> for this.
>>>>>>
>>>>>> --
>>>>>> Peter N
>>>>>>
>>>>>> On 19/07/2013 4:00 PM, Craig Topper wrote:
>>>>>>
>>>>>> Hmm, I'm not able to get those .ll files to compile if I disable SSE
>>>>>> and I end up with SSE instructions(including sqrtpd) if I don't disable it.
>>>>>>
>>>>>>
>>>>>>  On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <peter at uformia.com>wrote:
>>>>>>
>>>>>>>  Is there something specifically required to enable SSE? If it's
>>>>>>> not detected as available (based from the target triple?) then I don't
>>>>>>> think we enable it specifically.
>>>>>>>
>>>>>>> Also it seems that it should handle converting to/from the vector
>>>>>>> types, although I can see it getting confused about needing to do that if
>>>>>>> it thinks SSE isn't available at all.
>>>>>>>
>>>>>>>
>>>>>>> On 19/07/2013 3:47 PM, Craig Topper wrote:
>>>>>>>
>>>>>>> Hmm, maybe sse isn't being enabled so its falling back to emulating
>>>>>>> sqrt?
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman <peter at uformia.com>wrote:
>>>>>>>
>>>>>>>>  In the disassembly, I'm seeing three cases of
>>>>>>>> call        76719BA1
>>>>>>>>
>>>>>>>> I am assuming this is the sqrt function as this is the only
>>>>>>>> function called in the LLVM IR.
>>>>>>>>
>>>>>>>> The code at 76719BA1 is:
>>>>>>>>
>>>>>>>> 76719BA1  push        ebp
>>>>>>>> 76719BA2  mov         ebp,esp
>>>>>>>> 76719BA4  sub         esp,20h
>>>>>>>> 76719BA7  and         esp,0FFFFFFF0h
>>>>>>>> 76719BAA  fld         st(0)
>>>>>>>> 76719BAC  fst         dword ptr [esp+18h]
>>>>>>>> 76719BB0  fistp       qword ptr [esp+10h]
>>>>>>>> 76719BB4  fild        qword ptr [esp+10h]
>>>>>>>> 76719BB8  mov         edx,dword ptr [esp+18h]
>>>>>>>> 76719BBC  mov         eax,dword ptr [esp+10h]
>>>>>>>> 76719BC0  test        eax,eax
>>>>>>>> 76719BC2  je          76719DCF
>>>>>>>> 76719BC8  fsubp       st(1),st
>>>>>>>> 76719BCA  test        edx,edx
>>>>>>>> 76719BCC  js          7671F9DB
>>>>>>>> 76719BD2  fstp        dword ptr [esp]
>>>>>>>> 76719BD5  mov         ecx,dword ptr [esp]
>>>>>>>> 76719BD8  add         ecx,7FFFFFFFh
>>>>>>>> 76719BDE  sbb         eax,0
>>>>>>>> 76719BE1  mov         edx,dword ptr [esp+14h]
>>>>>>>> 76719BE5  sbb         edx,0
>>>>>>>> 76719BE8  leave
>>>>>>>> 76719BE9  ret
>>>>>>>>
>>>>>>>>
>>>>>>>> As you can see at 76719BD5, it modifies ECX .
>>>>>>>>
>>>>>>>> I don't know that this is the sqrtpd function (for example, I'm not
>>>>>>>> seeing any SSE instructions here?) but whatever it is, it's being called
>>>>>>>> from the IR I attached earlier, and is modifying ECX under some
>>>>>>>> circumstances.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 19/07/2013 3:29 PM, Craig Topper wrote:
>>>>>>>>
>>>>>>>> That should map directly to sqrtpd which can't modify ecx.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com>wrote:
>>>>>>>>
>>>>>>>>>  Sorry, that should have been llvm.x86.sse2.sqrt.pd
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 19/07/2013 3:25 PM, Craig Topper wrote:
>>>>>>>>>
>>>>>>>>> What is "frep.x86.sse2.sqrt.pd". I'm only familiar with things
>>>>>>>>> prefixed with "llvm.x86".
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Jul 18, 2013 at 10:12 PM, Peter Newman <peter at uformia.com>wrote:
>>>>>>>>>
>>>>>>>>>>  After stepping through the produced assembly, I believe I have
>>>>>>>>>> a culprit.
>>>>>>>>>>
>>>>>>>>>> One of the calls to @frep.x86.sse2.sqrt.pd is modifying the value
>>>>>>>>>> of ECX - while the produced code is expecting it to still contain its
>>>>>>>>>> previous value.
>>>>>>>>>>
>>>>>>>>>> Peter N
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 19/07/2013 2:09 PM, Peter Newman wrote:
>>>>>>>>>>
>>>>>>>>>> I've attached the module->dump() that our code is producing.
>>>>>>>>>> Unfortunately this is the smallest test case I have available.
>>>>>>>>>>
>>>>>>>>>> This is before any optimization passes are applied. There are two
>>>>>>>>>> separate modules in existence at the time, and there are no guarantees
>>>>>>>>>> about the order the surrounding code calls those functions, so there may be
>>>>>>>>>> some interaction between them? There shouldn't be, they don't refer to any
>>>>>>>>>> common memory etc. There is no multi-threading occurring.
>>>>>>>>>>
>>>>>>>>>> The function in module-dump.ll (called crashfunc in this file) is
>>>>>>>>>> called with
>>>>>>>>>> -        func_params    0x0018f3b0    double [3]
>>>>>>>>>>         [0x0]    -11.339976634695301    double
>>>>>>>>>>         [0x1]    -9.7504239056205506    double
>>>>>>>>>>         [0x2]    -5.2900856817382804    double
>>>>>>>>>> at the time of the exception.
>>>>>>>>>>
>>>>>>>>>> This is compiled on a "i686-pc-win32" triple. All of the
>>>>>>>>>> non-intrinsic functions referred to in these modules are the standard
>>>>>>>>>> equivalents from the MSVC library (e.g. @asin is the standard C lib
>>>>>>>>>> double asin( double ) ).
>>>>>>>>>>
>>>>>>>>>> Hopefully this is reproducible for you.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> PeterN
>>>>>>>>>>
>>>>>>>>>> On 18/07/2013 4:37 PM, Craig Topper wrote:
>>>>>>>>>>
>>>>>>>>>> Are you able to send any IR for others to reproduce this issue?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman <peter at uformia.com
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Unfortunately, this doesn't appear to be the bug I'm hitting. I
>>>>>>>>>>> applied the fix to my source and it didn't make a difference.
>>>>>>>>>>>
>>>>>>>>>>> Also further testing found me getting the same behavior with
>>>>>>>>>>> other SIMD instructions. The common factor is in each case, ECX is set to
>>>>>>>>>>> 0x7fffffff, and it's an operation using xmm ptr ecx+offset .
>>>>>>>>>>>
>>>>>>>>>>> Additionally, turning the optimization level passed to createJIT
>>>>>>>>>>> down appears to avoid it, so I'm now leaning towards a bug in one of the
>>>>>>>>>>> optimization passes.
>>>>>>>>>>>
>>>>>>>>>>> I'm going to dig through the passes controlled by that parameter
>>>>>>>>>>> and see if I can narrow down which optimization is causing it.
>>>>>>>>>>>
>>>>>>>>>>> Peter N
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 17/07/2013 1:58 PM, Solomon Boulos wrote:
>>>>>>>>>>>
>>>>>>>>>>>> As someone off list just told me, perhaps my new bug is the
>>>>>>>>>>>> same issue:
>>>>>>>>>>>>
>>>>>>>>>>>>    http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>>>>>>>>
>>>>>>>>>>>> Do you happen to be using FastISel?
>>>>>>>>>>>>
>>>>>>>>>>>> Solomon
>>>>>>>>>>>>
>>>>>>>>>>>> On Jul 16, 2013, at 6:39 PM, Peter Newman <peter at uformia.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>  Hello all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm currently in the process of debugging a crash occurring in
>>>>>>>>>>>>> our program. In LLVM 3.2 and 3.3 it appears that JIT generated code is
>>>>>>>>>>>>> attempting to perform access unaligned memory with a SSE2 instruction.
>>>>>>>>>>>>> However this only happens under certain conditions that seem (but may not
>>>>>>>>>>>>> be) related to the stacks state on calling the function.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Our program acts as a front-end, using the LLVM C++ API to
>>>>>>>>>>>>> generate a JIT generated function. This function is primarily mathematical,
>>>>>>>>>>>>> so we use the Vector types to take advantage of SIMD instructions (as well
>>>>>>>>>>>>> as a few SSE2 intrinsics).
>>>>>>>>>>>>>
>>>>>>>>>>>>> This worked in LLVM 2.8 but started failing in 3.2 and has
>>>>>>>>>>>>> continued to fail in 3.3. It fails with no optimizations applied to the
>>>>>>>>>>>>> LLVM Function/Module. It crashes with what is reported as a memory access
>>>>>>>>>>>>> error (accessing 0xffffffff), however it's suggested that this is how the
>>>>>>>>>>>>> SSE fault raising mechanism appears.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The generated instruction varies, but it seems to often be
>>>>>>>>>>>>> similar to (I don't have it in front of me, sorry):
>>>>>>>>>>>>> movapd xmm0, xmm[ecx+0x???????]
>>>>>>>>>>>>> Where the xmm register changes, and the second parameter is a
>>>>>>>>>>>>> memory access.
>>>>>>>>>>>>> ECX is always set to 0x7ffffff - however I don't know if this
>>>>>>>>>>>>> is part of the SSE error reporting process or is part of the situation
>>>>>>>>>>>>> causing the error.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I haven't worked out exactly what code path etc is causing
>>>>>>>>>>>>> this crash. I'm hoping that someone can tell me if there were any changed
>>>>>>>>>>>>> requirements for working with SIMD in LLVM 3.2 (or earlier, we haven't
>>>>>>>>>>>>> tried 3.0 or 3.1). I currently suspect the use of GlobalVariable (we first
>>>>>>>>>>>>> discovered the crash when using a feature that uses them), however I have
>>>>>>>>>>>>> attempted using setAlignment on the GlobalVariables without any change.
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Peter N
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  --
>>>>>>>>>> ~Craig
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  --
>>>>>>>>> ~Craig
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>>> ~Craig
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>> ~Craig
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>>> ~Craig
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>> ~Craig
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>  --
>>>> ~Craig
>>>>
>>>>
>>>>
>>>
>>>
>>>  --
>>> ~Craig
>>>
>>>
>>>
>>
>>
>>  --
>> ~Craig
>>
>>
>>
>
>
>  --
> ~Craig
>
>
>


-- 
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130721/f6d25ae9/attachment.html>


More information about the llvm-dev mailing list