[LLVMdev] fptoui calling a function that modifies ECX
Peter Newman
peter at uformia.com
Fri Jul 19 00:24:04 PDT 2013
Thank you, I'm trying this now.
On 19/07/2013 5:23 PM, Craig Topper wrote:
> Try adding ECX to the Defs of this part of
> lib/Target/X86/X86InstrCompiler.td like I've done below. I don't have
> a Windows machine to test myself.
>
> let Defs = [EAX, EDX, ECX, EFLAGS], FPForm = SpecialFP in {
> def WIN_FTOL_32 : I<0, Pseudo, (outs), (ins RFP32:$src),
> "# win32 fptoui",
> [(X86WinFTOL RFP32:$src)]>,
> Requires<[In32BitMode]>;
>
> def WIN_FTOL_64 : I<0, Pseudo, (outs), (ins RFP64:$src),
> "# win32 fptoui",
> [(X86WinFTOL RFP64:$src)]>,
> Requires<[In32BitMode]>;
> }
>
>
> On Thu, Jul 18, 2013 at 11:59 PM, Peter Newman <peter at uformia.com
> <mailto:peter at uformia.com>> wrote:
>
> Oh, excellent point, I agree. My bad. Now that I'm not assuming
> those are the sqrt, I see the sqrtpd's in the output. Also there
> are three fptoui's and there are 3 call instances.
>
> (Changing subject line again.)
>
> Now it looks like it's bug #13862
>
> On 19/07/2013 4:51 PM, Craig Topper wrote:
>> I think those calls correspond to this
>>
>> %110 = fptoui double %109 to i32
>>
>> The calls are followed by an imul with 12 which matches up with
>> what occurs right after the fptoui in the IR.
>>
>>
>> On Thu, Jul 18, 2013 at 11:48 PM, Peter Newman <peter at uformia.com
>> <mailto:peter at uformia.com>> wrote:
>>
>> Yes, that is the result of module-dump.ll
>>
>>
>> On 19/07/2013 4:46 PM, Craig Topper wrote:
>>> Does this correspond to one of the .ll files you sent earlier?
>>>
>>>
>>> On Thu, Jul 18, 2013 at 11:34 PM, Peter Newman
>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>
>>> (Changing subject line as diagnosis has changed)
>>>
>>> I'm attaching the compiled code that I've been getting,
>>> both with CodeGenOpt::Default and CodeGenOpt::None . The
>>> crash isn't occurring with CodeGenOpt::None, but that
>>> seems to be because ECX isn't being used - it still gets
>>> set to 0x7fffffff by one of the calls to 76719BA1
>>>
>>> I notice that X86::SQRTPD[m|r] appear in
>>> X86InstrInfo::isHighLatencyDef. I was thinking an
>>> optimization might be removing it, but I don't get the
>>> sqrtpd instruction even if the createJIT optimization
>>> level turned off.
>>>
>>> I am trying this with the Release 3.3 code - I'll try it
>>> with trunk and see if I get a different result there.
>>> Maybe there was a recent commit for this.
>>>
>>> --
>>> Peter N
>>>
>>> On 19/07/2013 4:00 PM, Craig Topper wrote:
>>>> Hmm, I'm not able to get those .ll files to compile if
>>>> I disable SSE and I end up with SSE
>>>> instructions(including sqrtpd) if I don't disable it.
>>>>
>>>>
>>>> On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman
>>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>
>>>> Is there something specifically required to enable
>>>> SSE? If it's not detected as available (based from
>>>> the target triple?) then I don't think we enable it
>>>> specifically.
>>>>
>>>> Also it seems that it should handle converting
>>>> to/from the vector types, although I can see it
>>>> getting confused about needing to do that if it
>>>> thinks SSE isn't available at all.
>>>>
>>>>
>>>> On 19/07/2013 3:47 PM, Craig Topper wrote:
>>>>> Hmm, maybe sse isn't being enabled so its falling
>>>>> back to emulating sqrt?
>>>>>
>>>>>
>>>>> On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman
>>>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>>
>>>>> In the disassembly, I'm seeing three cases of
>>>>> call 76719BA1
>>>>>
>>>>> I am assuming this is the sqrt function as
>>>>> this is the only function called in the LLVM IR.
>>>>>
>>>>> The code at 76719BA1 is:
>>>>>
>>>>> 76719BA1 push ebp
>>>>> 76719BA2 mov ebp,esp
>>>>> 76719BA4 sub esp,20h
>>>>> 76719BA7 and esp,0FFFFFFF0h
>>>>> 76719BAA fld st(0)
>>>>> 76719BAC fst dword ptr [esp+18h]
>>>>> 76719BB0 fistp qword ptr [esp+10h]
>>>>> 76719BB4 fild qword ptr [esp+10h]
>>>>> 76719BB8 mov edx,dword ptr [esp+18h]
>>>>> 76719BBC mov eax,dword ptr [esp+10h]
>>>>> 76719BC0 test eax,eax
>>>>> 76719BC2 je 76719DCF
>>>>> 76719BC8 fsubp st(1),st
>>>>> 76719BCA test edx,edx
>>>>> 76719BCC js 7671F9DB
>>>>> 76719BD2 fstp dword ptr [esp]
>>>>> 76719BD5 mov ecx,dword ptr [esp]
>>>>> 76719BD8 add ecx,7FFFFFFFh
>>>>> 76719BDE sbb eax,0
>>>>> 76719BE1 mov edx,dword ptr [esp+14h]
>>>>> 76719BE5 sbb edx,0
>>>>> 76719BE8 leave
>>>>> 76719BE9 ret
>>>>>
>>>>>
>>>>> As you can see at 76719BD5, it modifies ECX .
>>>>>
>>>>> I don't know that this is the sqrtpd function
>>>>> (for example, I'm not seeing any SSE
>>>>> instructions here?) but whatever it is, it's
>>>>> being called from the IR I attached earlier,
>>>>> and is modifying ECX under some circumstances.
>>>>>
>>>>>
>>>>> On 19/07/2013 3:29 PM, Craig Topper wrote:
>>>>>> That should map directly to sqrtpd which
>>>>>> can't modify ecx.
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 18, 2013 at 10:27 PM, Peter
>>>>>> Newman <peter at uformia.com
>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>
>>>>>> Sorry, that should have been
>>>>>> llvm.x86.sse2.sqrt.pd
>>>>>>
>>>>>>
>>>>>> On 19/07/2013 3:25 PM, Craig Topper wrote:
>>>>>>> What is "frep.x86.sse2.sqrt.pd". I'm
>>>>>>> only familiar with things prefixed with
>>>>>>> "llvm.x86".
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 18, 2013 at 10:12 PM, Peter
>>>>>>> Newman <peter at uformia.com
>>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>>
>>>>>>> After stepping through the produced
>>>>>>> assembly, I believe I have a culprit.
>>>>>>>
>>>>>>> One of the calls to
>>>>>>> @frep.x86.sse2.sqrt.pd is modifying
>>>>>>> the value of ECX - while the
>>>>>>> produced code is expecting it to
>>>>>>> still contain its previous value.
>>>>>>>
>>>>>>> Peter N
>>>>>>>
>>>>>>>
>>>>>>> On 19/07/2013 2:09 PM, Peter Newman
>>>>>>> wrote:
>>>>>>>> I've attached the module->dump()
>>>>>>>> that our code is producing.
>>>>>>>> Unfortunately this is the smallest
>>>>>>>> test case I have available.
>>>>>>>>
>>>>>>>> This is before any optimization
>>>>>>>> passes are applied. There are two
>>>>>>>> separate modules in existence at
>>>>>>>> the time, and there are no
>>>>>>>> guarantees about the order the
>>>>>>>> surrounding code calls those
>>>>>>>> functions, so there may be some
>>>>>>>> interaction between them? There
>>>>>>>> shouldn't be, they don't refer to
>>>>>>>> any common memory etc. There is no
>>>>>>>> multi-threading occurring.
>>>>>>>>
>>>>>>>> The function in module-dump.ll
>>>>>>>> (called crashfunc in this file) is
>>>>>>>> called with
>>>>>>>> - func_params 0x0018f3b0 double [3]
>>>>>>>> [0x0] -11.339976634695301 double
>>>>>>>> [0x1] -9.7504239056205506 double
>>>>>>>> [0x2] -5.2900856817382804 double
>>>>>>>> at the time of the exception.
>>>>>>>>
>>>>>>>> This is compiled on a
>>>>>>>> "i686-pc-win32" triple. All of the
>>>>>>>> non-intrinsic functions referred to
>>>>>>>> in these modules are the standard
>>>>>>>> equivalents from the MSVC library
>>>>>>>> (e.g. @asin is the standard C lib
>>>>>>>> double asin( double ) ).
>>>>>>>>
>>>>>>>> Hopefully this is reproducible for you.
>>>>>>>>
>>>>>>>> --
>>>>>>>> PeterN
>>>>>>>>
>>>>>>>> On 18/07/2013 4:37 PM, Craig Topper
>>>>>>>> wrote:
>>>>>>>>> Are you able to send any IR for
>>>>>>>>> others to reproduce this issue?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jul 17, 2013 at 11:23 PM,
>>>>>>>>> Peter Newman <peter at uformia.com
>>>>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>>>>
>>>>>>>>> Unfortunately, this doesn't
>>>>>>>>> appear to be the bug I'm
>>>>>>>>> hitting. I applied the fix to
>>>>>>>>> my source and it didn't make a
>>>>>>>>> difference.
>>>>>>>>>
>>>>>>>>> Also further testing found me
>>>>>>>>> getting the same behavior with
>>>>>>>>> other SIMD instructions. The
>>>>>>>>> common factor is in each case,
>>>>>>>>> ECX is set to 0x7fffffff, and
>>>>>>>>> it's an operation using xmm
>>>>>>>>> ptr ecx+offset .
>>>>>>>>>
>>>>>>>>> Additionally, turning the
>>>>>>>>> optimization level passed to
>>>>>>>>> createJIT down appears to
>>>>>>>>> avoid it, so I'm now leaning
>>>>>>>>> towards a bug in one of the
>>>>>>>>> optimization passes.
>>>>>>>>>
>>>>>>>>> I'm going to dig through the
>>>>>>>>> passes controlled by that
>>>>>>>>> parameter and see if I can
>>>>>>>>> narrow down which optimization
>>>>>>>>> is causing it.
>>>>>>>>>
>>>>>>>>> Peter N
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 17/07/2013 1:58 PM, Solomon
>>>>>>>>> Boulos wrote:
>>>>>>>>>
>>>>>>>>> As someone off list just
>>>>>>>>> told me, perhaps my new
>>>>>>>>> bug is the same issue:
>>>>>>>>>
>>>>>>>>> http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>>>>>
>>>>>>>>> Do you happen to be using
>>>>>>>>> FastISel?
>>>>>>>>>
>>>>>>>>> Solomon
>>>>>>>>>
>>>>>>>>> On Jul 16, 2013, at 6:39
>>>>>>>>> PM, Peter Newman
>>>>>>>>> <peter at uformia.com
>>>>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>>>>
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> I'm currently in the
>>>>>>>>> process of debugging a
>>>>>>>>> crash occurring in our
>>>>>>>>> program. In LLVM 3.2
>>>>>>>>> and 3.3 it appears
>>>>>>>>> that JIT generated
>>>>>>>>> code is attempting to
>>>>>>>>> perform access
>>>>>>>>> unaligned memory with
>>>>>>>>> a SSE2 instruction.
>>>>>>>>> However this only
>>>>>>>>> happens under certain
>>>>>>>>> conditions that seem
>>>>>>>>> (but may not be)
>>>>>>>>> related to the stacks
>>>>>>>>> state on calling the
>>>>>>>>> function.
>>>>>>>>>
>>>>>>>>> Our program acts as a
>>>>>>>>> front-end, using the
>>>>>>>>> LLVM C++ API to
>>>>>>>>> generate a JIT
>>>>>>>>> generated function.
>>>>>>>>> This function is
>>>>>>>>> primarily
>>>>>>>>> mathematical, so we
>>>>>>>>> use the Vector types
>>>>>>>>> to take advantage of
>>>>>>>>> SIMD instructions (as
>>>>>>>>> well as a few SSE2
>>>>>>>>> intrinsics).
>>>>>>>>>
>>>>>>>>> This worked in LLVM
>>>>>>>>> 2.8 but started
>>>>>>>>> failing in 3.2 and has
>>>>>>>>> continued to fail in
>>>>>>>>> 3.3. It fails with no
>>>>>>>>> optimizations applied
>>>>>>>>> to the LLVM
>>>>>>>>> Function/Module. It
>>>>>>>>> crashes with what is
>>>>>>>>> reported as a memory
>>>>>>>>> access error
>>>>>>>>> (accessing
>>>>>>>>> 0xffffffff), however
>>>>>>>>> it's suggested that
>>>>>>>>> this is how the SSE
>>>>>>>>> fault raising
>>>>>>>>> mechanism appears.
>>>>>>>>>
>>>>>>>>> The generated
>>>>>>>>> instruction varies,
>>>>>>>>> but it seems to often
>>>>>>>>> be similar to (I don't
>>>>>>>>> have it in front of
>>>>>>>>> me, sorry):
>>>>>>>>> movapd xmm0,
>>>>>>>>> xmm[ecx+0x???????]
>>>>>>>>> Where the xmm register
>>>>>>>>> changes, and the
>>>>>>>>> second parameter is a
>>>>>>>>> memory access.
>>>>>>>>> ECX is always set to
>>>>>>>>> 0x7ffffff - however I
>>>>>>>>> don't know if this is
>>>>>>>>> part of the SSE error
>>>>>>>>> reporting process or
>>>>>>>>> is part of the
>>>>>>>>> situation causing the
>>>>>>>>> error.
>>>>>>>>>
>>>>>>>>> I haven't worked out
>>>>>>>>> exactly what code path
>>>>>>>>> etc is causing this
>>>>>>>>> crash. I'm hoping that
>>>>>>>>> someone can tell me if
>>>>>>>>> there were any changed
>>>>>>>>> requirements for
>>>>>>>>> working with SIMD in
>>>>>>>>> LLVM 3.2 (or earlier,
>>>>>>>>> we haven't tried 3.0
>>>>>>>>> or 3.1). I currently
>>>>>>>>> suspect the use of
>>>>>>>>> GlobalVariable (we
>>>>>>>>> first discovered the
>>>>>>>>> crash when using a
>>>>>>>>> feature that uses
>>>>>>>>> them), however I have
>>>>>>>>> attempted using
>>>>>>>>> setAlignment on the
>>>>>>>>> GlobalVariables
>>>>>>>>> without any change.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Peter N
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers
>>>>>>>>> mailing list
>>>>>>>>> LLVMdev at cs.uiuc.edu
>>>>>>>>> <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>>> http://llvm.cs.uiuc.edu
>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> LLVMdev at cs.uiuc.edu
>>>>>>>>> <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>>> http://llvm.cs.uiuc.edu
>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ~Craig
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ~Craig
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ~Craig
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~Craig
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ~Craig
>>>
>>>
>>>
>>>
>>> --
>>> ~Craig
>>
>>
>>
>>
>> --
>> ~Craig
>
>
>
>
> --
> ~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130719/65cb4500/attachment.html>
More information about the llvm-dev
mailing list