[LLVMdev] fptoui calling a function that modifies ECX
Peter Newman
peter at uformia.com
Fri Jul 19 02:34:30 PDT 2013
That does appear to have worked. All my tests are passing now.
I'll hand this out to our other devs & testers and make sure it's
working for them as well (not just on my machine).
Thank you, again.
--
Peter N
On 19/07/2013 5:45 PM, Craig Topper wrote:
> I don't think that's going to work.
>
>
> On Fri, Jul 19, 2013 at 12:24 AM, Peter Newman <peter at uformia.com
> <mailto:peter at uformia.com>> wrote:
>
> Thank you, I'm trying this now.
>
>
> On 19/07/2013 5:23 PM, Craig Topper wrote:
>> Try adding ECX to the Defs of this part of
>> lib/Target/X86/X86InstrCompiler.td like I've done below. I don't
>> have a Windows machine to test myself.
>>
>> let Defs = [EAX, EDX, ECX, EFLAGS], FPForm = SpecialFP in {
>> def WIN_FTOL_32 : I<0, Pseudo, (outs), (ins RFP32:$src),
>> "# win32 fptoui",
>> [(X86WinFTOL RFP32:$src)]>,
>> Requires<[In32BitMode]>;
>>
>> def WIN_FTOL_64 : I<0, Pseudo, (outs), (ins RFP64:$src),
>> "# win32 fptoui",
>> [(X86WinFTOL RFP64:$src)]>,
>> Requires<[In32BitMode]>;
>> }
>>
>>
>> On Thu, Jul 18, 2013 at 11:59 PM, Peter Newman <peter at uformia.com
>> <mailto:peter at uformia.com>> wrote:
>>
>> Oh, excellent point, I agree. My bad. Now that I'm not
>> assuming those are the sqrt, I see the sqrtpd's in the
>> output. Also there are three fptoui's and there are 3 call
>> instances.
>>
>> (Changing subject line again.)
>>
>> Now it looks like it's bug #13862
>>
>> On 19/07/2013 4:51 PM, Craig Topper wrote:
>>> I think those calls correspond to this
>>>
>>> %110 = fptoui double %109 to i32
>>>
>>> The calls are followed by an imul with 12 which matches up
>>> with what occurs right after the fptoui in the IR.
>>>
>>>
>>> On Thu, Jul 18, 2013 at 11:48 PM, Peter Newman
>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>
>>> Yes, that is the result of module-dump.ll
>>>
>>>
>>> On 19/07/2013 4:46 PM, Craig Topper wrote:
>>>> Does this correspond to one of the .ll files you sent
>>>> earlier?
>>>>
>>>>
>>>> On Thu, Jul 18, 2013 at 11:34 PM, Peter Newman
>>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>
>>>> (Changing subject line as diagnosis has changed)
>>>>
>>>> I'm attaching the compiled code that I've been
>>>> getting, both with CodeGenOpt::Default and
>>>> CodeGenOpt::None . The crash isn't occurring with
>>>> CodeGenOpt::None, but that seems to be because ECX
>>>> isn't being used - it still gets set to 0x7fffffff
>>>> by one of the calls to 76719BA1
>>>>
>>>> I notice that X86::SQRTPD[m|r] appear in
>>>> X86InstrInfo::isHighLatencyDef. I was thinking an
>>>> optimization might be removing it, but I don't get
>>>> the sqrtpd instruction even if the createJIT
>>>> optimization level turned off.
>>>>
>>>> I am trying this with the Release 3.3 code - I'll
>>>> try it with trunk and see if I get a different
>>>> result there. Maybe there was a recent commit for this.
>>>>
>>>> --
>>>> Peter N
>>>>
>>>> On 19/07/2013 4:00 PM, Craig Topper wrote:
>>>>> Hmm, I'm not able to get those .ll files to
>>>>> compile if I disable SSE and I end up with SSE
>>>>> instructions(including sqrtpd) if I don't disable it.
>>>>>
>>>>>
>>>>> On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman
>>>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote:
>>>>>
>>>>> Is there something specifically required to
>>>>> enable SSE? If it's not detected as available
>>>>> (based from the target triple?) then I don't
>>>>> think we enable it specifically.
>>>>>
>>>>> Also it seems that it should handle converting
>>>>> to/from the vector types, although I can see
>>>>> it getting confused about needing to do that
>>>>> if it thinks SSE isn't available at all.
>>>>>
>>>>>
>>>>> On 19/07/2013 3:47 PM, Craig Topper wrote:
>>>>>> Hmm, maybe sse isn't being enabled so its
>>>>>> falling back to emulating sqrt?
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 18, 2013 at 10:45 PM, Peter
>>>>>> Newman <peter at uformia.com
>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>
>>>>>> In the disassembly, I'm seeing three cases of
>>>>>> call 76719BA1
>>>>>>
>>>>>> I am assuming this is the sqrt function
>>>>>> as this is the only function called in
>>>>>> the LLVM IR.
>>>>>>
>>>>>> The code at 76719BA1 is:
>>>>>>
>>>>>> 76719BA1 push ebp
>>>>>> 76719BA2 mov ebp,esp
>>>>>> 76719BA4 sub esp,20h
>>>>>> 76719BA7 and esp,0FFFFFFF0h
>>>>>> 76719BAA fld st(0)
>>>>>> 76719BAC fst dword ptr [esp+18h]
>>>>>> 76719BB0 fistp qword ptr [esp+10h]
>>>>>> 76719BB4 fild qword ptr [esp+10h]
>>>>>> 76719BB8 mov edx,dword ptr [esp+18h]
>>>>>> 76719BBC mov eax,dword ptr [esp+10h]
>>>>>> 76719BC0 test eax,eax
>>>>>> 76719BC2 je 76719DCF
>>>>>> 76719BC8 fsubp st(1),st
>>>>>> 76719BCA test edx,edx
>>>>>> 76719BCC js 7671F9DB
>>>>>> 76719BD2 fstp dword ptr [esp]
>>>>>> 76719BD5 mov ecx,dword ptr [esp]
>>>>>> 76719BD8 add ecx,7FFFFFFFh
>>>>>> 76719BDE sbb eax,0
>>>>>> 76719BE1 mov edx,dword ptr [esp+14h]
>>>>>> 76719BE5 sbb edx,0
>>>>>> 76719BE8 leave
>>>>>> 76719BE9 ret
>>>>>>
>>>>>>
>>>>>> As you can see at 76719BD5, it modifies ECX .
>>>>>>
>>>>>> I don't know that this is the sqrtpd
>>>>>> function (for example, I'm not seeing any
>>>>>> SSE instructions here?) but whatever it
>>>>>> is, it's being called from the IR I
>>>>>> attached earlier, and is modifying ECX
>>>>>> under some circumstances.
>>>>>>
>>>>>>
>>>>>> On 19/07/2013 3:29 PM, Craig Topper wrote:
>>>>>>> That should map directly to sqrtpd which
>>>>>>> can't modify ecx.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 18, 2013 at 10:27 PM, Peter
>>>>>>> Newman <peter at uformia.com
>>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>>
>>>>>>> Sorry, that should have been
>>>>>>> llvm.x86.sse2.sqrt.pd
>>>>>>>
>>>>>>>
>>>>>>> On 19/07/2013 3:25 PM, Craig Topper
>>>>>>> wrote:
>>>>>>>> What is "frep.x86.sse2.sqrt.pd".
>>>>>>>> I'm only familiar with things
>>>>>>>> prefixed with "llvm.x86".
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jul 18, 2013 at 10:12 PM,
>>>>>>>> Peter Newman <peter at uformia.com
>>>>>>>> <mailto:peter at uformia.com>> wrote:
>>>>>>>>
>>>>>>>> After stepping through the
>>>>>>>> produced assembly, I believe I
>>>>>>>> have a culprit.
>>>>>>>>
>>>>>>>> One of the calls to
>>>>>>>> @frep.x86.sse2.sqrt.pd is
>>>>>>>> modifying the value of ECX -
>>>>>>>> while the produced code is
>>>>>>>> expecting it to still contain
>>>>>>>> its previous value.
>>>>>>>>
>>>>>>>> Peter N
>>>>>>>>
>>>>>>>>
>>>>>>>> On 19/07/2013 2:09 PM, Peter
>>>>>>>> Newman wrote:
>>>>>>>>> I've attached the
>>>>>>>>> module->dump() that our code
>>>>>>>>> is producing. Unfortunately
>>>>>>>>> this is the smallest test case
>>>>>>>>> I have available.
>>>>>>>>>
>>>>>>>>> This is before any
>>>>>>>>> optimization passes are
>>>>>>>>> applied. There are two
>>>>>>>>> separate modules in existence
>>>>>>>>> at the time, and there are no
>>>>>>>>> guarantees about the order the
>>>>>>>>> surrounding code calls those
>>>>>>>>> functions, so there may be
>>>>>>>>> some interaction between them?
>>>>>>>>> There shouldn't be, they don't
>>>>>>>>> refer to any common memory
>>>>>>>>> etc. There is no
>>>>>>>>> multi-threading occurring.
>>>>>>>>>
>>>>>>>>> The function in module-dump.ll
>>>>>>>>> (called crashfunc in this
>>>>>>>>> file) is called with
>>>>>>>>> - func_params 0x0018f3b0
>>>>>>>>> double [3]
>>>>>>>>> [0x0] -11.339976634695301 double
>>>>>>>>> [0x1] -9.7504239056205506 double
>>>>>>>>> [0x2] -5.2900856817382804 double
>>>>>>>>> at the time of the exception.
>>>>>>>>>
>>>>>>>>> This is compiled on a
>>>>>>>>> "i686-pc-win32" triple. All of
>>>>>>>>> the non-intrinsic functions
>>>>>>>>> referred to in these modules
>>>>>>>>> are the standard equivalents
>>>>>>>>> from the MSVC library (e.g.
>>>>>>>>> @asin is the standard C lib
>>>>>>>>> double asin( double ) ).
>>>>>>>>>
>>>>>>>>> Hopefully this is reproducible
>>>>>>>>> for you.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> PeterN
>>>>>>>>>
>>>>>>>>> On 18/07/2013 4:37 PM, Craig
>>>>>>>>> Topper wrote:
>>>>>>>>>> Are you able to send any IR
>>>>>>>>>> for others to reproduce this
>>>>>>>>>> issue?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 17, 2013 at 11:23
>>>>>>>>>> PM, Peter Newman
>>>>>>>>>> <peter at uformia.com
>>>>>>>>>> <mailto:peter at uformia.com>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Unfortunately, this
>>>>>>>>>> doesn't appear to be the
>>>>>>>>>> bug I'm hitting. I
>>>>>>>>>> applied the fix to my
>>>>>>>>>> source and it didn't make
>>>>>>>>>> a difference.
>>>>>>>>>>
>>>>>>>>>> Also further testing
>>>>>>>>>> found me getting the same
>>>>>>>>>> behavior with other SIMD
>>>>>>>>>> instructions. The common
>>>>>>>>>> factor is in each case,
>>>>>>>>>> ECX is set to 0x7fffffff,
>>>>>>>>>> and it's an operation
>>>>>>>>>> using xmm ptr ecx+offset .
>>>>>>>>>>
>>>>>>>>>> Additionally, turning the
>>>>>>>>>> optimization level passed
>>>>>>>>>> to createJIT down appears
>>>>>>>>>> to avoid it, so I'm now
>>>>>>>>>> leaning towards a bug in
>>>>>>>>>> one of the optimization
>>>>>>>>>> passes.
>>>>>>>>>>
>>>>>>>>>> I'm going to dig through
>>>>>>>>>> the passes controlled by
>>>>>>>>>> that parameter and see if
>>>>>>>>>> I can narrow down which
>>>>>>>>>> optimization is causing it.
>>>>>>>>>>
>>>>>>>>>> Peter N
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 17/07/2013 1:58 PM,
>>>>>>>>>> Solomon Boulos wrote:
>>>>>>>>>>
>>>>>>>>>> As someone off list
>>>>>>>>>> just told me, perhaps
>>>>>>>>>> my new bug is the
>>>>>>>>>> same issue:
>>>>>>>>>>
>>>>>>>>>> http://llvm.org/bugs/show_bug.cgi?id=16640
>>>>>>>>>>
>>>>>>>>>> Do you happen to be
>>>>>>>>>> using FastISel?
>>>>>>>>>>
>>>>>>>>>> Solomon
>>>>>>>>>>
>>>>>>>>>> On Jul 16, 2013, at
>>>>>>>>>> 6:39 PM, Peter Newman
>>>>>>>>>> <peter at uformia.com
>>>>>>>>>> <mailto:peter at uformia.com>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hello all,
>>>>>>>>>>
>>>>>>>>>> I'm currently in
>>>>>>>>>> the process of
>>>>>>>>>> debugging a crash
>>>>>>>>>> occurring in our
>>>>>>>>>> program. In LLVM
>>>>>>>>>> 3.2 and 3.3 it
>>>>>>>>>> appears that JIT
>>>>>>>>>> generated code is
>>>>>>>>>> attempting to
>>>>>>>>>> perform access
>>>>>>>>>> unaligned memory
>>>>>>>>>> with a SSE2
>>>>>>>>>> instruction.
>>>>>>>>>> However this only
>>>>>>>>>> happens under
>>>>>>>>>> certain
>>>>>>>>>> conditions that
>>>>>>>>>> seem (but may not
>>>>>>>>>> be) related to
>>>>>>>>>> the stacks state
>>>>>>>>>> on calling the
>>>>>>>>>> function.
>>>>>>>>>>
>>>>>>>>>> Our program acts
>>>>>>>>>> as a front-end,
>>>>>>>>>> using the LLVM
>>>>>>>>>> C++ API to
>>>>>>>>>> generate a JIT
>>>>>>>>>> generated
>>>>>>>>>> function. This
>>>>>>>>>> function is
>>>>>>>>>> primarily
>>>>>>>>>> mathematical, so
>>>>>>>>>> we use the Vector
>>>>>>>>>> types to take
>>>>>>>>>> advantage of SIMD
>>>>>>>>>> instructions (as
>>>>>>>>>> well as a few
>>>>>>>>>> SSE2 intrinsics).
>>>>>>>>>>
>>>>>>>>>> This worked in
>>>>>>>>>> LLVM 2.8 but
>>>>>>>>>> started failing
>>>>>>>>>> in 3.2 and has
>>>>>>>>>> continued to fail
>>>>>>>>>> in 3.3. It fails
>>>>>>>>>> with no
>>>>>>>>>> optimizations
>>>>>>>>>> applied to the
>>>>>>>>>> LLVM
>>>>>>>>>> Function/Module.
>>>>>>>>>> It crashes with
>>>>>>>>>> what is reported
>>>>>>>>>> as a memory
>>>>>>>>>> access error
>>>>>>>>>> (accessing
>>>>>>>>>> 0xffffffff),
>>>>>>>>>> however it's
>>>>>>>>>> suggested that
>>>>>>>>>> this is how the
>>>>>>>>>> SSE fault raising
>>>>>>>>>> mechanism appears.
>>>>>>>>>>
>>>>>>>>>> The generated
>>>>>>>>>> instruction
>>>>>>>>>> varies, but it
>>>>>>>>>> seems to often be
>>>>>>>>>> similar to (I
>>>>>>>>>> don't have it in
>>>>>>>>>> front of me, sorry):
>>>>>>>>>> movapd xmm0,
>>>>>>>>>> xmm[ecx+0x???????]
>>>>>>>>>> Where the xmm
>>>>>>>>>> register changes,
>>>>>>>>>> and the second
>>>>>>>>>> parameter is a
>>>>>>>>>> memory access.
>>>>>>>>>> ECX is always set
>>>>>>>>>> to 0x7ffffff -
>>>>>>>>>> however I don't
>>>>>>>>>> know if this is
>>>>>>>>>> part of the SSE
>>>>>>>>>> error reporting
>>>>>>>>>> process or is
>>>>>>>>>> part of the
>>>>>>>>>> situation causing
>>>>>>>>>> the error.
>>>>>>>>>>
>>>>>>>>>> I haven't worked
>>>>>>>>>> out exactly what
>>>>>>>>>> code path etc is
>>>>>>>>>> causing this
>>>>>>>>>> crash. I'm hoping
>>>>>>>>>> that someone can
>>>>>>>>>> tell me if there
>>>>>>>>>> were any changed
>>>>>>>>>> requirements for
>>>>>>>>>> working with SIMD
>>>>>>>>>> in LLVM 3.2 (or
>>>>>>>>>> earlier, we
>>>>>>>>>> haven't tried 3.0
>>>>>>>>>> or 3.1). I
>>>>>>>>>> currently suspect
>>>>>>>>>> the use of
>>>>>>>>>> GlobalVariable
>>>>>>>>>> (we first
>>>>>>>>>> discovered the
>>>>>>>>>> crash when using
>>>>>>>>>> a feature that
>>>>>>>>>> uses them),
>>>>>>>>>> however I have
>>>>>>>>>> attempted using
>>>>>>>>>> setAlignment on
>>>>>>>>>> the
>>>>>>>>>> GlobalVariables
>>>>>>>>>> without any change.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Peter N
>>>>>>>>>> _______________________________________________
>>>>>>>>>> LLVM Developers
>>>>>>>>>> mailing list
>>>>>>>>>> LLVMdev at cs.uiuc.edu
>>>>>>>>>> <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>>>> http://llvm.cs.uiuc.edu
>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> LLVMdev at cs.uiuc.edu
>>>>>>>>>> <mailto:LLVMdev at cs.uiuc.edu>
>>>>>>>>>> http://llvm.cs.uiuc.edu
>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> ~Craig
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ~Craig
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ~Craig
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ~Craig
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~Craig
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ~Craig
>>>
>>>
>>>
>>>
>>> --
>>> ~Craig
>>
>>
>>
>>
>> --
>> ~Craig
>
>
>
>
> --
> ~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130719/f93ac47a/attachment.html>
More information about the llvm-dev
mailing list