<div dir="ltr">Hmm, maybe sse isn't being enabled so its falling back to emulating sqrt?</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    <div>In the disassembly, I'm seeing three
      cases of<br>
      call        76719BA1<br>
      <br>
      I am assuming this is the sqrt function as this is the only
      function called in the LLVM IR.<br>
      <br>
      The code at 76719BA1 is:<br>
      <br>
      76719BA1  push        ebp  <br>
      76719BA2  mov         ebp,esp <br>
      76719BA4  sub         esp,20h <br>
      76719BA7  and         esp,0FFFFFFF0h <br>
      76719BAA  fld         st(0) <br>
      76719BAC  fst         dword ptr [esp+18h] <br>
      76719BB0  fistp       qword ptr [esp+10h] <br>
      76719BB4  fild        qword ptr [esp+10h] <br>
      76719BB8  mov         edx,dword ptr [esp+18h] <br>
      76719BBC  mov         eax,dword ptr [esp+10h] <br>
      76719BC0  test        eax,eax <br>
      76719BC2  je          76719DCF <br>
      76719BC8  fsubp       st(1),st <br>
      76719BCA  test        edx,edx <br>
      76719BCC  js          7671F9DB <br>
      76719BD2  fstp        dword ptr [esp] <br>
      76719BD5  mov         ecx,dword ptr [esp] <br>
      76719BD8  add         ecx,7FFFFFFFh <br>
      76719BDE  sbb         eax,0 <br>
      76719BE1  mov         edx,dword ptr [esp+14h] <br>
      76719BE5  sbb         edx,0 <br>
      76719BE8  leave            <br>
      76719BE9  ret              <br>
      <br>
      <br>
      As you can see at 76719BD5, it modifies ECX .<br>
      <br>
      I don't know that this is the sqrtpd function (for example, I'm
      not seeing any SSE instructions here?) but whatever it is, it's
      being called from the IR I attached earlier, and is modifying ECX
      under some circumstances.<div><div class="h5"><br>
      <br>
      On 19/07/2013 3:29 PM, Craig Topper wrote:<br>
    </div></div></div><div><div class="h5">
    <blockquote type="cite">
      <div dir="ltr">That should map directly to sqrtpd which can't
        modify ecx.</div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Thu, Jul 18, 2013 at 10:27 PM, Peter
          Newman <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <div>Sorry, that should have been llvm.x86.sse2.sqrt.pd
                <div>
                  <div><br>
                    <br>
                    On 19/07/2013 3:25 PM, Craig Topper wrote:<br>
                  </div>
                </div>
              </div>
              <div>
                <div>
                  <blockquote type="cite">
                    <div dir="ltr">What is "frep.x86.sse2.sqrt.pd". I'm
                      only familiar with things prefixed with
                      "llvm.x86".</div>
                    <div class="gmail_extra"><br>
                      <br>
                      <div class="gmail_quote">On Thu, Jul 18, 2013 at
                        10:12 PM, Peter Newman <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF">
                            <div>After stepping through the produced
                              assembly, I believe I have a culprit.<br>
                              <br>
                              One of the calls to @frep.x86.sse2.sqrt.pd
                              is modifying the value of ECX - while the
                              produced code is expecting it to still
                              contain its previous value.<br>
                              <br>
                              Peter N
                              <div>
                                <div><br>
                                  <br>
                                  On 19/07/2013 2:09 PM, Peter Newman
                                  wrote:<br>
                                </div>
                              </div>
                            </div>
                            <div>
                              <div>
                                <blockquote type="cite">
                                  <div>I've attached the
                                    module->dump() that our code is
                                    producing. Unfortunately this is the
                                    smallest test case I have available.<br>
                                    <br>
                                    This is before any optimization
                                    passes are applied. There are two
                                    separate modules in existence at the
                                    time, and there are no guarantees
                                    about the order the surrounding code
                                    calls those functions, so there may
                                    be some interaction between them?
                                    There shouldn't be, they don't refer
                                    to any common memory etc. There is
                                    no multi-threading occurring.<br>
                                    <br>
                                    The function in module-dump.ll
                                    (called crashfunc in this file) is
                                    called with<br>
                                    -        func_params   
                                    0x0018f3b0    double [3]<br>
                                            [0x0]   
                                    -11.339976634695301    double<br>
                                            [0x1]   
                                    -9.7504239056205506    double<br>
                                            [0x2]   
                                    -5.2900856817382804    double<br>
                                    at the time of the exception.<br>
                                    <br>
                                    This is compiled on a
                                    "i686-pc-win32" triple. All of the
                                    non-intrinsic functions referred to
                                    in these modules are the standard
                                    equivalents from the MSVC library
                                    (e.g. @asin is the standard C lib   
                                    double asin( double ) ).<br>
                                    <br>
                                    Hopefully this is reproducible for
                                    you.<br>
                                    <br>
                                    --<br>
                                    PeterN<br>
                                    <br>
                                    On 18/07/2013 4:37 PM, Craig Topper
                                    wrote:<br>
                                  </div>
                                  <blockquote type="cite">
                                    <div dir="ltr">Are you able to send
                                      any IR for others to reproduce
                                      this issue?</div>
                                    <div class="gmail_extra"><br>
                                      <br>
                                      <div class="gmail_quote">On Wed,
                                        Jul 17, 2013 at 11:23 PM, Peter
                                        Newman <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                                        wrote:<br>
                                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Unfortunately,

                                          this doesn't appear to be the
                                          bug I'm hitting. I applied the
                                          fix to my source and it didn't
                                          make a difference.<br>
                                          <br>
                                          Also further testing found me
                                          getting the same behavior with
                                          other SIMD instructions. The
                                          common factor is in each case,
                                          ECX is set to 0x7fffffff, and
                                          it's an operation using xmm
                                          ptr ecx+offset .<br>
                                          <br>
                                          Additionally, turning the
                                          optimization level passed to
                                          createJIT down appears to
                                          avoid it, so I'm now leaning
                                          towards a bug in one of the
                                          optimization passes.<br>
                                          <br>
                                          I'm going to dig through the
                                          passes controlled by that
                                          parameter and see if I can
                                          narrow down which optimization
                                          is causing it.<br>
                                          <br>
                                          Peter N
                                          <div>
                                            <div><br>
                                              <br>
                                              On 17/07/2013 1:58 PM,
                                              Solomon Boulos wrote:<br>
                                              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                As someone off list just
                                                told me, perhaps my new
                                                bug is the same issue:<br>
                                                <br>
                                                   <a href="http://llvm.org/bugs/show_bug.cgi?id=16640" target="_blank">http://llvm.org/bugs/show_bug.cgi?id=16640</a><br>
                                                <br>
                                                Do you happen to be
                                                using FastISel?<br>
                                                <br>
                                                Solomon<br>
                                                <br>
                                                On Jul 16, 2013, at 6:39
                                                PM, Peter Newman <<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>>



                                                wrote:<br>
                                                <br>
                                                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                  Hello all,<br>
                                                  <br>
                                                  I'm currently in the
                                                  process of debugging a
                                                  crash occurring in our
                                                  program. In LLVM 3.2
                                                  and 3.3 it appears
                                                  that JIT generated
                                                  code is attempting to
                                                  perform access
                                                  unaligned memory with
                                                  a SSE2 instruction.
                                                  However this only
                                                  happens under certain
                                                  conditions that seem
                                                  (but may not be)
                                                  related to the stacks
                                                  state on calling the
                                                  function.<br>
                                                  <br>
                                                  Our program acts as a
                                                  front-end, using the
                                                  LLVM C++ API to
                                                  generate a JIT
                                                  generated function.
                                                  This function is
                                                  primarily
                                                  mathematical, so we
                                                  use the Vector types
                                                  to take advantage of
                                                  SIMD instructions (as
                                                  well as a few SSE2
                                                  intrinsics).<br>
                                                  <br>
                                                  This worked in LLVM
                                                  2.8 but started
                                                  failing in 3.2 and has
                                                  continued to fail in
                                                  3.3. It fails with no
                                                  optimizations applied
                                                  to the LLVM
                                                  Function/Module. It
                                                  crashes with what is
                                                  reported as a memory
                                                  access error
                                                  (accessing
                                                  0xffffffff), however
                                                  it's suggested that
                                                  this is how the SSE
                                                  fault raising
                                                  mechanism appears.<br>
                                                  <br>
                                                  The generated
                                                  instruction varies,
                                                  but it seems to often
                                                  be similar to (I don't
                                                  have it in front of
                                                  me, sorry):<br>
                                                  movapd xmm0,
                                                  xmm[ecx+0x???????]<br>
                                                  Where the xmm register
                                                  changes, and the
                                                  second parameter is a
                                                  memory access.<br>
                                                  ECX is always set to
                                                  0x7ffffff - however I
                                                  don't know if this is
                                                  part of the SSE error
                                                  reporting process or
                                                  is part of the
                                                  situation causing the
                                                  error.<br>
                                                  <br>
                                                  I haven't worked out
                                                  exactly what code path
                                                  etc is causing this
                                                  crash. I'm hoping that
                                                  someone can tell me if
                                                  there were any changed
                                                  requirements for
                                                  working with SIMD in
                                                  LLVM 3.2 (or earlier,
                                                  we haven't tried 3.0
                                                  or 3.1). I currently
                                                  suspect the use of
                                                  GlobalVariable (we
                                                  first discovered the
                                                  crash when using a
                                                  feature that uses
                                                  them), however I have
                                                  attempted using
                                                  setAlignment on the
                                                  GlobalVariables
                                                  without any change.<br>
                                                  <br>
                                                  --<br>
                                                  Peter N<br>
_______________________________________________<br>
                                                  LLVM Developers
                                                  mailing list<br>
                                                  <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>
                                                          <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
                                                  <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
                                                </blockquote>
                                              </blockquote>
                                              <br>
_______________________________________________<br>
                                              LLVM Developers mailing
                                              list<br>
                                              <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>
                                                      <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
                                              <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
                                            </div>
                                          </div>
                                        </blockquote>
                                      </div>
                                      <br>
                                      <br clear="all">
                                      <div><br>
                                      </div>
                                      -- <br>
                                      ~Craig </div>
                                  </blockquote>
                                  <br>
                                </blockquote>
                                <br>
                              </div>
                            </div>
                          </div>
                        </blockquote>
                      </div>
                      <br>
                      <br clear="all">
                      <div><br>
                      </div>
                      -- <br>
                      ~Craig </div>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        ~Craig
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br>~Craig
</div>