<div dir="ltr">The calls represent the MSVC _ftol2 function I think.</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jul 18, 2013 at 11:34 PM, Peter Newman <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    <div>(Changing subject line as diagnosis has
      changed)<br>
      <br>
      I'm attaching the compiled code that I've been getting, both with
      CodeGenOpt::Default and CodeGenOpt::None . The crash isn't
      occurring with CodeGenOpt::None, but that seems to be because ECX
      isn't being used - it still gets set to 0x7fffffff by one of the
      calls to 76719BA1<br>
      <br>
      I notice that X86::SQRTPD[m|r] appear in
      X86InstrInfo::isHighLatencyDef. I was thinking an optimization
      might be removing it, but I don't get the sqrtpd instruction even
      if the createJIT optimization level turned off.<br>
      <br>
      I am trying this with the Release 3.3 code - I'll try it with
      trunk and see if I get a different result there. Maybe there was a
      recent commit for this.<br>
      <br>
      --<br>
      Peter N<br>
      <br>
      On 19/07/2013 4:00 PM, Craig Topper wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">Hmm, I'm not able to get those .ll files to compile
        if I disable SSE and I end up with SSE instructions(including
        sqrtpd) if I don't disable it.</div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">
          On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <div>Is there something specifically required to enable
                SSE? If it's not detected as available (based from the
                target triple?) then I don't think we enable it
                specifically.<br>
                <br>
                Also it seems that it should handle converting to/from
                the vector types, although I can see it getting confused
                about needing to do that if it thinks SSE isn't
                available at all.
                <div>
                  <div><br>
                    <br>
                    On 19/07/2013 3:47 PM, Craig Topper wrote:<br>
                  </div>
                </div>
              </div>
              <div>
                <div>
                  <blockquote type="cite">
                    <div dir="ltr">Hmm, maybe sse isn't being enabled so
                      its falling back to emulating sqrt?</div>
                    <div class="gmail_extra"><br>
                      <br>
                      <div class="gmail_quote">On Thu, Jul 18, 2013 at
                        10:45 PM, Peter Newman <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF">
                            <div>In the disassembly, I'm seeing three
                              cases of<br>
                              call        76719BA1<br>
                              <br>
                              I am assuming this is the sqrt function as
                              this is the only function called in the
                              LLVM IR.<br>
                              <br>
                              The code at 76719BA1 is:<br>
                              <br>
                              76719BA1  push        ebp  <br>
                              76719BA2  mov         ebp,esp <br>
                              76719BA4  sub         esp,20h <br>
                              76719BA7  and         esp,0FFFFFFF0h <br>
                              76719BAA  fld         st(0) <br>
                              76719BAC  fst         dword ptr [esp+18h]
                              <br>
                              76719BB0  fistp       qword ptr [esp+10h]
                              <br>
                              76719BB4  fild        qword ptr [esp+10h]
                              <br>
                              76719BB8  mov         edx,dword ptr
                              [esp+18h] <br>
                              76719BBC  mov         eax,dword ptr
                              [esp+10h] <br>
                              76719BC0  test        eax,eax <br>
                              76719BC2  je          76719DCF <br>
                              76719BC8  fsubp       st(1),st <br>
                              76719BCA  test        edx,edx <br>
                              76719BCC  js          7671F9DB <br>
                              76719BD2  fstp        dword ptr [esp] <br>
                              76719BD5  mov         ecx,dword ptr [esp]
                              <br>
                              76719BD8  add         ecx,7FFFFFFFh <br>
                              76719BDE  sbb         eax,0 <br>
                              76719BE1  mov         edx,dword ptr
                              [esp+14h] <br>
                              76719BE5  sbb         edx,0 <br>
                              76719BE8  leave            <br>
                              76719BE9  ret              <br>
                              <br>
                              <br>
                              As you can see at 76719BD5, it modifies
                              ECX .<br>
                              <br>
                              I don't know that this is the sqrtpd
                              function (for example, I'm not seeing any
                              SSE instructions here?) but whatever it
                              is, it's being called from the IR I
                              attached earlier, and is modifying ECX
                              under some circumstances.
                              <div>
                                <div><br>
                                  <br>
                                  On 19/07/2013 3:29 PM, Craig Topper
                                  wrote:<br>
                                </div>
                              </div>
                            </div>
                            <div>
                              <div>
                                <blockquote type="cite">
                                  <div dir="ltr">That should map
                                    directly to sqrtpd which can't
                                    modify ecx.</div>
                                  <div class="gmail_extra"><br>
                                    <br>
                                    <div class="gmail_quote">On Thu, Jul
                                      18, 2013 at 10:27 PM, Peter Newman
                                      <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                                      wrote:<br>
                                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                        <div text="#000000" bgcolor="#FFFFFF">
                                          <div>Sorry, that should have
                                            been llvm.x86.sse2.sqrt.pd
                                            <div>
                                              <div><br>
                                                <br>
                                                On 19/07/2013 3:25 PM,
                                                Craig Topper wrote:<br>
                                              </div>
                                            </div>
                                          </div>
                                          <div>
                                            <div>
                                              <blockquote type="cite">
                                                <div dir="ltr">What is
                                                  "frep.x86.sse2.sqrt.pd".
                                                  I'm only familiar with
                                                  things prefixed with
                                                  "llvm.x86".</div>
                                                <div class="gmail_extra"><br>
                                                  <br>
                                                  <div class="gmail_quote">On
                                                    Thu, Jul 18, 2013 at
                                                    10:12 PM, Peter
                                                    Newman <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                                                    wrote:<br>
                                                    <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                      <div text="#000000" bgcolor="#FFFFFF">
                                                        <div>After
                                                          stepping
                                                          through the
                                                          produced
                                                          assembly, I
                                                          believe I have
                                                          a culprit.<br>
                                                          <br>
                                                          One of the
                                                          calls to
                                                          @frep.x86.sse2.sqrt.pd
                                                          is modifying
                                                          the value of
                                                          ECX - while
                                                          the produced
                                                          code is
                                                          expecting it
                                                          to still
                                                          contain its
                                                          previous
                                                          value.<br>
                                                          <br>
                                                          Peter N
                                                          <div>
                                                          <div><br>
                                                          <br>
                                                          On 19/07/2013
                                                          2:09 PM, Peter
                                                          Newman wrote:<br>
                                                          </div>
                                                          </div>
                                                        </div>
                                                        <div>
                                                          <div>
                                                          <blockquote type="cite">
                                                          <div>I've
                                                          attached the
                                                          module->dump()
                                                          that our code
                                                          is producing.
                                                          Unfortunately
                                                          this is the
                                                          smallest test
                                                          case I have
                                                          available.<br>
                                                          <br>
                                                          This is before
                                                          any
                                                          optimization
                                                          passes are
                                                          applied. There
                                                          are two
                                                          separate
                                                          modules in
                                                          existence at
                                                          the time, and
                                                          there are no
                                                          guarantees
                                                          about the
                                                          order the
                                                          surrounding
                                                          code calls
                                                          those
                                                          functions, so
                                                          there may be
                                                          some
                                                          interaction
                                                          between them?
                                                          There
                                                          shouldn't be,
                                                          they don't
                                                          refer to any
                                                          common memory
                                                          etc. There is
                                                          no
                                                          multi-threading
                                                          occurring.<br>
                                                          <br>
                                                          The function
                                                          in
                                                          module-dump.ll
                                                          (called
                                                          crashfunc in
                                                          this file) is
                                                          called with<br>
                                                          -       
                                                          func_params   
                                                          0x0018f3b0   
                                                          double [3]<br>
                                                                 
                                                          [0x0]   
                                                          -11.339976634695301   
                                                          double<br>
                                                                 
                                                          [0x1]   
                                                          -9.7504239056205506   
                                                          double<br>
                                                                 
                                                          [0x2]   
                                                          -5.2900856817382804   
                                                          double<br>
                                                          at the time of
                                                          the exception.<br>
                                                          <br>
                                                          This is
                                                          compiled on a
                                                          "i686-pc-win32"

                                                          triple. All of
                                                          the
                                                          non-intrinsic
                                                          functions
                                                          referred to in
                                                          these modules
                                                          are the
                                                          standard
                                                          equivalents
                                                          from the MSVC
                                                          library (e.g.
                                                          @asin is the
                                                          standard C lib
                                                             double
                                                          asin( double )
                                                          ).<br>
                                                          <br>
                                                          Hopefully this
                                                          is
                                                          reproducible
                                                          for you.<br>
                                                          <br>
                                                          --<br>
                                                          PeterN<br>
                                                          <br>
                                                          On 18/07/2013
                                                          4:37 PM, Craig
                                                          Topper wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">Are
                                                          you able to
                                                          send any IR
                                                          for others to
                                                          reproduce this
                                                          issue?</div>
                                                          <div class="gmail_extra"><br>
                                                          <br>
                                                          <div class="gmail_quote">On

                                                          Wed, Jul 17,
                                                          2013 at 11:23
                                                          PM, Peter
                                                          Newman <span dir="ltr"><<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                                                          wrote:<br>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Unfortunately,



                                                          this doesn't
                                                          appear to be
                                                          the bug I'm
                                                          hitting. I
                                                          applied the
                                                          fix to my
                                                          source and it
                                                          didn't make a
                                                          difference.<br>
                                                          <br>
                                                          Also further
                                                          testing found
                                                          me getting the
                                                          same behavior
                                                          with other
                                                          SIMD
                                                          instructions.
                                                          The common
                                                          factor is in
                                                          each case, ECX
                                                          is set to
                                                          0x7fffffff,
                                                          and it's an
                                                          operation
                                                          using xmm ptr
                                                          ecx+offset .<br>
                                                          <br>
                                                          Additionally,
                                                          turning the
                                                          optimization
                                                          level passed
                                                          to createJIT
                                                          down appears
                                                          to avoid it,
                                                          so I'm now
                                                          leaning
                                                          towards a bug
                                                          in one of the
                                                          optimization
                                                          passes.<br>
                                                          <br>
                                                          I'm going to
                                                          dig through
                                                          the passes
                                                          controlled by
                                                          that parameter
                                                          and see if I
                                                          can narrow
                                                          down which
                                                          optimization
                                                          is causing it.<br>
                                                          <br>
                                                          Peter N
                                                          <div>
                                                          <div><br>
                                                          <br>
                                                          On 17/07/2013
                                                          1:58 PM,
                                                          Solomon Boulos
                                                          wrote:<br>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          As someone off
                                                          list just told
                                                          me, perhaps my
                                                          new bug is the
                                                          same issue:<br>
                                                          <br>
                                                             <a href="http://llvm.org/bugs/show_bug.cgi?id=16640" target="_blank">http://llvm.org/bugs/show_bug.cgi?id=16640</a><br>
                                                          <br>
                                                          Do you happen
                                                          to be using
                                                          FastISel?<br>
                                                          <br>
                                                          Solomon<br>
                                                          <br>
                                                          On Jul 16,
                                                          2013, at 6:39
                                                          PM, Peter
                                                          Newman <<a href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>>





                                                          wrote:<br>
                                                          <br>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          Hello all,<br>
                                                          <br>
                                                          I'm currently
                                                          in the process
                                                          of debugging a
                                                          crash
                                                          occurring in
                                                          our program.
                                                          In LLVM 3.2
                                                          and 3.3 it
                                                          appears that
                                                          JIT generated
                                                          code is
                                                          attempting to
                                                          perform access
                                                          unaligned
                                                          memory with a
                                                          SSE2
                                                          instruction.
                                                          However this
                                                          only happens
                                                          under certain
                                                          conditions
                                                          that seem (but
                                                          may not be)
                                                          related to the
                                                          stacks state
                                                          on calling the
                                                          function.<br>
                                                          <br>
                                                          Our program
                                                          acts as a
                                                          front-end,
                                                          using the LLVM
                                                          C++ API to
                                                          generate a JIT
                                                          generated
                                                          function. This
                                                          function is
                                                          primarily
                                                          mathematical,
                                                          so we use the
                                                          Vector types
                                                          to take
                                                          advantage of
                                                          SIMD
                                                          instructions
                                                          (as well as a
                                                          few SSE2
                                                          intrinsics).<br>
                                                          <br>
                                                          This worked in
                                                          LLVM 2.8 but
                                                          started
                                                          failing in 3.2
                                                          and has
                                                          continued to
                                                          fail in 3.3.
                                                          It fails with
                                                          no
                                                          optimizations
                                                          applied to the
                                                          LLVM
                                                          Function/Module.
                                                          It crashes
                                                          with what is
                                                          reported as a
                                                          memory access
                                                          error
                                                          (accessing
                                                          0xffffffff),
                                                          however it's
                                                          suggested that
                                                          this is how
                                                          the SSE fault
                                                          raising
                                                          mechanism
                                                          appears.<br>
                                                          <br>
                                                          The generated
                                                          instruction
                                                          varies, but it
                                                          seems to often
                                                          be similar to
                                                          (I don't have
                                                          it in front of
                                                          me, sorry):<br>
                                                          movapd xmm0,
                                                          xmm[ecx+0x???????]<br>
                                                          Where the xmm
                                                          register
                                                          changes, and
                                                          the second
                                                          parameter is a
                                                          memory access.<br>
                                                          ECX is always
                                                          set to
                                                          0x7ffffff -
                                                          however I
                                                          don't know if
                                                          this is part
                                                          of the SSE
                                                          error
                                                          reporting
                                                          process or is
                                                          part of the
                                                          situation
                                                          causing the
                                                          error.<br>
                                                          <br>
                                                          I haven't
                                                          worked out
                                                          exactly what
                                                          code path etc
                                                          is causing
                                                          this crash.
                                                          I'm hoping
                                                          that someone
                                                          can tell me if
                                                          there were any
                                                          changed
                                                          requirements
                                                          for working
                                                          with SIMD in
                                                          LLVM 3.2 (or
                                                          earlier, we
                                                          haven't tried
                                                          3.0 or 3.1). I
                                                          currently
                                                          suspect the
                                                          use of
                                                          GlobalVariable
                                                          (we first
                                                          discovered the
                                                          crash when
                                                          using a
                                                          feature that
                                                          uses them),
                                                          however I have
                                                          attempted
                                                          using
                                                          setAlignment
                                                          on the
                                                          GlobalVariables
                                                          without any
                                                          change.<br>
                                                          <br>
                                                          --<br>
                                                          Peter N<br>
_______________________________________________<br>
                                                          LLVM
                                                          Developers
                                                          mailing list<br>
                                                          <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>
                                                                  <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
                                                          <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
                                                          </blockquote>
                                                          </blockquote>
                                                          <br>
_______________________________________________<br>
                                                          LLVM
                                                          Developers
                                                          mailing list<br>
                                                          <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>
                                                                  <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
                                                          <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          <br clear="all">
                                                          <div><br>
                                                          </div>
                                                          -- <br>
                                                          ~Craig </div>
                                                          </blockquote>
                                                          <br>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                        </div>
                                                      </div>
                                                    </blockquote>
                                                  </div>
                                                  <br>
                                                  <br clear="all">
                                                  <div><br>
                                                  </div>
                                                  -- <br>
                                                  ~Craig </div>
                                              </blockquote>
                                              <br>
                                            </div>
                                          </div>
                                        </div>
                                      </blockquote>
                                    </div>
                                    <br>
                                    <br clear="all">
                                    <div><br>
                                    </div>
                                    -- <br>
                                    ~Craig </div>
                                </blockquote>
                                <br>
                              </div>
                            </div>
                          </div>
                        </blockquote>
                      </div>
                      <br>
                      <br clear="all"><span class="HOEnZb"><font color="#888888">
                      <div><br>
                      </div>
                      -- <br>
                      ~Craig </font></span></div><span class="HOEnZb"><font color="#888888">
                  </font></span></blockquote><span class="HOEnZb"><font color="#888888">
                  <br>
                </font></span></div><span class="HOEnZb"><font color="#888888">
              </font></span></div><span class="HOEnZb"><font color="#888888">
            </font></span></div><span class="HOEnZb"><font color="#888888">
          </font></span></blockquote><span class="HOEnZb"><font color="#888888">
        </font></span></div><span class="HOEnZb"><font color="#888888">
        <br>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        ~Craig
      </font></span></div>
    </blockquote>
    <br>
  </div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br>~Craig
</div>