<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Oh, excellent point, I agree. My bad.
      Now that I'm not assuming those are the sqrt, I see the sqrtpd's
      in the output. Also there are three fptoui's and there are 3 call
      instances.<br>
      <br>
      (Changing subject line again.)<br>
      <br>
      Now it looks like it's bug #13862<br>
      <br>
      On 19/07/2013 4:51 PM, Craig Topper wrote:<br>
    </div>
    <blockquote
cite="mid:CAF7ks-Nwh+2BJ5n8WwbpD0r9jd42chXRn53GGgaU2rbHh397OQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">I think those calls correspond to this
        <div><br>
        </div>
        <div>
          <div>  %110 = fptoui double %109 to i32</div>
        </div>
        <div><br>
        </div>
        <div>The calls are followed by an imul with 12 which matches up
          with what occurs right after the fptoui in the IR.</div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Thu, Jul 18, 2013 at 11:48 PM, Peter
          Newman <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <div>Yes, that is the result of module-dump.ll
                <div>
                  <div class="h5"><br>
                    <br>
                    On 19/07/2013 4:46 PM, Craig Topper wrote:<br>
                  </div>
                </div>
              </div>
              <div>
                <div class="h5">
                  <blockquote type="cite">
                    <div dir="ltr">Does this correspond to one of the
                      .ll files you sent earlier?</div>
                    <div class="gmail_extra"><br>
                      <br>
                      <div class="gmail_quote">On Thu, Jul 18, 2013 at
                        11:34 PM, Peter Newman <span dir="ltr"><<a
                            moz-do-not-send="true"
                            href="mailto:peter@uformia.com"
                            target="_blank">peter@uformia.com</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF">
                            <div>(Changing subject line as diagnosis has
                              changed)<br>
                              <br>
                              I'm attaching the compiled code that I've
                              been getting, both with
                              CodeGenOpt::Default and CodeGenOpt::None .
                              The crash isn't occurring with
                              CodeGenOpt::None, but that seems to be
                              because ECX isn't being used - it still
                              gets set to 0x7fffffff by one of the calls
                              to 76719BA1<br>
                              <br>
                              I notice that X86::SQRTPD[m|r] appear in
                              X86InstrInfo::isHighLatencyDef. I was
                              thinking an optimization might be removing
                              it, but I don't get the sqrtpd instruction
                              even if the createJIT optimization level
                              turned off.<br>
                              <br>
                              I am trying this with the Release 3.3 code
                              - I'll try it with trunk and see if I get
                              a different result there. Maybe there was
                              a recent commit for this.<br>
                              <br>
                              --<br>
                              Peter N<br>
                              <br>
                              On 19/07/2013 4:00 PM, Craig Topper wrote:<br>
                            </div>
                            <blockquote type="cite">
                              <div dir="ltr">Hmm, I'm not able to get
                                those .ll files to compile if I disable
                                SSE and I end up with SSE
                                instructions(including sqrtpd) if I
                                don't disable it.</div>
                              <div class="gmail_extra"><br>
                                <br>
                                <div class="gmail_quote"> On Thu, Jul
                                  18, 2013 at 10:53 PM, Peter Newman <span
                                    dir="ltr"><<a
                                      moz-do-not-send="true"
                                      href="mailto:peter@uformia.com"
                                      target="_blank">peter@uformia.com</a>></span>
                                  wrote:<br>
                                  <blockquote class="gmail_quote"
                                    style="margin:0 0 0
                                    .8ex;border-left:1px #ccc
                                    solid;padding-left:1ex">
                                    <div text="#000000"
                                      bgcolor="#FFFFFF">
                                      <div>Is there something
                                        specifically required to enable
                                        SSE? If it's not detected as
                                        available (based from the target
                                        triple?) then I don't think we
                                        enable it specifically.<br>
                                        <br>
                                        Also it seems that it should
                                        handle converting to/from the
                                        vector types, although I can see
                                        it getting confused about
                                        needing to do that if it thinks
                                        SSE isn't available at all.
                                        <div>
                                          <div><br>
                                            <br>
                                            On 19/07/2013 3:47 PM, Craig
                                            Topper wrote:<br>
                                          </div>
                                        </div>
                                      </div>
                                      <div>
                                        <div>
                                          <blockquote type="cite">
                                            <div dir="ltr">Hmm, maybe
                                              sse isn't being enabled so
                                              its falling back to
                                              emulating sqrt?</div>
                                            <div class="gmail_extra"><br>
                                              <br>
                                              <div class="gmail_quote">On
                                                Thu, Jul 18, 2013 at
                                                10:45 PM, Peter Newman <span
                                                  dir="ltr"><<a
                                                    moz-do-not-send="true"
href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                                                wrote:<br>
                                                <blockquote
                                                  class="gmail_quote"
                                                  style="margin:0 0 0
                                                  .8ex;border-left:1px
                                                  #ccc
                                                  solid;padding-left:1ex">
                                                  <div text="#000000"
                                                    bgcolor="#FFFFFF">
                                                    <div>In the
                                                      disassembly, I'm
                                                      seeing three cases
                                                      of<br>
                                                      call       
                                                      76719BA1<br>
                                                      <br>
                                                      I am assuming this
                                                      is the sqrt
                                                      function as this
                                                      is the only
                                                      function called in
                                                      the LLVM IR.<br>
                                                      <br>
                                                      The code at
                                                      76719BA1 is:<br>
                                                      <br>
                                                      76719BA1 
                                                      push        ebp  <br>
                                                      76719BA2 
                                                      mov        
                                                      ebp,esp <br>
                                                      76719BA4 
                                                      sub        
                                                      esp,20h <br>
                                                      76719BA7 
                                                      and        
                                                      esp,0FFFFFFF0h <br>
                                                      76719BAA 
                                                      fld         st(0)
                                                      <br>
                                                      76719BAC 
                                                      fst         dword
                                                      ptr [esp+18h] <br>
                                                      76719BB0 
                                                      fistp       qword
                                                      ptr [esp+10h] <br>
                                                      76719BB4 
                                                      fild        qword
                                                      ptr [esp+10h] <br>
                                                      76719BB8 
                                                      mov        
                                                      edx,dword ptr
                                                      [esp+18h] <br>
                                                      76719BBC 
                                                      mov        
                                                      eax,dword ptr
                                                      [esp+10h] <br>
                                                      76719BC0 
                                                      test       
                                                      eax,eax <br>
                                                      76719BC2 
                                                      je         
                                                      76719DCF <br>
                                                      76719BC8 
                                                      fsubp      
                                                      st(1),st <br>
                                                      76719BCA 
                                                      test       
                                                      edx,edx <br>
                                                      76719BCC 
                                                      js         
                                                      7671F9DB <br>
                                                      76719BD2 
                                                      fstp        dword
                                                      ptr [esp] <br>
                                                      76719BD5 
                                                      mov        
                                                      ecx,dword ptr
                                                      [esp] <br>
                                                      76719BD8 
                                                      add        
                                                      ecx,7FFFFFFFh <br>
                                                      76719BDE 
                                                      sbb         eax,0
                                                      <br>
                                                      76719BE1 
                                                      mov        
                                                      edx,dword ptr
                                                      [esp+14h] <br>
                                                      76719BE5 
                                                      sbb         edx,0
                                                      <br>
                                                      76719BE8 
                                                      leave            <br>
                                                      76719BE9 
                                                      ret              <br>
                                                      <br>
                                                      <br>
                                                      As you can see at
                                                      76719BD5, it
                                                      modifies ECX .<br>
                                                      <br>
                                                      I don't know that
                                                      this is the sqrtpd
                                                      function (for
                                                      example, I'm not
                                                      seeing any SSE
                                                      instructions
                                                      here?) but
                                                      whatever it is,
                                                      it's being called
                                                      from the IR I
                                                      attached earlier,
                                                      and is modifying
                                                      ECX under some
                                                      circumstances.
                                                      <div>
                                                        <div><br>
                                                          <br>
                                                          On 19/07/2013
                                                          3:29 PM, Craig
                                                          Topper wrote:<br>
                                                        </div>
                                                      </div>
                                                    </div>
                                                    <div>
                                                      <div>
                                                        <blockquote
                                                          type="cite">
                                                          <div dir="ltr">That
                                                          should map
                                                          directly to
                                                          sqrtpd which
                                                          can't modify
                                                          ecx.</div>
                                                          <div
                                                          class="gmail_extra"><br>
                                                          <br>
                                                          <div
                                                          class="gmail_quote">On

                                                          Thu, Jul 18,
                                                          2013 at 10:27
                                                          PM, Peter
                                                          Newman <span
                                                          dir="ltr"><<a
moz-do-not-send="true" href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                                                          wrote:<br>
                                                          <blockquote
                                                          class="gmail_quote"
                                                          style="margin:0
                                                          0 0
                                                          .8ex;border-left:1px
                                                          #ccc
                                                          solid;padding-left:1ex">
                                                          <div
                                                          text="#000000"
bgcolor="#FFFFFF">
                                                          <div>Sorry,
                                                          that should
                                                          have been
                                                          llvm.x86.sse2.sqrt.pd
                                                          <div>
                                                          <div><br>
                                                          <br>
                                                          On 19/07/2013
                                                          3:25 PM, Craig
                                                          Topper wrote:<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          <div>
                                                          <div>
                                                          <blockquote
                                                          type="cite">
                                                          <div dir="ltr">What

                                                          is
                                                          "frep.x86.sse2.sqrt.pd".
                                                          I'm only
                                                          familiar with
                                                          things
                                                          prefixed with
                                                          "llvm.x86".</div>
                                                          <div
                                                          class="gmail_extra"><br>
                                                          <br>
                                                          <div
                                                          class="gmail_quote">On


                                                          Thu, Jul 18,
                                                          2013 at 10:12
                                                          PM, Peter
                                                          Newman <span
                                                          dir="ltr"><<a
moz-do-not-send="true" href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                                                          wrote:<br>
                                                          <blockquote
                                                          class="gmail_quote"
                                                          style="margin:0
                                                          0 0
                                                          .8ex;border-left:1px
                                                          #ccc
                                                          solid;padding-left:1ex">
                                                          <div
                                                          text="#000000"
bgcolor="#FFFFFF">
                                                          <div>After
                                                          stepping
                                                          through the
                                                          produced
                                                          assembly, I
                                                          believe I have
                                                          a culprit.<br>
                                                          <br>
                                                          One of the
                                                          calls to
                                                          @frep.x86.sse2.sqrt.pd
                                                          is modifying
                                                          the value of
                                                          ECX - while
                                                          the produced
                                                          code is
                                                          expecting it
                                                          to still
                                                          contain its
                                                          previous
                                                          value.<br>
                                                          <br>
                                                          Peter N
                                                          <div>
                                                          <div><br>
                                                          <br>
                                                          On 19/07/2013
                                                          2:09 PM, Peter
                                                          Newman wrote:<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          <div>
                                                          <div>
                                                          <blockquote
                                                          type="cite">
                                                          <div>I've
                                                          attached the
                                                          module->dump()
                                                          that our code
                                                          is producing.
                                                          Unfortunately
                                                          this is the
                                                          smallest test
                                                          case I have
                                                          available.<br>
                                                          <br>
                                                          This is before
                                                          any
                                                          optimization
                                                          passes are
                                                          applied. There
                                                          are two
                                                          separate
                                                          modules in
                                                          existence at
                                                          the time, and
                                                          there are no
                                                          guarantees
                                                          about the
                                                          order the
                                                          surrounding
                                                          code calls
                                                          those
                                                          functions, so
                                                          there may be
                                                          some
                                                          interaction
                                                          between them?
                                                          There
                                                          shouldn't be,
                                                          they don't
                                                          refer to any
                                                          common memory
                                                          etc. There is
                                                          no
                                                          multi-threading
                                                          occurring.<br>
                                                          <br>
                                                          The function
                                                          in
                                                          module-dump.ll
                                                          (called
                                                          crashfunc in
                                                          this file) is
                                                          called with<br>
                                                          -       
                                                          func_params   
                                                          0x0018f3b0   
                                                          double [3]<br>
                                                                 
                                                          [0x0]   
                                                          -11.339976634695301   
                                                          double<br>
                                                                 
                                                          [0x1]   
                                                          -9.7504239056205506   
                                                          double<br>
                                                                 
                                                          [0x2]   
                                                          -5.2900856817382804   
                                                          double<br>
                                                          at the time of
                                                          the exception.<br>
                                                          <br>
                                                          This is
                                                          compiled on a
                                                          "i686-pc-win32"



                                                          triple. All of
                                                          the
                                                          non-intrinsic
                                                          functions
                                                          referred to in
                                                          these modules
                                                          are the
                                                          standard
                                                          equivalents
                                                          from the MSVC
                                                          library (e.g.
                                                          @asin is the
                                                          standard C lib
                                                             double
                                                          asin( double )
                                                          ).<br>
                                                          <br>
                                                          Hopefully this
                                                          is
                                                          reproducible
                                                          for you.<br>
                                                          <br>
                                                          --<br>
                                                          PeterN<br>
                                                          <br>
                                                          On 18/07/2013
                                                          4:37 PM, Craig
                                                          Topper wrote:<br>
                                                          </div>
                                                          <blockquote
                                                          type="cite">
                                                          <div dir="ltr">Are


                                                          you able to
                                                          send any IR
                                                          for others to
                                                          reproduce this
                                                          issue?</div>
                                                          <div
                                                          class="gmail_extra"><br>
                                                          <br>
                                                          <div
                                                          class="gmail_quote">On



                                                          Wed, Jul 17,
                                                          2013 at 11:23
                                                          PM, Peter
                                                          Newman <span
                                                          dir="ltr"><<a
moz-do-not-send="true" href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>></span>
                                                          wrote:<br>
                                                          <blockquote
                                                          class="gmail_quote"
                                                          style="margin:0
                                                          0 0
                                                          .8ex;border-left:1px
                                                          #ccc
                                                          solid;padding-left:1ex">Unfortunately,





                                                          this doesn't
                                                          appear to be
                                                          the bug I'm
                                                          hitting. I
                                                          applied the
                                                          fix to my
                                                          source and it
                                                          didn't make a
                                                          difference.<br>
                                                          <br>
                                                          Also further
                                                          testing found
                                                          me getting the
                                                          same behavior
                                                          with other
                                                          SIMD
                                                          instructions.
                                                          The common
                                                          factor is in
                                                          each case, ECX
                                                          is set to
                                                          0x7fffffff,
                                                          and it's an
                                                          operation
                                                          using xmm ptr
                                                          ecx+offset .<br>
                                                          <br>
                                                          Additionally,
                                                          turning the
                                                          optimization
                                                          level passed
                                                          to createJIT
                                                          down appears
                                                          to avoid it,
                                                          so I'm now
                                                          leaning
                                                          towards a bug
                                                          in one of the
                                                          optimization
                                                          passes.<br>
                                                          <br>
                                                          I'm going to
                                                          dig through
                                                          the passes
                                                          controlled by
                                                          that parameter
                                                          and see if I
                                                          can narrow
                                                          down which
                                                          optimization
                                                          is causing it.<br>
                                                          <br>
                                                          Peter N
                                                          <div>
                                                          <div><br>
                                                          <br>
                                                          On 17/07/2013
                                                          1:58 PM,
                                                          Solomon Boulos
                                                          wrote:<br>
                                                          <blockquote
                                                          class="gmail_quote"
                                                          style="margin:0
                                                          0 0
                                                          .8ex;border-left:1px
                                                          #ccc
                                                          solid;padding-left:1ex">
                                                          As someone off
                                                          list just told
                                                          me, perhaps my
                                                          new bug is the
                                                          same issue:<br>
                                                          <br>
                                                             <a
                                                          moz-do-not-send="true"
href="http://llvm.org/bugs/show_bug.cgi?id=16640" target="_blank">http://llvm.org/bugs/show_bug.cgi?id=16640</a><br>
                                                          <br>
                                                          Do you happen
                                                          to be using
                                                          FastISel?<br>
                                                          <br>
                                                          Solomon<br>
                                                          <br>
                                                          On Jul 16,
                                                          2013, at 6:39
                                                          PM, Peter
                                                          Newman <<a
moz-do-not-send="true" href="mailto:peter@uformia.com" target="_blank">peter@uformia.com</a>>







                                                          wrote:<br>
                                                          <br>
                                                          <blockquote
                                                          class="gmail_quote"
                                                          style="margin:0
                                                          0 0
                                                          .8ex;border-left:1px
                                                          #ccc
                                                          solid;padding-left:1ex">
                                                          Hello all,<br>
                                                          <br>
                                                          I'm currently
                                                          in the process
                                                          of debugging a
                                                          crash
                                                          occurring in
                                                          our program.
                                                          In LLVM 3.2
                                                          and 3.3 it
                                                          appears that
                                                          JIT generated
                                                          code is
                                                          attempting to
                                                          perform access
                                                          unaligned
                                                          memory with a
                                                          SSE2
                                                          instruction.
                                                          However this
                                                          only happens
                                                          under certain
                                                          conditions
                                                          that seem (but
                                                          may not be)
                                                          related to the
                                                          stacks state
                                                          on calling the
                                                          function.<br>
                                                          <br>
                                                          Our program
                                                          acts as a
                                                          front-end,
                                                          using the LLVM
                                                          C++ API to
                                                          generate a JIT
                                                          generated
                                                          function. This
                                                          function is
                                                          primarily
                                                          mathematical,
                                                          so we use the
                                                          Vector types
                                                          to take
                                                          advantage of
                                                          SIMD
                                                          instructions
                                                          (as well as a
                                                          few SSE2
                                                          intrinsics).<br>
                                                          <br>
                                                          This worked in
                                                          LLVM 2.8 but
                                                          started
                                                          failing in 3.2
                                                          and has
                                                          continued to
                                                          fail in 3.3.
                                                          It fails with
                                                          no
                                                          optimizations
                                                          applied to the
                                                          LLVM
                                                          Function/Module.
                                                          It crashes
                                                          with what is
                                                          reported as a
                                                          memory access
                                                          error
                                                          (accessing
                                                          0xffffffff),
                                                          however it's
                                                          suggested that
                                                          this is how
                                                          the SSE fault
                                                          raising
                                                          mechanism
                                                          appears.<br>
                                                          <br>
                                                          The generated
                                                          instruction
                                                          varies, but it
                                                          seems to often
                                                          be similar to
                                                          (I don't have
                                                          it in front of
                                                          me, sorry):<br>
                                                          movapd xmm0,
                                                          xmm[ecx+0x???????]<br>
                                                          Where the xmm
                                                          register
                                                          changes, and
                                                          the second
                                                          parameter is a
                                                          memory access.<br>
                                                          ECX is always
                                                          set to
                                                          0x7ffffff -
                                                          however I
                                                          don't know if
                                                          this is part
                                                          of the SSE
                                                          error
                                                          reporting
                                                          process or is
                                                          part of the
                                                          situation
                                                          causing the
                                                          error.<br>
                                                          <br>
                                                          I haven't
                                                          worked out
                                                          exactly what
                                                          code path etc
                                                          is causing
                                                          this crash.
                                                          I'm hoping
                                                          that someone
                                                          can tell me if
                                                          there were any
                                                          changed
                                                          requirements
                                                          for working
                                                          with SIMD in
                                                          LLVM 3.2 (or
                                                          earlier, we
                                                          haven't tried
                                                          3.0 or 3.1). I
                                                          currently
                                                          suspect the
                                                          use of
                                                          GlobalVariable
                                                          (we first
                                                          discovered the
                                                          crash when
                                                          using a
                                                          feature that
                                                          uses them),
                                                          however I have
                                                          attempted
                                                          using
                                                          setAlignment
                                                          on the
                                                          GlobalVariables
                                                          without any
                                                          change.<br>
                                                          <br>
                                                          --<br>
                                                          Peter N<br>
_______________________________________________<br>
                                                          LLVM
                                                          Developers
                                                          mailing list<br>
                                                          <a
                                                          moz-do-not-send="true"
href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>
                                                                  <a
                                                          moz-do-not-send="true"
href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
                                                          <a
                                                          moz-do-not-send="true"
href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
                                                          </blockquote>
                                                          </blockquote>
                                                          <br>
_______________________________________________<br>
                                                          LLVM
                                                          Developers
                                                          mailing list<br>
                                                          <a
                                                          moz-do-not-send="true"
href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>
                                                                  <a
                                                          moz-do-not-send="true"
href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
                                                          <a
                                                          moz-do-not-send="true"
href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          <br
                                                          clear="all">
                                                          <div><br>
                                                          </div>
                                                          -- <br>
                                                          ~Craig </div>
                                                          </blockquote>
                                                          <br>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          <br
                                                          clear="all">
                                                          <div><br>
                                                          </div>
                                                          -- <br>
                                                          ~Craig </div>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          <br
                                                          clear="all">
                                                          <div><br>
                                                          </div>
                                                          -- <br>
                                                          ~Craig </div>
                                                        </blockquote>
                                                        <br>
                                                      </div>
                                                    </div>
                                                  </div>
                                                </blockquote>
                                              </div>
                                              <br>
                                              <br clear="all">
                                              <span><font
                                                  color="#888888">
                                                  <div><br>
                                                  </div>
                                                  -- <br>
                                                  ~Craig </font></span></div>
                                            <span><font color="#888888">
                                              </font></span></blockquote>
                                          <span><font color="#888888"> <br>
                                            </font></span></div>
                                        <span><font color="#888888"> </font></span></div>
                                      <span><font color="#888888"> </font></span></div>
                                    <span><font color="#888888"> </font></span></blockquote>
                                  <span><font color="#888888"> </font></span></div>
                                <span><font color="#888888"> <br>
                                    <br clear="all">
                                    <div><br>
                                    </div>
                                    -- <br>
                                    ~Craig </font></span></div>
                            </blockquote>
                            <br>
                          </div>
                        </blockquote>
                      </div>
                      <br>
                      <br clear="all">
                      <div><br>
                      </div>
                      -- <br>
                      ~Craig </div>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        ~Craig
      </div>
    </blockquote>
    <br>
  </body>
</html>