<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <br>

    <br>

    <div class="moz-cite-prefix">On 01/13/2016 12:20 PM, James Molloy

      wrote:<br>

    </div>

    <blockquote

cite="mid:CALCTSA0ZK5if0pCRHrYwjz80C5rv_Rog6ZhRuXpzTe0RQzewDg@mail.gmail.com"

      type="cite">

      <div dir="ltr">>  (Right?)

        <div><br>

        </div>

        <div>Uh no, the register content explicitly does change :( We

          insert REV instructions (byteswap) on each bitcast. Bitcasts

          can be merged and elided etc, but conceptually there's a

          register content change on every bitcast.<br>

        </div>

      </div>

    </blockquote>

    Ok.  Then we need to change the LangRef as suggested.  Given this is

    a rather important semantic change, I think you need to send a top

    level RFC to the list.  <br>

    <br>

    A couple of points that will need clarified:<br>

    - Does this only apply to vector types?  It definitely doesn't apply

    between pointer types today.  What about integer, floating point,

    and FCAs?<br>

    - Is combining two casts into one a legal operation?  I think it is

    so far, but we need to explicitly state that. <br>

    - Do we have a predicate for identifying no-op casts that can be

    freely removed/combined?<br>

    - Is coercing a load to the type it's immediately bitcast to legal

    under this model?  <br>

    <blockquote

cite="mid:CALCTSA0ZK5if0pCRHrYwjz80C5rv_Rog6ZhRuXpzTe0RQzewDg@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><br>

        </div>

        <div>James</div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr">On Wed, 13 Jan 2016 at 18:09 Philip Reames <<a

            moz-do-not-send="true"

            href="mailto:listmail@philipreames.com"><a class="moz-txt-link-abbreviated" href="mailto:listmail@philipreames.com">listmail@philipreames.com</a></a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0 0 0

          .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

          <br>

          On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:<br>

          > ----- Original Message -----<br>

          >> From: "James Molloy" <<a moz-do-not-send="true"

            href="mailto:james@jamesmolloy.co.uk" target="_blank">james@jamesmolloy.co.uk</a>><br>

          >> To: "Hal Finkel" <<a moz-do-not-send="true"

            href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>><br>

          >> Cc: "llvm-dev" <<a moz-do-not-send="true"

            href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>,

          "Quentin Colombet" <<a moz-do-not-send="true"

            href="mailto:qcolombet@apple.com" target="_blank">qcolombet@apple.com</a>><br>

          >> Sent: Wednesday, January 13, 2016 9:54:26 AM<br>

          >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for

          global instruction selection<br>

          >><br>

          >><br>

          >>> I think that teaching the optimizer about

          big-Endian lane ordering<br>

          >>> would have been better.<br>

          >><br>

          >> It's certainly arguable. Even in hindsight I'm glad

          we didn't -<br>

          >> that's the approach GCC took and they've been fixing

          subtle bugs in<br>

          >> their vectorizer ever since.<br>

          >><br>

          >><br>

          >>> Inserting the REV after every LDR<br>

          >><br>

          >> We only do this conceptually. In most cases REVs

          cancel out, and we<br>

          >> have the LD1 instruction which is LDR+REV. With

          enough peepholes<br>

          >> there's really no need for code to run slower.<br>

          >><br>

          >><br>

          >>> Given what's been done, should we update the

          LangRef.<br>

          >><br>

          >> Potentially, yes. I hadn't realised quite how

          strongly worded it was<br>

          >> with respect to this.<br>

          >><br>

          > Please do ;)<br>

          I'm not sure changing bitcast is the right place.  Since the

          bitcast is<br>

          representing the in-register value (which doesn't change),

          maybe we<br>

          should define it as part of the load/store instead?  That's

          essentially<br>

          what's going on; we're converting from a canonical register

          form to a<br>

          variety of memory forms.  (Right?)<br>

          ><br>

          >   -Hal<br>

          ><br>

          >> James<br>

          >><br>

          >><br>

          >> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < <a

            moz-do-not-send="true" href="mailto:hfinkel@anl.gov"

            target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a></a> > wrote:<br>

          >><br>

          >><br>

          >><br>

          >><br>

          >> [resending so the message is smaller]<br>

          >><br>

          >><br>

          >><br>

          >><br>

          >><br>

          >><br>

          >> From: "James Molloy via llvm-dev" < <a

            moz-do-not-send="true" href="mailto:llvm-dev@lists.llvm.org"

            target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a></a> ><br>

          >> To: "Quentin Colombet" < <a

            moz-do-not-send="true" href="mailto:qcolombet@apple.com"

            target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:qcolombet@apple.com">qcolombet@apple.com</a></a> ><br>

          >> Cc: "llvm-dev" < <a moz-do-not-send="true"

            href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>

          ><br>

          >> Sent: Wednesday, January 13, 2016 2:35:32 AM<br>

          >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for

          global<br>

          >> instruction selection<br>

          >><br>

          >> Hi Philip,<br>

          >><br>

          >><br>

          >><br>

          >><br>

          >><br>

          >> store <2 x i64> %1, <2 x i64>* %y<br>

          >><br>

          >> Yes. The memory pattern differs. This is the first

          diagram on the<br>

          >> right at: <a moz-do-not-send="true"

            href="http://llvm.org/docs/BigEndianNEON.html#bitconverts"

            rel="noreferrer" target="_blank">http://llvm.org/docs/BigEndianNEON.html#bitconverts</a>

          )<br>

          >><br>

          >><br>

          >> I think that teaching the optimizer about big-Endian

          lane ordering<br>

          >> would have been better. Inserting the REV after every

          LDR sounds<br>

          >> very similar to what we do for VSX on little-Endian

          PowerPC systems<br>

          >> (PowerPC may have a slight advantage here in that we

          don't need to<br>

          >> do insertelement / extractelement / shufflevector

          through memory on<br>

          >> systems where little-Endian mode is relevant, see<br>

          >> <a moz-do-not-send="true"

href="http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf"

            rel="noreferrer" target="_blank">http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf</a><br>

          >> ).<br>

          >><br>

          >> Given what's been done, should we update the LangRef.

          It currently<br>

          >> reads, " The ‘ bitcast ‘ instruction converts value

          to type ty2 . It<br>

          >> is always a no-op cast because no bits change with

          this conversion.<br>

          >> The conversion is done as if the value had been

          stored to memory and<br>

          >> read back as type ty2 ." But this is now, at the

          least, misleading,<br>

          >> because this process of storing the value as one type

          and reading it<br>

          >> back in as another does, in fact, change the bits. We

          need to make<br>

          >> clear that this might change the bits (perhaps

          specifically by<br>

          >> calling out this case of vector bitcasts on

          big-Endian systems?).<br>

          >><br>

          >><br>

          >><br>

          >> Also, regarding this, " Most operating systems

          however do not run<br>

          >> with alignment faults enabled, so this is often not

          an issue." Are<br>

          >> you saying that the processor does the correct thing

          in this case<br>

          >> (if alignment faults are not enabled, then it

          performs a proper<br>

          >> unaligned load), or that the operating-system trap

          handler emulates<br>

          >> the unaligned load should one occur?<br>

          >><br>

          >> Thanks again,<br>

          >> Hal<br>

          >><br>

          >><br>

          >> _______________________________________________<br>

          >><br>

          >><br>

          >> LLVM Developers mailing list<br>

          >> <a moz-do-not-send="true"

            href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

          >> <a moz-do-not-send="true"

            href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"

            rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

          >><br>

          >><br>

          >> --<br>

          >> Hal Finkel<br>

          >> Assistant Computational Scientist<br>

          >> Leadership Computing Facility<br>

          >> Argonne National Laboratory<br>

          >><br>

          <br>

        </blockquote>

      </div>

    </blockquote>

    <br>

  </body>

</html>