<div dir="ltr">>  (Right?)<div><br></div><div>Uh no, the register content explicitly does change :( We insert REV instructions (byteswap) on each bitcast. Bitcasts can be merged and elided etc, but conceptually there's a register content change on every bitcast.</div><div><br></div><div>James</div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, 13 Jan 2016 at 18:09 Philip Reames <<a href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

<br>

On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:<br>

> ----- Original Message -----<br>

>> From: "James Molloy" <<a href="mailto:james@jamesmolloy.co.uk" target="_blank">james@jamesmolloy.co.uk</a>><br>

>> To: "Hal Finkel" <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>><br>

>> Cc: "llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>, "Quentin Colombet" <<a href="mailto:qcolombet@apple.com" target="_blank">qcolombet@apple.com</a>><br>

>> Sent: Wednesday, January 13, 2016 9:54:26 AM<br>

>> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global instruction selection<br>

>><br>

>><br>

>>> I think that teaching the optimizer about big-Endian lane ordering<br>

>>> would have been better.<br>

>><br>

>> It's certainly arguable. Even in hindsight I'm glad we didn't -<br>

>> that's the approach GCC took and they've been fixing subtle bugs in<br>

>> their vectorizer ever since.<br>

>><br>

>><br>

>>> Inserting the REV after every LDR<br>

>><br>

>> We only do this conceptually. In most cases REVs cancel out, and we<br>

>> have the LD1 instruction which is LDR+REV. With enough peepholes<br>

>> there's really no need for code to run slower.<br>

>><br>

>><br>

>>> Given what's been done, should we update the LangRef.<br>

>><br>

>> Potentially, yes. I hadn't realised quite how strongly worded it was<br>

>> with respect to this.<br>

>><br>

> Please do ;)<br>

I'm not sure changing bitcast is the right place.  Since the bitcast is<br>

representing the in-register value (which doesn't change), maybe we<br>

should define it as part of the load/store instead?  That's essentially<br>

what's going on; we're converting from a canonical register form to a<br>

variety of memory forms.  (Right?)<br>

><br>

>   -Hal<br>

><br>

>> James<br>

>><br>

>><br>

>> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < <a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a> > wrote:<br>

>><br>

>><br>

>><br>

>><br>

>> [resending so the message is smaller]<br>

>><br>

>><br>

>><br>

>><br>

>><br>

>><br>

>> From: "James Molloy via llvm-dev" < <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> ><br>

>> To: "Quentin Colombet" < <a href="mailto:qcolombet@apple.com" target="_blank">qcolombet@apple.com</a> ><br>

>> Cc: "llvm-dev" < <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> ><br>

>> Sent: Wednesday, January 13, 2016 2:35:32 AM<br>

>> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global<br>

>> instruction selection<br>

>><br>

>> Hi Philip,<br>

>><br>

>><br>

>><br>

>><br>

>><br>

>> store <2 x i64> %1, <2 x i64>* %y<br>

>><br>

>> Yes. The memory pattern differs. This is the first diagram on the<br>

>> right at: <a href="http://llvm.org/docs/BigEndianNEON.html#bitconverts" rel="noreferrer" target="_blank">http://llvm.org/docs/BigEndianNEON.html#bitconverts</a> )<br>

>><br>

>><br>

>> I think that teaching the optimizer about big-Endian lane ordering<br>

>> would have been better. Inserting the REV after every LDR sounds<br>

>> very similar to what we do for VSX on little-Endian PowerPC systems<br>

>> (PowerPC may have a slight advantage here in that we don't need to<br>

>> do insertelement / extractelement / shufflevector through memory on<br>

>> systems where little-Endian mode is relevant, see<br>

>> <a href="http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf" rel="noreferrer" target="_blank">http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf</a><br>

>> ).<br>

>><br>

>> Given what's been done, should we update the LangRef. It currently<br>

>> reads, " The ‘ bitcast ‘ instruction converts value to type ty2 . It<br>

>> is always a no-op cast because no bits change with this conversion.<br>

>> The conversion is done as if the value had been stored to memory and<br>

>> read back as type ty2 ." But this is now, at the least, misleading,<br>

>> because this process of storing the value as one type and reading it<br>

>> back in as another does, in fact, change the bits. We need to make<br>

>> clear that this might change the bits (perhaps specifically by<br>

>> calling out this case of vector bitcasts on big-Endian systems?).<br>

>><br>

>><br>

>><br>

>> Also, regarding this, " Most operating systems however do not run<br>

>> with alignment faults enabled, so this is often not an issue." Are<br>

>> you saying that the processor does the correct thing in this case<br>

>> (if alignment faults are not enabled, then it performs a proper<br>

>> unaligned load), or that the operating-system trap handler emulates<br>

>> the unaligned load should one occur?<br>

>><br>

>> Thanks again,<br>

>> Hal<br>

>><br>

>><br>

>> _______________________________________________<br>

>><br>

>><br>

>> LLVM Developers mailing list<br>

>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

>><br>

>><br>

>> --<br>

>> Hal Finkel<br>

>> Assistant Computational Scientist<br>

>> Leadership Computing Facility<br>

>> Argonne National Laboratory<br>

>><br>

<br>

</blockquote></div>