[llvm-dev] [GlobalISel] A Proposal for global instruction selection

James Molloy via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 13 07:54:26 PST 2016


>I think that teaching the optimizer about big-Endian lane ordering would
have been better.

It's certainly arguable. Even in hindsight I'm glad we didn't - that's the
approach GCC took and they've been fixing subtle bugs in their vectorizer
ever since.

> Inserting the REV after every LDR

We only do this conceptually. In most cases REVs cancel out, and we have
the LD1 instruction which is LDR+REV. With enough peepholes there's really
no need for code to run slower.

> Given what's been done, should we update the LangRef.

Potentially, yes. I hadn't realised quite how strongly worded it was with
respect to this.

James

On Wed, 13 Jan 2016 at 14:39 Hal Finkel <hfinkel at anl.gov> wrote:

> [resending so the message is smaller]
>
>
> ------------------------------
>
>
> From: "James Molloy via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "Quentin Colombet" <qcolombet at apple.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Wednesday, January 13, 2016 2:35:32 AM
> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global instruction
> selection
>
> Hi Philip,
>
>       store <2 x i64> %1, <2 x i64>* %y
>
> Yes. The memory pattern differs. This is the first diagram on the right
> at: http://llvm.org/docs/BigEndianNEON.html#bitconverts )
>
>
> I think that teaching the optimizer about big-Endian lane ordering would
> have been better. Inserting the REV after every LDR sounds very similar to
> what we do for VSX on little-Endian PowerPC systems (PowerPC may have a
> slight advantage here in that we don't need to do insertelement /
> extractelement / shufflevector through memory on systems where
> little-Endian mode is relevant, see
> http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf).
>
>
> Given what's been done, should we update the LangRef. It currently reads,
> " The ‘ bitcast ‘ instruction converts value to type ty2 . It is always a
> no-op cast because no bits change with this conversion. The conversion is
> done as if the value had been stored to memory and read back as type ty2 ."
> But this is now, at the least, misleading, because this process of storing
> the value as one type and reading it back in as another does, in fact,
> change the bits. We need to make clear that this might change the bits
> (perhaps specifically by calling out this case of vector bitcasts on
> big-Endian systems?).
>
> Also, regarding this, " Most operating systems however do not run with
> alignment faults enabled, so this is often not an issue." Are you saying
> that the processor does the correct thing in this case (if alignment faults
> are not enabled, then it performs a proper unaligned load), or that the
> operating-system trap handler emulates the unaligned load should one occur?
>
> Thanks again,
> Hal
> _______________________________________________
>
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160113/6c5c3cdb/attachment.html>


More information about the llvm-dev mailing list