[llvm-dev] [GlobalISel] A Proposal for global instruction selection

Wed Jan 13 08:01:36 PST 2016

----- Original Message -----
> From: "James Molloy" <james at jamesmolloy.co.uk>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Quentin Colombet" <qcolombet at apple.com>
> Sent: Wednesday, January 13, 2016 9:54:26 AM
> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global instruction selection
> 
> 
> > I think that teaching the optimizer about big-Endian lane ordering
> > would have been better.
> 
> 
> It's certainly arguable. Even in hindsight I'm glad we didn't -
> that's the approach GCC took and they've been fixing subtle bugs in
> their vectorizer ever since.
> 
> 
> > Inserting the REV after every LDR
> 
> 
> We only do this conceptually. In most cases REVs cancel out, and we
> have the LD1 instruction which is LDR+REV. With enough peepholes
> there's really no need for code to run slower.
> 
> 
> > Given what's been done, should we update the LangRef.
> 
> 
> Potentially, yes. I hadn't realised quite how strongly worded it was
> with respect to this.
> 

Please do ;)

 -Hal

> 
> James
> 
> 
> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> 
> 
> [resending so the message is smaller]
> 
> 
> 
> 
> 
> 
> From: "James Molloy via llvm-dev" < llvm-dev at lists.llvm.org >
> To: "Quentin Colombet" < qcolombet at apple.com >
> Cc: "llvm-dev" < llvm-dev at lists.llvm.org >
> Sent: Wednesday, January 13, 2016 2:35:32 AM
> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
> instruction selection
> 
> Hi Philip,
> 
> 
> 
> 
> 
> store <2 x i64> %1, <2 x i64>* %y
> 
> Yes. The memory pattern differs. This is the first diagram on the
> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts )
> 
> 
> I think that teaching the optimizer about big-Endian lane ordering
> would have been better. Inserting the REV after every LDR sounds
> very similar to what we do for VSX on little-Endian PowerPC systems
> (PowerPC may have a slight advantage here in that we don't need to
> do insertelement / extractelement / shufflevector through memory on
> systems where little-Endian mode is relevant, see
> http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
> ).
> 
> Given what's been done, should we update the LangRef. It currently
> reads, " The ‘ bitcast ‘ instruction converts value to type ty2 . It
> is always a no-op cast because no bits change with this conversion.
> The conversion is done as if the value had been stored to memory and
> read back as type ty2 ." But this is now, at the least, misleading,
> because this process of storing the value as one type and reading it
> back in as another does, in fact, change the bits. We need to make
> clear that this might change the bits (perhaps specifically by
> calling out this case of vector bitcasts on big-Endian systems?).
> 
> 
> 
> Also, regarding this, " Most operating systems however do not run
> with alignment faults enabled, so this is often not an issue." Are
> you saying that the processor does the correct thing in this case
> (if alignment faults are not enabled, then it performs a proper
> unaligned load), or that the operating-system trap handler emulates
> the unaligned load should one occur?
> 
> Thanks again,
> Hal
> 
> 
> _______________________________________________
> 
> 
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory