[llvm-dev] [GlobalISel] A Proposal for global instruction selection
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Fri Jan 15 05:46:43 PST 2016
----- Original Message -----
> From: "Daniel Sanders" <Daniel.Sanders at imgtec.com>
> To: "James Molloy" <james at jamesmolloy.co.uk>, "Hal Finkel"
> <hfinkel at anl.gov>, "Philip Reames" <listmail at philipreames.com>
> Cc: llvm-dev at lists.llvm.org
> Sent: Friday, January 15, 2016 4:29:33 AM
> Subject: RE: [llvm-dev] [GlobalISel] A Proposal for global
> instruction selection
> Hi,
> I think we just need to draw attention to the fact that other IR's
> may vary.
> I'm thinking we should add something like this to the ISD::BITCAST
> doxygen comment:
> This is subtly different from the bitcast instruction from LLVM-IR
> since this node may change the bits
> in the register. For example, this occurs on big-endian NEON and
> big-endian MSA where the layout
> of the bits in the register depends on the vector type and this node
> acts as a shuffle operation for
> some vector type combinations.
I agree; improving the ISD::BITCAST documentation is a good idea.
> And have LangRef say something like:
> The conversion is done as if the value had been stored to memory and
> read back as type ty2. This is equivalent to a no-op cast where no
> bits change with this conversion.
> .. caution::
> This equivalence does not necessarily apply to other IR's in LLVM.
> See ISD::BITCAST for an example.
> The '.. caution::' should render in the same way as the 'Rationale'
> box in http://llvm.org/docs/LangRef.html#volatile-memory-accesses.
We should not do this. We try to keep the LangRef as implementation-independent as possible, and thus, we don't explicitly discuss things like ISD nodes there.
-Hal
> From: James Molloy [mailto:james at jamesmolloy.co.uk]
> Sent: 15 January 2016 08:46
> To: Hal Finkel; Philip Reames
> Cc: llvm-dev at lists.llvm.org; Daniel Sanders
> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
> instruction selection
> Hi,
> > "It is always a no-op cast because no bits change with this
> > conversion. The conversion is done as if the value had been stored
> > to memory and read back as type ty2."
> I think a simple "as-if" in there should be sufficient;
> "It is always a no-op cast because it acts as if no bits change with
> this conversion. The conversion is done as if the value had been
> stored to memory and read back as type ty2."
> What do you think?
> James
> On Thu, 14 Jan 2016 at 22:35 Hal Finkel < hfinkel at anl.gov > wrote:
> > ----- Original Message -----
>
> > > From: "Philip Reames" < listmail at philipreames.com >
>
> > > To: "James Molloy" < james at jamesmolloy.co.uk >, "Daniel Sanders"
> > > <
> > > Daniel.Sanders at imgtec.com >, "Hal Finkel"
>
> > > < hfinkel at anl.gov >
>
> > > Cc: llvm-dev at lists.llvm.org
>
> > > Sent: Thursday, January 14, 2016 3:48:37 PM
>
> > > Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
> > > instruction selection
>
> > >
>
> > > This explanation makes a lot more sense to me. I think it would
> > > make
>
> > > sense to document this mental model, but I agree that this
>
> > > interpretation does not seem to require changes to the IR
> > > semantics.
>
> > The semantics, no. But we still may want to update the language
> > reference. It says, "It is always a no-op cast because no bits
> > change with this conversion. The conversion is done as if the value
> > had been stored to memory and read back as type ty2." And, what
> > we've learned, is that this second sentence does not always imply
> > the first (the bits might, in fact, change).
>
> > -Hal
>
> > >
>
> > > Just to check, this implies that DSE *is* legal right?
>
> > >
>
> > > Philip
>
> > >
>
> > >
>
> > > On 01/14/2016 05:48 AM, James Molloy wrote:
>
> > >
>
> > >
>
> > >
>
> > > Hi,
>
> > >
>
> > >
>
> > > I've given a bit of misinformation here and have caused some
>
> > > confusion. After talking with Tim and Mehdi last night on IRC, I
>
> > > need to correct what I said above to fall more in line with what
>
> > > Daniel is saying. If any of the below contradicts what I've said
>
> > > already, please accept my apologies. This version should be
> > > right.
>
> > >
>
> > >
>
> > > The behaviour of the code generator for big-endian NEON and MIPS
> > > is
>
> > > derived from the fact that we did not want to change IR semantics
> > > at
>
> > > all. A fundamental property that we do not want to break is
> > > memory
>
> > > round-tripping:
>
> > >
>
> > >
>
> > > %1 = load <4 x i32>, %p32
>
> > > %2 = bitcast <4 x i32> %1 to <2 x i64>
>
> > > store <2 x i64> %2, (bitcast %p32 to <2 x i64>*)
>
> > >
>
> > >
>
> > > The value of memory before and after the store MUST NOT change
>
> > > (contrary to what I said in an earlier post, I know).
>
> > >
>
> > >
>
> > > So in fact everything you can do in IR is valid. There are no
> > > changes
>
> > > to IR semantics in the slightest. However, when it comes to
>
> > > generating code from the IR, there are new rules:
>
> > > 1) Loads and stores are selected to be special loads and stores
> > > that
>
> > > do some transform from a canonical form in memory to a
> > > type-specific
>
> > > form in register.
>
> > > 2) Because bitcasts are load/store pairs in semantic, they must
>
> > > behave as if a store then load was done. Specifically (bitcast
> > > TyA
>
> > > to TyB) must transform TyA -> canonical form -> TyB, as a store
> > > then
>
> > > load would. Therefore bitcasts are not no-ops during code
> > > generation
>
> > > (*but behave as if they are from an IR perspective!*).
>
> > >
>
> > >
>
> > > The reason this works neatly in IR is due to the IR's type
> > > system.
> > > In
>
> > > order to change type, a cast must be inserted or a memory round
>
> > > trip. There is no other way. However in SDAG, things break down a
>
> > > bit. SDAG is more weakly typed, and bitconverts are often simply
>
> > > removed. We need that not to happen. Bitconverts are not no-ops.
>
> > >
>
> > >
>
> > > Daniel's explanation of physical register mapping was excellent
> > > so
>
> > > I'm not going to repeat that.
>
> > >
>
> > >
>
> > > I apologise for the confusion and misinformation. This is quite a
>
> > > complex topic and takes a bit of mind bending for me to
> > > understand,
>
> > > and it was a long time ago.
>
> > >
>
> > >
>
> > > James
>
> > >
>
> > >
>
> > > On Thu, 14 Jan 2016 at 13:17 Daniel Sanders <
>
> > > Daniel.Sanders at imgtec.com > wrote:
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > > Ok. Then we need to change the LangRef as suggested. Given this
> > > > is
>
> > > > a rather important semantic change, I think you need to send a
> > > > top
>
> > > > level RFC to the list.
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > FWIW, I don't think this is a semantic change to LLVM-IR itself.
> > > I
>
> > > think it's more clearing up the misconception that LLVM-IR
> > > semantics
>
> > > also apply to SelectionDAG's operations. That said, I do think
> > > it's
>
> > > important to mention this in LangRef since it's very easy to make
>
> > > this mistake and very few targets need to worry about the
>
> > > distinction.
>
> > >
>
> > >
>
> > >
>
> > > To explain why I don't think this is a semantic change to
> > > LLVM-IR,
>
> > > let's consider this example from earlier:
>
> > >
>
> > >
>
> > >
>
> > > %0 = load <4 x i32> %x
>
> > > %1 = bitcast <4 x i32> %0 to <2 x i64>
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > store <2 x i64> %1, <2 x i64>* %y
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > In LLVM-IR terms, if the value of %0 is:
>
> > >
>
> > > %0 = 0x00112233_44556677_8899aabb_ccddeeff
>
> > >
>
> > > then the value of %1 is:
>
> > >
>
> > > %1 = 0x0011223344556677_8899aabbccddeeff
>
> > >
>
> > > which agrees with the store/load and the 'no bits change'
> > > statements
>
> > > in LangRef.
>
> > >
>
> > >
>
> > >
>
> > > However, the mapping of these bits to physical register bits is
> > > not
>
> > > consistent between types:
>
> > >
>
> > > Physreg(%0) = 0xccddeeff_8899aabb_44556677_00112233
>
> > >
>
> > > Physreg(%1) = 0x8899aabbccddeeff_0011223344556677
>
> > >
>
> > >
>
> > >
>
> > > Essentially, I'm saying that BitCastInst and ISD::BITCAST have
>
> > > slightly different semantics because of their different domains.
> > > The
>
> > > former is working on an abstract representation of the values
> > > where
>
> > > both statements in LangRef are true, but the latter is closer to
> > > the
>
> > > target where the 'no bits change' statement ceases to be true in
>
> > > some cases.
>
> > >
>
> > >
>
> > >
>
> > > > A couple of points that will need clarified:
>
> > > > - Does this only apply to vector types? It definitely doesn't
> > > > apply
>
> > > > between pointer types today. What about integer, floating
> > > > point,
>
> > > > and FCAs?
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > I've only seen it for vector types so far but in theory it could
>
> > > happen for other types. I'd expect FCAs to encounter it since the
>
> > > physical registers may contain padding that isn't present in the
>
> > > LLVM-IR representation and the placement and amount of padding
> > > will
>
> > > depend on the exact FCA.
>
> > >
>
> > > I can think of cases where address space casts can encounter the
> > > same
>
> > > problem but that's already been covered in LangRef ("It can be a
>
> > > no-op cast or a complex value modification, depending on the
> > > target
>
> > > and the address space pair.").
>
> > >
>
> > >
>
> > >
>
> > > Does anyone use FCAs directly? Most targets seem to convert them
> > > to
>
> > > same-sized integers or bitcast an FCA* to i8*.
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > > - Is combining two casts into one a legal operation? I think it
> > > > is
>
> > > > so far, but we need to explicitly state that.
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Yes, A->B->C and A->C are equivalent.
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > > - Do we have a predicate for identifying no-op casts that can
> > > > be
>
> > > > freely removed/combined?
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > James mentioned one in CGP but I haven't been able to find it. I
>
> > > don't think it's necessary to have one at the LLVM-IR level but
> > > we
>
> > > do need one in the backends. I remember adding one to the backend
>
> > > but I can't find that either so I think I'm remembering one of my
>
> > > patches from before I split MSA's registers into type-specific
>
> > > classes.
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > > - Is coercing a load to the type it's immediately bitcast to
> > > > legal
>
> > > > under this model?
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Yes.
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > From: llvm-dev [mailto: llvm-dev-bounces at lists.llvm.org ] On
> > > Behalf
>
> > > Of Philip Reames via llvm-dev
>
> > > Sent: 13 January 2016 20:31
>
> > > To: James Molloy; Hal Finkel
>
> > > Cc: llvm-dev
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
>
> > > instruction selection
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On 01/13/2016 12:20 PM, James Molloy wrote:
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > > (Right?)
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Uh no, the register content explicitly does change :( We insert
> > > REV
>
> > > instructions (byteswap) on each bitcast. Bitcasts can be merged
> > > and
>
> > > elided etc, but conceptually there's a register content change on
>
> > > every bitcast.
>
> > >
>
> > > Ok. Then we need to change the LangRef as suggested. Given this
> > > is
> > > a
>
> > > rather important semantic change, I think you need to send a top
>
> > > level RFC to the list.
>
> > >
>
> > > A couple of points that will need clarified:
>
> > > - Does this only apply to vector types? It definitely doesn't
> > > apply
>
> > > between pointer types today. What about integer, floating point,
> > > and
>
> > > FCAs?
>
> > > - Is combining two casts into one a legal operation? I think it
> > > is
> > > so
>
> > > far, but we need to explicitly state that.
>
> > > - Do we have a predicate for identifying no-op casts that can be
>
> > > freely removed/combined?
>
> > > - Is coercing a load to the type it's immediately bitcast to
> > > legal
>
> > > under this model?
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > James
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On Wed, 13 Jan 2016 at 18:09 Philip Reames <
>
> > > listmail at philipreames.com > wrote:
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:
>
> > > > ----- Original Message -----
>
> > > >> From: "James Molloy" < james at jamesmolloy.co.uk >
>
> > > >> To: "Hal Finkel" < hfinkel at anl.gov >
>
> > > >> Cc: "llvm-dev" < llvm-dev at lists.llvm.org >, "Quentin Colombet"
> > > >> <
>
> > > >> qcolombet at apple.com >
>
> > > >> Sent: Wednesday, January 13, 2016 9:54:26 AM
>
> > > >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
>
> > > >> instruction selection
>
> > > >>
>
> > > >>
>
> > > >>> I think that teaching the optimizer about big-Endian lane
>
> > > >>> ordering
>
> > > >>> would have been better.
>
> > > >>
>
> > > >> It's certainly arguable. Even in hindsight I'm glad we didn't
> > > >> -
>
> > > >> that's the approach GCC took and they've been fixing subtle
> > > >> bugs
>
> > > >> in
>
> > > >> their vectorizer ever since.
>
> > > >>
>
> > > >>
>
> > > >>> Inserting the REV after every LDR
>
> > > >>
>
> > > >> We only do this conceptually. In most cases REVs cancel out,
> > > >> and
>
> > > >> we
>
> > > >> have the LD1 instruction which is LDR+REV. With enough
> > > >> peepholes
>
> > > >> there's really no need for code to run slower.
>
> > > >>
>
> > > >>
>
> > > >>> Given what's been done, should we update the LangRef.
>
> > > >>
>
> > > >> Potentially, yes. I hadn't realised quite how strongly worded
> > > >> it
>
> > > >> was
>
> > > >> with respect to this.
>
> > > >>
>
> > > > Please do ;)
>
> > > I'm not sure changing bitcast is the right place. Since the
> > > bitcast
>
> > > is
>
> > > representing the in-register value (which doesn't change), maybe
> > > we
>
> > > should define it as part of the load/store instead? That's
>
> > > essentially
>
> > > what's going on; we're converting from a canonical register form
> > > to
> > > a
>
> > > variety of memory forms. (Right?)
>
> > > >
>
> > > > -Hal
>
> > > >
>
> > > >> James
>
> > > >>
>
> > > >>
>
> > > >> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at anl.gov >
> > > >> wrote:
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >> [resending so the message is smaller]
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >> From: "James Molloy via llvm-dev" < llvm-dev at lists.llvm.org >
>
> > > >> To: "Quentin Colombet" < qcolombet at apple.com >
>
> > > >> Cc: "llvm-dev" < llvm-dev at lists.llvm.org >
>
> > > >> Sent: Wednesday, January 13, 2016 2:35:32 AM
>
> > > >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
>
> > > >> instruction selection
>
> > > >>
>
> > > >> Hi Philip,
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >> store <2 x i64> %1, <2 x i64>* %y
>
> > > >>
>
> > > >> Yes. The memory pattern differs. This is the first diagram on
> > > >> the
>
> > > >> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts
> > > >> )
>
> > > >>
>
> > > >>
>
> > > >> I think that teaching the optimizer about big-Endian lane
> > > >> ordering
>
> > > >> would have been better. Inserting the REV after every LDR
> > > >> sounds
>
> > > >> very similar to what we do for VSX on little-Endian PowerPC
>
> > > >> systems
>
> > > >> (PowerPC may have a slight advantage here in that we don't
> > > >> need
> > > >> to
>
> > > >> do insertelement / extractelement / shufflevector through
> > > >> memory
>
> > > >> on
>
> > > >> systems where little-Endian mode is relevant, see
>
> > > >> http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
>
> > > >> ).
>
> > > >>
>
> > > >> Given what's been done, should we update the LangRef. It
> > > >> currently
>
> > > >> reads, " The ‘ bitcast ‘ instruction converts value to type
> > > >> ty2
> > > >> .
>
> > > >> It
>
> > > >> is always a no-op cast because no bits change with this
>
> > > >> conversion.
>
> > > >> The conversion is done as if the value had been stored to
> > > >> memory
>
> > > >> and
>
> > > >> read back as type ty2 ." But this is now, at the least,
>
> > > >> misleading,
>
> > > >> because this process of storing the value as one type and
> > > >> reading
>
> > > >> it
>
> > > >> back in as another does, in fact, change the bits. We need to
> > > >> make
>
> > > >> clear that this might change the bits (perhaps specifically by
>
> > > >> calling out this case of vector bitcasts on big-Endian
> > > >> systems?).
>
> > > >>
>
> > > >>
>
> > > >>
>
> > > >> Also, regarding this, " Most operating systems however do not
> > > >> run
>
> > > >> with alignment faults enabled, so this is often not an issue."
> > > >> Are
>
> > > >> you saying that the processor does the correct thing in this
> > > >> case
>
> > > >> (if alignment faults are not enabled, then it performs a
> > > >> proper
>
> > > >> unaligned load), or that the operating-system trap handler
>
> > > >> emulates
>
> > > >> the unaligned load should one occur?
>
> > > >>
>
> > > >> Thanks again,
>
> > > >> Hal
>
> > > >>
>
> > > >>
>
> > > >> _______________________________________________
>
> > > >>
>
> > > >>
>
> > > >> LLVM Developers mailing list
>
> > > >> llvm-dev at lists.llvm.org
>
> > > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> > > >>
>
> > > >>
>
> > > >> --
>
> > > >> Hal Finkel
>
> > > >> Assistant Computational Scientist
>
> > > >> Leadership Computing Facility
>
> > > >> Argonne National Laboratory
>
> > > >>
>
> > >
>
> > >
>
> > >
>
> > --
>
> > Hal Finkel
>
> > Assistant Computational Scientist
>
> > Leadership Computing Facility
>
> > Argonne National Laboratory
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/c5af7996/attachment.html>
More information about the llvm-dev
mailing list