[llvm-dev] [GlobalISel] A Proposal for global instruction selection

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Thu Jan 14 13:48:37 PST 2016


This explanation makes a lot more sense to me.  I think it would make 
sense to document this mental model, but I agree that this 
interpretation does not seem to require changes to the IR semantics.

Just to check, this implies that DSE *is* legal right?

Philip

On 01/14/2016 05:48 AM, James Molloy wrote:
> Hi,
>
> I've given a bit of misinformation here and have caused some 
> confusion. After talking with Tim and Mehdi last night on IRC, I need 
> to correct what I said above to fall more in line with what Daniel is 
> saying. If any of the below contradicts what I've said already, please 
> accept my apologies. This version should be right.
>
> The behaviour of the code generator for big-endian NEON and MIPS is 
> derived from the fact that we did not want to change IR semantics at 
> all. A fundamental property that we do not want to break is memory 
> round-tripping:
>
> %1 = load <4 x i32>, %p32
> %2 = bitcast <4 x i32> %1 to <2 x i64>
> store <2 x i64> %2, (bitcast %p32 to <2 x i64>*)
>
> The value of memory before and after the store MUST NOT change 
> (contrary to what I said in an earlier post, I know).
>
> So in fact everything you can do in IR is valid. There are no changes 
> to IR semantics in the slightest. However, when it comes to generating 
> code from the IR, there are new rules:
>   1) Loads and stores are selected to be special loads and stores that 
> do some transform from a canonical form in memory to a type-specific 
> form in register.
>   2) Because bitcasts are load/store pairs in semantic, they must 
> behave as if a store then load was done. Specifically (bitcast TyA to 
> TyB) must transform TyA -> canonical form -> TyB, as a store then load 
> would. Therefore bitcasts are not no-ops during code generation (*but 
> behave as if they are from an IR perspective!*).
>
> The reason this works neatly in IR is due to the IR's type system. In 
> order to change type, a cast must be inserted or a memory round trip. 
> There is no other way. However in SDAG, things break down a bit. SDAG 
> is more weakly typed, and bitconverts are often simply removed. We 
> need that not to happen. Bitconverts are not no-ops.
>
> Daniel's explanation of physical register mapping was excellent so I'm 
> not going to repeat that.
>
> I apologise for the confusion and misinformation. This is quite a 
> complex topic and takes a bit of mind bending for me to understand, 
> and it was a long time ago.
>
> James
>
> On Thu, 14 Jan 2016 at 13:17 Daniel Sanders <Daniel.Sanders at imgtec.com 
> <mailto:Daniel.Sanders at imgtec.com>> wrote:
>
>     > Ok.  Then we need to change the LangRef as suggested.  Given this
>     is a rather important semantic change, I think you need to send a
>     top level RFC to the list.
>
>     FWIW, I don't think this is a semantic change to LLVM-IR itself. I
>     think it's more clearing up the misconception that LLVM-IR
>     semantics also apply to SelectionDAG's operations. That said, I do
>     think it's important to mention this in LangRef since it's very
>     easy to make this mistake and very few targets need to worry about
>     the distinction.
>
>     To explain why I don't think this is a semantic change to LLVM-IR,
>     let's consider this example from earlier:
>
>         %0 = load <4 x i32> %x
>         %1 = bitcast <4 x i32> %0 to <2 x i64>
>
>         store <2 x i64> %1, <2 x i64>* %y
>
>     In LLVM-IR terms, if the value of %0 is:
>
>         %0 = 0x00112233_44556677_8899aabb_ccddeeff
>
>     then the value of %1 is:
>
>         %1 = 0x0011223344556677_8899aabbccddeeff
>
>     which agrees with the store/load and the 'no bits change'
>     statements in LangRef.
>
>     However, the mapping of these bits to physical register bits is
>     not consistent between types:
>
>         Physreg(%0) = 0xccddeeff_8899aabb_44556677_00112233
>
>         Physreg(%1) = 0x8899aabbccddeeff_0011223344556677
>
>     Essentially, I'm saying that BitCastInst and ISD::BITCAST have
>     slightly different semantics because of their different domains.
>     The former is working on an abstract representation of the values
>     where both statements in LangRef are true, but the latter is
>     closer to the target where the 'no bits change' statement ceases
>     to be true in some cases.
>
>     > A couple of points that will need clarified:
>     > - Does this only apply to vector types?  It definitely doesn't
>     apply between pointer types today. What about integer, floating
>     point, and FCAs?
>
>     I've only seen it for vector types so far but in theory it could
>     happen for other types. I'd expect FCAs to encounter it since the
>     physical registers may contain padding that isn't present in the
>     LLVM-IR representation and the placement and amount of padding
>     will depend on the exact FCA.
>
>     I can think of cases where address space casts can encounter the
>     same problem but that's already been covered in LangRef ("It can
>     be a no-op cast or a complex value modification, depending on the
>     target and the address space pair.").
>
>     Does anyone use FCAs directly? Most targets seem to convert them
>     to same-sized integers or bitcast an FCA* to i8*.
>
>
>     > - Is combining two casts into one a legal operation?  I think it
>     is so far, but we need to explicitly state that.
>
>     Yes, A->B->C and A->C are equivalent.
>
>
>     > - Do we have a predicate for identifying no-op casts that can be
>     freely removed/combined?
>
>     James mentioned one in CGP but I haven't been able to find it. I
>     don't think it's necessary to have one at the LLVM-IR level but we
>     do need one in the backends. I remember adding one to the backend
>     but I can't find that either so I think I'm remembering one of my
>     patches from before I split MSA's registers into type-specific
>     classes.
>
>
>     > - Is coercing a load to the type it's immediately bitcast to
>     legal under this model?
>
>     Yes.
>
>     *From:*llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org
>     <mailto:llvm-dev-bounces at lists.llvm.org>] *On Behalf Of *Philip
>     Reames via llvm-dev
>     *Sent:* 13 January 2016 20:31
>     *To:* James Molloy; Hal Finkel
>     *Cc:* llvm-dev
>
>
>     *Subject:* Re: [llvm-dev] [GlobalISel] A Proposal for global
>     instruction selection
>
>     On 01/13/2016 12:20 PM, James Molloy wrote:
>
>         >  (Right?)
>
>         Uh no, the register content explicitly does change :( We
>         insert REV instructions (byteswap) on each bitcast. Bitcasts
>         can be merged and elided etc, but conceptually there's a
>         register content change on every bitcast.
>
>     Ok.  Then we need to change the LangRef as suggested.  Given this
>     is a rather important semantic change, I think you need to send a
>     top level RFC to the list.
>
>     A couple of points that will need clarified:
>     - Does this only apply to vector types?  It definitely doesn't
>     apply between pointer types today.  What about integer, floating
>     point, and FCAs?
>     - Is combining two casts into one a legal operation? I think it is
>     so far, but we need to explicitly state that.
>     - Do we have a predicate for identifying no-op casts that can be
>     freely removed/combined?
>     - Is coercing a load to the type it's immediately bitcast to legal
>     under this model?
>
>     James
>
>     On Wed, 13 Jan 2016 at 18:09 Philip Reames
>     <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>
>
>
>         On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:
>         > ----- Original Message -----
>         >> From: "James Molloy" <james at jamesmolloy.co.uk
>         <mailto:james at jamesmolloy.co.uk>>
>         >> To: "Hal Finkel" <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>
>         >> Cc: "llvm-dev" <llvm-dev at lists.llvm.org
>         <mailto:llvm-dev at lists.llvm.org>>, "Quentin Colombet"
>         <qcolombet at apple.com <mailto:qcolombet at apple.com>>
>         >> Sent: Wednesday, January 13, 2016 9:54:26 AM
>         >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
>         instruction selection
>         >>
>         >>
>         >>> I think that teaching the optimizer about big-Endian lane
>         ordering
>         >>> would have been better.
>         >>
>         >> It's certainly arguable. Even in hindsight I'm glad we didn't -
>         >> that's the approach GCC took and they've been fixing subtle
>         bugs in
>         >> their vectorizer ever since.
>         >>
>         >>
>         >>> Inserting the REV after every LDR
>         >>
>         >> We only do this conceptually. In most cases REVs cancel
>         out, and we
>         >> have the LD1 instruction which is LDR+REV. With enough
>         peepholes
>         >> there's really no need for code to run slower.
>         >>
>         >>
>         >>> Given what's been done, should we update the LangRef.
>         >>
>         >> Potentially, yes. I hadn't realised quite how strongly
>         worded it was
>         >> with respect to this.
>         >>
>         > Please do ;)
>         I'm not sure changing bitcast is the right place. Since the
>         bitcast is
>         representing the in-register value (which doesn't change),
>         maybe we
>         should define it as part of the load/store instead?  That's
>         essentially
>         what's going on; we're converting from a canonical register
>         form to a
>         variety of memory forms.  (Right?)
>         >
>         >   -Hal
>         >
>         >> James
>         >>
>         >>
>         >> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at anl.gov
>         <mailto:hfinkel at anl.gov> > wrote:
>         >>
>         >>
>         >>
>         >>
>         >> [resending so the message is smaller]
>         >>
>         >>
>         >>
>         >>
>         >>
>         >>
>         >> From: "James Molloy via llvm-dev" < llvm-dev at lists.llvm.org
>         <mailto:llvm-dev at lists.llvm.org> >
>         >> To: "Quentin Colombet" < qcolombet at apple.com
>         <mailto:qcolombet at apple.com> >
>         >> Cc: "llvm-dev" < llvm-dev at lists.llvm.org
>         <mailto:llvm-dev at lists.llvm.org> >
>         >> Sent: Wednesday, January 13, 2016 2:35:32 AM
>         >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
>         >> instruction selection
>         >>
>         >> Hi Philip,
>         >>
>         >>
>         >>
>         >>
>         >>
>         >> store <2 x i64> %1, <2 x i64>* %y
>         >>
>         >> Yes. The memory pattern differs. This is the first diagram
>         on the
>         >> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts )
>         >>
>         >>
>         >> I think that teaching the optimizer about big-Endian lane
>         ordering
>         >> would have been better. Inserting the REV after every LDR
>         sounds
>         >> very similar to what we do for VSX on little-Endian PowerPC
>         systems
>         >> (PowerPC may have a slight advantage here in that we don't
>         need to
>         >> do insertelement / extractelement / shufflevector through
>         memory on
>         >> systems where little-Endian mode is relevant, see
>         >>
>         http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
>         >> ).
>         >>
>         >> Given what's been done, should we update the LangRef. It
>         currently
>         >> reads, " The ‘ bitcast ‘ instruction converts value to type
>         ty2 . It
>         >> is always a no-op cast because no bits change with this
>         conversion.
>         >> The conversion is done as if the value had been stored to
>         memory and
>         >> read back as type ty2 ." But this is now, at the least,
>         misleading,
>         >> because this process of storing the value as one type and
>         reading it
>         >> back in as another does, in fact, change the bits. We need
>         to make
>         >> clear that this might change the bits (perhaps specifically by
>         >> calling out this case of vector bitcasts on big-Endian
>         systems?).
>         >>
>         >>
>         >>
>         >> Also, regarding this, " Most operating systems however do
>         not run
>         >> with alignment faults enabled, so this is often not an
>         issue." Are
>         >> you saying that the processor does the correct thing in
>         this case
>         >> (if alignment faults are not enabled, then it performs a proper
>         >> unaligned load), or that the operating-system trap handler
>         emulates
>         >> the unaligned load should one occur?
>         >>
>         >> Thanks again,
>         >> Hal
>         >>
>         >>
>         >> _______________________________________________
>         >>
>         >>
>         >> LLVM Developers mailing list
>         >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>         >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>         >>
>         >>
>         >> --
>         >> Hal Finkel
>         >> Assistant Computational Scientist
>         >> Leadership Computing Facility
>         >> Argonne National Laboratory
>         >>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160114/9e5887b6/attachment-0001.html>


More information about the llvm-dev mailing list