[llvm-dev] [GlobalISel] A Proposal for global instruction selection
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Thu Jan 14 13:48:37 PST 2016
This explanation makes a lot more sense to me. I think it would make
sense to document this mental model, but I agree that this
interpretation does not seem to require changes to the IR semantics.
Just to check, this implies that DSE *is* legal right?
Philip
On 01/14/2016 05:48 AM, James Molloy wrote:
> Hi,
>
> I've given a bit of misinformation here and have caused some
> confusion. After talking with Tim and Mehdi last night on IRC, I need
> to correct what I said above to fall more in line with what Daniel is
> saying. If any of the below contradicts what I've said already, please
> accept my apologies. This version should be right.
>
> The behaviour of the code generator for big-endian NEON and MIPS is
> derived from the fact that we did not want to change IR semantics at
> all. A fundamental property that we do not want to break is memory
> round-tripping:
>
> %1 = load <4 x i32>, %p32
> %2 = bitcast <4 x i32> %1 to <2 x i64>
> store <2 x i64> %2, (bitcast %p32 to <2 x i64>*)
>
> The value of memory before and after the store MUST NOT change
> (contrary to what I said in an earlier post, I know).
>
> So in fact everything you can do in IR is valid. There are no changes
> to IR semantics in the slightest. However, when it comes to generating
> code from the IR, there are new rules:
> 1) Loads and stores are selected to be special loads and stores that
> do some transform from a canonical form in memory to a type-specific
> form in register.
> 2) Because bitcasts are load/store pairs in semantic, they must
> behave as if a store then load was done. Specifically (bitcast TyA to
> TyB) must transform TyA -> canonical form -> TyB, as a store then load
> would. Therefore bitcasts are not no-ops during code generation (*but
> behave as if they are from an IR perspective!*).
>
> The reason this works neatly in IR is due to the IR's type system. In
> order to change type, a cast must be inserted or a memory round trip.
> There is no other way. However in SDAG, things break down a bit. SDAG
> is more weakly typed, and bitconverts are often simply removed. We
> need that not to happen. Bitconverts are not no-ops.
>
> Daniel's explanation of physical register mapping was excellent so I'm
> not going to repeat that.
>
> I apologise for the confusion and misinformation. This is quite a
> complex topic and takes a bit of mind bending for me to understand,
> and it was a long time ago.
>
> James
>
> On Thu, 14 Jan 2016 at 13:17 Daniel Sanders <Daniel.Sanders at imgtec.com
> <mailto:Daniel.Sanders at imgtec.com>> wrote:
>
> > Ok. Then we need to change the LangRef as suggested. Given this
> is a rather important semantic change, I think you need to send a
> top level RFC to the list.
>
> FWIW, I don't think this is a semantic change to LLVM-IR itself. I
> think it's more clearing up the misconception that LLVM-IR
> semantics also apply to SelectionDAG's operations. That said, I do
> think it's important to mention this in LangRef since it's very
> easy to make this mistake and very few targets need to worry about
> the distinction.
>
> To explain why I don't think this is a semantic change to LLVM-IR,
> let's consider this example from earlier:
>
> %0 = load <4 x i32> %x
> %1 = bitcast <4 x i32> %0 to <2 x i64>
>
> store <2 x i64> %1, <2 x i64>* %y
>
> In LLVM-IR terms, if the value of %0 is:
>
> %0 = 0x00112233_44556677_8899aabb_ccddeeff
>
> then the value of %1 is:
>
> %1 = 0x0011223344556677_8899aabbccddeeff
>
> which agrees with the store/load and the 'no bits change'
> statements in LangRef.
>
> However, the mapping of these bits to physical register bits is
> not consistent between types:
>
> Physreg(%0) = 0xccddeeff_8899aabb_44556677_00112233
>
> Physreg(%1) = 0x8899aabbccddeeff_0011223344556677
>
> Essentially, I'm saying that BitCastInst and ISD::BITCAST have
> slightly different semantics because of their different domains.
> The former is working on an abstract representation of the values
> where both statements in LangRef are true, but the latter is
> closer to the target where the 'no bits change' statement ceases
> to be true in some cases.
>
> > A couple of points that will need clarified:
> > - Does this only apply to vector types? It definitely doesn't
> apply between pointer types today. What about integer, floating
> point, and FCAs?
>
> I've only seen it for vector types so far but in theory it could
> happen for other types. I'd expect FCAs to encounter it since the
> physical registers may contain padding that isn't present in the
> LLVM-IR representation and the placement and amount of padding
> will depend on the exact FCA.
>
> I can think of cases where address space casts can encounter the
> same problem but that's already been covered in LangRef ("It can
> be a no-op cast or a complex value modification, depending on the
> target and the address space pair.").
>
> Does anyone use FCAs directly? Most targets seem to convert them
> to same-sized integers or bitcast an FCA* to i8*.
>
>
> > - Is combining two casts into one a legal operation? I think it
> is so far, but we need to explicitly state that.
>
> Yes, A->B->C and A->C are equivalent.
>
>
> > - Do we have a predicate for identifying no-op casts that can be
> freely removed/combined?
>
> James mentioned one in CGP but I haven't been able to find it. I
> don't think it's necessary to have one at the LLVM-IR level but we
> do need one in the backends. I remember adding one to the backend
> but I can't find that either so I think I'm remembering one of my
> patches from before I split MSA's registers into type-specific
> classes.
>
>
> > - Is coercing a load to the type it's immediately bitcast to
> legal under this model?
>
> Yes.
>
> *From:*llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org
> <mailto:llvm-dev-bounces at lists.llvm.org>] *On Behalf Of *Philip
> Reames via llvm-dev
> *Sent:* 13 January 2016 20:31
> *To:* James Molloy; Hal Finkel
> *Cc:* llvm-dev
>
>
> *Subject:* Re: [llvm-dev] [GlobalISel] A Proposal for global
> instruction selection
>
> On 01/13/2016 12:20 PM, James Molloy wrote:
>
> > (Right?)
>
> Uh no, the register content explicitly does change :( We
> insert REV instructions (byteswap) on each bitcast. Bitcasts
> can be merged and elided etc, but conceptually there's a
> register content change on every bitcast.
>
> Ok. Then we need to change the LangRef as suggested. Given this
> is a rather important semantic change, I think you need to send a
> top level RFC to the list.
>
> A couple of points that will need clarified:
> - Does this only apply to vector types? It definitely doesn't
> apply between pointer types today. What about integer, floating
> point, and FCAs?
> - Is combining two casts into one a legal operation? I think it is
> so far, but we need to explicitly state that.
> - Do we have a predicate for identifying no-op casts that can be
> freely removed/combined?
> - Is coercing a load to the type it's immediately bitcast to legal
> under this model?
>
> James
>
> On Wed, 13 Jan 2016 at 18:09 Philip Reames
> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>
>
>
> On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:
> > ----- Original Message -----
> >> From: "James Molloy" <james at jamesmolloy.co.uk
> <mailto:james at jamesmolloy.co.uk>>
> >> To: "Hal Finkel" <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>
> >> Cc: "llvm-dev" <llvm-dev at lists.llvm.org
> <mailto:llvm-dev at lists.llvm.org>>, "Quentin Colombet"
> <qcolombet at apple.com <mailto:qcolombet at apple.com>>
> >> Sent: Wednesday, January 13, 2016 9:54:26 AM
> >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
> instruction selection
> >>
> >>
> >>> I think that teaching the optimizer about big-Endian lane
> ordering
> >>> would have been better.
> >>
> >> It's certainly arguable. Even in hindsight I'm glad we didn't -
> >> that's the approach GCC took and they've been fixing subtle
> bugs in
> >> their vectorizer ever since.
> >>
> >>
> >>> Inserting the REV after every LDR
> >>
> >> We only do this conceptually. In most cases REVs cancel
> out, and we
> >> have the LD1 instruction which is LDR+REV. With enough
> peepholes
> >> there's really no need for code to run slower.
> >>
> >>
> >>> Given what's been done, should we update the LangRef.
> >>
> >> Potentially, yes. I hadn't realised quite how strongly
> worded it was
> >> with respect to this.
> >>
> > Please do ;)
> I'm not sure changing bitcast is the right place. Since the
> bitcast is
> representing the in-register value (which doesn't change),
> maybe we
> should define it as part of the load/store instead? That's
> essentially
> what's going on; we're converting from a canonical register
> form to a
> variety of memory forms. (Right?)
> >
> > -Hal
> >
> >> James
> >>
> >>
> >> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at anl.gov
> <mailto:hfinkel at anl.gov> > wrote:
> >>
> >>
> >>
> >>
> >> [resending so the message is smaller]
> >>
> >>
> >>
> >>
> >>
> >>
> >> From: "James Molloy via llvm-dev" < llvm-dev at lists.llvm.org
> <mailto:llvm-dev at lists.llvm.org> >
> >> To: "Quentin Colombet" < qcolombet at apple.com
> <mailto:qcolombet at apple.com> >
> >> Cc: "llvm-dev" < llvm-dev at lists.llvm.org
> <mailto:llvm-dev at lists.llvm.org> >
> >> Sent: Wednesday, January 13, 2016 2:35:32 AM
> >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
> >> instruction selection
> >>
> >> Hi Philip,
> >>
> >>
> >>
> >>
> >>
> >> store <2 x i64> %1, <2 x i64>* %y
> >>
> >> Yes. The memory pattern differs. This is the first diagram
> on the
> >> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts )
> >>
> >>
> >> I think that teaching the optimizer about big-Endian lane
> ordering
> >> would have been better. Inserting the REV after every LDR
> sounds
> >> very similar to what we do for VSX on little-Endian PowerPC
> systems
> >> (PowerPC may have a slight advantage here in that we don't
> need to
> >> do insertelement / extractelement / shufflevector through
> memory on
> >> systems where little-Endian mode is relevant, see
> >>
> http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
> >> ).
> >>
> >> Given what's been done, should we update the LangRef. It
> currently
> >> reads, " The ‘ bitcast ‘ instruction converts value to type
> ty2 . It
> >> is always a no-op cast because no bits change with this
> conversion.
> >> The conversion is done as if the value had been stored to
> memory and
> >> read back as type ty2 ." But this is now, at the least,
> misleading,
> >> because this process of storing the value as one type and
> reading it
> >> back in as another does, in fact, change the bits. We need
> to make
> >> clear that this might change the bits (perhaps specifically by
> >> calling out this case of vector bitcasts on big-Endian
> systems?).
> >>
> >>
> >>
> >> Also, regarding this, " Most operating systems however do
> not run
> >> with alignment faults enabled, so this is often not an
> issue." Are
> >> you saying that the processor does the correct thing in
> this case
> >> (if alignment faults are not enabled, then it performs a proper
> >> unaligned load), or that the operating-system trap handler
> emulates
> >> the unaligned load should one occur?
> >>
> >> Thanks again,
> >> Hal
> >>
> >>
> >> _______________________________________________
> >>
> >>
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> >>
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160114/9e5887b6/attachment-0001.html>
More information about the llvm-dev
mailing list