[llvm-dev] Reaching definitions on Machine IR post register allocation

Raghavan, Venugopal via llvm-dev llvm-dev at lists.llvm.org
Tue Oct 31 21:24:00 PDT 2017


Hi Geoff/Krzyssztof,

Wouldn't the isRenamable() change be required even for the RDF based copy propagation? Maybe Hexagon does not impose ABI/ISA restrictions which require specific registers to be used in specific contexts.

Also, if Geoff's copy propagation pass is invoked post-RA wouldn't it need to handle the x86 ISA feature which allows 8 bit/16 bit values to be moved into a register without over-writing the other bits in the register being defined? I am assuming this needs to be handled for this pass to work for x86. Is this part of Geoff's patch?

I have seen cases where we need copy propagation across basic blocks. Given that RDF works on the whole function, would it not be a good idea to just use that with the X86 partial-definition issue being handled?

Considering everything, would the following not be a good approach?

1) Use the isRenamable() change for MachineOperand which allows us to adhere to ABI/ISA constraints
2) Fix the partial register definition issues as seen in x86
3) Try to get the RDF based analysis and copy propagation and other passes built on RDF to work for all targets including x86 (we get the advantage of passes which can work on the whole machine function assuming that there are no scalability issues by doing so).

I myself implemented a rudimentary copy propagation pass which tries to copy propagate on the whole machine function but was put off from doing further work on this because of the X86 partial definition issue mentioned above to which Krzysztof drew my attention.

Regards,
Venu.

-----Original Message-----
From: Geoff Berry [mailto:gberry at codeaurora.org] 
Sent: Tuesday, October 31, 2017 8:33 PM
To: llvm-dev at lists.llvm.org; Raghavan, Venugopal <Venugopal.Raghavan at amd.com>
Subject: Re: [llvm-dev] Reaching definitions on Machine IR post register allocation

Hi Venu,

FWIW, I have a pass that does copy propagation after RA [1] (currently only within a basic block) that should be enabled some time in the not-too-distant future.  It has been reviewed and accepted, but I'm currently working on getting a slight change to the MachineOperand representation [2] that should make the copy propagation change much simpler.  I believe this change to MachineOperand (or something like it) would also be needed for any pass that wants to rename registers after RA (unless the renaming is done right after RA when virtual registers are still present, which is what my current patch does, and is the source of complexity that I'm trying to eliminate).

[1] https://reviews.llvm.org/D30751
[2] https://reviews.llvm.org/D39400 D39400 WIP: [MachineOperand][MIR] Add isRenamable to MachineOperand.
	
On 10/31/2017 5:49 AM, Raghavan, Venugopal via llvm-dev wrote:
> Hi Krzysztof,
> 
> Thanks a lot for taking the time to write a detailed explanation. I 
> think I understand things better now.
> 
> I am trying to see if I can use RDF for X86 assuming I can add more 
> register units for X86 so that the partial re-definition issue you 
> pointed earlier is fixed. I am not yet sure whether it is easy or not, 
> but I have identified a portion of code in TableGen which I can modify 
> to do this.
> 
> Once I have done this I plan to use the RDG graph construction and RDF 
> copy propagation code to try to propagate copies in the code after RA.
> 
> I am hoping that this plan would work.
> 
> Regards,
> 
> Venu.
> 
> From: *Krzysztof Parzyszek via llvm-dev* <llvm-dev at lists.llvm.org 
> <mailto:llvm-dev at lists.llvm.org>>
> Date: Mon, Oct 30, 2017 at 7:25 PM
> Subject: Re: [llvm-dev] Reaching definitions on Machine IR post 
> register allocation
> To: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> 
> 
> Hi Raghavan,
> Thanks for asking the questions, it's not a problem at all.
> 
> The RDF graph simulates SSA. It does so by having "artificial" PHI 
> nodes that account for multiple reaching defs.
> 
> Consider this example (for Hexagon). The CFG is a simple triangle 
> which has a split block (BB#0), which either goes to the side block 
> (BB#1), or to the join block (BB#2). The register R0 is defined in 
> BB#0 and then redefined in BB#1. Both definitions can reach the use of R0 in BB#2:
> 
> 
> ***
> Before Hexagon RDF optimizations
> # Machine code for function fred: IsSSA, NoPHIs, TracksLiveness, 
> NoVRegs
> 
> BB#0:
>      Live Ins: %P0
>          %R0<def> = IMPLICIT_DEF
>          J2_jumpt %P0, <BB#2>, %PC<imp-def>  ; Conditional branch to 
> BB#2
>      Successors according to CFG: BB#1 BB#2
> 
> BB#1:
>      Predecessors according to CFG: BB#0
>          %R0<def> = IMPLICIT_DEF
>      Successors according to CFG: BB#2
> 
> BB#2:
>      Live Ins: %R0
>      Predecessors according to CFG: BB#0 BB#1
>          %R1<def> = COPY %R0
> ***
> 
> The constructed graph is below. Note the phi node (p17) in BB#2:
> 
> Starting copy propagation on: fred
> DFG dump:[
> f1: Function: fred
> b2: --- BB#0 --- preds(0):   succs(2): BB#1, BB#2
> p15: phi [+d16<P0>(,,u7):]
> s3: IMPLICIT_DEF [d4<R0>(,d10,u19):]
> s5: J2_jumpt BB#2 [/+d6<PC>!(,,):, u7<P0>(+d16):]
> 
> b8: --- BB#1 --- preds(1): BB#0  succs(1): BB#2
> s9: IMPLICIT_DEF [d10<R0>(d4,,u20):]
> 
> b11: --- BB#2 --- preds(2): BB#0, BB#1  succs(0):
> p17: phi [+d18<R0>(,,u14):, u19<R0>(d4,b2):, u20<R0>(d10,b8):]  ; 
> *PHI*
> s12: COPY [d13<R1>(,,):, u14<R0>(+d18):]
> 
> ]
> 
> 
> The phi node has uses which are linked to the reaching defs, and 
> paired with the predecessor block to which the def corresponds. At 
> this point, the graph looks a lot like the standard SSA graph.
> 
> 
> There are cases where there are multiple reaching defs emerging in a 
> straight-line code. Imagine a register BIG_REG that has two 
> non-overlapping subregisters, SMALL_REG_1 and SMALL_REG_2. Now, 
> consider
> this:
> 
> SMALL_REG_1 = ...
> SMALL_REG_2 = ...
> ... = BIG_REG
> 
> Both of the preceding definitions of of the subregisters of BIG_REG 
> reach the use of BIG_REG, and both of them have to be connected to the 
> use somehow. Phi nodes are not the solution, since they are specific 
> to join blocks in the control flow graph. What RDF does is that it 
> adds additional uses of BIG_REG, and each of them has a different 
> reaching def. These uses are called "shadows" in the RDF terminology.
> 
> A concrete example (from Hexagon again, D0 is a pair of registers R0 
> and R1). Note the two uses u9 and u10 in the COPY instruction, and 
> that each of them has one of the preceding defs as its reaching def:
> 
> Before Hexagon RDF optimizations
> # Machine code for function fred: IsSSA, NoPHIs, TracksLiveness, 
> NoVRegs
> 
> BB#0:
>          %R0<def> = IMPLICIT_DEF
>          %R1<def> = IMPLICIT_DEF
>          %D1<def> = COPY %D0
> 
> # End machine code for function fred.
> 
> Starting copy propagation on: fred
> DFG dump:[
> f1: Function: fred
> b2: --- BB#0 --- preds(0):   succs(0):
> s3: IMPLICIT_DEF [d4<R0>(,,u10"):]
> s5: IMPLICIT_DEF [d6<R1>(,,u9"):]
> s7: COPY [d8<D1>(,,):, u9"<D0>(d6):, u10"<D0>(d4):] ;; Two uses: u9, 
> u10
> 
> ]
> 
> 
> 
> In the example above, the fact that the two subregisters were 
> non-overlapping was important. If they are non-overlapping, there 
> won't be any data-flow links between them, since they are essentially 
> independent registers. If they were overlapping, such links would be 
> present. Here's a modified version of the example above, where we 
> first modify D0, then one of its subregisters, and then read the whole D0:
> 
> Before Hexagon RDF optimizations
> # Machine code for function fred: IsSSA, NoPHIs, TracksLiveness, 
> NoVRegs
> 
> BB#0:
>          %D0<def> = IMPLICIT_DEF
>          %R1<def> = IMPLICIT_DEF
>          %D1<def> = COPY %D0
> 
> # End machine code for function fred.
> 
> Starting copy propagation on: fred
> DFG dump:[
> f1: Function: fred
> b2: --- BB#0 --- preds(0):   succs(0):
> s3: IMPLICIT_DEF [d4<D0>(,d6,):]
> s5: IMPLICIT_DEF [d6<R1>(d4,,u9):]
> s7: COPY [d8<D1>(,,):, u9<D0>(d6):]
> 
> ]
> 
> 
> This time, the use of D0 only has the immediately preceding def of R1, 
> and that def is then linked to the preceding def of D0.
> 
> 
> Before I describe how to get the reaching defs from the graph, let me 
> point out that there is a function "getAllReachingDefs" in 
> RDFLiveness.cpp that for a given RefNode returns the vector of all 
> reaching defs from the same block in the order in which they would be 
> encountered going backwards from the RefNode to the beginning of the 
> basic block. That function will only return defs from the same basic 
> block as the RefNode is in. This function was meant to hide the 
> complexity of finding reaching defs "by hand", and is the preferred 
> way of getting reaching defs.
> 
> Now, to get all reaching defs for a given use or def, you need to 
> follow the "reaching-def" link up in the graph, possibly multiple 
> times, until the set of registers associated with these reaching defs 
> covers the register from the use/def that you started from. In the 
> last example, there is a use of D0, and the reaching def is d6, which 
> is associated with R1. The set {R1} does not cover D0, so you'd need 
> to go further up, to the reaching def of d6. That def (d4) defines D0, 
> so now the set of registers is {R1,D0}, which completely covers D0, so 
> the algorithm
> stops: the list of reaching defs is d6, d4.
> All of that work is done by getAllReachingDefs, together with handling 
> shadows and all other necessary cases.
> 
> 
> The "isPreserving" flag has more to do with making sure that all 
> necessary links are present in the graph. The idea is that the entire 
> data flow is completely represented via the links, and so the exact 
> semantics of the machine instructions does not need to be known.
> If you have two definitions of the same register following one 
> another, and the first one does not reach any uses, it may be considered "dead"
> by looking at the data-flow links alone. If the second one is 
> predicated, it may or may not happen, and so the preceding def is not 
> guaranteed to be dead (and cannot be removed). In other words, it can 
> "preserve" the original value (or, in general, one or more bits of the 
> original value), hence the name. I don't think that it's necessary to 
> have that flag anymore (i.e. RDF can be modified to work without it), 
> but there are some historical reasons why it was introduced.
> 
> -Krzysztof
> 
> 
> 
> 
> On 10/30/2017 4:13 AM, Raghavan, Venugopal via llvm-dev wrote:
> 
>     Hi Krzysztof,
> 
>     I am returning to this thread after a gap in time. I took some time
>     to study the code in RDFGraph.cpp in the Hexagon directory.
> 
>     I have a few questions. I hope you will not find it too much of a
>     bother to answer them.
> 
>     1) Does use node only support a single reaching def? If the RDF
>     graph is constructed post-RA, the code is no longer in SSA form and
>     is it not possible that multiple reaching defs reach a use node? For
>     example, the same physical register RAX may be defined in the if
>     branch and the else branch and can reach a common use node resulting
>     in multiple reaching defs for a use node. How is this handled?
>     2) I am unable to locate the code that handles re-definition of the
>     same register. I though "clobber" refers to such situations but the
>     isClobbering() function which controls the setting of the
>     "Clobbering" attribute seems to mainly handle function calls and not
>     the re-definitions of registers through other types of instructions.
>     How are such re-definitions handled?
>     3) My mental picture of reaching definitions analysis is the
>     classical iterative data flow analysis approach which involves using
>     the union operator as the join operator for the data flow
>     information, but I do not see anything in the code that corresponds
>     to such a notion. I am obviously way off the mark here, so, can you
>     briefly explain the idea behind the reaching definitions analysis in
>     the RDF code?
>     4) Assuming I have somehow fixed the register units issue for x86
>     (which you mentioned in a previous reply)  in the TableGen code by
>     adding extra register units for x86, I am not sure how the
>     preservation of the previous defined value through a partial
>     over-write by a sub-register definition is handled in the RDF code.
>     I looked at the isPreserving() function but that function seems to
>     check whether the instruction is predicated or not which is a little
>     confusing to me since the x86 case can create preservation of bits
>     through non-predicated partial over-writes. So, how is the x86
>     situation handled?
> 
>     I realize that I have asked too many questions, but, I hope you can
>     briefly provide me some clarifications.
> 
>     Thanks.
> 
>     Regards,
>     Venugopal Raghavan.
> 
>     -----Original Message-----
>     From: qcolombet at apple.com <mailto:qcolombet at apple.com>
>     [mailto:qcolombet at apple.com <mailto:qcolombet at apple.com>]
>     Sent: Tuesday, September 12, 2017 11:31 PM
>     To: Raghavan, Venugopal <Venugopal.Raghavan at amd.com
>     <mailto:Venugopal.Raghavan at amd.com>>
>     Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     Subject: Re: [llvm-dev] Reaching definitions on Machine IR post
>     register allocation
> 
>     Hi Venu,
> 
>         On Sep 11, 2017, at 11:00 PM, Raghavan, Venugopal via llvm-dev
>         <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
>         Hi Krzysztof,
> 
>         Thanks for your reply.
> 
>         I agree that adding extra register units for x86 would be the
>         right way to fix this. Do you know if there is a plan to fix this?
> 
> 
>     No concrete plan, no. We've been thinking about for quite some time
>     now, but never got at it.
> 
>     Cheers,
>     -Quentin
> 
> 
>         The case that you have pointed out involving partial writes to a
>         register together completely killing a wider register is
>         interesting and did not occur to me. But, for reaching
>         definitions, even if you do not detect this case, shouldn't this
>         be safe (although conservative) since you would only end up
>         over-estimating the set of reaching definitions at a point?
> 
>         But for other scenarios such as copy propagation, you may need
>         to detect this scenario because otherwise you could assume that
>         a copy reaches a use point (when it actually does not) and you
>         may end up propagating the copy when it is incorrect to do so.
> 
>         Regards,
>         Venu.
> 
> 
>         -----Original Message-----
>         From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org
>         <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of
>         Krzysztof Parzyszek via llvm-dev
>         Sent: Friday, September 8, 2017 7:10 PM
>         To: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>         Subject: Re: [llvm-dev] Reaching definitions on Machine IR post
>         register allocation
> 
>         It would be much easier to add extra reg units to EAX, EBX, etc.
>         to represent the upper half of the register. That would fix the
>         issue the right way (and would actually make sense without the
>         context of RDF).
> 
>         Regarding the sub-register check---consider this case:
> 
>             AX = ...
>             ...
>             AL = ...  // AX partially overwritten
>             AH = ...  // AX completely overwritten
> 
>         Each of the assignments to AL/AH does not overwrite AX by
>         itself, but together they do. This would be very difficult to
>         detect if we added such special treatment of sub-registers that
>         you propose.
> 
>         I could try to make a case for the extra register units to get
>         them added.
> 
>         -Krzysztof
> 
> 
>         On 9/8/2017 1:55 AM, Raghavan, Venugopal via llvm-dev wrote:
> 
>             Hi Krzysztof,
> 
>             Thanks for your explanation - I think I understand the issue
>             now.
> 
>             I do know if this is a hack, but, in reaching definitions,
>             when we check if a definition of AX kills a previous
>             definition of EAX, can we not check the actual physical
>             register? In this case since EAX and AX are different
>             (although I understand AX is a sub-register of EAX), can we
>             not conclude that the definition of AX does not kill EAX?
>             Since, in reaching definitions, it is conservative if we
>             over-estimate the set of reaching definitions (but not
>             unsafe), should not this strategy be adequate to avoid
>             incorrect optimizations?
> 
>             Or, is there an optimization that can behave incorrectly
>             with an overestimated set of reaching definitions?
> 
>             Regards,
>             Venu.
> 
>             -----Original Message-----
>             From: Nema, Ashutosh
>             Sent: Friday, September 08, 2017 12:00 PM
>             To: Raghavan, Venugopal <Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>>
>             Subject: FW: [llvm-dev] Reaching definitions on Machine IR post
>             register allocation
> 
> 
> 
>             -----Original Message-----
>             From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of
>             Krzysztof Parzyszek via llvm-dev
>             Sent: Wednesday, September 6, 2017 6:10 PM
>             To: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             Subject: Re: [llvm-dev] Reaching definitions on Machine IR post
>             register allocation
> 
>             RDF needs to know when an assignment to a register is
>             overwritten by another assignment, or by a sequence of
>             assignments.  This is needed to determine whether the
>             original assignment is still live or not. RDF uses register
>             units to represent the building blocks of registers, and
>             assumes that if all units of a register are overwritten,
>             then the original value of that register is completely
>             overwritten. This assumption is true for all targets (that I
>             have tested it with) except X86. On X86, registers AX and
>             EAX both have only one register unit, but when you assign a
>             value to AX, the upper half of EAX is preserved (that is,
>             the original value of EAX is not completely overwritten).
> 
>             -Krzysztof
> 
>             On 9/6/2017 6:42 AM, Raghavan, Venugopal via llvm-dev wrote:
> 
>                 Hi Krzysztof,
> 
>                 I did look at the other link you have mentioned in your
>                 reply but did not quite understand the register units
>                 issue. If it is not too difficult, can you briefly
>                 summarize what the issue was?
> 
>                 Thanks.
> 
>                 Regards,
>                 Venu.
> 
> 
>                 -----Original Message-----
>                 From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org
>                 <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of
>                 Krzysztof Parzyszek via llvm-dev
>                 Sent: Tuesday, September 5, 2017 7:44 PM
>                 To: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>                 Subject: Re: [llvm-dev] Reaching definitions on Machine
>                 IR post
>                 register allocation
> 
>                 Hexagon has RDF that does exactly that.  At the moment
>                 it's under lib/Target/Hexagon, but it meant to be
>                 target-independent.  It won't work with X86 due to a
>                 known issue related to register units, but it should
>                 work fine for other targets.  See
>                 https://reviews.llvm.org/D29295 about moving it to
>                 lib/CodeGen.
> 
>                 -Krzysztof
> 
>                 On 9/4/2017 9:00 AM, Raghavan, Venugopal via llvm-dev wrote:
> 
>                     Hi,
> 
>                     Just to clarify I am looking for a whole machine
>                     function analysis
>                     not just something restricted to within a machine
>                     basic block.
> 
>                     Thanks.
> 
>                     Regards,
> 
>                     Venu.
> 
>                     *From:* Raghavan, Venugopal
>                     *Sent:* Saturday, September 02, 2017 12:56 PM
>                     *To:* llvm-dev at lists.llvm.org
>                     <mailto:llvm-dev at lists.llvm.org>
>                     *Subject:* Reaching definitions on Machine IR post
>                     register
>                     allocation
> 
>                     Hi,
> 
>                     Given a definition of a register by a machine
>                     instruction in the
>                     Machine IR post register allocation, I would like to
>                     compute the
>                     set of uses of this register reached by this definition.
> 
>                     Does LLVM already have this kind of analysis I can
>                     use? Otherwise,
>                     I will have to implement a reaching definitions
>                     analysis which
>                     would be a little involved since it would need to
>                     work on a non-SSA IR form.
> 
>                     If something already exists that would be very
>                     helpful for me.
> 
>                     Thanks.
> 
>                     Regards,
> 
>                     Venugopal Raghavan.
> 
> 
> 
>                     _______________________________________________
>                     LLVM Developers mailing list
>                     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>                     
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
>                 --
>                 Qualcomm Innovation Center, Inc. is a member of Code
>                 Aurora Forum,
>                 hosted by The Linux Foundation
>                 _______________________________________________
>                 LLVM Developers mailing list
>                 llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>                 http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>                 _______________________________________________
>                 LLVM Developers mailing list
>                 llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>                 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
>             --
>             Qualcomm Innovation Center, Inc. is a member of Code Aurora
>             Forum,
>             hosted by The Linux Foundation
>             _______________________________________________
>             LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             _______________________________________________
>             LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
>         --
>         Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>         hosted by The Linux Foundation
>         _______________________________________________
>         LLVM Developers mailing list
>         llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>         http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>         _______________________________________________
>         LLVM Developers mailing list
>         llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>         http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
> hosted by The Linux Foundation 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> 
> --
> 
> Thanks & Regards
> 
> Deepali Rai
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

--
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
  Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


More information about the llvm-dev mailing list