[llvm-dev] Reaching definitions on Machine IR post register allocation

Mon Nov 27 11:22:33 PST 2017

Hi Venu,
This is what I used: https://reviews.llvm.org/D40509. It fails 
check-llvm because extra register classes are generated, which changes 
what the scheduling algorithm does. I added a non-allocatable register 
class for the new registers, but maybe removing it altogether would be 
better. I never explored this patch any further, so I'm not sure how 
well it works if you ignore the lit mismatches.

-Krzysztof

On 11/24/2017 3:21 AM, Venugopal Raghavan wrote:
> Hi  Krzysztof,
> 
> In one of your earlier emails in this thread you mentioned that you had 
> some changes which add extra aliases for subregisters. Did you mean for 
> X86? And is it extra register units that you added or aliases?
> 
> I tried adding extra register units for X86 through some changes in 
> CodeGenRegisters.cpp in TableGen but I am seeing a runtime error in one 
> of my test cases possibly due to the fact that my change is incorrect or 
> incomplete. Apart from adding extra register units, do you need to do 
> anything else?
> 
> If you have the changes you made, I would be grateful if you can share it.
> 
> Thanks.
> 
> Regards,
> Venu.
> 
> On Tue, Nov 14, 2017 at 8:54 AM, Venugopal Raghavan <venur2005 at gmail.com 
> <mailto:venur2005 at gmail.com>> wrote:
> 
>     Hi Krzysztof,
> 
>     Thanks for the reply.
> 
>     Since a data flow edge is missing, I thought that it could be a
>     correctness issue and not just a precision issue. But, on second
>     thoughts, maybe it isn't because the edge from the implicit def to
>     the use is present and the register associated with this edge covers
>     the register associated with the missing edge.
> 
>     I will try to revert the patch you mentioned and check if the
>     precision issue is fixed. I am seeing some other failure in a test
>     case after RDF copy propagation for X86 code and I was attributing
>     it to this missing edge, but, I guess there is some other issue
>     which I will try to debug.
> 
>     Thanks.
> 
>     Regards,
>     Venu.
> 
>     On Mon, Nov 13, 2017 at 9:39 PM, Krzysztof Parzyszek via llvm-dev
>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
>         Hi Venu,
> 
>         This is happening because there is an implicit def of ECX on the
>         COPY instruction. This was an issue on Hexagon as well. Let me
>         give you some background. There are two kinds of implicit defs
>         (and implicit uses, but I'll refer only to defs for brevity):
>         (1) Those that indicate that some physical register (that is not
>         an operand) is modified by a given instruction (EFLAGS is a good
>         example on X86).
>         (2) Those that are there for the purpose of tracking liveness,
>         such as the one on the COPY instruction here.
> 
>         The first kind is strictly necessary (or otherwise, some
>         physical register modifications would not be reflected in the
>         code), but the second kind is only there to assist with liveness
>         tracking: if register liveness was no longer tracked, these
>         implicit defs/uses could be removed and everything would be
>         fine. The first kind of implicit defs would have to stay
>         regardless of whether we're still interested in liveness or not.
> 
>         Having the second kind of implicit defs around will pessimize
>         many things (like scheduling, for example), because the general
>         algorithms that detect dependency between instructions will take
>         the implicit defs into account. For example, instructions
>         modifying AL and AH will appear to be dependent if they have
>         implicit defs of AX/EAX, even though AL and AH do not overlap.
>         This was a big problem on Hexagon, and the original RDF
>         implementation ignored the second kind of implicit defs/uses,
>         only paying attention to the first. Later on, subregister
>         liveness tracking was enabled on Hexagon, which got rid of the
>         second kind of implicit defs/uses altogether. After that, RDF
>         was changed to no longer ignore any implicit operands: this
>         happened in r300369.
> 
>         Since X86 does not use subregister liveness tracking, it still
>         has the extra implicit operands. I don't know whether it has any
>         special treatment of these to avoid the dependency issues, but
>         the presence of these will reduce the precision of RDF. You can
>         try to revert r300369 and see if it helps. It won't hurt
>         Hexagon, so if helps X86, we could consider bringing that code back.
> 
>         -Krzysztof
> 
>         On 11/10/2017 6:52 AM, Raghavan, Venugopal via llvm-dev wrote:
> 
>             Hi,
> 
>             For a test case that I ran I am seeing something in the RDF
>             graph that I do not quite understand. I think there is an
>             data flow edge that is missing but most likely I am wrong.
> 
>             The relevant portion of IR looks like this:
> 
>             BB#0:
> 
>             %R10<def> = MOVSX64rr32 %EDX<kill>; dbg:FastBoard.cpp:186:26
>             @[ FastBoard.cpp:1938:21 ]
> 
>                                   .
> 
>                                   .
> 
>                                   .
> 
>             TEST32rr %ESI<kill>, %R8D<kill>, %EFLAGS<imp-def>;
>             dbg:FastBoard.cpp:1940:10
> 
>             JNE_1 <BB#1>, %EFLAGS<imp-use,kill>; dbg:FastBoard.cpp:1940:9
> 
>             BB#1:
> 
>             %CL<def> = COPY %R9B<kill>, %ECX<imp-use,kill>,
>             %ECX<imp-def>; dbg:FastBoard.cpp:1938:38
> 
>                                    .
> 
>                                    .
> 
>                                    .
> 
>             CMP32mr %RDI, 4, %RSI, 1776, %noreg, %ECX, %EFLAGS<imp-def>;
>             mem:LD4[%arrayidx.i49](tbaa=!2767) dbg:FastBoard.cpp:1947:26
> 
>             The relevant portion of the RDF graph that is constructed is
>             shown below:
> 
>             BB#0:
> 
>             s3: MOV32r0 [d4<ECX>(,d50,u245):, d5<EFLAGS>!(,d7,):]
> 
>                                   .
> 
>                                   .
> 
>                                   .
> 
>             BB#1:
> 
>             s48: COPY [d49<CL>(d4,,):, d50<ECX>(d4,d184,u246):d49,
>             u51<R9B>(d11):u23, u52<ECX>(d4):]
> 
>                                   .
> 
>                                   .
> 
>                                   .
> 
>             s61: CMP32mr [d62<EFLAGS>!(d58,d80,u205):,
>             u63<RDI>(+d194):u55, u64<RSI>(d57):, u65<ECX>(d50):]
> 
>             This is a test case I was interested in because it shows the
>             partial register redefinition scenario in X86 for which more
>             register units needed to be added. I have a hacky fix for
>             this in TableGen code which adds more units for X86
>             registers. I am not completely sure that my fix is correct
>             and I was trying to test it.
> 
>             You can see that register ECX is defined by s3 above and is
>             used in the compare instruction which is s61. In between
>             there is a partial re-definition of the lower 8 bits of ECX
>             through the definition of register CL in s48. This
>             definition of CL should leave the high order bits defined by
>             s3 untouched.
> 
>             Now consider the reaching definition for the use ref node
>             u65 in s61. It has the d50 def ref node as a reaching
>             definition. My question is: shouldn’t we also be able to
>             infer that the d49 ref node is a reaching def node for u65?
>             Starting from u65 there does not seem to be a way to reach d49.
> 
>             I debugged the code in X86RDFGraph.cpp. It appears that when
>             we start from u65, we hit the def ref node d50 which covers
>             u65 since both involve ECX. Hence there is no need to look
>             for other reaching defs.
> 
>             The node d50 itself is created as a def node because ECX
>             appears as an implicit def in the COPY instruction in the
>             machine IR and the code in the function buildStmt()
>             processes such implicit defs and creates def nodes for them
>             in the RDF graph.
> 
>             I understand that the virtual SSA nature of the RDF graph
>             does not allow multiple reaching defs to be specified at a
>             use ref node, but then how does one handle situations such
>             as this where we need to identify all reaching defs for a
>             use node? Or, is this situation being created because my
>             hacky fix for the X86 partial redefinition case is not doing
>             the right thing?
> 
>             Thanks.
> 
>             Regards,
> 
>             Venu.
> 
>             From: *Krzysztof Parzyszek via llvm-dev*
>             <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             Date: Thu, Nov 2, 2017 at 1:34 AM
>             Subject: Re: [llvm-dev] Reaching definitions on Machine IR
>             post register allocation
>             To: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
> 
> 
>             RDF has its own tests to detect the ability to rename any
>             given register. RefNodes that have "Fixed" flag are those
>             where the register cannot be changed. Having the isRenamable
>             flag in MachineOperand would help remove these tests.
> 
>             -Krzysztof
> 
> 
> 
> 
>             On 10/31/2017 11:24 PM, Raghavan, Venugopal via llvm-dev wrote:
> 
>                  Hi Geoff/Krzyssztof,
> 
>                  Wouldn't the isRenamable() change be required even for
>             the RDF based
>                  copy propagation? Maybe Hexagon does not impose ABI/ISA
>             restrictions
>                  which require specific registers to be used in specific
>             contexts.
> 
>                  Also, if Geoff's copy propagation pass is invoked
>             post-RA wouldn't
>                  it need to handle the x86 ISA feature which allows 8
>             bit/16 bit
>                  values to be moved into a register without over-writing
>             the other
>                  bits in the register being defined? I am assuming this
>             needs to be
>                  handled for this pass to work for x86. Is this part of
>             Geoff's patch?
> 
>                  I have seen cases where we need copy propagation across
>             basic
>                  blocks. Given that RDF works on the whole function,
>             would it not be
>                  a good idea to just use that with the X86
>             partial-definition issue
>                  being handled?
> 
>                  Considering everything, would the following not be a
>             good approach?
> 
>                  1) Use the isRenamable() change for MachineOperand
>             which allows us
>                  to adhere to ABI/ISA constraints
>                  2) Fix the partial register definition issues as seen
>             in x86
>                  3) Try to get the RDF based analysis and copy
>             propagation and other
>                  passes built on RDF to work for all targets including
>             x86 (we get
>                  the advantage of passes which can work on the whole
>             machine function
>                  assuming that there are no scalability issues by doing so).
> 
>                  I myself implemented a rudimentary copy propagation
>             pass which tries
>                  to copy propagate on the whole machine function but was
>             put off from
>                  doing further work on this because of the X86 partial
>             definition
>                  issue mentioned above to which Krzysztof drew my attention.
> 
>                  Regards,
>                  Venu.
> 
>                  -----Original Message-----
>                  From: Geoff Berry [mailto:gberry at codeaurora.org
>             <mailto:gberry at codeaurora.org>
>                  <mailto:gberry at codeaurora.org
>             <mailto:gberry at codeaurora.org>>]
>                  Sent: Tuesday, October 31, 2017 8:33 PM
>                  To: llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>;
>                  Raghavan, Venugopal <Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>
>                  <mailto:Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>>>
>                  Subject: Re: [llvm-dev] Reaching definitions on Machine
>             IR post
>                  register allocation
> 
>                  Hi Venu,
> 
>                  FWIW, I have a pass that does copy propagation after RA [1]
>                  (currently only within a basic block) that should be
>             enabled some
>                  time in the not-too-distant future.  It has been
>             reviewed and
>                  accepted, but I'm currently working on getting a slight
>             change to
>                  the MachineOperand representation [2] that should make
>             the copy
>                  propagation change much simpler.  I believe this change to
>                  MachineOperand (or something like it) would also be
>             needed for any
>                  pass that wants to rename registers after RA (unless
>             the renaming is
>                  done right after RA when virtual registers are still
>             present, which
>                  is what my current patch does, and is the source of
>             complexity that
>                  I'm trying to eliminate).
> 
>                  [1] https://reviews.llvm.org/D30751
>             <https://reviews.llvm.org/D30751>
>                  [2] https://reviews.llvm.org/D39400
>             <https://reviews.llvm.org/D39400> D39400 WIP:
>                  [MachineOperand][MIR] Add isRenamable to MachineOperand.
> 
>                  On 10/31/2017 5:49 AM, Raghavan, Venugopal via llvm-dev
>             wrote:
> 
>                      Hi Krzysztof,
> 
>                      Thanks a lot for taking the time to write a
>             detailed explanation. I
>                      think I understand things better now.
> 
>                      I am trying to see if I can use RDF for X86
>             assuming I can add more
>                      register units for X86 so that the partial
>             re-definition issue you
>                      pointed earlier is fixed. I am not yet sure whether
>             it is easy
>                      or not,
>                      but I have identified a portion of code in TableGen
>             which I can
>                      modify
>                      to do this.
> 
>                      Once I have done this I plan to use the RDG graph
>             construction
>                      and RDF
>                      copy propagation code to try to propagate copies in
>             the code
>                      after RA.
> 
>                      I am hoping that this plan would work.
> 
>                      Regards,
> 
>                      Venu.
> 
>                      From: *Krzysztof Parzyszek via llvm-dev*
>                      <llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>>
>                      Date: Mon, Oct 30, 2017 at 7:25 PM
>                      Subject: Re: [llvm-dev] Reaching definitions on
>             Machine IR post
>                      register allocation
>                      To: llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
> 
> 
>                      Hi Raghavan,
>                      Thanks for asking the questions, it's not a problem
>             at all.
> 
>                      The RDF graph simulates SSA. It does so by having
>             "artificial" PHI
>                      nodes that account for multiple reaching defs.
> 
>                      Consider this example (for Hexagon). The CFG is a
>             simple triangle
>                      which has a split block (BB#0), which either goes
>             to the side block
>                      (BB#1), or to the join block (BB#2). The register
>             R0 is defined in
>                      BB#0 and then redefined in BB#1. Both definitions
>             can reach the
>                      use of R0 in BB#2:
> 
> 
>                      ***
>                      Before Hexagon RDF optimizations
>                      # Machine code for function fred: IsSSA, NoPHIs,
>             TracksLiveness,
>                      NoVRegs
> 
>                      BB#0:
>                             Live Ins: %P0
>                                 %R0<def> = IMPLICIT_DEF
>                                 J2_jumpt %P0, <BB#2>, %PC<imp-def>  ;
>             Conditional
>                      branch to
>                      BB#2
>                             Successors according to CFG: BB#1 BB#2
> 
>                      BB#1:
>                             Predecessors according to CFG: BB#0
>                                 %R0<def> = IMPLICIT_DEF
>                             Successors according to CFG: BB#2
> 
>                      BB#2:
>                             Live Ins: %R0
>                             Predecessors according to CFG: BB#0 BB#1
>                                 %R1<def> = COPY %R0
>                      ***
> 
>                      The constructed graph is below. Note the phi node
>             (p17) in BB#2:
> 
>                      Starting copy propagation on: fred
>                      DFG dump:[
>                      f1: Function: fred
>                      b2: --- BB#0 --- preds(0):   succs(2): BB#1, BB#2
>                      p15: phi [+d16<P0>(,,u7):]
>                      s3: IMPLICIT_DEF [d4<R0>(,d10,u19):]
>                      s5: J2_jumpt BB#2 [/+d6<PC>!(,,):, u7<P0>(+d16):]
> 
>                      b8: --- BB#1 --- preds(1): BB#0  succs(1): BB#2
>                      s9: IMPLICIT_DEF [d10<R0>(d4,,u20):]
> 
>                      b11: --- BB#2 --- preds(2): BB#0, BB#1  succs(0):
>                      p17: phi [+d18<R0>(,,u14):, u19<R0>(d4,b2):,
>             u20<R0>(d10,b8):]  ;
>                      *PHI*
>                      s12: COPY [d13<R1>(,,):, u14<R0>(+d18):]
> 
>                      ]
> 
> 
>                      The phi node has uses which are linked to the
>             reaching defs, and
>                      paired with the predecessor block to which the def
>             corresponds. At
>                      this point, the graph looks a lot like the standard
>             SSA graph.
> 
> 
>                      There are cases where there are multiple reaching
>             defs emerging in a
>                      straight-line code. Imagine a register BIG_REG that
>             has two
>                      non-overlapping subregisters, SMALL_REG_1 and
>             SMALL_REG_2. Now,
>                      consider
>                      this:
> 
>                      SMALL_REG_1 = ...
>                      SMALL_REG_2 = ...
>                      ... = BIG_REG
> 
>                      Both of the preceding definitions of of the
>             subregisters of BIG_REG
>                      reach the use of BIG_REG, and both of them have to
>             be connected
>                      to the
>                      use somehow. Phi nodes are not the solution, since
>             they are specific
>                      to join blocks in the control flow graph. What RDF
>             does is that it
>                      adds additional uses of BIG_REG, and each of them
>             has a different
>                      reaching def. These uses are called "shadows" in
>             the RDF
>                      terminology.
> 
>                      A concrete example (from Hexagon again, D0 is a
>             pair of registers R0
>                      and R1). Note the two uses u9 and u10 in the COPY
>             instruction, and
>                      that each of them has one of the preceding defs as
>             its reaching def:
> 
>                      Before Hexagon RDF optimizations
>                      # Machine code for function fred: IsSSA, NoPHIs,
>             TracksLiveness,
>                      NoVRegs
> 
>                      BB#0:
>                                 %R0<def> = IMPLICIT_DEF
>                                 %R1<def> = IMPLICIT_DEF
>                                 %D1<def> = COPY %D0
> 
>                      # End machine code for function fred.
> 
>                      Starting copy propagation on: fred
>                      DFG dump:[
>                      f1: Function: fred
>                      b2: --- BB#0 --- preds(0):   succs(0):
>                      s3: IMPLICIT_DEF [d4<R0>(,,u10"):]
>                      s5: IMPLICIT_DEF [d6<R1>(,,u9"):]
>                      s7: COPY [d8<D1>(,,):, u9"<D0>(d6):, u10"<D0>(d4):]
>             ;; Two uses: u9,
>                      u10
> 
>                      ]
> 
> 
> 
>                      In the example above, the fact that the two
>             subregisters were
>                      non-overlapping was important. If they are
>             non-overlapping, there
>                      won't be any data-flow links between them, since
>             they are
>                      essentially
>                      independent registers. If they were overlapping,
>             such links would be
>                      present. Here's a modified version of the example
>             above, where we
>                      first modify D0, then one of its subregisters, and
>             then read the
>                      whole D0:
> 
>                      Before Hexagon RDF optimizations
>                      # Machine code for function fred: IsSSA, NoPHIs,
>             TracksLiveness,
>                      NoVRegs
> 
>                      BB#0:
>                                 %D0<def> = IMPLICIT_DEF
>                                 %R1<def> = IMPLICIT_DEF
>                                 %D1<def> = COPY %D0
> 
>                      # End machine code for function fred.
> 
>                      Starting copy propagation on: fred
>                      DFG dump:[
>                      f1: Function: fred
>                      b2: --- BB#0 --- preds(0):   succs(0):
>                      s3: IMPLICIT_DEF [d4<D0>(,d6,):]
>                      s5: IMPLICIT_DEF [d6<R1>(d4,,u9):]
>                      s7: COPY [d8<D1>(,,):, u9<D0>(d6):]
> 
>                      ]
> 
> 
>                      This time, the use of D0 only has the immediately
>             preceding def
>                      of R1,
>                      and that def is then linked to the preceding def of D0.
> 
> 
>                      Before I describe how to get the reaching defs from
>             the graph,
>                      let me
>                      point out that there is a function
>             "getAllReachingDefs" in
>                      RDFLiveness.cpp that for a given RefNode returns
>             the vector of all
>                      reaching defs from the same block in the order in
>             which they
>                      would be
>                      encountered going backwards from the RefNode to the
>             beginning of the
>                      basic block. That function will only return defs
>             from the same basic
>                      block as the RefNode is in. This function was meant
>             to hide the
>                      complexity of finding reaching defs "by hand", and
>             is the preferred
>                      way of getting reaching defs.
> 
>                      Now, to get all reaching defs for a given use or
>             def, you need to
>                      follow the "reaching-def" link up in the graph,
>             possibly multiple
>                      times, until the set of registers associated with
>             these reaching
>                      defs
>                      covers the register from the use/def that you
>             started from. In the
>                      last example, there is a use of D0, and the
>             reaching def is d6,
>                      which
>                      is associated with R1. The set {R1} does not cover
>             D0, so you'd need
>                      to go further up, to the reaching def of d6. That
>             def (d4)
>                      defines D0,
>                      so now the set of registers is {R1,D0}, which
>             completely covers
>                      D0, so
>                      the algorithm
>                      stops: the list of reaching defs is d6, d4.
>                      All of that work is done by getAllReachingDefs,
>             together with
>                      handling
>                      shadows and all other necessary cases.
> 
> 
>                      The "isPreserving" flag has more to do with making
>             sure that all
>                      necessary links are present in the graph. The idea
>             is that the
>                      entire
>                      data flow is completely represented via the links,
>             and so the exact
>                      semantics of the machine instructions does not need
>             to be known.
>                      If you have two definitions of the same register
>             following one
>                      another, and the first one does not reach any uses,
>             it may be
>                      considered "dead"
>                      by looking at the data-flow links alone. If the
>             second one is
>                      predicated, it may or may not happen, and so the
>             preceding def
>                      is not
>                      guaranteed to be dead (and cannot be removed). In
>             other words,
>                      it can
>                      "preserve" the original value (or, in general, one
>             or more bits
>                      of the
>                      original value), hence the name. I don't think that
>             it's
>                      necessary to
>                      have that flag anymore (i.e. RDF can be modified to
>             work without
>                      it),
>                      but there are some historical reasons why it was
>             introduced.
> 
>                      -Krzysztof
> 
> 
> 
> 
>                      On 10/30/2017 4:13 AM, Raghavan, Venugopal via
>             llvm-dev wrote:
> 
>                            Hi Krzysztof,
> 
>                            I am returning to this thread after a gap in
>             time. I took
>                      some time
>                            to study the code in RDFGraph.cpp in the
>             Hexagon directory.
> 
>                            I have a few questions. I hope you will not
>             find it too
>                      much of a
>                            bother to answer them.
> 
>                            1) Does use node only support a single
>             reaching def? If
>                      the RDF
>                            graph is constructed post-RA, the code is no
>             longer in SSA
>                      form and
>                            is it not possible that multiple reaching
>             defs reach a use
>                      node? For
>                            example, the same physical register RAX may
>             be defined in
>                      the if
>                            branch and the else branch and can reach a
>             common use node
>                      resulting
>                            in multiple reaching defs for a use node. How
>             is this handled?
>                            2) I am unable to locate the code that handles
>                      re-definition of the
>                            same register. I though "clobber" refers to such
>                      situations but the
>                            isClobbering() function which controls the
>             setting of the
>                            "Clobbering" attribute seems to mainly handle
>             function
>                      calls and not
>                            the re-definitions of registers through other
>             types of
>                      instructions.
>                            How are such re-definitions handled?
>                            3) My mental picture of reaching definitions
>             analysis is the
>                            classical iterative data flow analysis
>             approach which
>                      involves using
>                            the union operator as the join operator for
>             the data flow
>                            information, but I do not see anything in the
>             code that
>                      corresponds
>                            to such a notion. I am obviously way off the
>             mark here,
>                      so, can you
>                            briefly explain the idea behind the reaching
>             definitions
>                      analysis in
>                            the RDF code?
>                            4) Assuming I have somehow fixed the register
>             units issue
>                      for x86
>                            (which you mentioned in a previous reply)  in
>             the TableGen
>                      code by
>                            adding extra register units for x86, I am not
>             sure how the
>                            preservation of the previous defined value
>             through a partial
>                            over-write by a sub-register definition is
>             handled in the
>                      RDF code.
>                            I looked at the isPreserving() function but
>             that function
>                      seems to
>                            check whether the instruction is predicated
>             or not which
>                      is a little
>                            confusing to me since the x86 case can create
>             preservation
>                      of bits
>                            through non-predicated partial over-writes.
>             So, how is the x86
>                            situation handled?
> 
>                            I realize that I have asked too many
>             questions, but, I
>                      hope you can
>                            briefly provide me some clarifications.
> 
>                            Thanks.
> 
>                            Regards,
>                            Venugopal Raghavan.
> 
>                            -----Original Message-----
>                            From: qcolombet at apple.com
>             <mailto:qcolombet at apple.com> <mailto:qcolombet at apple.com
>             <mailto:qcolombet at apple.com>>
>                      <mailto:qcolombet at apple.com
>             <mailto:qcolombet at apple.com> <mailto:qcolombet at apple.com
>             <mailto:qcolombet at apple.com>>>
>                            [mailto:qcolombet at apple.com
>             <mailto:qcolombet at apple.com> <mailto:qcolombet at apple.com
>             <mailto:qcolombet at apple.com>>
>                      <mailto:qcolombet at apple.com
>             <mailto:qcolombet at apple.com> <mailto:qcolombet at apple.com
>             <mailto:qcolombet at apple.com>>>]
>                            Sent: Tuesday, September 12, 2017 11:31 PM
>                            To: Raghavan, Venugopal
>             <Venugopal.Raghavan at amd.com <mailto:Venugopal.Raghavan at amd.com>
>                      <mailto:Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>>
>                            <mailto:Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>
>                      <mailto:Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>>>>
>                            Cc: llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>             <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>                            Subject: Re: [llvm-dev] Reaching definitions
>             on Machine IR
>                      post
>                            register allocation
> 
>                            Hi Venu,
> 
>                                On Sep 11, 2017, at 11:00 PM, Raghavan,
>             Venugopal via
>                      llvm-dev
>                                <llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>             <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>> wrote:
> 
>                                Hi Krzysztof,
> 
>                                Thanks for your reply.
> 
>                                I agree that adding extra register units
>             for x86 would
>                      be the
>                                right way to fix this. Do you know if
>             there is a plan
>                      to fix this?
> 
> 
>                            No concrete plan, no. We've been thinking
>             about for quite
>                      some time
>                            now, but never got at it.
> 
>                            Cheers,
>                            -Quentin
> 
> 
>                                The case that you have pointed out
>             involving partial
>                      writes to a
>                                register together completely killing a
>             wider register is
>                                interesting and did not occur to me. But,
>             for reaching
>                                definitions, even if you do not detect
>             this case,
>                      shouldn't this
>                                be safe (although conservative) since you
>             would only
>                      end up
>                                over-estimating the set of reaching
>             definitions at a
>                      point?
> 
>                                But for other scenarios such as copy
>             propagation, you
>                      may need
>                                to detect this scenario because otherwise
>             you could
>                      assume that
>                                a copy reaches a use point (when it
>             actually does not)
>                      and you
>                                may end up propagating the copy when it
>             is incorrect
>                      to do so.
> 
>                                Regards,
>                                Venu.
> 
> 
>                                -----Original Message-----
>                                From: llvm-dev
>             [mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>
>                      <mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>>
>                                <mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>
>                      <mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>>>] On Behalf Of
>                                Krzysztof Parzyszek via llvm-dev
>                                Sent: Friday, September 8, 2017 7:10 PM
>                                To: llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>             <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>                                Subject: Re: [llvm-dev] Reaching
>             definitions on
>                      Machine IR post
>                                register allocation
> 
>                                It would be much easier to add extra reg
>             units to EAX,
>                      EBX, etc.
>                                to represent the upper half of the
>             register. That
>                      would fix the
>                                issue the right way (and would actually
>             make sense
>                      without the
>                                context of RDF).
> 
>                                Regarding the sub-register
>             check---consider this case:
> 
>                                    AX = ...
>                                    ...
>                                    AL = ...  // AX partially overwritten
>                                    AH = ...  // AX completely overwritten
> 
>                                Each of the assignments to AL/AH does not
>             overwrite AX by
>                                itself, but together they do. This would
>             be very
>                      difficult to
>                                detect if we added such special treatment of
>                      sub-registers that
>                                you propose.
> 
>                                I could try to make a case for the extra
>             register
>                      units to get
>                                them added.
> 
>                                -Krzysztof
> 
> 
>                                On 9/8/2017 1:55 AM, Raghavan, Venugopal
>             via llvm-dev
>                      wrote:
> 
>                                    Hi Krzysztof,
> 
>                                    Thanks for your explanation - I think
>             I understand
>                      the issue
>                                    now.
> 
>                                    I do know if this is a hack, but, in
>             reaching
>                      definitions,
>                                    when we check if a definition of AX
>             kills a previous
>                                    definition of EAX, can we not check
>             the actual
>                      physical
>                                    register? In this case since EAX and
>             AX are different
>                                    (although I understand AX is a
>             sub-register of
>                      EAX), can we
>                                    not conclude that the definition of
>             AX does not
>                      kill EAX?
>                                    Since, in reaching definitions, it is
>             conservative
>                      if we
>                                    over-estimate the set of reaching
>             definitions (but not
>                                    unsafe), should not this strategy be
>             adequate to avoid
>                                    incorrect optimizations?
> 
>                                    Or, is there an optimization that can
>             behave
>                      incorrectly
>                                    with an overestimated set of reaching
>             definitions?
> 
>                                    Regards,
>                                    Venu.
> 
>                                    -----Original Message-----
>                                    From: Nema, Ashutosh
>                                    Sent: Friday, September 08, 2017 12:00 PM
>                                    To: Raghavan, Venugopal
>                      <Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>
>             <mailto:Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>>
>                                    <mailto:Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>
>                      <mailto:Venugopal.Raghavan at amd.com
>             <mailto:Venugopal.Raghavan at amd.com>>>>
>                                    Subject: FW: [llvm-dev] Reaching
>             definitions on
>                      Machine IR post
>                                    register allocation
> 
> 
> 
>                                    -----Original Message-----
>                                    From: llvm-dev
>                      [mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>
>                      <mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>>
>                                  
>               <mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>
>                      <mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>>>] On Behalf Of
>                                    Krzysztof Parzyszek via llvm-dev
>                                    Sent: Wednesday, September 6, 2017
>             6:10 PM
>                                    To: llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>             <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>                                    Subject: Re: [llvm-dev] Reaching
>             definitions on
>                      Machine IR post
>                                    register allocation
> 
>                                    RDF needs to know when an assignment
>             to a register is
>                                    overwritten by another assignment, or
>             by a sequence of
>                                    assignments.  This is needed to
>             determine whether the
>                                    original assignment is still live or
>             not. RDF uses
>                      register
>                                    units to represent the building blocks of
>                      registers, and
>                                    assumes that if all units of a
>             register are
>                      overwritten,
>                                    then the original value of that
>             register is completely
>                                    overwritten. This assumption is true
>             for all
>                      targets (that I
>                                    have tested it with) except X86. On
>             X86, registers
>                      AX and
>                                    EAX both have only one register unit,
>             but when you
>                      assign a
>                                    value to AX, the upper half of EAX is
>             preserved
>                      (that is,
>                                    the original value of EAX is not
>             completely
>                      overwritten).
> 
>                                    -Krzysztof
> 
>                                    On 9/6/2017 6:42 AM, Raghavan,
>             Venugopal via
>                      llvm-dev wrote:
> 
>                                        Hi Krzysztof,
> 
>                                        I did look at the other link you have
>                      mentioned in your
>                                        reply but did not quite
>             understand the
>                      register units
>                                        issue. If it is not too
>             difficult, can you briefly
>                                        summarize what the issue was?
> 
>                                        Thanks.
> 
>                                        Regards,
>                                        Venu.
> 
> 
>                                        -----Original Message-----
>                                        From: llvm-dev
>                      [mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>
>                      <mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>>
>                                      
>               <mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>
>                      <mailto:llvm-dev-bounces at lists.llvm.org
>             <mailto:llvm-dev-bounces at lists.llvm.org>>>] On Behalf Of
>                                        Krzysztof Parzyszek via llvm-dev
>                                        Sent: Tuesday, September 5, 2017
>             7:44 PM
>                                        To: llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>             <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>                                        Subject: Re: [llvm-dev] Reaching
>             definitions
>                      on Machine
>                                        IR post
>                                        register allocation
> 
>                                        Hexagon has RDF that does exactly
>             that.  At
>                      the moment
>                                        it's under lib/Target/Hexagon,
>             but it meant to be
>                                        target-independent.  It won't
>             work with X86
>                      due to a
>                                        known issue related to register
>             units, but it
>                      should
>                                        work fine for other targets.  See
>             https://reviews.llvm.org/D29295
>             <https://reviews.llvm.org/D29295> about moving it to
>                                        lib/CodeGen.
> 
>                                        -Krzysztof
> 
>                                        On 9/4/2017 9:00 AM, Raghavan,
>             Venugopal via
>                      llvm-dev wrote:
> 
>                                            Hi,
> 
>                                            Just to clarify I am looking
>             for a whole
>                      machine
>                                            function analysis
>                                            not just something restricted
>             to within a
>                      machine
>                                            basic block.
> 
>                                            Thanks.
> 
>                                            Regards,
> 
>                                            Venu.
> 
>                                            *From:* Raghavan, Venugopal
>                                            *Sent:* Saturday, September
>             02, 2017 12:56 PM
>                                            *To:* llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                                          
>               <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>                                            *Subject:* Reaching
>             definitions on Machine
>                      IR post
>                                            register
>                                            allocation
> 
>                                            Hi,
> 
>                                            Given a definition of a
>             register by a machine
>                                            instruction in the
>                                            Machine IR post register
>             allocation, I
>                      would like to
>                                            compute the
>                                            set of uses of this register
>             reached by
>                      this definition.
> 
>                                            Does LLVM already have this
>             kind of
>                      analysis I can
>                                            use? Otherwise,
>                                            I will have to implement a
>             reaching
>                      definitions
>                                            analysis which
>                                            would be a little involved
>             since it would
>                      need to
>                                            work on a non-SSA IR form.
> 
>                                            If something already exists
>             that would be very
>                                            helpful for me.
> 
>                                            Thanks.
> 
>                                            Regards,
> 
>                                            Venugopal Raghavan.
> 
> 
> 
>                                                 
>                _______________________________________________
>                                            LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
>                                        --
>                                        Qualcomm Innovation Center, Inc.
>             is a member
>                      of Code
>                                        Aurora Forum,
>                                        hosted by The Linux Foundation
>                                      
>               _______________________________________________
>                                        LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>                                      
>               _______________________________________________
>                                        LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
>                                    --
>                                    Qualcomm Innovation Center, Inc. is a
>             member of
>                      Code Aurora
>                                    Forum,
>                                    hosted by The Linux Foundation
>                                  
>               _______________________________________________
>                                    LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>                                  
>               _______________________________________________
>                                    LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
>                                --
>                                Qualcomm Innovation Center, Inc. is a
>             member of Code
>                      Aurora Forum,
>                                hosted by The Linux Foundation
>                              
>               _______________________________________________
>                                LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>                              
>               _______________________________________________
>                                LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
>                            _______________________________________________
>                            LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
>                      --
>                      Qualcomm Innovation Center, Inc. is a member of
>             Code Aurora Forum,
>                      hosted by The Linux Foundation
>                      _______________________________________________
>                      LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>                      <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
>                      _______________________________________________
>                      LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
>                  --
>                  Geoff Berry
>                  Employee of Qualcomm Datacenter Technologies, Inc.
>                      Qualcomm Datacenter Technologies, Inc. as an
>             affiliate of
>                  Qualcomm Technologies, Inc.  Qualcomm Technologies,
>             Inc. is a member
>                  of the Code Aurora Forum, a Linux Foundation
>             Collaborative Project.
>                  _______________________________________________
>                  LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
>             -- 
>             Qualcomm Innovation Center, Inc. is a member of Code Aurora
>             Forum, hosted by The Linux Foundation
>             _______________________________________________
>             LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             <mailto:llvm-dev at lists.llvm.org
>             <mailto:llvm-dev at lists.llvm.org>>
> 
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
> 
>             _______________________________________________
>             LLVM Developers mailing list
>             llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
>         -- 
>         Qualcomm Innovation Center, Inc. is a member of Code Aurora
>         Forum, hosted by The Linux Foundation
>         _______________________________________________
>         LLVM Developers mailing list
>         llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>         http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
> 

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation