[llvm-dev] Reaching definitions on Machine IR post register allocation

Tue Nov 28 03:16:12 PST 2017

Hi Krzysztof,

Thanks for the pointer to the patch. I will try it out.

My runtime error appears even when I build the RDF graph and just do the
RDF liveness analysis without invoking RDF copy propagation. It seems that
some later pass downstream is doing something wrong after the liveness
information is recomputed. This error appears even when I exclude my patch
for the X86 partial register definition issue. I am trying to narrow it
down. It is taking time to figure out which pass is introducing the problem.

Once again, many thanks for your pointer.

Regards,
Venu.

On Tue, Nov 28, 2017 at 12:52 AM, Krzysztof Parzyszek <
kparzysz at codeaurora.org> wrote:

> Hi Venu,
> This is what I used: https://reviews.llvm.org/D40509. It fails check-llvm
> because extra register classes are generated, which changes what the
> scheduling algorithm does. I added a non-allocatable register class for the
> new registers, but maybe removing it altogether would be better. I never
> explored this patch any further, so I'm not sure how well it works if you
> ignore the lit mismatches.
>
> -Krzysztof
>
> On 11/24/2017 3:21 AM, Venugopal Raghavan wrote:
>
>> Hi  Krzysztof,
>>
>> In one of your earlier emails in this thread you mentioned that you had
>> some changes which add extra aliases for subregisters. Did you mean for
>> X86? And is it extra register units that you added or aliases?
>>
>> I tried adding extra register units for X86 through some changes in
>> CodeGenRegisters.cpp in TableGen but I am seeing a runtime error in one of
>> my test cases possibly due to the fact that my change is incorrect or
>> incomplete. Apart from adding extra register units, do you need to do
>> anything else?
>>
>> If you have the changes you made, I would be grateful if you can share it.
>>
>> Thanks.
>>
>> Regards,
>> Venu.
>>
>> On Tue, Nov 14, 2017 at 8:54 AM, Venugopal Raghavan <venur2005 at gmail.com
>> <mailto:venur2005 at gmail.com>> wrote:
>>
>>     Hi Krzysztof,
>>
>>     Thanks for the reply.
>>
>>     Since a data flow edge is missing, I thought that it could be a
>>     correctness issue and not just a precision issue. But, on second
>>     thoughts, maybe it isn't because the edge from the implicit def to
>>     the use is present and the register associated with this edge covers
>>     the register associated with the missing edge.
>>
>>     I will try to revert the patch you mentioned and check if the
>>     precision issue is fixed. I am seeing some other failure in a test
>>     case after RDF copy propagation for X86 code and I was attributing
>>     it to this missing edge, but, I guess there is some other issue
>>     which I will try to debug.
>>
>>     Thanks.
>>
>>     Regards,
>>     Venu.
>>
>>     On Mon, Nov 13, 2017 at 9:39 PM, Krzysztof Parzyszek via llvm-dev
>>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>>         Hi Venu,
>>
>>         This is happening because there is an implicit def of ECX on the
>>         COPY instruction. This was an issue on Hexagon as well. Let me
>>         give you some background. There are two kinds of implicit defs
>>         (and implicit uses, but I'll refer only to defs for brevity):
>>         (1) Those that indicate that some physical register (that is not
>>         an operand) is modified by a given instruction (EFLAGS is a good
>>         example on X86).
>>         (2) Those that are there for the purpose of tracking liveness,
>>         such as the one on the COPY instruction here.
>>
>>         The first kind is strictly necessary (or otherwise, some
>>         physical register modifications would not be reflected in the
>>         code), but the second kind is only there to assist with liveness
>>         tracking: if register liveness was no longer tracked, these
>>         implicit defs/uses could be removed and everything would be
>>         fine. The first kind of implicit defs would have to stay
>>         regardless of whether we're still interested in liveness or not.
>>
>>         Having the second kind of implicit defs around will pessimize
>>         many things (like scheduling, for example), because the general
>>         algorithms that detect dependency between instructions will take
>>         the implicit defs into account. For example, instructions
>>         modifying AL and AH will appear to be dependent if they have
>>         implicit defs of AX/EAX, even though AL and AH do not overlap.
>>         This was a big problem on Hexagon, and the original RDF
>>         implementation ignored the second kind of implicit defs/uses,
>>         only paying attention to the first. Later on, subregister
>>         liveness tracking was enabled on Hexagon, which got rid of the
>>         second kind of implicit defs/uses altogether. After that, RDF
>>         was changed to no longer ignore any implicit operands: this
>>         happened in r300369.
>>
>>         Since X86 does not use subregister liveness tracking, it still
>>         has the extra implicit operands. I don't know whether it has any
>>         special treatment of these to avoid the dependency issues, but
>>         the presence of these will reduce the precision of RDF. You can
>>         try to revert r300369 and see if it helps. It won't hurt
>>         Hexagon, so if helps X86, we could consider bringing that code
>> back.
>>
>>         -Krzysztof
>>
>>         On 11/10/2017 6:52 AM, Raghavan, Venugopal via llvm-dev wrote:
>>
>>             Hi,
>>
>>             For a test case that I ran I am seeing something in the RDF
>>             graph that I do not quite understand. I think there is an
>>             data flow edge that is missing but most likely I am wrong.
>>
>>             The relevant portion of IR looks like this:
>>
>>             BB#0:
>>
>>             %R10<def> = MOVSX64rr32 %EDX<kill>; dbg:FastBoard.cpp:186:26
>>             @[ FastBoard.cpp:1938:21 ]
>>
>>                                   .
>>
>>                                   .
>>
>>                                   .
>>
>>             TEST32rr %ESI<kill>, %R8D<kill>, %EFLAGS<imp-def>;
>>             dbg:FastBoard.cpp:1940:10
>>
>>             JNE_1 <BB#1>, %EFLAGS<imp-use,kill>; dbg:FastBoard.cpp:1940:9
>>
>>             BB#1:
>>
>>             %CL<def> = COPY %R9B<kill>, %ECX<imp-use,kill>,
>>             %ECX<imp-def>; dbg:FastBoard.cpp:1938:38
>>
>>                                    .
>>
>>                                    .
>>
>>                                    .
>>
>>             CMP32mr %RDI, 4, %RSI, 1776, %noreg, %ECX, %EFLAGS<imp-def>;
>>             mem:LD4[%arrayidx.i49](tbaa=!2767) dbg:FastBoard.cpp:1947:26
>>
>>             The relevant portion of the RDF graph that is constructed is
>>             shown below:
>>
>>             BB#0:
>>
>>             s3: MOV32r0 [d4<ECX>(,d50,u245):, d5<EFLAGS>!(,d7,):]
>>
>>                                   .
>>
>>                                   .
>>
>>                                   .
>>
>>             BB#1:
>>
>>             s48: COPY [d49<CL>(d4,,):, d50<ECX>(d4,d184,u246):d49,
>>             u51<R9B>(d11):u23, u52<ECX>(d4):]
>>
>>                                   .
>>
>>                                   .
>>
>>                                   .
>>
>>             s61: CMP32mr [d62<EFLAGS>!(d58,d80,u205):,
>>             u63<RDI>(+d194):u55, u64<RSI>(d57):, u65<ECX>(d50):]
>>
>>             This is a test case I was interested in because it shows the
>>             partial register redefinition scenario in X86 for which more
>>             register units needed to be added. I have a hacky fix for
>>             this in TableGen code which adds more units for X86
>>             registers. I am not completely sure that my fix is correct
>>             and I was trying to test it.
>>
>>             You can see that register ECX is defined by s3 above and is
>>             used in the compare instruction which is s61. In between
>>             there is a partial re-definition of the lower 8 bits of ECX
>>             through the definition of register CL in s48. This
>>             definition of CL should leave the high order bits defined by
>>             s3 untouched.
>>
>>             Now consider the reaching definition for the use ref node
>>             u65 in s61. It has the d50 def ref node as a reaching
>>             definition. My question is: shouldn’t we also be able to
>>             infer that the d49 ref node is a reaching def node for u65?
>>             Starting from u65 there does not seem to be a way to reach
>> d49.
>>
>>             I debugged the code in X86RDFGraph.cpp. It appears that when
>>             we start from u65, we hit the def ref node d50 which covers
>>             u65 since both involve ECX. Hence there is no need to look
>>             for other reaching defs.
>>
>>             The node d50 itself is created as a def node because ECX
>>             appears as an implicit def in the COPY instruction in the
>>             machine IR and the code in the function buildStmt()
>>             processes such implicit defs and creates def nodes for them
>>             in the RDF graph.
>>
>>             I understand that the virtual SSA nature of the RDF graph
>>             does not allow multiple reaching defs to be specified at a
>>             use ref node, but then how does one handle situations such
>>             as this where we need to identify all reaching defs for a
>>             use node? Or, is this situation being created because my
>>             hacky fix for the X86 partial redefinition case is not doing
>>             the right thing?
>>
>>             Thanks.
>>
>>             Regards,
>>
>>             Venu.
>>
>>             From: *Krzysztof Parzyszek via llvm-dev*
>>             <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>             <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>>
>>             Date: Thu, Nov 2, 2017 at 1:34 AM
>>             Subject: Re: [llvm-dev] Reaching definitions on Machine IR
>>             post register allocation
>>             To: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>             <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>
>>
>>
>>             RDF has its own tests to detect the ability to rename any
>>             given register. RefNodes that have "Fixed" flag are those
>>             where the register cannot be changed. Having the isRenamable
>>             flag in MachineOperand would help remove these tests.
>>
>>             -Krzysztof
>>
>>
>>
>>
>>             On 10/31/2017 11:24 PM, Raghavan, Venugopal via llvm-dev
>> wrote:
>>
>>                  Hi Geoff/Krzyssztof,
>>
>>                  Wouldn't the isRenamable() change be required even for
>>             the RDF based
>>                  copy propagation? Maybe Hexagon does not impose ABI/ISA
>>             restrictions
>>                  which require specific registers to be used in specific
>>             contexts.
>>
>>                  Also, if Geoff's copy propagation pass is invoked
>>             post-RA wouldn't
>>                  it need to handle the x86 ISA feature which allows 8
>>             bit/16 bit
>>                  values to be moved into a register without over-writing
>>             the other
>>                  bits in the register being defined? I am assuming this
>>             needs to be
>>                  handled for this pass to work for x86. Is this part of
>>             Geoff's patch?
>>
>>                  I have seen cases where we need copy propagation across
>>             basic
>>                  blocks. Given that RDF works on the whole function,
>>             would it not be
>>                  a good idea to just use that with the X86
>>             partial-definition issue
>>                  being handled?
>>
>>                  Considering everything, would the following not be a
>>             good approach?
>>
>>                  1) Use the isRenamable() change for MachineOperand
>>             which allows us
>>                  to adhere to ABI/ISA constraints
>>                  2) Fix the partial register definition issues as seen
>>             in x86
>>                  3) Try to get the RDF based analysis and copy
>>             propagation and other
>>                  passes built on RDF to work for all targets including
>>             x86 (we get
>>                  the advantage of passes which can work on the whole
>>             machine function
>>                  assuming that there are no scalability issues by doing
>> so).
>>
>>                  I myself implemented a rudimentary copy propagation
>>             pass which tries
>>                  to copy propagate on the whole machine function but was
>>             put off from
>>                  doing further work on this because of the X86 partial
>>             definition
>>                  issue mentioned above to which Krzysztof drew my
>> attention.
>>
>>                  Regards,
>>                  Venu.
>>
>>                  -----Original Message-----
>>                  From: Geoff Berry [mailto:gberry at codeaurora.org
>>             <mailto:gberry at codeaurora.org>
>>                  <mailto:gberry at codeaurora.org
>>             <mailto:gberry at codeaurora.org>>]
>>                  Sent: Tuesday, October 31, 2017 8:33 PM
>>                  To: llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>             <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>;
>>                  Raghavan, Venugopal <Venugopal.Raghavan at amd.com
>>             <mailto:Venugopal.Raghavan at amd.com>
>>                  <mailto:Venugopal.Raghavan at amd.com
>>             <mailto:Venugopal.Raghavan at amd.com>>>
>>                  Subject: Re: [llvm-dev] Reaching definitions on Machine
>>             IR post
>>                  register allocation
>>
>>                  Hi Venu,
>>
>>                  FWIW, I have a pass that does copy propagation after RA
>> [1]
>>                  (currently only within a basic block) that should be
>>             enabled some
>>                  time in the not-too-distant future.  It has been
>>             reviewed and
>>                  accepted, but I'm currently working on getting a slight
>>             change to
>>                  the MachineOperand representation [2] that should make
>>             the copy
>>                  propagation change much simpler.  I believe this change
>> to
>>                  MachineOperand (or something like it) would also be
>>             needed for any
>>                  pass that wants to rename registers after RA (unless
>>             the renaming is
>>                  done right after RA when virtual registers are still
>>             present, which
>>                  is what my current patch does, and is the source of
>>             complexity that
>>                  I'm trying to eliminate).
>>
>>                  [1] https://reviews.llvm.org/D30751
>>             <https://reviews.llvm.org/D30751>
>>                  [2] https://reviews.llvm.org/D39400
>>             <https://reviews.llvm.org/D39400> D39400 WIP:
>>                  [MachineOperand][MIR] Add isRenamable to MachineOperand.
>>
>>                  On 10/31/2017 5:49 AM, Raghavan, Venugopal via llvm-dev
>>             wrote:
>>
>>                      Hi Krzysztof,
>>
>>                      Thanks a lot for taking the time to write a
>>             detailed explanation. I
>>                      think I understand things better now.
>>
>>                      I am trying to see if I can use RDF for X86
>>             assuming I can add more
>>                      register units for X86 so that the partial
>>             re-definition issue you
>>                      pointed earlier is fixed. I am not yet sure whether
>>             it is easy
>>                      or not,
>>                      but I have identified a portion of code in TableGen
>>             which I can
>>                      modify
>>                      to do this.
>>
>>                      Once I have done this I plan to use the RDG graph
>>             construction
>>                      and RDF
>>                      copy propagation code to try to propagate copies in
>>             the code
>>                      after RA.
>>
>>                      I am hoping that this plan would work.
>>
>>                      Regards,
>>
>>                      Venu.
>>
>>                      From: *Krzysztof Parzyszek via llvm-dev*
>>                      <llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>             <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>             <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>>>
>>                      Date: Mon, Oct 30, 2017 at 7:25 PM
>>                      Subject: Re: [llvm-dev] Reaching definitions on
>>             Machine IR post
>>                      register allocation
>>                      To: llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>             <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>             <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>>
>>
>>
>>                      Hi Raghavan,
>>                      Thanks for asking the questions, it's not a problem
>>             at all.
>>
>>                      The RDF graph simulates SSA. It does so by having
>>             "artificial" PHI
>>                      nodes that account for multiple reaching defs.
>>
>>                      Consider this example (for Hexagon). The CFG is a
>>             simple triangle
>>                      which has a split block (BB#0), which either goes
>>             to the side block
>>                      (BB#1), or to the join block (BB#2). The register
>>             R0 is defined in
>>                      BB#0 and then redefined in BB#1. Both definitions
>>             can reach the
>>                      use of R0 in BB#2:
>>
>>
>>                      ***
>>                      Before Hexagon RDF optimizations
>>                      # Machine code for function fred: IsSSA, NoPHIs,
>>             TracksLiveness,
>>                      NoVRegs
>>
>>                      BB#0:
>>                             Live Ins: %P0
>>                                 %R0<def> = IMPLICIT_DEF
>>                                 J2_jumpt %P0, <BB#2>, %PC<imp-def>  ;
>>             Conditional
>>                      branch to
>>                      BB#2
>>                             Successors according to CFG: BB#1 BB#2
>>
>>                      BB#1:
>>                             Predecessors according to CFG: BB#0
>>                                 %R0<def> = IMPLICIT_DEF
>>                             Successors according to CFG: BB#2
>>
>>                      BB#2:
>>                             Live Ins: %R0
>>                             Predecessors according to CFG: BB#0 BB#1
>>                                 %R1<def> = COPY %R0
>>                      ***
>>
>>                      The constructed graph is below. Note the phi node
>>             (p17) in BB#2:
>>
>>                      Starting copy propagation on: fred
>>                      DFG dump:[
>>                      f1: Function: fred
>>                      b2: --- BB#0 --- preds(0):   succs(2): BB#1, BB#2
>>                      p15: phi [+d16<P0>(,,u7):]
>>                      s3: IMPLICIT_DEF [d4<R0>(,d10,u19):]
>>                      s5: J2_jumpt BB#2 [/+d6<PC>!(,,):, u7<P0>(+d16):]
>>
>>                      b8: --- BB#1 --- preds(1): BB#0  succs(1): BB#2
>>                      s9: IMPLICIT_DEF [d10<R0>(d4,,u20):]
>>
>>                      b11: --- BB#2 --- preds(2): BB#0, BB#1  succs(0):
>>                      p17: phi [+d18<R0>(,,u14):, u19<R0>(d4,b2):,
>>             u20<R0>(d10,b8):]  ;
>>                      *PHI*
>>                      s12: COPY [d13<R1>(,,):, u14<R0>(+d18):]
>>
>>                      ]
>>
>>
>>                      The phi node has uses which are linked to the
>>             reaching defs, and
>>                      paired with the predecessor block to which the def
>>             corresponds. At
>>                      this point, the graph looks a lot like the standard
>>             SSA graph.
>>
>>
>>                      There are cases where there are multiple reaching
>>             defs emerging in a
>>                      straight-line code. Imagine a register BIG_REG that
>>             has two
>>                      non-overlapping subregisters, SMALL_REG_1 and
>>             SMALL_REG_2. Now,
>>                      consider
>>                      this:
>>
>>                      SMALL_REG_1 = ...
>>                      SMALL_REG_2 = ...
>>                      ... = BIG_REG
>>
>>                      Both of the preceding definitions of of the
>>             subregisters of BIG_REG
>>                      reach the use of BIG_REG, and both of them have to
>>             be connected
>>                      to the
>>                      use somehow. Phi nodes are not the solution, since
>>             they are specific
>>                      to join blocks in the control flow graph. What RDF
>>             does is that it
>>                      adds additional uses of BIG_REG, and each of them
>>             has a different
>>                      reaching def. These uses are called "shadows" in
>>             the RDF
>>                      terminology.
>>
>>                      A concrete example (from Hexagon again, D0 is a
>>             pair of registers R0
>>                      and R1). Note the two uses u9 and u10 in the COPY
>>             instruction, and
>>                      that each of them has one of the preceding defs as
>>             its reaching def:
>>
>>                      Before Hexagon RDF optimizations
>>                      # Machine code for function fred: IsSSA, NoPHIs,
>>             TracksLiveness,
>>                      NoVRegs
>>
>>                      BB#0:
>>                                 %R0<def> = IMPLICIT_DEF
>>                                 %R1<def> = IMPLICIT_DEF
>>                                 %D1<def> = COPY %D0
>>
>>                      # End machine code for function fred.
>>
>>                      Starting copy propagation on: fred
>>                      DFG dump:[
>>                      f1: Function: fred
>>                      b2: --- BB#0 --- preds(0):   succs(0):
>>                      s3: IMPLICIT_DEF [d4<R0>(,,u10"):]
>>                      s5: IMPLICIT_DEF [d6<R1>(,,u9"):]
>>                      s7: COPY [d8<D1>(,,):, u9"<D0>(d6):, u10"<D0>(d4):]
>>             ;; Two uses: u9,
>>                      u10
>>
>>                      ]
>>
>>
>>
>>                      In the example above, the fact that the two
>>             subregisters were
>>                      non-overlapping was important. If they are
>>             non-overlapping, there
>>                      won't be any data-flow links between them, since
>>             they are
>>                      essentially
>>                      independent registers. If they were overlapping,
>>             such links would be
>>                      present. Here's a modified version of the example
>>             above, where we
>>                      first modify D0, then one of its subregisters, and
>>             then read the
>>                      whole D0:
>>
>>                      Before Hexagon RDF optimizations
>>                      # Machine code for function fred: IsSSA, NoPHIs,
>>             TracksLiveness,
>>                      NoVRegs
>>
>>                      BB#0:
>>                                 %D0<def> = IMPLICIT_DEF
>>                                 %R1<def> = IMPLICIT_DEF
>>                                 %D1<def> = COPY %D0
>>
>>                      # End machine code for function fred.
>>
>>                      Starting copy propagation on: fred
>>                      DFG dump:[
>>                      f1: Function: fred
>>                      b2: --- BB#0 --- preds(0):   succs(0):
>>                      s3: IMPLICIT_DEF [d4<D0>(,d6,):]
>>                      s5: IMPLICIT_DEF [d6<R1>(d4,,u9):]
>>                      s7: COPY [d8<D1>(,,):, u9<D0>(d6):]
>>
>>                      ]
>>
>>
>>                      This time, the use of D0 only has the immediately
>>             preceding def
>>                      of R1,
>>                      and that def is then linked to the preceding def of
>> D0.
>>
>>
>>                      Before I describe how to get the reaching defs from
>>             the graph,
>>                      let me
>>                      point out that there is a function
>>             "getAllReachingDefs" in
>>                      RDFLiveness.cpp that for a given RefNode returns
>>             the vector of all
>>                      reaching defs from the same block in the order in
>>             which they
>>                      would be
>>                      encountered going backwards from the RefNode to the
>>             beginning of the
>>                      basic block. That function will only return defs
>>             from the same basic
>>                      block as the RefNode is in. This function was meant
>>             to hide the
>>                      complexity of finding reaching defs "by hand", and
>>             is the preferred
>>                      way of getting reaching defs.
>>
>>                      Now, to get all reaching defs for a given use or
>>             def, you need to
>>                      follow the "reaching-def" link up in the graph,
>>             possibly multiple
>>                      times, until the set of registers associated with
>>             these reaching
>>                      defs
>>                      covers the register from the use/def that you
>>             started from. In the
>>                      last example, there is a use of D0, and the
>>             reaching def is d6,
>>                      which
>>                      is associated with R1. The set {R1} does not cover
>>             D0, so you'd need
>>                      to go further up, to the reaching def of d6. That
>>             def (d4)
>>                      defines D0,
>>                      so now the set of registers is {R1,D0}, which
>>             completely covers
>>                      D0, so
>>                      the algorithm
>>                      stops: the list of reaching defs is d6, d4.
>>                      All of that work is done by getAllReachingDefs,
>>             together with
>>                      handling
>>                      shadows and all other necessary cases.
>>
>>
>>                      The "isPreserving" flag has more to do with making
>>             sure that all
>>                      necessary links are present in the graph. The idea
>>             is that the
>>                      entire
>>                      data flow is completely represented via the links,
>>             and so the exact
>>                      semantics of the machine instructions does not need
>>             to be known.
>>                      If you have two definitions of the same register
>>             following one
>>                      another, and the first one does not reach any uses,
>>             it may be
>>                      considered "dead"
>>                      by looking at the data-flow links alone. If the
>>             second one is
>>                      predicated, it may or may not happen, and so the
>>             preceding def
>>                      is not
>>                      guaranteed to be dead (and cannot be removed). In
>>             other words,
>>                      it can
>>                      "preserve" the original value (or, in general, one
>>             or more bits
>>                      of the
>>                      original value), hence the name. I don't think that
>>             it's
>>                      necessary to
>>                      have that flag anymore (i.e. RDF can be modified to
>>             work without
>>                      it),
>>                      but there are some historical reasons why it was
>>             introduced.
>>
>>                      -Krzysztof
>>
>>
>>
>>
>>                      On 10/30/2017 4:13 AM, Raghavan, Venugopal via
>>             llvm-dev wrote:
>>
>>                            Hi Krzysztof,
>>
>>                            I am returning to this thread after a gap in
>>             time. I took
>>                      some time
>>                            to study the code in RDFGraph.cpp in the
>>             Hexagon directory.
>>
>>                            I have a few questions. I hope you will not
>>             find it too
>>                      much of a
>>                            bother to answer them.
>>
>>                            1) Does use node only support a single
>>             reaching def? If
>>                      the RDF
>>                            graph is constructed post-RA, the code is no
>>             longer in SSA
>>                      form and
>>                            is it not possible that multiple reaching
>>             defs reach a use
>>                      node? For
>>                            example, the same physical register RAX may
>>             be defined in
>>                      the if
>>                            branch and the else branch and can reach a
>>             common use node
>>                      resulting
>>                            in multiple reaching defs for a use node. How
>>             is this handled?
>>                            2) I am unable to locate the code that handles
>>                      re-definition of the
>>                            same register. I though "clobber" refers to
>> such
>>                      situations but the
>>                            isClobbering() function which controls the
>>             setting of the
>>                            "Clobbering" attribute seems to mainly handle
>>             function
>>                      calls and not
>>                            the re-definitions of registers through other
>>             types of
>>                      instructions.
>>                            How are such re-definitions handled?
>>                            3) My mental picture of reaching definitions
>>             analysis is the
>>                            classical iterative data flow analysis
>>             approach which
>>                      involves using
>>                            the union operator as the join operator for
>>             the data flow
>>                            information, but I do not see anything in the
>>             code that
>>                      corresponds
>>                            to such a notion. I am obviously way off the
>>             mark here,
>>                      so, can you
>>                            briefly explain the idea behind the reaching
>>             definitions
>>                      analysis in
>>                            the RDF code?
>>                            4) Assuming I have somehow fixed the register
>>             units issue
>>                      for x86
>>                            (which you mentioned in a previous reply)  in
>>             the TableGen
>>                      code by
>>                            adding extra register units for x86, I am not
>>             sure how the
>>                            preservation of the previous defined value
>>             through a partial
>>                            over-write by a sub-register definition is
>>             handled in the
>>                      RDF code.
>>                            I looked at the isPreserving() function but
>>             that function
>>                      seems to
>>                            check whether the instruction is predicated
>>             or not which
>>                      is a little
>>                            confusing to me since the x86 case can create
>>             preservation
>>                      of bits
>>                            through non-predicated partial over-writes.
>>             So, how is the x86
>>                            situation handled?
>>
>>                            I realize that I have asked too many
>>             questions, but, I
>>                      hope you can
>>                            briefly provide me some clarifications.
>>
>>                            Thanks.
>>
>>                            Regards,
>>                            Venugopal Raghavan.
>>
>>                            -----Original Message-----
>>                            From: qcolombet at apple.com
>>             <mailto:qcolombet at apple.com> <mailto:qcolombet at apple.com
>>             <mailto:qcolombet at apple.com>>
>>                      <mailto:qcolombet at apple.com
>>             <mailto:qcolombet at apple.com> <mailto:qcolombet at apple.com
>>             <mailto:qcolombet at apple.com>>>
>>                            [mailto:qcolombet at apple.com
>>             <mailto:qcolombet at apple.com> <mailto:qcolombet at apple.com
>>             <mailto:qcolombet at apple.com>>
>>                      <mailto:qcolombet at apple.com
>>             <mailto:qcolombet at apple.com> <mailto:qcolombet at apple.com
>>             <mailto:qcolombet at apple.com>>>]
>>                            Sent: Tuesday, September 12, 2017 11:31 PM
>>                            To: Raghavan, Venugopal
>>             <Venugopal.Raghavan at amd.com <mailto:Venugopal.Raghavan at amd
>> .com>
>>                      <mailto:Venugopal.Raghavan at amd.com
>>             <mailto:Venugopal.Raghavan at amd.com>>
>>                            <mailto:Venugopal.Raghavan at amd.com
>>             <mailto:Venugopal.Raghavan at amd.com>
>>                      <mailto:Venugopal.Raghavan at amd.com
>>             <mailto:Venugopal.Raghavan at amd.com>>>>
>>                            Cc: llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>
>>             <mailto:llvm-dev at lists.llvm.org <mailto:
>> llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>>
>>                            Subject: Re: [llvm-dev] Reaching definitions
>>             on Machine IR
>>                      post
>>                            register allocation
>>
>>                            Hi Venu,
>>
>>                                On Sep 11, 2017, at 11:00 PM, Raghavan,
>>             Venugopal via
>>                      llvm-dev
>>                                <llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>
>>             <mailto:llvm-dev at lists.llvm.org <mailto:
>> llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>>> wrote:
>>
>>                                Hi Krzysztof,
>>
>>                                Thanks for your reply.
>>
>>                                I agree that adding extra register units
>>             for x86 would
>>                      be the
>>                                right way to fix this. Do you know if
>>             there is a plan
>>                      to fix this?
>>
>>
>>                            No concrete plan, no. We've been thinking
>>             about for quite
>>                      some time
>>                            now, but never got at it.
>>
>>                            Cheers,
>>                            -Quentin
>>
>>
>>                                The case that you have pointed out
>>             involving partial
>>                      writes to a
>>                                register together completely killing a
>>             wider register is
>>                                interesting and did not occur to me. But,
>>             for reaching
>>                                definitions, even if you do not detect
>>             this case,
>>                      shouldn't this
>>                                be safe (although conservative) since you
>>             would only
>>                      end up
>>                                over-estimating the set of reaching
>>             definitions at a
>>                      point?
>>
>>                                But for other scenarios such as copy
>>             propagation, you
>>                      may need
>>                                to detect this scenario because otherwise
>>             you could
>>                      assume that
>>                                a copy reaches a use point (when it
>>             actually does not)
>>                      and you
>>                                may end up propagating the copy when it
>>             is incorrect
>>                      to do so.
>>
>>                                Regards,
>>                                Venu.
>>
>>
>>                                -----Original Message-----
>>                                From: llvm-dev
>>             [mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>
>>                      <mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>>
>>                                <mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>
>>                      <mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>>>] On Behalf Of
>>                                Krzysztof Parzyszek via llvm-dev
>>                                Sent: Friday, September 8, 2017 7:10 PM
>>                                To: llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>
>>             <mailto:llvm-dev at lists.llvm.org <mailto:
>> llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>>
>>                                Subject: Re: [llvm-dev] Reaching
>>             definitions on
>>                      Machine IR post
>>                                register allocation
>>
>>                                It would be much easier to add extra reg
>>             units to EAX,
>>                      EBX, etc.
>>                                to represent the upper half of the
>>             register. That
>>                      would fix the
>>                                issue the right way (and would actually
>>             make sense
>>                      without the
>>                                context of RDF).
>>
>>                                Regarding the sub-register
>>             check---consider this case:
>>
>>                                    AX = ...
>>                                    ...
>>                                    AL = ...  // AX partially overwritten
>>                                    AH = ...  // AX completely overwritten
>>
>>                                Each of the assignments to AL/AH does not
>>             overwrite AX by
>>                                itself, but together they do. This would
>>             be very
>>                      difficult to
>>                                detect if we added such special treatment
>> of
>>                      sub-registers that
>>                                you propose.
>>
>>                                I could try to make a case for the extra
>>             register
>>                      units to get
>>                                them added.
>>
>>                                -Krzysztof
>>
>>
>>                                On 9/8/2017 1:55 AM, Raghavan, Venugopal
>>             via llvm-dev
>>                      wrote:
>>
>>                                    Hi Krzysztof,
>>
>>                                    Thanks for your explanation - I think
>>             I understand
>>                      the issue
>>                                    now.
>>
>>                                    I do know if this is a hack, but, in
>>             reaching
>>                      definitions,
>>                                    when we check if a definition of AX
>>             kills a previous
>>                                    definition of EAX, can we not check
>>             the actual
>>                      physical
>>                                    register? In this case since EAX and
>>             AX are different
>>                                    (although I understand AX is a
>>             sub-register of
>>                      EAX), can we
>>                                    not conclude that the definition of
>>             AX does not
>>                      kill EAX?
>>                                    Since, in reaching definitions, it is
>>             conservative
>>                      if we
>>                                    over-estimate the set of reaching
>>             definitions (but not
>>                                    unsafe), should not this strategy be
>>             adequate to avoid
>>                                    incorrect optimizations?
>>
>>                                    Or, is there an optimization that can
>>             behave
>>                      incorrectly
>>                                    with an overestimated set of reaching
>>             definitions?
>>
>>                                    Regards,
>>                                    Venu.
>>
>>                                    -----Original Message-----
>>                                    From: Nema, Ashutosh
>>                                    Sent: Friday, September 08, 2017 12:00
>> PM
>>                                    To: Raghavan, Venugopal
>>                      <Venugopal.Raghavan at amd.com
>>             <mailto:Venugopal.Raghavan at amd.com>
>>             <mailto:Venugopal.Raghavan at amd.com
>>             <mailto:Venugopal.Raghavan at amd.com>>
>>                                    <mailto:Venugopal.Raghavan at amd.com
>>             <mailto:Venugopal.Raghavan at amd.com>
>>                      <mailto:Venugopal.Raghavan at amd.com
>>             <mailto:Venugopal.Raghavan at amd.com>>>>
>>                                    Subject: FW: [llvm-dev] Reaching
>>             definitions on
>>                      Machine IR post
>>                                    register allocation
>>
>>
>>
>>                                    -----Original Message-----
>>                                    From: llvm-dev
>>                      [mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>
>>                      <mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>>
>>                                                <mailto:
>> llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>
>>                      <mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>>>] On Behalf Of
>>                                    Krzysztof Parzyszek via llvm-dev
>>                                    Sent: Wednesday, September 6, 2017
>>             6:10 PM
>>                                    To: llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>
>>             <mailto:llvm-dev at lists.llvm.org <mailto:
>> llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>>
>>                                    Subject: Re: [llvm-dev] Reaching
>>             definitions on
>>                      Machine IR post
>>                                    register allocation
>>
>>                                    RDF needs to know when an assignment
>>             to a register is
>>                                    overwritten by another assignment, or
>>             by a sequence of
>>                                    assignments.  This is needed to
>>             determine whether the
>>                                    original assignment is still live or
>>             not. RDF uses
>>                      register
>>                                    units to represent the building blocks
>> of
>>                      registers, and
>>                                    assumes that if all units of a
>>             register are
>>                      overwritten,
>>                                    then the original value of that
>>             register is completely
>>                                    overwritten. This assumption is true
>>             for all
>>                      targets (that I
>>                                    have tested it with) except X86. On
>>             X86, registers
>>                      AX and
>>                                    EAX both have only one register unit,
>>             but when you
>>                      assign a
>>                                    value to AX, the upper half of EAX is
>>             preserved
>>                      (that is,
>>                                    the original value of EAX is not
>>             completely
>>                      overwritten).
>>
>>                                    -Krzysztof
>>
>>                                    On 9/6/2017 6:42 AM, Raghavan,
>>             Venugopal via
>>                      llvm-dev wrote:
>>
>>                                        Hi Krzysztof,
>>
>>                                        I did look at the other link you
>> have
>>                      mentioned in your
>>                                        reply but did not quite
>>             understand the
>>                      register units
>>                                        issue. If it is not too
>>             difficult, can you briefly
>>                                        summarize what the issue was?
>>
>>                                        Thanks.
>>
>>                                        Regards,
>>                                        Venu.
>>
>>
>>                                        -----Original Message-----
>>                                        From: llvm-dev
>>                      [mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>
>>                      <mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>>
>>                                                    <mailto:
>> llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>
>>                      <mailto:llvm-dev-bounces at lists.llvm.org
>>             <mailto:llvm-dev-bounces at lists.llvm.org>>>] On Behalf Of
>>                                        Krzysztof Parzyszek via llvm-dev
>>                                        Sent: Tuesday, September 5, 2017
>>             7:44 PM
>>                                        To: llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>>             <mailto:llvm-dev at lists.llvm.org>>
>>             <mailto:llvm-dev at lists.llvm.org <mailto:
>> llvm-dev at lists.llvm.org>
>>                      <mailto:llvm-dev at lists.llvm.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171128/e20e4727/attachment-0001.html>