<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 05/14/2017 04:39 PM, Daniel Berlin
      wrote:<br>
    </div>
    <blockquote
cite="mid:CAF4BwTVyWUQHfKbM2uh67Hp6fy1b70kWoxi9G6gcqgRdCvGF3w@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="ltr"><br>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On Sun, May 14, 2017 at 11:01 AM, Hal
            Finkel <span dir="ltr"><<a moz-do-not-send="true"
                href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div>
                  <div class="h5">
                    <p><br>
                    </p>
                    <div class="m_-6131269866192422863moz-cite-prefix">On
                      05/14/2017 12:49 PM, Daniel Berlin wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div dir="ltr"><br>
                        <div class="gmail_extra"><br>
                          <div class="gmail_quote">On Sun, May 14, 2017
                            at 10:20 AM, Hal Finkel <span dir="ltr"><<a
                                moz-do-not-send="true"
                                href="mailto:hfinkel@anl.gov"
                                target="_blank">hfinkel@anl.gov</a>></span>
                            wrote:<br>
                            <blockquote class="gmail_quote"
                              style="margin:0 0 0 .8ex;border-left:1px
                              #ccc solid;padding-left:1ex">
                              <div bgcolor="#FFFFFF" text="#000000">
                                <div>
                                  <div class="m_-6131269866192422863h5">
                                    <p><br>
                                    </p>
                                    <div
                                      class="m_-6131269866192422863m_-7109946281551088390moz-cite-prefix">On
                                      05/14/2017 11:06 AM, Daniel Berlin
                                      wrote:<br>
                                    </div>
                                    <blockquote type="cite">
                                      <div dir="ltr"><br>
                                        <div class="gmail_extra"><br>
                                          <div class="gmail_quote">On
                                            Sun, May 14, 2017 at 8:37
                                            AM, Hal Finkel <span
                                              dir="ltr"><<a
                                                moz-do-not-send="true"
                                                href="mailto:hfinkel@anl.gov"
                                                target="_blank">hfinkel@anl.gov</a>></span>
                                            wrote:<br>
                                            <blockquote
                                              class="gmail_quote"
                                              style="margin:0 0 0
                                              .8ex;border-left:1px #ccc
                                              solid;padding-left:1ex">
                                              <div bgcolor="#FFFFFF"
                                                text="#000000"><span>
                                                  <p><br>
                                                  </p>
                                                  <div
class="m_-6131269866192422863m_-7109946281551088390m_7871019985833609421moz-cite-prefix">On
                                                    03/01/2017 05:30 PM,
                                                    Daniel Berlin via
                                                    llvm-dev wrote:<br>
                                                  </div>
                                                  <blockquote
                                                    type="cite">
                                                    <div dir="ltr">So, <a
moz-do-not-send="true"
                                                        href="https://bugs.llvm.org/show_bug.cgi?id=32056"
                                                        target="_blank">https://bugs.llvm.org/show_bug<wbr>.cgi?id=32056</a>
                                                      is an example
                                                      showing our
                                                      current TBAA tree
                                                      for union
                                                      generation is
                                                      definitely
                                                      irretrievably
                                                      broken.
                                                      <div>I'll be
                                                        honest here. I'm
                                                        pretty sure your
                                                        proposal doesn't
                                                        go far enough.</div>
                                                      <div>But
                                                        truthfully,  I
                                                        would rather see
                                                        us come closer
                                                        to a
                                                        representation
                                                        we know works,
                                                        which is GCC's.</div>
                                                      <div>Let me try to
                                                        simplify what
                                                        you are
                                                        suggesting, and
                                                        what we have.</div>
                                                      <div>Our current
                                                        representation
                                                        is basically
                                                        inverted from
                                                        GCC, but we
                                                        don't allow
                                                        things that
                                                        would enable it
                                                        to work.<br>
                                                      </div>
                                                      <div><br>
                                                      </div>
                                                      <div>Given</div>
                                                      <div>union {int a,
                                                        short b};</div>
                                                      <div><br>
                                                      </div>
                                                      <div>GCC's will
                                                        be:</div>
                                                      <div><br>
                                                      </div>
                                                      <div> union</div>
                                                      <div>  /   \</div>
                                                      <div>short int</div>
                                                      <div><br>
                                                      </div>
                                                      <div><br>
                                                      </div>
                                                      <div>Everything is
                                                        implicitly a
                                                        subset of alias
                                                        set 0 (which for
                                                        C/C++ is char)
                                                        just to avoid
                                                        representing it.</div>
                                                      <div><br>
                                                      </div>
                                                      <div>Our metadata
                                                        has no child
                                                        links, and only
                                                        a single parent
                                                        link.</div>
                                                      <div><br>
                                                      </div>
                                                      <div>You can't
                                                        represent the
                                                        above form
                                                        because you
                                                        can't make a
                                                        single short
                                                        node a child  of
                                                        every
                                                        union/struct it
                                                        needs to be
                                                        (lack of
                                                        multiple parents
                                                        means you can't
                                                        just frob them
                                                        all to offset
                                                        zero).</div>
                                                      <div><br>
                                                      </div>
                                                      <div>Your
                                                        suggestion is to
                                                        invert this as a
                                                        struct</div>
                                                      <div><br>
                                                      </div>
                                                      <div>short  int</div>
                                                      <div>   |    / </div>
                                                      <div>union</div>
                                                      <div><br>
                                                      </div>
                                                      <div>We don't
                                                        allow multiple
                                                        parents,
                                                        however.</div>
                                                      <div>Because of
                                                        that, you
                                                        suggest we
                                                        follow all nodes
                                                        for unions,
                                                        special casing
                                                        union-type nodes
                                                        somehow</div>
                                                    </div>
                                                  </blockquote>
                                                  <br>
                                                </span> Now that I've
                                                spent a bunch of time
                                                looking at this, I'd
                                                like to voice support
                                                for Steven's original
                                                proposal. In the context
                                                of what we have, it
                                                makes sense, seems
                                                sound, and should fix
                                                the representational
                                                mapping problem we
                                                currently have. </div>
                                            </blockquote>
                                            <div><br>
                                            </div>
                                            <div>Except you can't easily
                                              differentiate it from the
                                              current one, and if we are
                                              going to have to
                                              upgrade/break
                                              compatibility, why not
                                              just do it once right, a
                                              way we know works, instead
                                              of risk screwing it up
                                              again, and playing with a
                                              representation we aren't
                                              actually sure we can make
                                              efficient for this case?<br>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </blockquote>
                                    <br>
                                  </div>
                                </div>
                                I don't see why need to break backward
                                compatibility. Do you have something
                                specific in mind? Once we extend the
                                TBAA implementation to treat repeated
                                offsets as disjunction, then we'll
                                extend Clang to emit metadata for unions
                                in this form. Old IR will work exactly
                                as it does now.</div>
                            </blockquote>
                            <div> </div>
                            <div>Except the Old IR is irretrievably
                              broken. and in this method, you can't tell
                              whether you are dealing with correct TBAA
                              or not.</div>
                            <div><br>
                            </div>
                            <div>Earlier, the discussion was basically
                              "detect old IR, disable TBAA since it's
                              broken, make new IR work".</div>
                            <div><br>
                            </div>
                            <div>If we do that, we need a way to
                              distinguish new vs old IR.</div>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                    <br>
                    <br>
                  </div>
                </div>
                Ah, okay. I don't think that's desirable in general.
                There are frontends emitting perfectly self-consistent
                TBAA metadata, and there's no reason they should change.
                Clang's metadata is broken because the mapping between
                language rules and TBAA is broken. If we'd like, we can
                blacklist TBAA metadata coming from root nodes named
                "Simple C++ TBAA" and "Simple C/C++ TBAA" (or provide an
                option to do that because I'm not convinced it is worth
                trying to retroactively "fix" old IR by default given
                the associated performance penalties). After the fix,
                Clang can emit root nodes with different names (they're
                arbitrary). Changing the root-node names will also give
                the conservatively-correct behavior when linking old IR
                with new IR.<span class=""><br>
                  <br>
                </span></div>
            </blockquote>
            <div><br>
            </div>
            <div>I still continue to believe trying to patch the
              existing mechanism this way is not a great plan, and
              likely to end us in a place where we still have either
              correctness or performance issues.</div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    I don't agree, but this is because I fail to see how the two
    representations (the GCC-like one you've outlined and the current
    one with the proposed extension) aren't completely isomorphic. Your
    proposal is:<br>
    <br>
    <blockquote type="cite">
      <div>scalar type node -> {name, children nodes}<br>
      </div>
      <div>struct node -> {name, array of {offset, child node} }</div>
      <div><br>
      </div>
      <div>Paths are separate from the tbaa tree itself, and are just:</div>
      <div>path node -> {struct node, offset} or whatever.</div>
      <div><br>
      </div>
      <div>unions are just scalar type nodes with multiple children, not
        struct nodes with special-cased offset zero.</div>
    </blockquote>
    <br>
    the evolutionary scheme can be described in similar terms:<br>
    <br>
    aggregate-type node -> { name, array of { offset, member-type } }<br>
    <br>
    For a root node, the array is empty (i.e. it is only a name).<br>
    <br>
    For a scalar type, the array only has one entry: the base/parent
    type. <br>
    <br>
    For struct types, the array has all of the members.<br>
    <br>
    For union types, the array has all of the members with duplicate
    offsets (likely zero, although there may be cases, such as anonymous
    unions in structs, where the duplicate offsets are not zero).<br>
    <br>
    Paths in the current scheme, as in your proposal, are separate from
    the TBAA DAG itself, and are just:<br>
    path node -> {access-type node, aggregate-type node, offset}<br>
    <br>
    For scalar accesses, access type == aggregate type and offset == 0.
    This could be more representationally efficient, but it is done this
    way to provide a uniform way to handle both scalars and aggregates
    and provide a uniform way to differentiate these node from the
    old-style scalar-only TBAA (to be fair, however, likely more the
    latter than the former). In any case, aside from the fact that we
    explicitly include the access type, this is the same.<br>
    <br>
    In both cases the edges of the hierarchy run between structs and
    their members, so regardless of whether you draw the graph going up
    or down (i.e. call the edges parents or children) that aspect of the
    representation seems equivalent. The differences seem to be:<br>
    <br>
     1. Our current scheme represents scalar types like it represents
    structs with a single member (i.e. a single base type at offset 0).
    In your proposed scheme we have separate scalar nodes.<br>
    <br>
     2. The proposed evolved scheme we represent unions as aggregates
    with duplicated offsets (or, more generally, duplicate offsets
    within some aggregate). In your proposed scheme we represent unions
    as multiple children of scalar nodes. Transforming between the
    schemes seems only to require taking all consecutive members with
    duplicate offsets in the proposed evolved scheme and placing them
    into a union (scalar) node in your scheme.<br>
    <br>
    Given that the union representation seems equivalent by rewriting,
    the question is whether the representation of scalars is equivalent
    - that is, is representing a scalar as a single-member struct (or
    union, as the case may be) equivalent to the representation you've
    proposed? I don't see why this wouldn't be equivalent as well (i.e.
    when traversing the tree in your representation we'd not adjust the
    offset when visiting scalar nodes, and then we'd visit the child,
    which is what we currently do as well).<br>
    <br>
    Thanks again,<br>
    Hal<br>
    <br>
    <blockquote
cite="mid:CAF4BwTVyWUQHfKbM2uh67Hp6fy1b70kWoxi9G6gcqgRdCvGF3w@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>But i'm not doing the work, so if that's what you guys
              want to do, go for it.<br>
            </div>
            <div><br>
            </div>
            <div>--Dan</div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
  </body>
</html>