<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p><br>
</p>
<div class="moz-cite-prefix">On 05/14/2017 04:39 PM, Daniel Berlin
wrote:<br>
</div>
<blockquote
cite="mid:CAF4BwTVyWUQHfKbM2uh67Hp6fy1b70kWoxi9G6gcqgRdCvGF3w@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Sun, May 14, 2017 at 11:01 AM, Hal
Finkel <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>
<div class="h5">
<p><br>
</p>
<div class="m_-6131269866192422863moz-cite-prefix">On
05/14/2017 12:49 PM, Daniel Berlin wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Sun, May 14, 2017
at 10:20 AM, Hal Finkel <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:hfinkel@anl.gov"
target="_blank">hfinkel@anl.gov</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>
<div class="m_-6131269866192422863h5">
<p><br>
</p>
<div
class="m_-6131269866192422863m_-7109946281551088390moz-cite-prefix">On
05/14/2017 11:06 AM, Daniel Berlin
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On
Sun, May 14, 2017 at 8:37
AM, Hal Finkel <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:hfinkel@anl.gov"
target="_blank">hfinkel@anl.gov</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div bgcolor="#FFFFFF"
text="#000000"><span>
<p><br>
</p>
<div
class="m_-6131269866192422863m_-7109946281551088390m_7871019985833609421moz-cite-prefix">On
03/01/2017 05:30 PM,
Daniel Berlin via
llvm-dev wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">So, <a
moz-do-not-send="true"
href="https://bugs.llvm.org/show_bug.cgi?id=32056"
target="_blank">https://bugs.llvm.org/show_bug<wbr>.cgi?id=32056</a>
is an example
showing our
current TBAA tree
for union
generation is
definitely
irretrievably
broken.
<div>I'll be
honest here. I'm
pretty sure your
proposal doesn't
go far enough.</div>
<div>But
truthfully, I
would rather see
us come closer
to a
representation
we know works,
which is GCC's.</div>
<div>Let me try to
simplify what
you are
suggesting, and
what we have.</div>
<div>Our current
representation
is basically
inverted from
GCC, but we
don't allow
things that
would enable it
to work.<br>
</div>
<div><br>
</div>
<div>Given</div>
<div>union {int a,
short b};</div>
<div><br>
</div>
<div>GCC's will
be:</div>
<div><br>
</div>
<div> union</div>
<div> / \</div>
<div>short int</div>
<div><br>
</div>
<div><br>
</div>
<div>Everything is
implicitly a
subset of alias
set 0 (which for
C/C++ is char)
just to avoid
representing it.</div>
<div><br>
</div>
<div>Our metadata
has no child
links, and only
a single parent
link.</div>
<div><br>
</div>
<div>You can't
represent the
above form
because you
can't make a
single short
node a child of
every
union/struct it
needs to be
(lack of
multiple parents
means you can't
just frob them
all to offset
zero).</div>
<div><br>
</div>
<div>Your
suggestion is to
invert this as a
struct</div>
<div><br>
</div>
<div>short int</div>
<div> | / </div>
<div>union</div>
<div><br>
</div>
<div>We don't
allow multiple
parents,
however.</div>
<div>Because of
that, you
suggest we
follow all nodes
for unions,
special casing
union-type nodes
somehow</div>
</div>
</blockquote>
<br>
</span> Now that I've
spent a bunch of time
looking at this, I'd
like to voice support
for Steven's original
proposal. In the context
of what we have, it
makes sense, seems
sound, and should fix
the representational
mapping problem we
currently have. </div>
</blockquote>
<div><br>
</div>
<div>Except you can't easily
differentiate it from the
current one, and if we are
going to have to
upgrade/break
compatibility, why not
just do it once right, a
way we know works, instead
of risk screwing it up
again, and playing with a
representation we aren't
actually sure we can make
efficient for this case?<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
I don't see why need to break backward
compatibility. Do you have something
specific in mind? Once we extend the
TBAA implementation to treat repeated
offsets as disjunction, then we'll
extend Clang to emit metadata for unions
in this form. Old IR will work exactly
as it does now.</div>
</blockquote>
<div> </div>
<div>Except the Old IR is irretrievably
broken. and in this method, you can't tell
whether you are dealing with correct TBAA
or not.</div>
<div><br>
</div>
<div>Earlier, the discussion was basically
"detect old IR, disable TBAA since it's
broken, make new IR work".</div>
<div><br>
</div>
<div>If we do that, we need a way to
distinguish new vs old IR.</div>
</div>
</div>
</div>
</blockquote>
<br>
<br>
</div>
</div>
Ah, okay. I don't think that's desirable in general.
There are frontends emitting perfectly self-consistent
TBAA metadata, and there's no reason they should change.
Clang's metadata is broken because the mapping between
language rules and TBAA is broken. If we'd like, we can
blacklist TBAA metadata coming from root nodes named
"Simple C++ TBAA" and "Simple C/C++ TBAA" (or provide an
option to do that because I'm not convinced it is worth
trying to retroactively "fix" old IR by default given
the associated performance penalties). After the fix,
Clang can emit root nodes with different names (they're
arbitrary). Changing the root-node names will also give
the conservatively-correct behavior when linking old IR
with new IR.<span class=""><br>
<br>
</span></div>
</blockquote>
<div><br>
</div>
<div>I still continue to believe trying to patch the
existing mechanism this way is not a great plan, and
likely to end us in a place where we still have either
correctness or performance issues.</div>
</div>
</div>
</div>
</blockquote>
<br>
I don't agree, but this is because I fail to see how the two
representations (the GCC-like one you've outlined and the current
one with the proposed extension) aren't completely isomorphic. Your
proposal is:<br>
<br>
<blockquote type="cite">
<div>scalar type node -> {name, children nodes}<br>
</div>
<div>struct node -> {name, array of {offset, child node} }</div>
<div><br>
</div>
<div>Paths are separate from the tbaa tree itself, and are just:</div>
<div>path node -> {struct node, offset} or whatever.</div>
<div><br>
</div>
<div>unions are just scalar type nodes with multiple children, not
struct nodes with special-cased offset zero.</div>
</blockquote>
<br>
the evolutionary scheme can be described in similar terms:<br>
<br>
aggregate-type node -> { name, array of { offset, member-type } }<br>
<br>
For a root node, the array is empty (i.e. it is only a name).<br>
<br>
For a scalar type, the array only has one entry: the base/parent
type. <br>
<br>
For struct types, the array has all of the members.<br>
<br>
For union types, the array has all of the members with duplicate
offsets (likely zero, although there may be cases, such as anonymous
unions in structs, where the duplicate offsets are not zero).<br>
<br>
Paths in the current scheme, as in your proposal, are separate from
the TBAA DAG itself, and are just:<br>
path node -> {access-type node, aggregate-type node, offset}<br>
<br>
For scalar accesses, access type == aggregate type and offset == 0.
This could be more representationally efficient, but it is done this
way to provide a uniform way to handle both scalars and aggregates
and provide a uniform way to differentiate these node from the
old-style scalar-only TBAA (to be fair, however, likely more the
latter than the former). In any case, aside from the fact that we
explicitly include the access type, this is the same.<br>
<br>
In both cases the edges of the hierarchy run between structs and
their members, so regardless of whether you draw the graph going up
or down (i.e. call the edges parents or children) that aspect of the
representation seems equivalent. The differences seem to be:<br>
<br>
1. Our current scheme represents scalar types like it represents
structs with a single member (i.e. a single base type at offset 0).
In your proposed scheme we have separate scalar nodes.<br>
<br>
2. The proposed evolved scheme we represent unions as aggregates
with duplicated offsets (or, more generally, duplicate offsets
within some aggregate). In your proposed scheme we represent unions
as multiple children of scalar nodes. Transforming between the
schemes seems only to require taking all consecutive members with
duplicate offsets in the proposed evolved scheme and placing them
into a union (scalar) node in your scheme.<br>
<br>
Given that the union representation seems equivalent by rewriting,
the question is whether the representation of scalars is equivalent
- that is, is representing a scalar as a single-member struct (or
union, as the case may be) equivalent to the representation you've
proposed? I don't see why this wouldn't be equivalent as well (i.e.
when traversing the tree in your representation we'd not adjust the
offset when visiting scalar nodes, and then we'd visit the child,
which is what we currently do as well).<br>
<br>
Thanks again,<br>
Hal<br>
<br>
<blockquote
cite="mid:CAF4BwTVyWUQHfKbM2uh67Hp6fy1b70kWoxi9G6gcqgRdCvGF3w@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>But i'm not doing the work, so if that's what you guys
want to do, go for it.<br>
</div>
<div><br>
</div>
<div>--Dan</div>
</div>
</div>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</body>
</html>