<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p><br>
</p>
<div class="moz-cite-prefix">On 08/10/2017 12:01 AM, Chandler
Carruth wrote:<br>
</div>
<blockquote
cite="mid:CAAwGriEt=S4cOXUZY6d3XCqi9dD4cAdXkrQUe4nnHhhRwAdEuA@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr">On Wed, Aug 9, 2017 at 9:51 PM Hal Finkel <<a
moz-do-not-send="true" href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<p><br>
</p>
<div class="m_5578066034655955596moz-cite-prefix">On
08/09/2017 11:03 PM, Chandler Carruth wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hal already answered much of this, just
continuing this part of the discussion...
<div><br>
<div class="gmail_quote">
<div dir="ltr">On Wed, Aug 9, 2017 at 8:56 PM
Xinliang David Li via llvm-commits <<a
moz-do-not-send="true"
href="mailto:llvm-commits@lists.llvm.org"
target="_blank">llvm-commits@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0
0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Wed, Aug 9, 2017
at 8:37 PM, Hal Finkel <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:hfinkel@anl.gov"
target="_blank">hfinkel@anl.gov</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span>
<p><br>
</p>
<div
class="m_5578066034655955596m_-2021915036012865282m_-1753555538924036895moz-cite-prefix">On
08/09/2017 10:14 PM, Xinliang
David Li via llvm-commits wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> Can you elaborate here
too? If there were missed
optimization that later
got fixed, there should be
regression tests for them,
right? And what
information is missing?</div>
</div>
</div>
</div>
</blockquote>
<br>
</span> To make a general statement,
if we load (a, i8) and (a+2, i16), for
example, and these came from some
structure, we've lost the information
that the load (a+1, i8) would have
been legal (i.e. is known to be
deferenceable). This is not specific
to bit fields, but the fact that we
lose information on the
dereferenceable byte ranges around
memory access turns into a problem
when we later can't legally widen.
There may be a better way to keep this
information other than producing wide
loads (which is an imperfect
mechanism, especially the way we do it
by restricting to legal integer
types),</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>I don't think we have such a restriction?
Maybe I'm missing something. When I originally
added this logic, it definitely was not
restricted to legal integer types.</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
<div bgcolor="#FFFFFF" text="#000000"> I believe you're
right for bitfields. For general structures, however, we
certainly load individual fields instead of loading the
whole structure with some wide integer in order to
preserve dereferenceability information.</div>
</blockquote>
<div><br>
</div>
<div>I don't believe structures provide that information. See
below.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div class="gmail_quote">
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0 0
0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
but at the moment, we don't have
anything better.<br>
</div>
</blockquote>
<div><br>
</div>
</div>
</div>
</div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Ok, as you mentioned, widening looks
like a workaround to paper over the
weakness in IR to annotate the
information. More importantly, my
question is whether this is a just
theoretical concern.</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>I really disagree with this being a
workaround.</div>
<div><br>
</div>
<div>I think it is very fundamentally the correct
model -- the semantics are that this is a
single, wide memory operation that a narrow data
type is extracted from.</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
<div bgcolor="#FFFFFF" text="#000000"> That is one option.
We do need to preserve this information (maybe we can do
this with TBAA, or similar, or maybe using some other
mechanism entirely). However, we do try harder to do this
with bitfields than with other aggregates. If I have
struct { int a, b, c, d; } S; and I load S.d, we don't do
this by loading a 128-bit integer and then extracting some
part of it. Should we? Probably not.</div>
</blockquote>
<div><br>
</div>
<div>We cannot, it isn't allowed (I'm pretty sure...)</div>
<div><br>
</div>
<div>1) It violates C++ (and C) memory model -- another thread
could be writing to the other variables.</div>
</div>
</div>
</blockquote>
<br>
Ah, indeed, you're correct. That does indeed motivate bitfields
being a special case. Do the comments explain that somewhere?<br>
<br>
I'll need to add this to my mental list of sometimes-unfortunate
semantics.<br>
<br>
-Hal<br>
<br>
<blockquote
cite="mid:CAAwGriEt=S4cOXUZY6d3XCqi9dD4cAdXkrQUe4nnHhhRwAdEuA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div><br>
</div>
<div>2) Related to #1, there are applications that rely on
this memory model, for example structures where entire
regions of the structure live in protected pages and cannot
be correctly accessed.</div>
<div><br>
</div>
<div>3) Again related to #1, there are applications that rely
on the memory model when doing memory-mapped IO to avoid
reading or writing regions that are being updated by the OS
or other processes.</div>
<div><br>
</div>
<div>Bitfields are the only place where we have specific
license to widen access in the C++ memory model (that I'm
aware of)....</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> I suspect having
better support for aggregate memory access would be a
better solution. Or, as noted, using metadata or some
other secondary mechanism.<br>
</div>
</blockquote>
<div><br>
</div>
<div>FWIW, I actually agree that if we want to do more of
this, we would be better served by a different IR, but I
strongly suspect it would look more like first class
aggregates rather than metadata so that we could reason
about it more fundamentally in terms of SSA.</div>
<div><br>
</div>
<div>But bitfields are (IMO) an importantly different problem
in that they are mergeable in interesting and important ways
due to being integers and often times sub-byte integers.
This is why a single large integer combined with late
narrowing seems like a particularly desirable way to
represent the fundamental information of the semantic
constraints of the program.</div>
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">Maybe more
aggressively preserving this information for bit fields is
the right answer, empirically. I can believe that's true.
The more-general problem still exists, however.</div>
</blockquote>
<div><br>
</div>
<div>For other languages / semantics, yes. Increasingly I
think a (better designed / integrated / spec'ed, etc) system
like FCAs would work particularly well at making this easy
to express and reason about. But it would be a pretty
significant change.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><br>
The thing that appeals to me about the IR-transformation
approach is the ability to handle "hand coded" bit fields
as effectively as language-level bit fields. I've
certainly seen my share of these, and they're definitely
important. Moreover, this is true regardless of what we
think about the underlying optimal model for preserving
aggregate derefereceability in general.<br>
</div>
</blockquote>
<div><br>
</div>
<div>Completely agree. Teaching LLVM to handle wide integer
accesses will be beneficial no matter what decisions are
made here.</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> </div>
</blockquote>
</div>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</body>
</html>