<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">This thread is deep enough and the
start of it confrontational enough, that I doubt enough people are
reading this deep. Please rephrase this as a separate RFC to
ensure visibility. <br>
<br>
For the record, the overall direction your sketching seems
entirely reasonable to me. <br>
<br>
Philip<br>
<br>
On 08/18/2015 10:31 PM, deadal nix via llvm-dev wrote:<br>
</div>
<blockquote
cite="mid:CANGV3T1DeTU0Fj=Zu27Npw2qCG9-LB=SmZKSstdjgFWUSmefFQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>It is pretty clear people need this. Let's
get this moving.<br>
<br>
</div>
I'll try to sum up the point that have been made
and I'll try to address them carefully.<br>
</div>
<br>
1/ There is no good solution for large aggregates.<br>
</div>
That is true. However, I don't think this is a
reason to not address smaller aggregates, as they
appear to be needed. Realistically, the proportion
of aggregates that are very large is small, and
there is no expectation that such a thing would map
nicely to the hardware anyway (the hardware won't
have enough registers to load it all anyway). I do
think this is reasonable to expect a reasonable
handling of relatively small aggregates like fat
pointers while accepting that larges ones will be
inefficient.<br>
<br>
</div>
<div>This limitation is not unique to the current
discussion, as SROA suffer from the same limitation.<br>
</div>
<div>It is possible to disable to transformation for
aggregates that are too large if this is too big of
a concern. It should maybe also be done for SROA.<br>
</div>
<div><br>
</div>
2/ Slicing the aggregate break the semantic of
atomic/volatile.<br>
</div>
That is true. It means slicing the aggregate should not
be done for atomic/volatile. It doesn't mean this should
not be done for regular ones as it is reasonable to
handle atomic/volatile differently. After all, they have
different semantic.<br>
<br>
</div>
3/ Not slicing can create scalar that aren't supported by
the target. This is undesirable.<br>
</div>
Indeed. But as always, the important question is compared to
what ?<br>
<br>
</div>
The hardware has no notion of aggregate, so an aggregate or a
large scalar ends up both requiring legalization. Doing the
transformation is still beneficial :<br>
</div>
- Some aggregates will generate valid scalars. For such
aggregate, this is 100% win.<br>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div> - For aggregate that won't, the situation is
still better as various optimization passes will
be able to handle the load in a sensible manner.<br>
</div>
<div> - The transformation never make the
situation worse than it is to begin with.<br>
<br>
</div>
<div>On previous discussion, Hal Finkel seemed to
think that the scalar solution is preferable to
the slicing one.<br>
<br>
</div>
<div>Is that a fair assessment of the situation ?
Considering all of this, I think the right path
forward is :<br>
</div>
<div> - Go for the scalar solution in the general
case.<br>
</div>
<div> - If that is a problem, the slicing approach
can be used for non atomic/volatile.<br>
</div>
<div> - If necessary, disable the transformation
for very large aggregates (and consider doing so
for SROA as well).<br>
<br>
</div>
<div>Do we have a plan ?<br>
</div>
<div><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">2015-08-18 18:36 GMT-07:00 Nicholas
Chapman via llvm-dev <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Oh,<br>
and another potential reason for handling aggregate loads
and stores directly is that it expresses the semantics of
the program more clearly, which I think should allow LLVM
to optimise more aggresively.<br>
Here's a bug report showing a missed optimisation, which I
think is due to the use of memcpy, which in turn is
required to work around slow structure loads and stores:<br>
<a moz-do-not-send="true"
href="https://llvm.org/bugs/show_bug.cgi?id=23226"
target="_blank">https://llvm.org/bugs/show_bug.cgi?id=23226</a><br>
<br>
Cheers,<br>
Nick<span class=""><br>
<div>On 17/08/2015 22:02, mats petersson via llvm-dev
wrote:<br>
</div>
</span>
<div>
<div class="h5">
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>I've definitely "run into this
problem", and I would very much love
to remove my kludges [that are
incomplete, because I keep finding
places where I need to modify the
code-gen to "fix" the same problem -
this is probably par for the course
from a complete amateur compiler
writer and someone that has only spent
the last 14 months working (as a
hobby) with LLVM]. <br>
<br>
</div>
So whilst I can't contribute much on the
"what is the right solution" and "how do
we solve this", I would very much like
to see something that allows the user of
LLVM to use load/store withing things
like "is my thing that I'm storing big,
if so don't generate a load, use a
memcpy instead". Not only does this make
the usage of LLVM harder, it also causes
slow compilation [perhaps this is a
separte problem, but I have a simple
program that copies a large struct a few
times, and if I turn off my "use memcpy
for large things", the compile time gets
quite a lot longer - approx 1000x, and
48 seconds is a long time to compile 37
lines of relatively straight forward
code - even the Pascal compiler on
PDP-11/70 that I used at my school in
1980's was capable of doing more than 1
line per second, and it didn't run
anywhere near 2.5GHz and had 20-30 users
anytime I could use it...]<br>
<br>
../lacsap -no-memcpy -tt longcompile.pas
<br>
Time for Parse 0.657 ms<br>
Time for Analyse 0.018 ms<br>
Time for Compile 1.248 ms<br>
Time for CreateObject 48803.263 ms<br>
Time for CreateBinary 48847.631 ms<br>
Time for Compile 48854.064 ms<br>
<br>
</div>
compared with:<br>
../lacsap -tt longcompile.pas <br>
Time for Parse 0.455 ms<br>
Time for Analyse 0.013 ms<br>
Time for Compile 1.138 ms<br>
Time for CreateObject 44.627 ms<br>
Time for CreateBinary 82.758 ms<br>
Time for Compile 95.797 ms<br>
<br>
</div>
wc longcompile.pas <br>
37 84 410 longcompile.pas<br>
<br>
</div>
Source here:<br>
<a moz-do-not-send="true"
href="https://github.com/Leporacanthicus/lacsap/blob/master/test/longcompile.pas"
target="_blank">https://github.com/Leporacanthicus/lacsap/blob/master/test/longcompile.pas</a><br>
<br>
</div>
<br>
--<br>
</div>
Mats<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 17 August 2015 at
21:18, deadal nix via llvm-dev <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org"
target="_blank">llvm-dev@lists.llvm.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr">
<div>
<div>
<div>OK, what about that plan :<br>
<br>
</div>
Slice the aggregate into a serie of
valid loads/stores for non atomic ones.<br>
</div>
Use big scalar for atomic/volatile ones.<br>
</div>
Try to generate memcpy or memmove when
possible ?<br>
<div><br>
</div>
</div>
<div>
<div>
<div class="gmail_extra"><br>
<div class="gmail_quote">2015-08-17
12:16 GMT-07:00 deadal nix <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:deadalnix@gmail.com"
target="_blank">deadalnix@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote"><span>2015-08-17
11:26 GMT-07:00 Mehdi Amini
<span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:mehdi.amini@apple.com"
target="_blank">mehdi.amini@apple.com</a>></span>:<br>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div
style="word-wrap:break-word">Hi,
<div><br>
<div><span>
<blockquote
type="cite">
<div>On Aug 17,
2015, at 12:13
AM, deadal nix
via llvm-dev
<<a
moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
wrote:</div>
<br>
<div>
<div dir="ltr"><br>
<div
class="gmail_extra"><br>
<div
class="gmail_quote">2015-08-16
23:21
GMT-07:00
David Majnemer
<span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:david.majnemer@gmail.com"
target="_blank">david.majnemer@gmail.com</a>></span>:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div dir="ltr"><br>
<div
class="gmail_extra"><br>
<div
class="gmail_quote"><span></span>
<div>Because a
solution which
doesn't
generalize is
not a very
powerful
solution.
What happens
when somebody
says that they
want to use
atomics +
large
aggregate
loads and
stores? Give
them yet
another,
different
answer? That
would mean our
earlier, less
general
answer,
approach was
either a
bandaid (bad)
or the new
answer
requires a
parallel code
path in their
frontend
(worse).</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div><br>
</div>
</span>
<div>+1 with David’s
approach: making
thing
incrementally
better is fine *as
long as* the long
term direction is
identified. Small
incremental
changes that makes
things slightly
better in the
short term but
drives us away of
the long term
direction is not
good.</div>
<div><br>
</div>
<div>Don’t get me
wrong, I’m not
saying that the
current patch is
not good, just
that it does not
seem clear to me
that the long term
direction has been
identified, which
explain why some
can be nervous
about adding stuff
prematurely. </div>
<div>And I’m not for
the status quo,
while I can’t
judge it
definitively
myself, I even
bugged David last
month to look at
this revision and
try to identify
what is really the
long term
direction and how
to make your (and
other) frontends’
life easier. </div>
<span>
<div><br>
</div>
<div><br>
</div>
</span></div>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>As long as there is
something to be done.
Concern has been raised for
very large aggregate (64K,
1Mb) but there is no way a
good codegen can come out of
these anyway. I don't know
of any machine that have 1Mb
of register available to
tank the load. Even I we had
a good way to handle it in
InstCombine, the backend
would have no capability to
generate something nice for
it anyway. Most aggregates
are small and there is no
good excuse to not do
anything to handle them
because someone could
generate gigantic ones that
won't map nicely to the
hardware anyway.<br>
<br>
</div>
<div>By that logic, SROA
should not exists as one
could generate gigantic
aggregate as well (in fact,
SROA fail pretty badly on
large aggregates).<br>
<br>
</div>
<div>The second concern raised
is for atomic/volatile,
which needs to be handled by
the optimizer differently
anyway, so is mostly
irrelevant here.<br>
</div>
</div>
<span>
<div class="gmail_quote">
<div> </div>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div
style="word-wrap:break-word">
<div>
<div><span>
<blockquote
type="cite">
<div>
<div dir="ltr">
<div
class="gmail_extra">
<div
class="gmail_quote">
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div dir="ltr">
<div
class="gmail_extra">
<div
class="gmail_quote"><span>
<div> </div>
</span></div>
</div>
</div>
</blockquote>
<br>
</div>
<br>
</div>
<div
class="gmail_extra">clang
has many
developer
behind it,
some of them
paid to work
on it. That s
simply not the
case for many
others.<br>
<br>
</div>
<div
class="gmail_extra">But
to answer your
questions :<br>
</div>
<div
class="gmail_extra"> -
Per field
load/store
generate more
loads/stores
than necessary
in many cases.
These can't be
aggregated
back because
of padding.<br>
</div>
<div
class="gmail_extra"> -
memcpy only
work memory to
memory. It is
certainly
usable in some
cases, but
certainly do
not cover all
uses.<br>
</div>
<div
class="gmail_extra"><br>
</div>
<div
class="gmail_extra">I'm
willing to do
the memcpy
optimization
in InstCombine
(in fact,
things would
not degenerate
into so much
bikescheding,
that would
already be
done).<br>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
</span></div>
</div>
<div>Calling out
“bikescheding” what
other devs think is
what keeps the quality
of the project high is
unlikely to help your
patch go through, it’s
probably quite the
opposite actually.</div>
<div><br>
</div>
<div><br>
</div>
</div>
</blockquote>
<br>
</div>
</span>I understand the desire
to keep quality high. That's is
not where the problem is. The
problem lies into discussing
actual proposal against
hypothetical perfect ones that
do not exists.<br>
</div>
<div class="gmail_extra"><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org"
target="_blank">llvm-dev@lists.llvm.org</a>
<a moz-do-not-send="true"
href="http://llvm.cs.uiuc.edu"
rel="noreferrer" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a moz-do-not-send="true"
href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
LLVM Developers mailing list
<a moz-do-not-send="true" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <a moz-do-not-send="true" href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a>
<a moz-do-not-send="true" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<a moz-do-not-send="true"
href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
<br>
</body>
</html>