<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 3/7/2017 1:57 PM, Wei Mi wrote:<br>
</div>
<blockquote
cite="mid:CA+4CFy5-z2O3taJkobV9FqZ2A1Jqme_vps7oi9Wzm1jm89mZgg@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Mar 7, 2017 at 10:51 AM, Wei
Mi <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:wmi@google.com" target="_blank">wmi@google.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">
<div>
<div class="h5">On Tue, Mar 7, 2017 at 9:36 AM,
Friedman, Eli <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:efriedma@codeaurora.org"
target="_blank">efriedma@codeaurora.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<div>
<div class="m_-182419626503815813gmail-h5">
<div
class="m_-182419626503815813gmail-m_-940156540931862899moz-cite-prefix">On
3/7/2017 9:15 AM, Wei Mi wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon,
Mar 6, 2017 at 8:50 PM, Wei Mi <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:wmi@google.com"
target="_blank">wmi@google.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote"><span>On
Mon, Mar 6, 2017 at
4:36 PM, Friedman, Eli
<span dir="ltr"><<a
moz-do-not-send="true" href="mailto:efriedma@codeaurora.org"
target="_blank">efriedma@codeaurora.org</a>></span>
wrote:<br>
</span><span>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex"><span
class="m_-182419626503815813gmail-m_-940156540931862899m_-8464126786741311821gmail-">On
3/6/2017 11:06 AM,
Wei Mi wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
After looking at
the benchmark
motivating the
test, I find it
is much harder
for
codegenprepare
to catch all the
shrinking
opportunities
compared with
instcombine,
because of two
things:<br>
<br>
* loadpre may
merge the
original load in
a shrinking
pattern with
other load in a
very early
position, and
there maybe
other stores in
between. The
safety check for
the
codegenprepare
now simply scan
mayWriteToMemory
insns between
load and store
in the candidate
shrinking
pattern, and it
will fail
because of those
stores in
between.<br>
</blockquote>
<br>
</span> It would be
easy to make the
check a lot more
powerful using
MemorySSA; I guess
you don't have
access to that in
the passes where
you're doing the
transform?</blockquote>
<div><br>
</div>
</span>
<div>Yes, if MemorySSA +
alias query can be
used, it should solve
the problem. I didn't
find existing usage of
MemorySSA in
instcombine or CGP so
I am not sure whether
it can be used. </div>
<span>
<div> </div>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex"><span
class="m_-182419626503815813gmail-m_-940156540931862899m_-8464126786741311821gmail-"><br>
<br>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
* we may see the
following
pattern. a.f1 is
a bitfield.
Ideally, we
should do the
shrinking for
all the three
stores. But
codegenprepare
works in
top-down order.
After the first
store is
shrinked, it is
hard for the
second and the
third to know
%bf.set is the
same as the
value loaded
from %0. If we
do this in
instcombine, it
is much easier
because before
load/store
elimination,
%bf.set will be
loaded from %0
respecitively in
if.then9 and in
if.end41.<br>
</blockquote>
<br>
</span> I'm not
following how it
helps to do this in
instcombine?
Normally, EarlyCSE
runs first. (A
complete C testcase
might be useful
here.)</blockquote>
<div><br>
</div>
</span>
<div>It is not earlycse.
It is also loadpre to
cause the load to be
replaced by %bf.set. </div>
<div><br>
</div>
<div>
<div
style="font-size:12.8px">Before
global value
numbering:</div>
<span>
<div
style="font-size:12.8px"><br
class="m_-182419626503815813gmail-m_-940156540931862899m_-8464126786741311821gmail-Apple-interchange-newline">
entry:</div>
<div
style="font-size:12.8px">
%0 = bitcast
%class.B* %this to
i64*</div>
<div
style="font-size:12.8px">
%bf.load = load
i64, i64* %0,
align 8</div>
<div
style="font-size:12.8px">
<div> %dec = add
i64 %bf.load,
65535</div>
<div> %bf.value =
and i64 %dec,
65535</div>
<div> %bf.clear5
= and i64
%bf.load, -65536</div>
<div> %bf.set =
or i64
%bf.value,
%bf.clear5</div>
<div> store i64
%bf.set, i64*
%0, align 8</div>
</div>
<div
style="font-size:12.8px">
...</div>
<div
style="font-size:12.8px">if.then9:</div>
</span>
<div
style="font-size:12.8px"> <span
style="font-size:12.8px">%bf.load1 = load i64, i64* %0, align 8
//
%bf.load1 is
helpful for the
patch to recognize
the
load-and-or-store
pattern.</span></div>
<div
style="font-size:12.8px">
<div> %inc79 = add
i64 %bf.load1,
281474976710656
</div>
<span>
<div> %bf.shl =
and i64 %inc79,
71776119061217280</div>
</span>
<div> %bf.clear18 =
and i64 %bf.load1,
-71776119061217281</div>
<span>
<div> %bf.set19 =
or i64 %bf.shl,
%bf.clear18</div>
<div> store i64
%bf.set19, i64*
%0, align 8</div>
</span></div>
<div
style="font-size:12.8px">
...</div>
<div
style="font-size:12.8px">if.end41:</div>
<div
style="font-size:12.8px"> <span
style="font-size:12.8px">%bf.load2 = load i64, i64* %0, align 8
</span><span
style="font-size:12.8px">// %bf.load2 is helpful for the patch to
recognize the
load-and-or-store
pattern.</span></div>
<div
style="font-size:12.8px">
<div> %inc4578 =
add i64 %bf.load2,
65536
</div>
<span>
<div> %bf.shl48 =
and i64
%inc4578,
4294901760</div>
</span>
<div> %bf.clear49 =
and i64 %bf.load2,
-4294901761</div>
<span>
<div> %bf.set50 =
or i64
%bf.shl48,
%bf.clear49</div>
<div> store i64
%bf.set50, i64*
%0, align 8</div>
</span></div>
</div>
<div> </div>
<div>After global value
numbering:</div>
<span>
<div><br>
</div>
<div>
<div
style="font-size:12.8px">entry:</div>
<div
style="font-size:12.8px">
%0 = bitcast
%class.B* %this to
i64*</div>
<div
style="font-size:12.8px">
%bf.load = load
i64, i64* %0,
align 8</div>
<div
style="font-size:12.8px">
<div> %dec = add
i64 %bf.load,
65535</div>
<div> %bf.value =
and i64 %dec,
65535</div>
<div> %bf.clear5
= and i64
%bf.load, -65536</div>
<div> %bf.set =
or i64
%bf.value,
%bf.clear5</div>
<div> store i64
%bf.set, i64*
%0, align 8</div>
</div>
<div
style="font-size:12.8px">
...</div>
<div
style="font-size:12.8px">if.then9:</div>
<div
style="font-size:12.8px">
<div> %inc79 =
add i64 %bf.set,
281474976710656
// we hope
to know %bf.set
is the same as
load i64, i64*
%0, align 8.</div>
<div> %bf.shl =
and i64 %inc79,
71776119061217280</div>
<div> %bf.clear18
= and i64
%bf.set,
-71776119061217281</div>
<div> %bf.set19 =
or i64 %bf.shl,
%bf.clear18</div>
<div> store i64
%bf.set19, i64*
%0, align 8</div>
</div>
<div
style="font-size:12.8px">
...</div>
<div
style="font-size:12.8px">if.end41:</div>
<div
style="font-size:12.8px">
<div> %inc4578 =
add i64 %bf.set,
65536
//
we hope to know
%bf.set is the
same as load
i64, i64* %0,
align 8.</div>
<div> %bf.shl48 =
and i64
%inc4578,
4294901760</div>
<div> %bf.clear49
= and i64
%bf.set,
-4294901761</div>
<div> %bf.set50 =
or i64
%bf.shl48,
%bf.clear49</div>
<div> store i64
%bf.set50, i64*
%0, align 8</div>
</div>
</div>
</span></div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
I don't like depending on pass ordering like
this... I mean, I can't really say without
the original C testcase, but at first
glance, EarlyCSE should be catching this,
and you're just getting lucky.</div>
</blockquote>
<div><br>
</div>
</div>
</div>
<div>For the testcase above, EarlyCSE does't replace
the load with previous store value because there
are calls and other stores in between (in the <span
style="color:rgb(84,84,84);font-family:roboto,arial,sans-serif">ellipsis</span>).
You are right, generally, unless we add another
pass of instcombine before earlycse, doing the
shrinking in instcombine after earlycse can face
the same problem. I havn't got an idea to solve it
generally except that to add another pass of
instcombine. Still thinking...</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>To catch all the bitfield shrinking opportunities,
seems the optimization should be placed before the first
pass of earlycse. But I don't think adding another pass of
instcombine before earlycse is a good idea, because I
vaguely remember we saw case requiring earlycse to happen
before instcombine. </div>
<div><br>
</div>
<div>Is it ok to add a separate pass before earlycse for
bitfield related shrinking? This is the best way I can
think of for now. Suggestions are welcomed.</div>
<br>
</div>
</div>
</div>
</blockquote>
<br>
You might want to skim the thread
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20120827/063200.html">http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20120827/063200.html</a>
for some background on our current treatment of bitfields. <br>
<br>
I'm not sure what the right answer here looks like, given the way it
interacts with other memory transforms like GVN. Could you send an
email to llvm-dev with a description of what we're looking at, to
see if anyone else has suggestions?<br>
<p>-Eli<br>
</p>
<pre class="moz-signature" cols="72">--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
</body>
</html>