<div dir="ltr"><div>Given that this is compiled with -O0, would there a way to skip the Optimization of the Type-legalized selection DAG? It's fine until it optimizes the Type-legalized selection DAG into the Optimized Type-legalized selection DAG.<br><br></div>Phil<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Dec 22, 2016 at 10:29 AM, Friedman, Eli <span dir="ltr"><<a href="mailto:efriedma@codeaurora.org" target="_blank">efriedma@codeaurora.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><div><div class="h5">
<div class="m_3596639334803254456moz-cite-prefix">On 12/21/2016 4:45 PM, Phil Tomson via
llvm-dev wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>
<div>Here's our testcase:<br>
<span style="font-family:monospace,monospace"><br>
#include <stdio.h><br>
<br>
struct flags {<br>
unsigned frog: 1;<br>
unsigned foo : 1;<br>
unsigned bar : 1;<br>
unsigned bat : 1;<br>
unsigned baz : 1;<br>
unsigned bam : 1;<br>
};<br>
<br>
int main() {<br>
struct flags flags;<br>
flags.bar = 1;<br>
flags.foo = 1;<br>
if (flags.foo == 1) {<br>
printf("Pass\n");<br>
return 0;<br>
} else {<br>
printf("FAIL\n");<br>
return 1;<br>
} <br>
}</span><br>
<br>
</div>
when we compile this using LLVM 3.9 we get the
"FAIL" message. However, when we compile in LLVM 3.6
it passes. (this is only an issue with -O0, higher
levels of optimization work fine)<br>
<br>
</div>
After some investigation we discovered the problem,
here's the relevant part of our assembly generated by
LVM 3.9:<br>
<span style="font-family:monospace,monospace"><br>
load r0, r510, 24, 8<br>
slr r0, r0, 1, 8<br>
cmpimm r0, r0, 1, 0, 8, SNE<br>
bitop1 r0, r0, 1<<0, AND, 64<br>
jct .LBB0_2, r0, 0, N<br>
jrel .LBB0_1</span><br>
<br>
</div>
Notice the slr (shift logical right) instruction there
is shifting to the right 1 position in order to get
flags.foo into bit 0 of r0. But the problem is that the
compare(cmpimm) is comparing not just the single bit but
the whole value in r0 (an 8-bit value) against 1. If we
insert a logical AND with '1' to mask r0 just prior to
the compare it works fine.<br>
<br>
</div>
And as it turns out, we see that <b><span style="font-family:monospace,monospace">and</span></b>
in the LLVM IR generated using -O0 and -emit-llvm has the
AND included:<br>
...<br>
<span style="font-family:monospace,monospace"> %bf.lshr =
lshr i8 %bf.load4, 1<br>
<b> %bf.clear5 = and i8 %bf.lshr, 1</b><br>
%bf.cast = zext i8 %bf.clear5 to i32<br>
%cmp = icmp eq i32 %bf.cast, 1<br>
br i1 %cmp, label %if.then, label %if.else</span><br>
<br>
</div>
(compiled with: clang -O0 -emit-llvm -S failing.c -o
failing.ll )<br>
<br>
</div>
I reran passing -debug to llc to see what's happening at
various stages of DAG optimization:<br>
<br>
<span style="font-family:monospace,monospace">clang -O0 -mllvm
-debug -S failing.c -o failing.s</span> <br>
<br>
</div>
The initial selection DAG has the AND op node:<br>
<span style="font-family:monospace,monospace"><br>
t22: i8 = srl t19, Constant:i64<1><br>
<b> t23: i8 = and t22, Constant:i8<1></b><br>
t24: i32 = zero_extend t23<br>
t27: i1 = setcc t24, Constant:i32<1>, seteq:ch<br>
t29: i1 = xor t27, Constant:i1<-1><br>
t31: ch = brcond t18, t29, BasicBlock:ch<if.else
0xa5f8d48><br>
t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98></span><br>
<br>
<div>
<div>
<div>
<div>The Optimized lowered selection DAG does not contain
the<b> AND</b> node, but it does have a truncate which
would seem to stand in for it given the result is only
1bit wide and the xor following it is operating on 1-bit
wide values:<br>
<br>
<span style="font-family:monospace,monospace">
t22: i8 = srl t19, Constant:i64<1><br>
t35: i1 = truncate t22<br>
t29: i1 = xor t35, Constant:i1<-1><br>
t31: ch = brcond t18, t29,
BasicBlock:ch<if.else 0xa5f8d48><br>
t33: ch = br t31, BasicBlock:ch<if.then
0xa5f8c98><br>
<br>
</span></div>
<div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">Next we get to the
Type-legalized selection DAG:<br>
<br>
</font> t22: i8 = srl t19,
Constant:i64<1><br>
t40: i8 = xor t22, Constant:i8<1><br>
t31: ch = brcond t18, t40,
BasicBlock:ch<if.else 0xa5f8d48><br>
t33: ch = br t31, BasicBlock:ch<if.then
0xa5f8c98><font face="arial,helvetica,sans-serif"><br>
<br>
</font></span></div>
<div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif"> The</font>
truncate<font face="arial,helvetica,sans-serif"> is
now gone.<br>
<br>
</font></span></div>
<div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">Next we have the
Optimzied type-legalized DAG:<br>
<br>
</font> t22: i8 = srl t19,
Constant:i64<1><br>
t43: i8 = setcc t22, Constant:i8<1>,
setne:ch<br>
t31: ch = brcond t18, t43,
BasicBlock:ch<if.else 0xa5f8d48><br>
t33: ch = br t31, BasicBlock:ch<if.then
0xa5f8c98><font face="arial,helvetica,sans-serif"><br>
<br>
</font></span></div>
<div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">The<b> </b></font><b>xor</b><font face="arial,helvetica,sans-serif"> has been replaced
with a </font><b>setcc</b><font face="arial,helvetica,sans-serif">. The legalized
selection DAG is essentially the same. As is the
optimized legalized selection DAG. <br>
<br>
</font></span></div>
<div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">So if t19 contains</font>
0b00000110<font face="arial,helvetica,sans-serif">
then<br>
</font></span></div>
<div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">t22 contains </font>0b00000011<font face="arial,helvetica,sans-serif"> <br>
</font>setcc<font face="arial,helvetica,sans-serif">
then compares </font>t22<font face="arial,helvetica,sans-serif"> with a constant 1
and since they're not equal (</font>setne<font face="arial,helvetica,sans-serif">) it sets bit 0 of
t43. <br>
</font></span></div>
<div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">brcond will then
test bit 0 of t43 and since it's set it branches to
the else branch (prints FAIL in this case)<br>
<br>
</font></span></div>
<div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">If instead t22
contained 0b00000001 (as would be the case if the
mask was still there) the </font>setcc <font face="arial,helvetica,sans-serif">would find both
values to compare equal and since </font>setne <font face="arial,helvetica,sans-serif">is specified the
branch in </font>brcond<font face="arial,helvetica,sans-serif"> will not be taken
(the correct behavior)<br>
<br>
</font></span></div>
<div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">Things seem to
have gone wrong when the </font></span><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">Type-legalized
selection DAG was optimized and the </font><b>xor </b><font face="arial,helvetica,sans-serif">node was changed
to a </font><b>setcc </b><font face="arial,helvetica,sans-serif">(and actually, the
</font><b>xor</b><font face="arial,helvetica,sans-serif"> seems like it was
more optimal than the </font><b>setcc </b><font face="arial,helvetica,sans-serif">anyway)</font><b>.
</b><font face="arial,helvetica,sans-serif"><br>
<br>
Any ideas about why this is happening?<br>
</font></span></div>
</div>
</div>
</div>
</div>
</blockquote>
<br></div></div>
I would suggest starting with DAGTypeLegalizer::<wbr>PromoteIntOp_BRCOND,
I think...<span class="HOEnZb"><font color="#888888"><br>
<br>
-Eli<br>
<pre class="m_3596639334803254456moz-signature" cols="72">--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
</font></span></div>
</blockquote></div><br></div>