[llvm-dev] struct bitfield regression between 3.6 and 3.9 (using -O0)
Friedman, Eli via llvm-dev
llvm-dev at lists.llvm.org
Thu Dec 22 10:29:48 PST 2016
On 12/21/2016 4:45 PM, Phil Tomson via llvm-dev wrote:
> Here's our testcase:
>
> #include <stdio.h>
>
> struct flags {
> unsigned frog: 1;
> unsigned foo : 1;
> unsigned bar : 1;
> unsigned bat : 1;
> unsigned baz : 1;
> unsigned bam : 1;
> };
>
> int main() {
> struct flags flags;
> flags.bar = 1;
> flags.foo = 1;
> if (flags.foo == 1) {
> printf("Pass\n");
> return 0;
> } else {
> printf("FAIL\n");
> return 1;
> }
> }
>
> when we compile this using LLVM 3.9 we get the "FAIL" message.
> However, when we compile in LLVM 3.6 it passes. (this is only an issue
> with -O0, higher levels of optimization work fine)
>
> After some investigation we discovered the problem, here's the
> relevant part of our assembly generated by LVM 3.9:
>
> load r0, r510, 24, 8
> slr r0, r0, 1, 8
> cmpimm r0, r0, 1, 0, 8, SNE
> bitop1 r0, r0, 1<<0, AND, 64
> jct .LBB0_2, r0, 0, N
> jrel .LBB0_1
>
> Notice the slr (shift logical right) instruction there is shifting to
> the right 1 position in order to get flags.foo into bit 0 of r0. But
> the problem is that the compare(cmpimm) is comparing not just the
> single bit but the whole value in r0 (an 8-bit value) against 1. If we
> insert a logical AND with '1' to mask r0 just prior to the compare it
> works fine.
>
> And as it turns out, we see that *and* in the LLVM IR generated using
> -O0 and -emit-llvm has the AND included:
> ...
> %bf.lshr = lshr i8 %bf.load4, 1
> * %bf.clear5 = and i8 %bf.lshr, 1*
> %bf.cast = zext i8 %bf.clear5 to i32
> %cmp = icmp eq i32 %bf.cast, 1
> br i1 %cmp, label %if.then, label %if.else
>
> (compiled with: clang -O0 -emit-llvm -S failing.c -o failing.ll )
>
> I reran passing -debug to llc to see what's happening at various
> stages of DAG optimization:
>
> clang -O0 -mllvm -debug -S failing.c -o failing.s
>
> The initial selection DAG has the AND op node:
>
> t22: i8 = srl t19, Constant:i64<1>
> * t23: i8 = and t22, Constant:i8<1>*
> t24: i32 = zero_extend t23
> t27: i1 = setcc t24, Constant:i32<1>, seteq:ch
> t29: i1 = xor t27, Constant:i1<-1>
> t31: ch = brcond t18, t29, BasicBlock:ch<if.else 0xa5f8d48>
> t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98>
>
> The Optimized lowered selection DAG does not contain the*AND* node,
> but it does have a truncate which would seem to stand in for it given
> the result is only 1bit wide and the xor following it is operating on
> 1-bit wide values:
>
> t22: i8 = srl t19, Constant:i64<1>
> t35: i1 = truncate t22
> t29: i1 = xor t35, Constant:i1<-1>
> t31: ch = brcond t18, t29, BasicBlock:ch<if.else 0xa5f8d48>
> t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98>
>
> Next we get to the Type-legalized selection DAG:
>
> t22: i8 = srl t19, Constant:i64<1>
> t40: i8 = xor t22, Constant:i8<1>
> t31: ch = brcond t18, t40, BasicBlock:ch<if.else 0xa5f8d48>
> t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98>
>
> The truncateis now gone.
>
> Next we have the Optimzied type-legalized DAG:
>
> t22: i8 = srl t19, Constant:i64<1>
> t43: i8 = setcc t22, Constant:i8<1>, setne:ch
> t31: ch = brcond t18, t43, BasicBlock:ch<if.else 0xa5f8d48>
> t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98>
>
> The***xor*has been replaced with a *setcc*. The legalized selection
> DAG is essentially the same. As is the optimized legalized selection DAG.
>
> So if t19 contains 0b00000110then
> t22 contains 0b00000011
> setccthen compares t22with a constant 1 and since they're not equal
> (setne) it sets bit 0 of t43.
> brcond will then test bit 0 of t43 and since it's set it branches to
> the else branch (prints FAIL in this case)
>
> If instead t22 contained 0b00000001 (as would be the case if the mask
> was still there) the setcc would find both values to compare equal and
> since setne is specified the branch in brcondwill not be taken (the
> correct behavior)
>
> Things seem to have gone wrong when the Type-legalized selection DAG
> was optimized and the *xor *node was changed to a *setcc *(and
> actually, the *xor*seems like it was more optimal than the *setcc
> *anyway)*. *
>
> Any ideas about why this is happening?
I would suggest starting with DAGTypeLegalizer::PromoteIntOp_BRCOND, I
think...
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161222/2720d736/attachment.html>
More information about the llvm-dev
mailing list