[llvm-dev] struct bitfield regression between 3.6 and 3.9 (using -O0)

Friedman, Eli via llvm-dev llvm-dev at lists.llvm.org
Thu Dec 22 10:29:48 PST 2016


On 12/21/2016 4:45 PM, Phil Tomson via llvm-dev wrote:
> Here's our testcase:
>
> #include <stdio.h>
>
> struct flags {
>     unsigned frog: 1;
>     unsigned foo : 1;
>     unsigned bar : 1;
>     unsigned bat : 1;
>     unsigned baz : 1;
>     unsigned bam : 1;
> };
>
> int main() {
>     struct flags flags;
>     flags.bar = 1;
>     flags.foo = 1;
>     if (flags.foo == 1) {
>         printf("Pass\n");
>         return 0;
>     } else {
>         printf("FAIL\n");
>         return 1;
>     }
> }
>
> when we compile this using LLVM 3.9 we get the "FAIL" message. 
> However, when we compile in LLVM 3.6 it passes. (this is only an issue 
> with -O0, higher levels of optimization work fine)
>
> After some investigation we discovered the problem, here's the 
> relevant part of our assembly generated by LVM 3.9:
>
>     load          r0, r510, 24, 8
>     slr           r0, r0, 1, 8
>     cmpimm        r0, r0, 1, 0, 8, SNE
>     bitop1        r0, r0, 1<<0, AND, 64
>     jct          .LBB0_2, r0, 0, N
>     jrel         .LBB0_1
>
> Notice the slr (shift logical right) instruction there is shifting to 
> the right 1 position in order to get flags.foo into bit 0 of r0.  But 
> the problem is that the compare(cmpimm) is comparing not just the 
> single bit but the whole value in r0 (an 8-bit value) against 1. If we 
> insert a logical AND with '1' to mask r0 just prior to the compare it 
> works fine.
>
> And as it turns out, we see that *and* in the LLVM IR generated using 
> -O0 and -emit-llvm has the AND included:
>  ...
>   %bf.lshr = lshr i8 %bf.load4, 1
> *  %bf.clear5 = and i8 %bf.lshr, 1*
>   %bf.cast = zext i8 %bf.clear5 to i32
>   %cmp = icmp eq i32 %bf.cast, 1
>   br i1 %cmp, label %if.then, label %if.else
>
> (compiled with:  clang -O0 -emit-llvm -S failing.c -o failing.ll )
>
> I reran passing -debug to llc to see what's happening at various 
> stages of DAG optimization:
>
> clang -O0 -mllvm -debug -S failing.c -o failing.s
>
> The initial selection DAG has the AND op node:
>
>             t22: i8 = srl t19, Constant:i64<1>
> *            t23: i8 = and t22, Constant:i8<1>*
>           t24: i32 = zero_extend t23
>         t27: i1 = setcc t24, Constant:i32<1>, seteq:ch
>       t29: i1 = xor t27, Constant:i1<-1>
>     t31: ch = brcond t18, t29, BasicBlock:ch<if.else 0xa5f8d48>
>   t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98>
>
> The Optimized lowered selection DAG does not contain the*AND* node, 
> but it does have a truncate which would seem to stand in for it given 
> the result is only 1bit wide and the xor following it is operating on 
> 1-bit wide values:
>
> t22: i8 = srl t19, Constant:i64<1>
>         t35: i1 = truncate t22
>       t29: i1 = xor t35, Constant:i1<-1>
>     t31: ch = brcond t18, t29, BasicBlock:ch<if.else 0xa5f8d48>
>   t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98>
>
> Next we get to the Type-legalized selection DAG:
>
>         t22: i8 = srl t19, Constant:i64<1>
>       t40: i8 = xor t22, Constant:i8<1>
>     t31: ch = brcond t18, t40, BasicBlock:ch<if.else 0xa5f8d48>
>   t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98>
>
>  The truncateis now gone.
>
> Next we have the Optimzied type-legalized DAG:
>
>         t22: i8 = srl t19, Constant:i64<1>
>       t43: i8 = setcc t22, Constant:i8<1>, setne:ch
>     t31: ch = brcond t18, t43, BasicBlock:ch<if.else 0xa5f8d48>
>   t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98>
>
> The***xor*has been replaced with a *setcc*. The legalized selection 
> DAG is essentially the same. As is the optimized legalized selection DAG.
>
> So if t19 contains 0b00000110then
> t22 contains 0b00000011
> setccthen compares t22with a constant 1 and since they're not equal 
> (setne) it sets bit 0 of t43.
> brcond will then test bit 0 of t43 and since it's set it branches to 
> the else branch (prints FAIL in this case)
>
> If instead t22 contained 0b00000001 (as would be the case if the mask 
> was still there) the setcc would find both values to compare equal and 
> since setne is specified the branch in brcondwill not be taken (the 
> correct behavior)
>
> Things seem to have gone wrong when the Type-legalized selection DAG 
> was optimized and the *xor *node was changed to a *setcc *(and 
> actually, the *xor*seems like it was more optimal than the *setcc 
> *anyway)*. *
>
>  Any ideas about why this is happening?

I would suggest starting with DAGTypeLegalizer::PromoteIntOp_BRCOND, I 
think...

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161222/2720d736/attachment.html>


More information about the llvm-dev mailing list