[llvm-bugs] [Bug 31138] New: [AArch64] Miscompile - zeroing of high order bits omitted during bfi

Wed Nov 23 08:47:10 PST 2016

https://llvm.org/bugs/show_bug.cgi?id=31138

            Bug ID: 31138
           Summary: [AArch64] Miscompile - zeroing of high order bits
                    omitted during bfi
           Product: new-bugs
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: pirama at google.com
                CC: arnaud.degrandmaison at arm.com, james.molloy at arm.com,
                    kristof.beyls at arm.com, llvm-bugs at lists.llvm.org,
                    silviu.baranga at arm.com, srhines at google.com
    Classification: Unclassified

Created attachment 17638
  --> https://llvm.org/bugs/attachment.cgi?id=17638&action=edit
Files to reproduce issue

Clang seems to be miscompile function decompressETC2Block in
tcuCompressedTexture.cpp [1] (which seems to happen between r256229 and
r271374).  The crux of the issue is miscompilation of the sequence (starting
line 597):

const deUint8   B1a             = (deUint8)getBit(src, 51);
const deUint8   B1b             = (deUint8)getBits(src, 47, 49);
...
baseB[0]                = extend4To8((deUint8)((B1a << 3) | B1b));
...
paintB[1]               = (deUint8)deClamp32((int)baseB[0] - dist, 0, 255);

This gets compiled to: (from attached broken_annotated.asm)

2171:  // x10 = src >> 51
2172:    5c:    d373fc2a        lsr     x10, x1, #51

2436:  // x13 = src >> 47
2437:   3ac:    d36ffc2d        lsr     x13, x1, #47

2461:  // before: w13 = (src >> 47) (line 2437)
2462:  // before: w10 = (src >> 51) (line 2172)
2463:  // w13[3:+1] = w10[0:1]
2464:  // w13 = (w13 & ~(1<<3)) | ((w10 & 1) << 3)
2465:  // w13 = (B1a << 3) | B1b
2466:  // PROBLEM: bits 4..31 are garbage - see later at line 2502
2467:   3c8:    331d014d        bfi     w13, w10, #3, #1

2500:  // w13 = extend4To8(w13)
2501:  // w13 = extend4To8((B1a << 3) | B1b) (baseB[0])
2502:   3ec:    331c0dad        bfi     w13, w13, #4, #4
2503:  // PROBLEM: bits 8..31 are garbage - see later at line 2581

2580:  // w13 = baseB[0] - dist
2580:   448:    4b0a01ad        sub     w13, w13, w10
2580:  // PROBLEM: operating as 32-bit value but bits 8..31 of w13 are garbage

The above asm was generated from the complete source file.  I've attached the
.ii from a reduced version of the same file (so the addresses and registers are
a bit different).

To compile it, run:
clang++ -c -O2 -mcpu=cortex-a53 -target aarch64-linux-android -fPIC  -o
deqp_reduced.o deqp_reduced.ii -save-temps

[1]
https://android.googlesource.com/platform/external/deqp/+/master/framework/common/tcuCompressedTexture.cpp

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20161123/394b9362/attachment.html>