<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [ARM] [v7m] Missed optimization - condition check of `do-while` case"
   href="https://bugs.llvm.org/show_bug.cgi?id=46655">46655</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[ARM] [v7m] Missed optimization - condition check of `do-while` case
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>C
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>alex_lop@walla.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>blitzrakete@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
          </td>
        </tr></table>
      <p>
        <div>
        <pre>I've noticed that when compiling the below code with clang for armv7
architecture, using -Os or -Oz optimizations, the produced assembly contains
unneeded `cmp` instruction:

    #include <stdint.h>
    #include <limits.h>

    #define NUM_BITS(type)                  (CHAR_BIT * sizeof(type))

    #define MOST_SIGNIFICANT_ONE(x)         (NUM_BITS(x) - 1 -
(uint32_t)_Generic((x), \
                                             unsigned int       :
__builtin_clz,   \
                                             unsigned long      :
__builtin_clzl,  \
                                             unsigned long long :
__builtin_clzll)(x))

    typedef void (*fn_ptr)(void);

    void boo(uint32_t);

    void foo(uint32_t mask)
    {
        uint32_t idx;

        // it is guarnteed that mask != 0 and mask < 0x40
        do {
            idx = MOST_SIGNIFICANT_ONE(mask);
            boo(idx);
            mask &= ~(1u << idx);
        } while(mask);
    }

The produced assembly is

        .section        __TEXT,__text,regular,pure_instructions
        .syntax unified
        .globl  _foo                    @ -- Begin function foo
        .p2align        1
        .code   16                      @ @foo
        .thumb_func     _foo
    _foo:
        .cfi_startproc
    @ %bb.0:
        push    {r4, r5, r6, r7, lr}
        .cfi_def_cfa_offset 20
        .cfi_offset lr, -4
        .cfi_offset r7, -8
        .cfi_offset r6, -12
        .cfi_offset r5, -16
        .cfi_offset r4, -20
        add     r7, sp, #12
        .cfi_def_cfa r7, 8
        str     r11, [sp, #-4]!
        .cfi_offset r11, -24
        mov     r4, r0
        mov.w   r5, #-2147483648
    LBB0_1:                                 @ =>This Inner Loop Header: Depth=1
        clz     r6, r4
        rsb.w   r0, r6, #31
        bl      _boo
        lsr.w   r0, r5, r6
        bics    r4, r0
        cmp     r4, #0           @ <========= THIS `cmp` instruction is not
needed since `bics` updates the zero flag
        bne     LBB0_1
    @ %bb.2:
        ldr     r11, [sp], #4
        pop     {r4, r5, r6, r7, pc}
        .cfi_endproc
                                            @ -- End function
    .subsections_via_symbols

In this function, the condition for `while(mask)` is checked using `cmp`
assembly instruction but the previous instruction `bics` has already updated
the zero flag for r4.

I tried to reproduce it for armv7a, there was used `bic` instead of `bics`:
<a href="https://godbolt.org/z/kmMis2">https://godbolt.org/z/kmMis2</a>

Clang verion:
  Apple clang version 12.0.0 (clang-1200.0.22.21)
  Target: x86_64-apple-darwin19.5.0
  Thread model: posix

Compilation command: 
clang -arch armv7m -munaligned-access -Wall -Wextra -Wshadow -Wpointer-arith
-Wconversion -Werror -fno-strict-aliasing -Oz -std=c11 -S test.c -o test.s</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>