[llvm-bugs] [Bug 51732] New: InstCombine incorrectly optimizes bit mask operations

Fri Sep 3 05:41:33 PDT 2021

https://bugs.llvm.org/show_bug.cgi?id=51732

            Bug ID: 51732
           Summary: InstCombine incorrectly optimizes bit mask operations
           Product: new-bugs
           Version: 10.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: pfcittolin at gmail.com
                CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org

Hello everyone,

I noticed a certain pattern of incorrectly optimizing bit mask operations on
integers, specifically at bits located at the end of "byte chunks" (7, 15,
31...).

For example, giving the following

define dso_local i32 @main() {
    %rlo.1 = alloca i1
    store i1 1, i1* %rlo.1

    %1 = load i64, i64* @byte ; @byte = dso_local global i64 30
    %2 = shl i64 1, 30
    %3 = and i64 %1, %2
    %4 = icmp ne i64 %3, 0
    %5 = load i1, i1* %rlo.1
    %6 = and i1 %4, %5
    store i1 %6, i1* %rlo.1
    %7 = load i1, i1* %rlo.1

    %8 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x
i8]* @.str.newline, i64 0, i64 0), i1 %7)

    ret i32 0
}

It optimizes to:

define dso_local i32 @main() local_unnamed_addr #0 {
  %1 = load i64, i64* @byte, align 8
  %2 = and i64 %1, 1073741824
  %3 = icmp ne i64 %2, 0
  %4 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1)
getelementptr inbounds ([4 x i8], [4 x i8]* @.str.newline, i64 0, i64 0), i1
%3)
  ret i32 0
}

However, when bitmasking bit 31, for example:

define dso_local i32 @main() {
    %rlo.1 = alloca i1
    store i1 1, i1* %rlo.1

    %1 = load i64, i64* @byte
    %2 = shl i64 1, 31
    %3 = and i64 %1, %2
    %4 = icmp ne i64 %3, 0
    %5 = load i1, i1* %rlo.1
    %6 = and i1 %4, %5
    store i1 %6, i1* %rlo.1
    %7 = load i1, i1* %rlo.1

    %8 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x
i8]* @.str.newline, i64 0, i64 0), i1 %7)

    ret i32 0
}

It gives me:

define dso_local i32 @main() local_unnamed_addr #0 {
  %1 = load i64, i64* @byte, align 8
  %2 = trunc i64 %1 to i32
  %3 = icmp slt i32 %2, 0
  %4 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1)
getelementptr inbounds ([4 x i8], [4 x i8]* @.str.newline, i64 0, i64 0), i1
%3)
  ret i32 0
}

This holds for any of those bits mentioned before:

%2 = shl i64 1, 7
%3 = and i64 %1, %2
%4 = icmp ne i64 %3, 0

Optimizes to:

%1 = load i64, i64* @byte, align 8
%2 = trunc i64 %1 to i8
%3 = icmp slt i8 %2, 0

And so on...

Using "-print-before-all -print-after-all" on "opt -O3" I narrowed it down to
the following part (here on an another equivalent example):

*** IR Dump Before Combine redundant instructions ***
define dso_local i32 @main() local_unnamed_addr {
  %1 = load i32, i32* @byte
  %2 = and i32 %1, 32768
  %3 = icmp ne i32 %2, 0
  %4 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x
i8]* @.str.newline, i64 0, i64 0), i1 %3)
  ret i32 0
}
*** IR Dump After Combine redundant instructions ***
define dso_local i32 @main() local_unnamed_addr {
  %1 = load i32, i32* @byte, align 4
  %2 = trunc i32 %1 to i16
  %3 = icmp slt i16 %2, 0
  %4 = call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr
inbounds ([4 x i8], [4 x i8]* @.str.newline, i64 0, i64 0), i1 %3)
  ret i32 0
}

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210903/50b954a3/attachment-0001.html>