[llvm-bugs] [Bug 44704] New: Miscompile by DSE due to bug in BasicAA

via llvm-bugs llvm-bugs at lists.llvm.org
Wed Jan 29 03:43:46 PST 2020


            Bug ID: 44704
           Summary: Miscompile by DSE due to bug in BasicAA
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Global Analyses
          Assignee: unassignedbugs at nondot.org
          Reporter: suc-daniil at yandex.ru
                CC: llvm-bugs at lists.llvm.org

On this IR:
define i8* @foo() {
  %var1 = call i8* @calloc(i64 10000, i64 1)
  br label %loop

loop:                                        ; preds = %entry, %loop
  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
  %iv.next = add i64 %iv, 1
  %addr11 = getelementptr i8, i8* %var1, i64 %iv.next
  store i8 1, i8* %addr11
  %addr18 = getelementptr i8, i8* %var1, i64 %iv
  store i8 0, i8* %addr18
  %var9 = icmp ugt i64 %iv, 1000
  br i1 %var9, label %exit, label %loop

exit:                                        ; preds = %loop
  ret i8* %var1

declare noalias i8* @calloc(i64, i64)

This command: opt -aa-pipeline=basic-aa -passes=aa-eval
-print-all-alias-modref-info -S input.ll

Says that %addr11 and %addr18 don't alias, while actually they do.
Because of this, DSE eliminates (store i8 0, i8* %addr18), since store of 0 to
memory allocated by calloc that doesn't alias with anything between the store
and the allocation is redundant.

So this command:
opt -aa-pipeline=basic-aa -passes=dse -S -o /dev/null -debug-only=dse input.ll

Produces this output:
DSE: Remove null store to the calloc'ed object:
  DEAD:   store i8 0, i8* %addr18
  OBJECT:   %var1 = call i8* @calloc(i64 10000, i64 1)

If anyone is interested, C++ reproducer looks like this (reproducible with -O2
since clang 3.8):
#include <iostream>
#include <cstdlib>

int *foo(int arr_size) {
    int *arr = (int*)calloc(arr_size, sizeof(int));
    for (int i = 0; i < arr_size - 1; ++i) {
        arr[i + 1] = 1;
        arr[i] = 0;
    return arr;

int main() {
    int arr_size = 10;
    int *arr = foo(arr_size);
    for (int i = 0; i < arr_size; ++i)
        std::cout << arr[i] << ' ';

It prints
0 1 1 1 1 1 1 1 1 1 

instead of
0 0 0 0 0 0 0 0 0 1

Here's what I found:
For two GEPs with the same base pointer following logic in BasicAA leads to
this bug:
1. Decompose both GEPs into (Base + (var-offsets + const-offset))
2. Find difference in offsets. In this case GEPs are (%var1 + %iv) and (%var1 +
%iv + 1), so difference is 1.
3. If the difference is >= size of one element, pointers don't alias.

Obviously, it doesn't work for cases when loop PHIs are parts of offsets.
Function aliasSameBasePointerGEPs tries to analyze such cases, but in this case
its answer is MayAlias, so we proceed to analyze till we reach the incorrect
check mentioned above.
I'm not really familiar with BasicAA, so any ideas on the best way to fix it
are welcome. Based on what I know I only can conservatively disable this buggy
analysis for the case when PHIs are involved.

You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200129/493547a5/attachment.html>

More information about the llvm-bugs mailing list