[llvm-dev] Question about instcombine pass.

Tue Feb 27 02:13:19 PST 2018

Hello, Everyone.

I have a question about llvm's "Combine redundant instructions(instcombine)" pass.

I have tested instcombine pass by writing the following three test cases.
But, CASE3 is not optimized as I expected.
Is this behavior expected?

The version of llvm is:
  clang version 5.0.1 (tags/RELEASE_501/final 325232)

Option of clang command is:
  clang -O1 a.c -S -emit-llvm -o -

TEST Programs:

(CASE1)
This case is optimized as I expected.
----------------------------------
#define LEN 10
int a[LEN], b[LEN], c[LEN], X[LEN];

void foo() {

  int i;

  for (i=1; i<LEN; i++) {
    a[i] += b[i] * c[i];
    a[i] -= b[i] * c[i];
  }
}
----------------------------------
IR.(Excerpt)
----------------------------------
; Function Attrs: norecurse nounwind uwtable define void @foo() local_unnamed_addr #0 {
for.end:
  ret void
}
----------------------------------

(CASE2)
This case is also optimized as I expected.
----------------------------------
#define LEN 10
int a[LEN], b[LEN], c[LEN], X[LEN];

void foo() {

  int i;

  for (i=1; i<LEN; i++) {
    X[i] = X[i-1] * X[i-1];
    a[i] += b[i] * c[i];
    a[i] -= b[i] * c[i];
  }
}
----------------------------------
IR.(Excerpt)
----------------------------------
for.body:                                         ; preds = %for.body, %entry
  %store_forwarded = phi i32 [ %load_initial, %entry ], [ %mul, %for.body ]
  %indvars.iv = phi i64 [ 1, %entry ], [ %indvars.iv.next, %for.body ]
  %mul = mul nsw i32 %store_forwarded, %store_forwarded
  %arrayidx5 = getelementptr inbounds [10 x i32], [10 x i32]* @X, i64 0, i64 %indvars.iv
  store i32 %mul, i32* %arrayidx5, align 4, !tbaa !2
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
  %exitcond = icmp eq i64 %indvars.iv.next, 10
  br i1 %exitcond, label %for.end, label %for.body
----------------------------------

(CASE3)
This case is not optimized as I expected.
I expected that instructions about an array 'a' are removed like the CASE2.
----------------------------------
#define LEN 10
int a[LEN], b[LEN], c[LEN], X[LEN];

void foo() {

  int i;

  for (i=1; i<LEN; i++) {
    a[i] += b[i] * c[i];
    X[i] = X[i-1] * X[i-1];
    a[i] -= b[i] * c[i];
  }
}

----------------------------------
IR.(Excerpt)
----------------------------------
for.body:                                         ; preds = %for.body, %entry
  %store_forwarded = phi i32 [ %load_initial, %entry ], [ %mul10, %for.body ]
  %indvars.iv = phi i64 [ 1, %entry ], [ %indvars.iv.next, %for.body ]
  %arrayidx4 = getelementptr inbounds [10 x i32], [10 x i32]* @a, i64 0, i64 %indvars.iv
  %0 = load i32, i32* %arrayidx4, align 4, !tbaa !2
  %mul10 = mul nsw i32 %store_forwarded, %store_forwarded
  %arrayidx12 = getelementptr inbounds [10 x i32], [10 x i32]* @X, i64 0, i64 %indvars.iv
  store i32 %mul10, i32* %arrayidx12, align 4, !tbaa !2
  store i32 %0, i32* %arrayidx4, align 4, !tbaa !2
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
  %exitcond = icmp eq i64 %indvars.iv.next, 10
  br i1 %exitcond, label %for.end, label %for.body
----------------------------------

Best Regards,
Hiroshi