<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/137168>137168</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Missed GEP Optimization for Constant Index
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
GINN-Imp
</td>
</tr>
</table>
<pre>
The following reduced IR is derived from https://github.com/dtcxzyw/llvm-opt-benchmark/blob/314b5d859bb1d19cb93c259e57edafaf11d4fc80/bench/abseil-cpp/original/bounded_utf8_length_sequence_test.ll#L8889
The reduced code is still a bit long, but we've pruned out extranous code as much as possible. Maybe we can reduce it further if we know some of the root causes.
missed optimization: `%66 = getelementptr inbounds nuw [2 x i64], ptr %0, i64 0, i64 %65` -> `%66 = getelementptr inbounds nuw [2 x i64], ptr %0, i64 0, i64 0`
https://godbolt.org/z/G3xf9rh8W
reduced code:
```llvm
define i64 @src(ptr noundef nonnull align 8 dereferenceable(16) %0, i32 noundef %1) {
%5 = alloca i32, align 4
%16 = alloca i32, align 4
store i32 %1, ptr %5, align 4
%21 = load i32, ptr %5, align 4
%22 = icmp uge i32 %21, 64
br i1 %22, label %23, label %24
23: ; preds = %3
store i32 63, ptr %5, align 4
br label %24
24: ; preds = %23, %3
br label %31
31: ; preds = %30, %27
%32 = load i32, ptr %5, align 4
%33 = udiv i32 %32, 32
store i32 %33, ptr %16, align 4
br label %57
57: ; preds = %31
%58 = load i32, ptr %16, align 4
%59 = icmp ugt i32 %58, 0
br i1 %59, label %61, label %60
60: ; preds = %57
ret i64 1
61: ; preds = %57
%63 = load i32, ptr %16, align 4
%64 = sub i32 %63, 1
%65 = zext i32 %64 to i64
%66 = getelementptr inbounds nuw [2 x i64], ptr %0, i64 0, i64 %65
%67 = load i64, ptr %66, align 8
ret i64 %67
}
```
clang-trunk:
```llvm
define i64 @src(ptr noundef nonnull readonly align 8 captures(none) dereferenceable(16) %0, i32 noundef %1) local_unnamed_addr #0 {
%.not = icmp ult i32 %1, 32
br i1 %.not, label %common.ret, label %3
common.ret: ; preds = %2, %3
%common.ret.op = phi i64 [ %8, %3 ], [ 1, %2 ]
ret i64 %common.ret.op
3: ; preds = %2
%spec.store.select = tail call i32 @llvm.umin.i32(i32 %1, i32 63)
%4 = lshr i32 %spec.store.select, 5
%5 = add nsw i32 %4, -1
%6 = zext nneg i32 %5 to i64
%7 = getelementptr inbounds nuw [2 x i64], ptr %0, i64 0, i64 %6
%8 = load i64, ptr %7, align 8
br label %common.ret
}
```
expected code:
```llvm
define i64 @src_optimized(ptr noundef nonnull readonly align 8 captures(none) dereferenceable(16) %0, i32 noundef %1) local_unnamed_addr #0 {
common.ret:
%.not = icmp ult i32 %1, 32
%2 = load i64, ptr %0, align 8
%spec.select = select i1 %.not, i64 1, i64 %2
ret i64 %spec.select
}
```
alive2 timed out. But `opt -O3` produces the same IR for `@src_optimized` and `@tgt`, perhaps proving that the code before and after the desired optimization is equivalent. https://godbolt.org/z/G3xf9rh8W
(`@src_optimized` is obtained after the `@src` is optimized by clang;
`@tgt` is obtained after `%66 = getelementptr inbounds nuw [2 x i64], ptr %0, i64 0, i64 %65` -> `%66 = getelementptr inbounds nuw [2 x i64], ptr %0, i64 0, i64 0`)
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzUV1tv47gV_jX0C2FBIkVdHvzgTNaDAN3doijQx4ASj2x2KFJLUnaSX1-QsnyLkzbtFGgHA8QSz_185xMPd05uNcAKsQfEHhd89DtjV9-ffvtt-dQPi8aI19Vfd4A7o5Q5SL3FFsTYgsBPf8HSYQFW7kHgzpoe77wfHKJrRDaIbLbS78YmaU2PyEb49uXt9YDIRql9vzSDXzag213P7Q9ENo0yDSIbmuUNExWrmyYTWd02NW0Jq4GVIHjHuywTeddWadAI2ohseONAqmU7DIhsjJVbqbkK52bUAsTz6LvqWYHe-t2zgz9G0C08e3A-UQoR-qeqqmqUrkOOc2atERByc14qhTlupMfK6C0i33AzenwARMo94MGOGgQ2o8fw4i3XZnSTMne4H9td-DsY52SjIMG_8tcG8AFwy_XRF5Yed6P1O7BYduHshzYH7EwP2HTYh6CM8bjlowOXoHTdS-eCz8HLXr5xL41GdI0xKlJEWFFgRB_xFjwo6EH7wVssdayFw3o8YMQeCH7BssgRewwJBQlEWBp-yyLHpx_BHENFipeI_vLfcJAGk-n6BjRGNEb5xNgtIps3RDbf6UtX2131N5SuUbq-bFLQStfBTPwfkIXStYBOaphyyFNnW0SqEISOiOiwNlqPobFKbjWuAoShAxuAwRsFiFRZgUh9DpqSky4iLItn5QNKQ9kJY7EiXCnT8iAbVCbT-SySFf9ExnljITqaHJyqxu5YI1m0pgwXs63PpEmUlm0_4HF7ckKil2ISayyW2SQc3iregIqP9Poxn1pAaITcl_4h-oAHC8LFaBBhNHg-513Qz_Jo7J0o8hDFO7tTzLODK02aTZo0u6tJ06MmKefiUfKVUlMapUch93OZJx1K3neZXiYcAPdxxqyc4mblz6h7dgJu9VFy78MJ4vUlkPycB6uCbHqNI1ZfAafIrh_TKZ8ivdsHNpXfgo8zfGxacb9p7NSrgn4pncAO9BG7sZkzmSB4Kk8xDfYbvJxyLXLsTaS2Wean8-1suLxIpsgvFIuLZKqrQkW1UKzy8ZIVp_K1iuvt0ttR__jPWNMCF0ar1xN9tnzwowWHSKWNhsCO_w6jBmpUz6PWvAfxzIUI2dL0kmkTbfwFBpW_YszjkJ0wGKSvYNeavjc6sXD9mh4LdD79l4fsPflccc-Vz8QMUWrYyanI7CEIVLMGPuIjvM9mIoovb3p8ZfJIaF_nhXuxH2N2A7RJ5KrEgYJ2qrnnUuGWKzUVPY-4ScZe6iROW3XZi5nR69nmNGzK7ezcs3degh67-agKgbU7zCpxDJbnAT3Pp9awPRHSzYyWP3dEZ7PVRwNa3s7nJZVfoOyDQYWXAVr_tQvO8_E6COJ_amivRupLQzxh_35909v6nsB0Buvx5zURTB-TcyvJ7WRdWPmoO1zJPRDsZT9d_BP8MPpwNzaDx8vfabgvD9aEK6qL13fHewirUmdsvELf9qtIMdfieOS3Pl6zv-EB7I4PLtjah6XL77iP9uJ60UAXbhJBkXcebDwR4KS9WQ3CFgN_jHLPFWif3C5on9-1SfVBwNJh03guw_Jz9n8SnkVmFdy84vj5CZRzLOmc6x1b_6-bDKkXYkVFTWu-gFVW5iyvi6IoFrtVXnKaZS1nVZU2gtUi5YwWGdC6KsuyqhZyRVLC0pzkGcmytEhazmjZpU3T1YKwhqA8hZ5LlUTWNXa7kM6NsMpomRXVIvKLi0s8IRoOOJ4iQsJOb1dx5W7GrQu0LZ13ZzNeegWrX6et8vsvf8a_X8InoPab0c5z7fGTFvCyGK1afbLnR4o6LvmDNX-PxL6J4ThENsd49yvyjwAAAP__kLugyw">