[llvm] [InstCombine] Make indexed compare fold GEP source type independent (PR #71663)

Sun Nov 26 18:00:41 PST 2023

dtcxzyw wrote:

> I saw performance regressions with this patch. Please give me some time to check the artifacts. PR Link: [dtcxzyw/llvm-ci#784](https://github.com/dtcxzyw/llvm-ci/pull/784) Artifacts: https://github.com/dtcxzyw/llvm-ci/actions/runs/6943611865

In `MultiSource/Benchmarks/MallocBench/cfrac/cfrac`:
```
Files ./artifacts/binaries/1e915a03d89253f1f99ca32f670d4058_bc/seg11.ll and ./artifacts/binaries/fe454fa883b692ec582f94c980b07f2b_bc/seg11.ll differ
function @llvm.memset.p0.i64 exists only in right module
function @llvm.usub.sat.i64 exists only in right module
in function pmul:
  in block %191 / %191:
    >   %192 = tail call ptr @errorp(i32 noundef signext 1, ptr noundef nonnull @.str.llvm.13542161874275324736, ptr noundef nonnull @.str.1.llvm.13542161874275324736) #6
    <   %192 = tail call ptr @errorp(i32 noundef signext 1, ptr noundef nonnull @.str.llvm.2182144003073642098, ptr noundef nonnull @.str.1.llvm.2182144003073642098) #4
  in block %219 / %219:
    >   %230 = getelementptr inbounds %struct.precisionType.0, ptr %202, i64 0, i32 4
    <   %230 = getelementptr inbounds %struct.precisionType, ptr %215, i64 0, i32 4
        %231 = zext i16 %205 to i64
    >   %232 = shl nuw nsw i64 %231, 1
    >   %233 = add nuw nsw i64 %232, 6
    >   %234 = tail call i64 @llvm.usub.sat.i64(i64 %232, i64 2)
    >   %235 = sub nsw i64 %233, %234
    >   %236 = getelementptr i8, ptr %215, i64 %235
    >   %237 = add nuw nsw i64 %234, 2
    >   tail call void @llvm.memset.p0.i64(ptr noundef nonnull align 2 dereferenceable(1) %236, i8 0, i64 %237, i1 false), !tbaa !7
    >   %238 = getelementptr inbounds %struct.precisionType.0, ptr %203, i64 0, i32 4
    >   %239 = getelementptr inbounds i8, ptr %215, i64 8
    >   %240 = getelementptr inbounds i16, ptr %238, i64 %231
    >   %241 = ptrtoint ptr %230 to i64
    >   %242 = zext i16 %208 to i64
    >   %243 = getelementptr inbounds i16, ptr %230, i64 %242
    <   %232 = getelementptr inbounds i16, ptr %230, i64 %231
  in block %217 / %217:
    >   %218 = tail call ptr @errorp(i32 noundef signext 1, ptr noundef nonnull @.str.llvm.13542161874275324736, ptr noundef nonnull @.str.1.llvm.13542161874275324736) #6
    <   %218 = tail call ptr @errorp(i32 noundef signext 1, ptr noundef nonnull @.str.llvm.2182144003073642098, ptr noundef nonnull @.str.1.llvm.2182144003073642098) #4
  in block %387 / %366:
    >   %367 = icmp ult i32 %360, 65536
    <   %388 = icmp ult i32 %381, 65536

```
It seems like converting GEP into shl+add may cause regression :(
We should improve `BasicAA` to handle more patterns in future.
But it shouldn't block this patch as we saw performance/code size improvement in some benchmarks.

https://github.com/llvm/llvm-project/pull/71663