[llvm] [InstCombine] Make indexed compare fold GEP source type independent (PR #71663)
Yingwei Zheng via llvm-commits
llvm-commits at lists.llvm.org
Sun Nov 26 18:00:41 PST 2023
dtcxzyw wrote:
> I saw performance regressions with this patch. Please give me some time to check the artifacts. PR Link: [dtcxzyw/llvm-ci#784](https://github.com/dtcxzyw/llvm-ci/pull/784) Artifacts: https://github.com/dtcxzyw/llvm-ci/actions/runs/6943611865
In `MultiSource/Benchmarks/MallocBench/cfrac/cfrac`:
```
Files ./artifacts/binaries/1e915a03d89253f1f99ca32f670d4058_bc/seg11.ll and ./artifacts/binaries/fe454fa883b692ec582f94c980b07f2b_bc/seg11.ll differ
function @llvm.memset.p0.i64 exists only in right module
function @llvm.usub.sat.i64 exists only in right module
in function pmul:
in block %191 / %191:
> %192 = tail call ptr @errorp(i32 noundef signext 1, ptr noundef nonnull @.str.llvm.13542161874275324736, ptr noundef nonnull @.str.1.llvm.13542161874275324736) #6
< %192 = tail call ptr @errorp(i32 noundef signext 1, ptr noundef nonnull @.str.llvm.2182144003073642098, ptr noundef nonnull @.str.1.llvm.2182144003073642098) #4
in block %219 / %219:
> %230 = getelementptr inbounds %struct.precisionType.0, ptr %202, i64 0, i32 4
< %230 = getelementptr inbounds %struct.precisionType, ptr %215, i64 0, i32 4
%231 = zext i16 %205 to i64
> %232 = shl nuw nsw i64 %231, 1
> %233 = add nuw nsw i64 %232, 6
> %234 = tail call i64 @llvm.usub.sat.i64(i64 %232, i64 2)
> %235 = sub nsw i64 %233, %234
> %236 = getelementptr i8, ptr %215, i64 %235
> %237 = add nuw nsw i64 %234, 2
> tail call void @llvm.memset.p0.i64(ptr noundef nonnull align 2 dereferenceable(1) %236, i8 0, i64 %237, i1 false), !tbaa !7
> %238 = getelementptr inbounds %struct.precisionType.0, ptr %203, i64 0, i32 4
> %239 = getelementptr inbounds i8, ptr %215, i64 8
> %240 = getelementptr inbounds i16, ptr %238, i64 %231
> %241 = ptrtoint ptr %230 to i64
> %242 = zext i16 %208 to i64
> %243 = getelementptr inbounds i16, ptr %230, i64 %242
< %232 = getelementptr inbounds i16, ptr %230, i64 %231
in block %217 / %217:
> %218 = tail call ptr @errorp(i32 noundef signext 1, ptr noundef nonnull @.str.llvm.13542161874275324736, ptr noundef nonnull @.str.1.llvm.13542161874275324736) #6
< %218 = tail call ptr @errorp(i32 noundef signext 1, ptr noundef nonnull @.str.llvm.2182144003073642098, ptr noundef nonnull @.str.1.llvm.2182144003073642098) #4
in block %387 / %366:
> %367 = icmp ult i32 %360, 65536
< %388 = icmp ult i32 %381, 65536
```
It seems like converting GEP into shl+add may cause regression :(
We should improve `BasicAA` to handle more patterns in future.
But it shouldn't block this patch as we saw performance/code size improvement in some benchmarks.
https://github.com/llvm/llvm-project/pull/71663
More information about the llvm-commits
mailing list