[PATCH] D153972: [AArch64] Fold tree of offset loads combine
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 29 00:45:24 PDT 2023
dmgreen added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:18406
+ B.getOperand(1).getOpcode() != ISD::CONCAT_VECTORS ||
+ B.getOperand(1).getNumOperands() != 4)
+ return false;
----------------
SjoerdMeijer wrote:
> Not sure if it deserve a comment, but why the 4 here?
This code is just trying to match the instructions in the above comment, which is the concat of 4 loads through the shuffles. This code is a bit of a shame as we don't reliably visit operands before the root node, and will not revisit the root after optimizing the leaf operands. See D152928, which should fix this so we can rely only on the BUILD_VECTOR code above. That looks like it will take a long time to work through all the regressions though, and in the meantime I didn't think it was best to make something very complex just for it to be removed later.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:18462
+
+ if (Op0.getOpcode() != Op1.getOpcode() || !Op0.hasOneUse() ||
+ !Op1.hasOneUse())
----------------
SjoerdMeijer wrote:
> Nit: this can be checked first in the function as an early exit before doing more work in `isLoadOrMultipleLoads`?
The `Op0.getOpcode() != Op1.getOpcode()` part may be false for loads due to the issue with not reliably simplifying nodes before operands. The full tree we are folding is this, where the LHS and RHS are not equally simplified:
```
t24: i64 = add nuw t22, Constant:i64<4>
t160: f32,ch = load<(load (s32) from %ir.19, align 1, !tbaa !5)> t0, t24, undef:i64
t20: i64 = add nuw t18, Constant:i64<4>
t161: f32,ch = load<(load (s32) from %ir.15, align 1, !tbaa !5)> t0, t20, undef:i64
t16: i64 = add nuw t14, Constant:i64<4>
t162: f32,ch = load<(load (s32) from %ir.11, align 1, !tbaa !5)> t0, t16, undef:i64
t12: i64 = add nuw t2, Constant:i64<4>
t151: f32,ch = load<(load (s32) from %ir.7, align 1, !tbaa !5)> t0, t12, undef:i64
t168: v4f32 = BUILD_VECTOR t160, t161, t162, t151
t167: v16i8 = bitcast t168
t122: v16i16 = zero_extend t167
t25: i64 = add nuw t23, Constant:i64<4>
t139: f32,ch = load<(load (s32) from %ir.20, align 1, !tbaa !5)> t0, t25, undef:i64
t21: i64 = add nuw t19, Constant:i64<4>
t140: f32,ch = load<(load (s32) from %ir.16, align 1, !tbaa !5)> t0, t21, undef:i64
t17: i64 = add nuw t15, Constant:i64<4>
t141: f32,ch = load<(load (s32) from %ir.12, align 1, !tbaa !5)> t0, t17, undef:i64
t13: i64 = add nuw t6, Constant:i64<4>
t130: f32,ch = load<(load (s32) from %ir.8, align 1, !tbaa !5)> t0, t13, undef:i64
t147: v4f32 = BUILD_VECTOR t139, t140, t141, t130
t146: v16i8 = bitcast t147
t123: v16i16 = zero_extend t146
t124: v16i16 = sub t122, t123
t126: v16i32 = any_extend t124
t72: v16i32 = BUILD_VECTOR Constant:i32<16>, Constant:i32<16>, ...
t73: v16i32 = shl nsw t126, t72
t206: f32,ch = load<(load (s32) from %ir.17, align 1, !tbaa !5)> t0, t22, undef:i64
t207: f32,ch = load<(load (s32) from %ir.13, align 1, !tbaa !5)> t0, t18, undef:i64
t208: f32,ch = load<(load (s32) from %ir.9, align 1, !tbaa !5)> t0, t14, undef:i64
t197: f32,ch = load<(load (s32) from %ir.0, align 1, !tbaa !5)> t0, t2, undef:i64
t214: v4f32 = BUILD_VECTOR t206, t207, t208, t197
t213: v16i8 = bitcast t214
t169: v16i16 = zero_extend t213
t185: f32,ch = load<(load (s32) from %ir.18, align 1, !tbaa !5)> t0, t23, undef:i64
t186: f32,ch = load<(load (s32) from %ir.14, align 1, !tbaa !5)> t0, t19, undef:i64
t187: f32,ch = load<(load (s32) from %ir.10, align 1, !tbaa !5)> t0, t15, undef:i64
t176: f32,ch = load<(load (s32) from %ir.2, align 1, !tbaa !5)> t0, t6, undef:i64
t193: v4f32 = BUILD_VECTOR t185, t186, t187, t176
t192: v16i8 = bitcast t193
t170: v16i16 = zero_extend t192
t171: v16i16 = sub t169, t170
t172: v16i32 = sign_extend t171
t74: v16i32 = add nsw t73, t172
```
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D153972/new/
https://reviews.llvm.org/D153972
More information about the llvm-commits
mailing list