[llvm-dev] Question about supporting zext on IVUsers and LSR
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Mon Nov 29 09:57:37 PST 2021
First, there are no "simple" question about LSR. :)
Second, I wouldn't view your example as an LSR problem, but a failed IR
canonicalization. In the example, we'd try to widen the IV in IndVars,
and LSR would expect the widening to have already been done. I'd take a
look into why we're not widening the IV as your next step.
Philip
On 11/25/21 6:37 AM, Jingu Kang via llvm-dev wrote:
>
> Hi All,
>
> I am looking at a simple example as below.
>
> target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
>
> target triple = "aarch64-unknown-linux-gnu"
>
> %struct.base_s = type { %struct.range, i64, i64, i64*, i32, [4 x i32],
> [274 x %struct.match], i32, i32, i8, i8, i8, i32, i32, i32, [16 x [768
> x i16]], [12 x [16 x i16]], [12 x i16], [12 x i16], [12 x i16], [12 x
> i16], [12 x [16 x i16]], [4 x [64 x i16]], [114 x i16], [16 x i16],
> %struct.length, %struct.length, [4 x [64 x i32]], [4 x [128 x i32]],
> i32, i32, [16 x i32], i32, i32, i32, [4096 x %struct.opt] }
>
> %struct.range = type { i64, i64, i32, i8, i64, i32, i32, [53 x i32],
> [53 x i16*] }
>
> %struct.match = type { i32, i32 }
>
> %struct.length = type { i16, i16, [16 x [8 x i16]], [16 x [8 x i16]],
> [256 x i16], [16 x [272 x i32]], i32, [16 x i32] }
>
> %struct.opt = type { i32, i8, i8, i32, i32, i32, i32, i32, [4 x i32] }
>
> define i32 @test(i32 %len, %struct.base_s* nocapture readonly %obj) {
>
> entry:
>
> br label %while.cond
>
> while.cond: ; preds = %while.cond, %entry
>
> %i.0 = phi i32 [ 0, %entry ], [ %inc, %while.cond ]
>
> %idxprom = zext i32 %i.0 to i64
>
> %len1 = getelementptr inbounds %struct.base_s, %struct.base_s* %obj,
> i64 0, i32 6, i64 %idxprom, i32 0
>
> %0 = load i32, i32* %len1, align 4
>
> %cmp = icmp ult i32 %0, %len
>
> %inc = add i32 %i.0, 1
>
> br i1 %cmp, label %while.cond, label %while.end
>
> while.end: ; preds = %while.cond
>
> ret i32 %i.0
>
> }
>
> I expected the LSR pass extracts the loop invariant part from `%len1 =
> getelementptr` and hoists it to preheader. It could cause a new IV for
> the loop dependent part from gep inside loop and `%0 = load` could use
> it. However, it looks the `IVUsers` does process the `%idxprom =
> zext`. I can see the `SCEVAddRecExpr` and `SCEVAddExpr` are handled in
> `isInteresting` function. It seems LSR pass does not also handle the
> `zext` for `IVChain`. If I remove the `%idxprom = zext` manually on
> above example, I can see LSR works as the expectation. Does anyone
> know why the `zext` is not supported on IVUsers and LSR? Does it make
> LSR difficult to construct formulas and compare them? If I missed
> something, please let me know.
>
> For reference, the assembly output of above example with `-O3` is as
> below.
>
> test:
>
> mov w8, w0
>
> mov w0, #-1
>
> .LBB0_1:
>
> add w0, w0, #1
>
> add x9, x1, w0, uxtw #3
>
> ldr w9, [x9, #724]
>
> cmp w9, w8
>
> b.lo .LBB0_1
>
> Ret
>
> If I remove the `zext`, the output is as below and the loop has one
> less instruction against above output.
>
> test:
>
> add x9, x1, #724
>
> mov x8, #-1
>
> .LBB0_1:
>
> ldr w10, [x9], #8
>
> add x8, x8, #1
>
> cmp w10, w0
>
> b.lo .LBB0_1
>
> mov x0, x8
>
> ret
>
> The IR code, in which the `zext` is removed, is as below.
>
> target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
>
> target triple = "aarch64-unknown-linux-gnu"
>
> %struct.base_s = type { %struct.range, i64, i64, i64*, i32, [4 x i32],
> [274 x %struct.match], i32, i32, i8, i8, i8, i32, i32, i32, [16 x [768
> x i16]], [12 x [16 x i16]], [12 x i16], [12 x i16], [12 x i16], [12 x
> i16], [12 x [16 x i16]], [4 x [64 x i16]], [114 x i16], [16 x i16],
> %struct.length, %struct.length, [4 x [64 x i32]], [4 x [128 x i32]],
> i32, i32, [16 x i32], i32, i32, i32, [4096 x %struct.opt] }
>
> %struct.range = type { i64, i64, i32, i8, i64, i32, i32, [53 x i32],
> [53 x i16*] }
>
> %struct.match = type { i32, i32 }
>
> %struct.length = type { i16, i16, [16 x [8 x i16]], [16 x [8 x i16]],
> [256 x i16], [16 x [272 x i32]], i32, [16 x i32] }
>
> %struct.opt = type { i32, i8, i8, i32, i32, i32, i32, i32, [4 x i32] }
>
> ;define i32 @test(i32 %len, %struct.base_s* nocapture readonly %obj) {
>
> define i64 @test(i32 %len, %struct.base_s* nocapture readonly %obj) {
>
> entry:
>
> br label %while.cond
>
> while.cond: ; preds = %while.cond, %entry
>
> ; %i.0 = phi i32 [ 0, %entry ], [ %inc, %while.cond ]
>
> %i.0 = phi i64 [ 0, %entry ], [ %inc, %while.cond ]
>
> ; %idxprom = zext i32 %i.0 to i64
>
> ; %len1 = getelementptr inbounds %struct.base_s, %struct.base_s*
> %obj, i64 0, i32 6, i64 %idxprom, i32 0
>
> %len1 = getelementptr inbounds %struct.base_s, %struct.base_s* %obj,
> i64 0, i32 6, i64 %i.0, i32 0
>
> %0 = load i32, i32* %len1, align 4
>
> %cmp = icmp ult i32 %0, %len
>
> ; %inc = add i32 %i.0, 1
>
> %inc = add i64 %i.0, 1
>
> br i1 %cmp, label %while.cond, label %while.end
>
> while.end: ; preds = %while.cond
>
> ; ret i32 %i.0
>
> ret i64 %i.0
>
> }
>
> Thanks
>
> JinGu Kang
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211129/f0581341/attachment.html>
More information about the llvm-dev
mailing list