[PATCH] D114832: [SROA] Improve SROA to prevent generating redundant coalescing operations.
Mingming Liu via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 8 16:06:52 PST 2021
luna added a comment.
thanks for the feedback!
In D114832#3181314 <https://reviews.llvm.org/D114832#3181314>, @aeubanks wrote:
> after running it through -O2:
>
> define i64 @test_struct_of_two_int(i1 zeroext %test, i64 ()* nocapture %p) local_unnamed_addr {
> entry:
> br i1 %test, label %return, label %if.end
>
> if.end: ; preds = %entry
> %call = tail call i64 %p()
> %retval.sroa.3.0.extract.shift = and i64 %call, -4294967296
> %phi.cast = and i64 %call, 4294967295
> br label %return
>
> return: ; preds = %entry, %if.end
> %retval.sroa.3.0 = phi i64 [ %retval.sroa.3.0.extract.shift, %if.end ], [ 0, %entry ]
> %retval.sroa.0.0 = phi i64 [ %phi.cast, %if.end ], [ 0, %entry ]
> %retval.sroa.0.0.insert.insert = or i64 %retval.sroa.0.0, %retval.sroa.3.0
> ret i64 %retval.sroa.0.0.insert.insert
> }
>
> yeah I'm not sure if this is something that only happens with SROA, but it does look very weird
>
> perhaps trying to implement something like
>
> b:
> phi1 = phi [b1, v1] [b2, v2]
> phi2 = phi [b1, v3] [b2, v4]
> r = op phi1, phi2
>
> into
>
> b1:
> n1 = op v1, v2
> b2:
> n2 = op v3, v4
> b:
> r = phi [b1, n1] [b2, n2]
>
> and maybe only if at least one of the ops simplifies away to prevent increasing the number of op instructions
To confirm my understanding, we want the transformation above when one or more `op` could be simplified away because the result of `op` already exists as a variable, so `op` (along with the instructions that generates op parameters) could be eliminated away. Is that roughly the idea?
In particular for the `test_struct_of_two_int` example, `op` is `or`, and
%retval.sroa.3.0.extract.shift = and i64 %call, -4294967296
%phi.cast = and i64 %call, 4294967295
are used to generate `or` parameters. As a result, both could be optimized away.
> when `test_struct_of_two_int` is properly fixed we should add a phase ordering test for it to make sure it fully optimizes correctly if we do go down this route
Also, it sounds the transformation could happen in a pass orthogonal with SROA (possibly after SROA, because PHI are inserted after SROA in this case).
So shall I prototype this transformation in a pass (wondering if inst combine is a good one), and insert the modified pass between SROA and the pass after SROA?
I guess I'm trying to get more instructions where this transformation should happen, around phase ordering.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114832/new/
https://reviews.llvm.org/D114832
More information about the llvm-commits
mailing list