[PATCH] Reassociate GEP operands for loop invariant code motion
Sanjoy Das
sanjoy at playingwithpointers.com
Mon Apr 20 22:39:11 PDT 2015
On Mon, Apr 20, 2015 at 10:13 PM, Jingyue Wu <jingyue at google.com> wrote:
> I ran `opt -scalar-evolution -analyze` on `simple_licm` and got
>
> Printing analysis 'Scalar Evolution Analysis' for function 'simple_licm':
> Classifying expressions for: @simple_licm
> %i = phi i32 [ 0, %entry ], [ %i.next, %loop ]
> --> {0,+,1}<nuw><nsw><%loop> U: [0,1000000) S: [0,1000000) Exits: 999999
> %idx = add nsw i32 %a, %i
> --> {%a,+,1}<nw><%loop> U: full-set S: full-set Exits: (999999 + %a)
> %idx.sext = sext i32 %idx to i64
> --> (sext i32 {%a,+,1}<nw><%loop> to i64) U: [-2147483648,2147483648) S: [-2147483648,2147483648) Exits: (sext i32 (999999 +
> %a) to i64)
> %arrayidx = getelementptr i32, i32* %input, i64 %idx.sext
> --> ((4 * (sext i32 {%a,+,1}<nw><%loop> to i64)) + %input) U: full-set S: full-set Exits: ((4 * (sext i32 (999999 + %a) to i64
> )) + %input)
> %0 = load i32, i32* %arrayidx
> --> %0 U: full-set S: full-set Exits: <<Unknown>>
> %i.next = add nuw nsw i32 %i, 1
> --> {1,+,1}<nuw><nsw><%loop> U: [1,1000001) S: [1,1000001) Exits: 1000000
> Determining loop execution counts for: @simple_licm
> Loop %loop: backedge-taken count is 999999
> Loop %loop: max backedge-taken count is 999999
>
> As far as I can see,
>
> %idx = add nsw i32 %a, %i
> --> {%a,+,1}<nw><%loop> U: full-set S: full-set Exits: (999999 + %a)
>
> `%idx` has only `<nw>` (self-wrap) flag but not `<nsw>`.
As far as I can tell, this is a missing case in SCEV. You were right
in saying that LSR (through SCEV) does not exploit nsw here.
> I noticed you recently worked on strengthening `ScalarEvolution`'s handling of nsw, which is pretty awesome! Does it address this issue?
As you can tell by empirical evidence, the answer is no. :)
The key issue here is that nsw/nuw/poison-values are on shaky ground
to begin with. SCEV pretends that an nuw/nsw addition that feeds into
a PHI can be assumed to not overflow but depending on how you
interpret the semantics of nuw/nsw in the lang-ref, this may or may
not hold. This is one reason why I've generally tried to make SCEV
more aggressive in inferring no-overflow bits directly from control
flow (i.e. even if nuw/nsw bits are not present in the IR).
The second reason I have had to make SCEV smart about FlagNSW/FlagNUW
in the absence of nsw/nuw in the IR is that the programming language
whose performance I'm interested in does not have UB on integer
overflow, so our frontend does not emit any nuw/nsw operations. :)
-- Sanjoy
More information about the llvm-commits
mailing list