[PATCH] Reassociate GEP operands for loop invariant code motion

Mon Apr 20 21:44:52 PDT 2015

> One reason we couldn't simply leverage LSR is that LSR forgets nsw and would miss cases such as `gep input, sext(a +nsw i)` in `simple_licm`.

I'm not sure this is correct.  LSR uses represents "registers" using
normal SCEV expressions.  If SCEV can see that an add recurrence does
not overflow, LSR should see that too.  I've only recently started
looking at LSR so maybe I'm missing some insight here.

> Indvar widening alleviates this nsw issue for lots of architectures. However, it's not a good option for GPU programs again because most GPUs support only i32 natively. If LSR fails to simplify the loop, then indvar widening can negatively affect performance (https://llvm.org/bugs/show_bug.cgi?id=21148) because 64-bit arithmetic is much more expensive than 32-bit. P.S. maybe we can narrow an induction variable back to its original size on LSR failure?

I ran into a similar (if not exactly the same) issue recently:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084465.html
(this is why I've been looking at LSR).

I concluded that the fundamental reason LSR is regressing performance
is that (as mentioned in the email I've linked to) it does not
consider formulae of the form (sext X) or (zext X).  Another way of
saying "narrow an induction variable back to its original size on LSR
failure" is "consider a direct zext (or sext) of the narrowed
induction variable as one possible way to satisfy a use and then do
your normal profit / cost calculation as usual".

-- Sanjoy