[llvm-dev] RFC for a design change in LoopStrengthReduce / ScalarEvolution

Sun Aug 16 18:11:17 PDT 2015

----- Original Message -----
> From: "Sanjoy Das via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "llvm-dev" <llvm-dev at lists.llvm.org>, "Andrew Trick" <atrick at apple.com>, "Quentin Colombet"
> <qcolombet at apple.com>, "Dan Gohman" <dan433584 at gmail.com>
> Sent: Sunday, August 16, 2015 7:42:16 PM
> Subject: [llvm-dev] RFC for a design change in LoopStrengthReduce /	ScalarEvolution
> 
> This is related to an issue in loop strength reduction [1] that I've
> been trying to fix on and off for a while.  [1] has a more detailed
> description of the issue and an example, but briefly put, I want LSR
> to consider formulae that have "Zext T" as base and/or scale
> registers, and to appropriately rate such formulae.
> 
> My first attempt[2] at fixing this was buggy and had to be reverted.
> While it is certainly possible to "fix" the bug in [2], the fixed
> version is ugly and fragile, and I'd prefer having something cleaner.
> 
> The key issue here is that currently there is no way to represent
> "Zext T" as a *distinct computation* within SCEV -- a SCEV expression
> of the form "Zext T" (any SCEV expression, actually) is always
> maximally simplified on construction.  This means I cannot "factor
> out" a Zext from a base register (say) and force SCEV to maintain it
> as a "SCEVZeroExtend T"; SCEV is free to re-simplify "SCEVZeroExtend
> T" to something equivalent on construction.  This is not just an
> issue
> in theory -- factoring out a 'SCEVZeroExtend' from an expression 'E'
> requires us to prove that 'E' == 'SCEVZeroExtend T'.  SCEV can use
> that exact same fact to "simplify" 'SCEVZeroExtend T' back to 'E' on
> construction.  There is an analogous issue w.r.t. sign extension
> expressions.

I don't understand why you want to factor out the information, exactly. It seems like what you need is a function like:

  unsigned getMinLeadingZeros(const SCEV *);

then, if you want to get the non-extended expression, you can just apply an appropriate truncation. I assume, however, that I'm missing something.

Adding simplification barriers into SCEV expressions seems like a dangerous idea, especially since LSR is not the only user of SCEV.

 -Hal

> 
> There are two possibilities I see here -- the problem can be solved
> within
> LoopStrengthReduce or within ScalarEvolution.
> 
> # Within LoopStrengthReduce
> 
> Solving it within LoopStrengthReduce is tricky: due to the
> representation issue in SCEV mentioned above, the fact that a scale
> or
> base register is actually a narrower than needed and has to be zero /
> sign extended before use needs to be represented separately from the
> SCEV*.  Every place in LoopStrengthReduce that analyzes and
> manipulates registers needs to be updated to understand this bit of
> side information ([2] was buggy because not every place in LSR had
> been updated this way in the change).
> 
> The cleanest approach I could come up with is to introduce a `class
> Register` that wraps a SCEV and some associated metadata; and change
> LSR to reason about registers using this `Register` class (and not
> raw
> SCEV pointers).  All of the "this i64 register is actually a zero
> extended i32 register" can then be made to live within this
> `Register`
> class, making the structure of the code more obvious.
> 
> # Within ScalarEvolution
> 
> Another way to fix the problem is to introduce a new kind of SCEV
> node
> -- a `SCEVOpaqueExpr`.  `SCEVOpaqueExpr` wraps another SCEV
> expression, and prevents optimizations from optimizing across the
> `SCEVOpaqueExpr` boundary.  This will allow us to keep representing
> registers as SCEV pointers as they are today while solving the zext
> representation problem.  After factoring a SCEV expression 'E' into
> 'SCEVZeroExtend T', 'SCEVZeroExtend T' can be "frozen" by
> representing
> it as 'SCEVZeroExtend (SCEVOpaqueExpr T)'.  Since 'SCEVZeroExtend
> (SCEVOpaqueExpr T)' computes the same value as 'SCEVZeroExtend T' ==
> 'E', there is no need to update the parts of LSR that analyze /
> modify
> registers to preserve correctness.
> 
> Note: this type of "opaque" node also can be used to force the SCEV
> expander to generate a particular sequence of expressions; something
> that is achieved by the "expand(getSCEVUnknown(expand(Op)))" idiom in
> LSR currently.
> 
> 
> 
> So far I'm leaning towards the second approach, but given that it
> would
> be a substantial change (though potentially useful elsewhere), I'd
> like to
> hear what others think before diving in.
> 
> -- Sanjoy
> 
> 
> [1]: http://lists.llvm.org/pipermail/llvm-dev/2015-April/084434.html
> [2]: http://reviews.llvm.org/rL243939
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org         http://llvm.cs.uiuc.edu
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory