[PATCH] D78220: IR/ConstantRange: Add support for FP ranges

Wed Apr 15 14:55:23 PDT 2020

jvesely added a comment.

In D78220#1984311 <https://reviews.llvm.org/D78220#1984311>, @nikic wrote:

> I would strongly recommend to implement this as a separate type (FloatRange for example), rather than combining it into ConstantRange. There's a couple reasons for that, the main ones would be a) float ranges have significantly different semantics, in particular the fact that they are inclusive and b) ConstantRange is a size critical structure, because it is used in value lattice elements. As implement, this change is going to skyrocket memory consumption during SCCP.

The point of having one combined range is to allow simplification of logic for users. you don't have to check for the type just do the same operations `union/intersect/binop/contains/..`.
I'm not sure how splitting it to a separate class would address the memory concerns. Tracking fp ranges will require two APFloats irrespective of the structure.
If keeping both `APFloat` and `APint` bounds is a problem I can put them in a union since they are used exclusively and an instance never switches between float and int.
The difference in semantics shouldn't impact users much. they should use `contains/isEmpty/isFull/isSingleElement/getABCrange` instead of checking the bounds manually.
The application to valluelattice/lvi/sccp is in D78224 <https://reviews.llvm.org/D78224>, to keep diff simple it hasn't implemented cleanups that are allowed by having one range class, yet.

> It would also be nice to have some general context on where you plan to use this.

Other than the usual uses of range propagation, floating point ranges can be used to safely apply fast-math optimizations if operands are known to not require special handling
There's already rudimentary support for NaN tracking, using ranges allows to safely apply `nosignedzeros`, `noinfs`, `nonans`, `denorms are zero` optimizations.

moreover,  gpu backends can generate special instructions if there's advanced knowledge of fp operands.
for example ptx backend can use different fdiv implementation based on the range of inputs [0]. amdgpu can eliminate `canonicalize` instructions if it knows the range of inputs[1].

[0] https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-div
[1] https://reviews.llvm.org/D35335

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78220/new/

https://reviews.llvm.org/D78220