[PATCH] D91174: [Analysis] Introduce a new InstructionCost class

Mon Nov 30 05:40:30 PST 2020

ctetreau added a comment.

I personally feel very strongly that we should define a total ordering here as is currently being done. @nhaehnle says that this behaves like nan, and that this is bad, and that it runs the risk of users relying on this total ordering. I disagree with all three of these concerns.

1. It behaves like nan - This is false because we have a //documented //total ordering, and floats only have a partial ordering with respect to nan.

  float a = [something that returns nan];
  float b = [something that returns a different nan];

  // ignoring epsilon issues for the purposes of demonstration

  if (a < b) {
    // unreachable
  } else if (a > b) {
    // also unreachable
  } else if (a == b) {
    // also unreachable
  } else if (a != b) {
    // also unreachable
  } else {
    llvm_unreachable("we shouldn't be able to get here, but we do");
  }

  auto a = InstructionCost::getInvalid(1);
  auto b = InstructionCost::getInvalid(2);

  if (a < b) {
    // we will always hit this branch
  } else if (a > b) {
    // also unreachable
  } else if (a == b) {
    // also unreachable
  } else if (a != b) {
    // also unreachable
  } else {
    llvm_unreachable("actually unreachable");
  }

2. "behaving like nan" is undesirable - Float with nan and this InstructionCost are basically the same as `llvm::Optional<float or InstructionCost>`. Unfortunately, working with optionals in C++ is kind of annoying due to a lack of nice language features like pattern matching or anything resembling `do` blocks in Haskell. To work around this, we add arithmetic operators and relations that let the user just treat two `InstructionCost ` objects as plain numbers. The user will be forced to handle the invalid case when they call `getValue()`, so there's no risk of "NaN getting in your matrix and causing the screen to go black."

3. having a total ordering runs the risk of users accidentally relying on it - This isn't a "risk", it's a feature. I requested a huge comment describing how the total ordering works, this makes it a guarantee provided by the class that it will continue to work this way. Unit tests have been added to ensure that it works this way. It is explicitly intended that users will use it this way.

To address some other issues raised in the thread:

- InstructionCost's relations don't behave correctly with respect to infinities. While this is true, I don't think that this is an issue. An invalid cost is not an "infinite cost". If you think about what an "infinite cost" even means, it means "a thing that is always the most costly thing to do". Why would you want such a thing? As mentioned in the commit message, it's typically used to prevent some optimization from happening. The hope is that if you assign a cost that's high enough, the system will never choose to do this thing. Really, the user isn't doing math with infinity, they are trying to set a cost to invalid. Additionally, this thing is basically a pair of bool and int. int doesn't actually have an inifinity, it has INT_MAX, and int doesn't have correct arithmetic with respect to "infinity". If we want to change the arithmetic operators to detect integer overflow, and to saturate at INT_MIN and INT_MAX, I would be ok with that. `InstructionCost(INT_MAX) + InstructionCost(INT_MAX) == InstructionCost(INT_MAX) // true`. I think this is reasonable but it also needs to be documented.

- @sdesmalen mentioned similar issues encountered in `TypeSize`. I don't think this is really the same sort of scenario. `TypeSize` represents a polynomial, but `InstructionCost` is basically just a scalar value. We want to provide the usual arithmetic operators and relations, and there exists a sensible total ordering. `TypeSize` does not have a sensible total ordering because it contains an unknown variable.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91174/new/

https://reviews.llvm.org/D91174