[PATCH] Simplify switch table lookup if a linear mapping is possible

Fri Nov 14 11:12:46 PST 2014

On Fri, Nov 14, 2014 at 2:50 AM, Erik Eckstein <eeckstein at apple.com> wrote:
> Sorry, I sent the previous mail too early :-)
>
> Here is the new version:
>
>
>
>
>
> The way I would do this is I would first look at two values to figure
> out the offset and multiplier, and then iterate through the rest of
> the values and see if they fit.
>
>
> Basically a good idea, but it would require to duplicate the checks, so IMO
> it would make things more complicated.

I'm not sure what you mean by "duplicate the checks", but I guess my
approach would be more inefficient since it would be doing
multiplication to check that each table element fits the linear
relation. Just measuring the distance, as your patch does, seems more
efficient.

> even in the case of overflows
>
>
> Overflows are not a problem. I added a test case for it.

Great.

>  It will also make handling undefs easier, i.e. we can just ignore them.
>
>
> I don't agree. This is only true if you assume that the first two values are
> not undefs.
>
> I decided to not handle undefs because according to my findings, they are
> very seldom in switch tables (even in non-linear ones).
> In the whole test-suite there is only a single case. IMO it's not worth the
> additional complexity it would add to the code.

OK, you've convinced me :)

> @@ -3765,6 +3812,16 @@
>    switch (Kind) {
>      case SingleValueKind:
>        return SingleValue;
> +    case LinearMapKind: {
> +      // Derive the result value from the input value.
> +      Value *Result = Builder.CreateIntCast(Index, LinearMultiplier->getType(),
> +                                            false, "switch.idx.cast");
> +      if (!LinearMultiplier->isOne())
> +        Result = Builder.CreateMul(Result, LinearMultiplier, "switch.idx.mult");
> +      if (!LinearOffset->isZero())
> +        Result = Builder.CreateAdd(Result, LinearOffset, "switch.offset");

Thanks! Can you please add tests that cover the cases that don't
require multiplier and/or offset? Besides that, the patch looks good
to me. Feel free to commit with added tests.

 - Hans

> On 13 Nov 2014, at 22:29, Hans Wennborg <hans at chromium.org> wrote:
>
> On Thu, Nov 13, 2014 at 12:20 PM, Erik Eckstein <eeckstein at apple.com> wrote:
>
> Thanks for reviewing!
>
> Do you have any numbers on how often this fires in a Clang bootstrap
> or other benchmark?
>
>
> 29 in the test-suite (including External).
>
>
> Nice.
>
> This enum is sorted in a "cheapest first" order, which matches the
> order that the SwitchLookupTable tries things. LinearMapKind should go
> before ArrayKind.
>
>
> In this case I would put it even before BitMapKind.
>
>
> Sounds good.
>
> I think undef is probably the only possible case here, right?
>
>
> I'm not sure.
>
> For undef, it shouldn't matter what result we return (someone please
> correct me if I'm wrong here), so it would be nice if we could ignore
> undef values when trying to find the linear relationship. If it makes
> things more complicated we could do it in a follow-up patch.
>
>
> Let's do it in a follow-up patch. I'll add a TODO.
>
> Should we handle the cases where the multiplier is 1 or offset is 0
> and generate simpler code for that?
>
>
> I didn't do it to keep the code simpler. But if there is a good reason to
> simplify it here and not in follow-up optimizations, it's not a problem.
>
>
> Yeah, I think we should do it here.
>
>
> A few more thoughts about the patch:
>
> +  // Check if we can derive the value with a linear transformation from the
> +  // table index.
> +  IntegerType *ValueIntType = dyn_cast<IntegerType>(ValueType);
> +  if (ValueIntType) {
> +    bool LinearMappingPossible = true;
> +    APInt PrevVal;
> +    APInt DistToPrev;
> +    // Check if there is the same distance between two consecutive values.
> +    for (uint64_t I = 0; I < TableSize; ++I) {
> +      ConstantInt *ResConst = dyn_cast<ConstantInt>(TableContents[I]);
> +      if (!ResConst) {
> +        // E.g. if it is an undef.
> +        LinearMappingPossible = false;
> +        break;
> +      }
> +      APInt Val = ResConst->getValue();
> +      if (I > 0) {
>
>
> Not a big deal, but checking "I != 0" seems like a more clear way to
> check that it's not the first iteration.
>
>
> +        APInt Dist = Val - PrevVal;
> +        if (I == 1) {
> +          DistToPrev = Dist;
> +        } else if (Dist != DistToPrev) {
> +          LinearMappingPossible = false;
> +          break;
> +        }
> +      }
> +      PrevVal = Val;
> +    }
>
>
> The way I would do this is I would first look at two values to figure
> out the offset and multiplier, and then iterate through the rest of
> the values and see if they fit.
>
> I think that might be a little easier to read, and it would also make
> it more obvious that all the results are correct even in the case of
> overflows etc., because we've checked the same computation that will
> happen at run-time. It will also make handling undefs easier, i.e. we
> can just ignore them.
>
>
> Thanks,
> Hans
>
> On 13 Nov 2014, at 19:40, Hans Wennborg <hans at chromium.org> wrote:
>
> On Thu, Nov 13, 2014 at 7:26 AM, Erik Eckstein <eeckstein at apple.com> wrote:
>
> Hi Hans,
>
> this patch adds a simple optimization for switch table lookup: It computes
> the output value directly with an (optional) mul and add if there is a
> linear mapping between index and output. Example:
>
>
> Thanks! This has been on my mental todo list for switch transformation
> stuff for a while, so I'm very happy to review it.
>
> Do you have any numbers on how often this fires in a Clang bootstrap
> or other benchmark?
>
> Index: lib/Transforms/Utils/SimplifyCFG.cpp
> ===================================================================
> --- lib/Transforms/Utils/SimplifyCFG.cpp (revision 221882)
> +++ lib/Transforms/Utils/SimplifyCFG.cpp (working copy)
> @@ -70,6 +70,7 @@
>    cl::desc("Hoist conditional stores if an unconditional store precedes"));
>
> STATISTIC(NumBitMaps, "Number of switch instructions turned into bitmaps");
> +STATISTIC(NumLinearMaps, "Number of switch instructions turned into linear
> mapping");
> STATISTIC(NumLookupTables, "Number of switch instructions turned into lookup
> tables");
> STATISTIC(NumLookupTablesHoles, "Number of switch instructions turned into
> lookup tables (holes checked)");
> STATISTIC(NumSinkCommons, "Number of common instructions sunk down to the
> end block");
> @@ -3663,7 +3664,12 @@
>
>      // The table is stored as an array of values. Values are retrieved by
> load
>      // instructions from the table.
> -      ArrayKind
> +      ArrayKind,
> +
> +      // For tables where there is a linear relationship between table
> index
> +      // and values. We calculate the result with a simple multiplication
> +      // and addition instead of a table lookup.
> +      LinearMapKind
>
>
> This enum is sorted in a "cheapest first" order, which matches the
> order that the SwitchLookupTable tries things. LinearMapKind should go
> before ArrayKind.
>
>
>    } Kind;
>
>    // For SingleValueKind, this is the single value.
> @@ -3673,6 +3679,10 @@
>    ConstantInt *BitMap;
>    IntegerType *BitMapElementTy;
>
> +    // For LinearMapKind, these are the constants used to derive the value.
> +    Constant *ValueOffset;
> +    Constant *Multiplier;
>
>
> Maybe LinearOffset and LinearMultiplier, to tie the names even tighter
> to the LinearMapKind?
>
> +
>    // For ArrayKind, this is the array.
>    GlobalVariable *Array;
>  };
> @@ -3685,7 +3695,7 @@
>                                     Constant *DefaultValue,
>                                     const DataLayout *DL)
>    : SingleValue(nullptr), BitMap(nullptr), BitMapElementTy(nullptr),
> -      Array(nullptr) {
> +      ValueOffset(nullptr), Multiplier(nullptr), Array(nullptr) {
>  assert(Values.size() && "Can't build lookup table without values!");
>  assert(TableSize >= Values.size() && "Can't fit values in table!");
>
> @@ -3730,6 +3740,42 @@
>    return;
>  }
>
> +  // Check if we can derive the value with a linear transformation from the
> +  // table index.
> +  IntegerType *ValueIntType = dyn_cast<IntegerType>(ValueType);
> +  if (ValueIntType) {
>
>
> In LLVM style it's more common to do:
>
> if (IntegerType *ValueIntType = dyn_cast<...
>
> This saves a line and reduces the scope of ValueIntType to only the if
> statement, which is where it's used.
>
> +    bool LinearMappingPossible = true;
> +    APInt PrevVal;
> +    APInt DistToPrev;
> +    // Check if there is the same distance between two consecutive values.
> +    for (uint64_t I = 0; I < TableSize; ++I) {
> +      ConstantInt *ResConst = dyn_cast<ConstantInt>(TableContents[I]);
>
>
> ResConst feels a little backwards to me. Why not ConstRes, as in
> "constant result"?
>
> +      if (!ResConst) {
> +        // E.g. if it is an undef.
>
>
> I think undef is probably the only possible case here, right?
>
> For undef, it shouldn't matter what result we return (someone please
> correct me if I'm wrong here), so it would be nice if we could ignore
> undef values when trying to find the linear relationship. If it makes
> things more complicated we could do it in a follow-up patch.
>
> +        LinearMappingPossible = false;
> +        break;
> +      }
> +      APInt Val = ResConst->getValue();
> +      if (I > 0) {
> +        APInt Dist = Val - PrevVal;
> +        if (I == 1) {
> +          DistToPrev = Dist;
> +        } else if (Dist != DistToPrev) {
> +          LinearMappingPossible = false;
> +          break;
> +        }
> +      }
> +      PrevVal = Val;
> +    }
> +    if (LinearMappingPossible) {
> +      ValueOffset = TableContents[0];
> +      Multiplier = ConstantInt::get(ValueIntType, DistToPrev);
> +      Kind = LinearMapKind;
> +      ++NumLinearMaps;
> +      return;
> +    }
> +  }
> +
>  // If the type is integer and the table fits in a register, build a bitmap.
>  if (WouldFitInRegister(DL, TableSize, ValueType)) {
>    IntegerType *IT = cast<IntegerType>(ValueType);
> @@ -3802,6 +3848,14 @@
>                                             "switch.gep");
>      return Builder.CreateLoad(GEP, "switch.load");
>    }
> +    case LinearMapKind: {
> +      // Derive the result value from the input value.
> +      Value *Result = Builder.CreateIntCast(Index, Multiplier->getType(),
> false,
> +                                            "switch.idx.cast");
> +      Result = Builder.CreateMul(Result, Multiplier, "switch.idx.mult");
> +      Result = Builder.CreateAdd(Result, ValueOffset, "switch.offset");
>
>
> Should we handle the cases where the multiplier is 1 or offset is 0
> and generate simpler code for that? It seems like it would be easy to
> handle here, so we might as well do it and save some churn.
>
>
>