[PATCH] Simplify switch table lookup if a linear mapping is possible

Thu Nov 13 12:20:13 PST 2014

Thanks for reviewing!

> Do you have any numbers on how often this fires in a Clang bootstrap
> or other benchmark?

29 in the test-suite (including External).

> This enum is sorted in a "cheapest first" order, which matches the
> order that the SwitchLookupTable tries things. LinearMapKind should go
> before ArrayKind.

In this case I would put it even before BitMapKind.

> I think undef is probably the only possible case here, right?

I'm not sure.

> For undef, it shouldn't matter what result we return (someone please
> correct me if I'm wrong here), so it would be nice if we could ignore
> undef values when trying to find the linear relationship. If it makes
> things more complicated we could do it in a follow-up patch.

Let's do it in a follow-up patch. I'll add a TODO.

> Should we handle the cases where the multiplier is 1 or offset is 0
> and generate simpler code for that?

I didn't do it to keep the code simpler. But if there is a good reason to simplify it here and not in follow-up optimizations, it's not a problem.

I will send an updated patch shortly.

Erik

> On 13 Nov 2014, at 19:40, Hans Wennborg <hans at chromium.org> wrote:
> 
> On Thu, Nov 13, 2014 at 7:26 AM, Erik Eckstein <eeckstein at apple.com> wrote:
>> Hi Hans,
>> 
>> this patch adds a simple optimization for switch table lookup: It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output. Example:
> 
> Thanks! This has been on my mental todo list for switch transformation
> stuff for a while, so I'm very happy to review it.
> 
> Do you have any numbers on how often this fires in a Clang bootstrap
> or other benchmark?
> 
>> Index: lib/Transforms/Utils/SimplifyCFG.cpp
>> ===================================================================
>> --- lib/Transforms/Utils/SimplifyCFG.cpp (revision 221882)
>> +++ lib/Transforms/Utils/SimplifyCFG.cpp (working copy)
>> @@ -70,6 +70,7 @@
>>     cl::desc("Hoist conditional stores if an unconditional store precedes"));
>> 
>> STATISTIC(NumBitMaps, "Number of switch instructions turned into bitmaps");
>> +STATISTIC(NumLinearMaps, "Number of switch instructions turned into linear mapping");
>> STATISTIC(NumLookupTables, "Number of switch instructions turned into lookup tables");
>> STATISTIC(NumLookupTablesHoles, "Number of switch instructions turned into lookup tables (holes checked)");
>> STATISTIC(NumSinkCommons, "Number of common instructions sunk down to the end block");
>> @@ -3663,7 +3664,12 @@
>> 
>>       // The table is stored as an array of values. Values are retrieved by load
>>       // instructions from the table.
>> -      ArrayKind
>> +      ArrayKind,
>> +
>> +      // For tables where there is a linear relationship between table index
>> +      // and values. We calculate the result with a simple multiplication
>> +      // and addition instead of a table lookup.
>> +      LinearMapKind
> 
> This enum is sorted in a "cheapest first" order, which matches the
> order that the SwitchLookupTable tries things. LinearMapKind should go
> before ArrayKind.
> 
> 
>>     } Kind;
>> 
>>     // For SingleValueKind, this is the single value.
>> @@ -3673,6 +3679,10 @@
>>     ConstantInt *BitMap;
>>     IntegerType *BitMapElementTy;
>> 
>> +    // For LinearMapKind, these are the constants used to derive the value.
>> +    Constant *ValueOffset;
>> +    Constant *Multiplier;
> 
> Maybe LinearOffset and LinearMultiplier, to tie the names even tighter
> to the LinearMapKind?
> 
>> +
>>     // For ArrayKind, this is the array.
>>     GlobalVariable *Array;
>>   };
>> @@ -3685,7 +3695,7 @@
>>                                      Constant *DefaultValue,
>>                                      const DataLayout *DL)
>>     : SingleValue(nullptr), BitMap(nullptr), BitMapElementTy(nullptr),
>> -      Array(nullptr) {
>> +      ValueOffset(nullptr), Multiplier(nullptr), Array(nullptr) {
>>   assert(Values.size() && "Can't build lookup table without values!");
>>   assert(TableSize >= Values.size() && "Can't fit values in table!");
>> 
>> @@ -3730,6 +3740,42 @@
>>     return;
>>   }
>> 
>> +  // Check if we can derive the value with a linear transformation from the
>> +  // table index.
>> +  IntegerType *ValueIntType = dyn_cast<IntegerType>(ValueType);
>> +  if (ValueIntType) {
> 
> In LLVM style it's more common to do:
> 
>  if (IntegerType *ValueIntType = dyn_cast<...
> 
> This saves a line and reduces the scope of ValueIntType to only the if
> statement, which is where it's used.
> 
>> +    bool LinearMappingPossible = true;
>> +    APInt PrevVal;
>> +    APInt DistToPrev;
>> +    // Check if there is the same distance between two consecutive values.
>> +    for (uint64_t I = 0; I < TableSize; ++I) {
>> +      ConstantInt *ResConst = dyn_cast<ConstantInt>(TableContents[I]);
> 
> ResConst feels a little backwards to me. Why not ConstRes, as in
> "constant result"?
> 
>> +      if (!ResConst) {
>> +        // E.g. if it is an undef.
> 
> I think undef is probably the only possible case here, right?
> 
> For undef, it shouldn't matter what result we return (someone please
> correct me if I'm wrong here), so it would be nice if we could ignore
> undef values when trying to find the linear relationship. If it makes
> things more complicated we could do it in a follow-up patch.
> 
>> +        LinearMappingPossible = false;
>> +        break;
>> +      }
>> +      APInt Val = ResConst->getValue();
>> +      if (I > 0) {
>> +        APInt Dist = Val - PrevVal;
>> +        if (I == 1) {
>> +          DistToPrev = Dist;
>> +        } else if (Dist != DistToPrev) {
>> +          LinearMappingPossible = false;
>> +          break;
>> +        }
>> +      }
>> +      PrevVal = Val;
>> +    }
>> +    if (LinearMappingPossible) {
>> +      ValueOffset = TableContents[0];
>> +      Multiplier = ConstantInt::get(ValueIntType, DistToPrev);
>> +      Kind = LinearMapKind;
>> +      ++NumLinearMaps;
>> +      return;
>> +    }
>> +  }
>> +
>>   // If the type is integer and the table fits in a register, build a bitmap.
>>   if (WouldFitInRegister(DL, TableSize, ValueType)) {
>>     IntegerType *IT = cast<IntegerType>(ValueType);
>> @@ -3802,6 +3848,14 @@
>>                                              "switch.gep");
>>       return Builder.CreateLoad(GEP, "switch.load");
>>     }
>> +    case LinearMapKind: {
>> +      // Derive the result value from the input value.
>> +      Value *Result = Builder.CreateIntCast(Index, Multiplier->getType(), false,
>> +                                            "switch.idx.cast");
>> +      Result = Builder.CreateMul(Result, Multiplier, "switch.idx.mult");
>> +      Result = Builder.CreateAdd(Result, ValueOffset, "switch.offset");
> 
> Should we handle the cases where the multiplier is 1 or offset is 0
> and generate simpler code for that? It seems like it would be easy to
> handle here, so we might as well do it and save some churn.