[PATCH] Simplify switch table lookup if a linear mapping is possible

Fri Nov 14 02:50:31 PST 2014

Sorry, I sent the previous mail too early :-)

Here is the new version:

> The way I would do this is I would first look at two values to figure
> out the offset and multiplier, and then iterate through the rest of
> the values and see if they fit.

Basically a good idea, but it would require to duplicate the checks, so IMO it would make things more complicated.

> even in the case of overflows 

Overflows are not a problem. I added a test case for it.

>  It will also make handling undefs easier, i.e. we can just ignore them.

I don't agree. This is only true if you assume that the first two values are not undefs.

I decided to not handle undefs because according to my findings, they are very seldom in switch tables (even in non-linear ones).
In the whole test-suite there is only a single case. IMO it's not worth the additional complexity it would add to the code.

Thanks,
Erik

> On 13 Nov 2014, at 22:29, Hans Wennborg <hans at chromium.org> wrote:
> 
> On Thu, Nov 13, 2014 at 12:20 PM, Erik Eckstein <eeckstein at apple.com <mailto:eeckstein at apple.com>> wrote:
>> Thanks for reviewing!
>> 
>>> Do you have any numbers on how often this fires in a Clang bootstrap
>>> or other benchmark?
>> 
>> 29 in the test-suite (including External).
> 
> Nice.
> 
>>> This enum is sorted in a "cheapest first" order, which matches the
>>> order that the SwitchLookupTable tries things. LinearMapKind should go
>>> before ArrayKind.
>> 
>> In this case I would put it even before BitMapKind.
> 
> Sounds good.
> 
>>> I think undef is probably the only possible case here, right?
>> 
>> I'm not sure.
>> 
>>> For undef, it shouldn't matter what result we return (someone please
>>> correct me if I'm wrong here), so it would be nice if we could ignore
>>> undef values when trying to find the linear relationship. If it makes
>>> things more complicated we could do it in a follow-up patch.
>> 
>> Let's do it in a follow-up patch. I'll add a TODO.
>> 
>>> Should we handle the cases where the multiplier is 1 or offset is 0
>>> and generate simpler code for that?
>> 
>> I didn't do it to keep the code simpler. But if there is a good reason to simplify it here and not in follow-up optimizations, it's not a problem.
> 
> Yeah, I think we should do it here.
> 
> 
> A few more thoughts about the patch:
> 
>> +  // Check if we can derive the value with a linear transformation from the
>> +  // table index.
>> +  IntegerType *ValueIntType = dyn_cast<IntegerType>(ValueType);
>> +  if (ValueIntType) {
>> +    bool LinearMappingPossible = true;
>> +    APInt PrevVal;
>> +    APInt DistToPrev;
>> +    // Check if there is the same distance between two consecutive values.
>> +    for (uint64_t I = 0; I < TableSize; ++I) {
>> +      ConstantInt *ResConst = dyn_cast<ConstantInt>(TableContents[I]);
>> +      if (!ResConst) {
>> +        // E.g. if it is an undef.
>> +        LinearMappingPossible = false;
>> +        break;
>> +      }
>> +      APInt Val = ResConst->getValue();
>> +      if (I > 0) {
> 
> Not a big deal, but checking "I != 0" seems like a more clear way to
> check that it's not the first iteration.
> 
> 
>> +        APInt Dist = Val - PrevVal;
>> +        if (I == 1) {
>> +          DistToPrev = Dist;
>> +        } else if (Dist != DistToPrev) {
>> +          LinearMappingPossible = false;
>> +          break;
>> +        }
>> +      }
>> +      PrevVal = Val;
>> +    }
> 
> The way I would do this is I would first look at two values to figure
> out the offset and multiplier, and then iterate through the rest of
> the values and see if they fit.
> 
> I think that might be a little easier to read, and it would also make
> it more obvious that all the results are correct even in the case of
> overflows etc., because we've checked the same computation that will
> happen at run-time. It will also make handling undefs easier, i.e. we
> can just ignore them.
> 
> 
> Thanks,
> Hans
> 
>>> On 13 Nov 2014, at 19:40, Hans Wennborg <hans at chromium.org <mailto:hans at chromium.org>> wrote:
>>> 
>>> On Thu, Nov 13, 2014 at 7:26 AM, Erik Eckstein <eeckstein at apple.com <mailto:eeckstein at apple.com>> wrote:
>>>> Hi Hans,
>>>> 
>>>> this patch adds a simple optimization for switch table lookup: It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output. Example:
>>> 
>>> Thanks! This has been on my mental todo list for switch transformation
>>> stuff for a while, so I'm very happy to review it.
>>> 
>>> Do you have any numbers on how often this fires in a Clang bootstrap
>>> or other benchmark?
>>> 
>>>> Index: lib/Transforms/Utils/SimplifyCFG.cpp
>>>> ===================================================================
>>>> --- lib/Transforms/Utils/SimplifyCFG.cpp (revision 221882)
>>>> +++ lib/Transforms/Utils/SimplifyCFG.cpp (working copy)
>>>> @@ -70,6 +70,7 @@
>>>>    cl::desc("Hoist conditional stores if an unconditional store precedes"));
>>>> 
>>>> STATISTIC(NumBitMaps, "Number of switch instructions turned into bitmaps");
>>>> +STATISTIC(NumLinearMaps, "Number of switch instructions turned into linear mapping");
>>>> STATISTIC(NumLookupTables, "Number of switch instructions turned into lookup tables");
>>>> STATISTIC(NumLookupTablesHoles, "Number of switch instructions turned into lookup tables (holes checked)");
>>>> STATISTIC(NumSinkCommons, "Number of common instructions sunk down to the end block");
>>>> @@ -3663,7 +3664,12 @@
>>>> 
>>>>      // The table is stored as an array of values. Values are retrieved by load
>>>>      // instructions from the table.
>>>> -      ArrayKind
>>>> +      ArrayKind,
>>>> +
>>>> +      // For tables where there is a linear relationship between table index
>>>> +      // and values. We calculate the result with a simple multiplication
>>>> +      // and addition instead of a table lookup.
>>>> +      LinearMapKind
>>> 
>>> This enum is sorted in a "cheapest first" order, which matches the
>>> order that the SwitchLookupTable tries things. LinearMapKind should go
>>> before ArrayKind.
>>> 
>>> 
>>>>    } Kind;
>>>> 
>>>>    // For SingleValueKind, this is the single value.
>>>> @@ -3673,6 +3679,10 @@
>>>>    ConstantInt *BitMap;
>>>>    IntegerType *BitMapElementTy;
>>>> 
>>>> +    // For LinearMapKind, these are the constants used to derive the value.
>>>> +    Constant *ValueOffset;
>>>> +    Constant *Multiplier;
>>> 
>>> Maybe LinearOffset and LinearMultiplier, to tie the names even tighter
>>> to the LinearMapKind?
>>> 
>>>> +
>>>>    // For ArrayKind, this is the array.
>>>>    GlobalVariable *Array;
>>>>  };
>>>> @@ -3685,7 +3695,7 @@
>>>>                                     Constant *DefaultValue,
>>>>                                     const DataLayout *DL)
>>>>    : SingleValue(nullptr), BitMap(nullptr), BitMapElementTy(nullptr),
>>>> -      Array(nullptr) {
>>>> +      ValueOffset(nullptr), Multiplier(nullptr), Array(nullptr) {
>>>>  assert(Values.size() && "Can't build lookup table without values!");
>>>>  assert(TableSize >= Values.size() && "Can't fit values in table!");
>>>> 
>>>> @@ -3730,6 +3740,42 @@
>>>>    return;
>>>>  }
>>>> 
>>>> +  // Check if we can derive the value with a linear transformation from the
>>>> +  // table index.
>>>> +  IntegerType *ValueIntType = dyn_cast<IntegerType>(ValueType);
>>>> +  if (ValueIntType) {
>>> 
>>> In LLVM style it's more common to do:
>>> 
>>> if (IntegerType *ValueIntType = dyn_cast<...
>>> 
>>> This saves a line and reduces the scope of ValueIntType to only the if
>>> statement, which is where it's used.
>>> 
>>>> +    bool LinearMappingPossible = true;
>>>> +    APInt PrevVal;
>>>> +    APInt DistToPrev;
>>>> +    // Check if there is the same distance between two consecutive values.
>>>> +    for (uint64_t I = 0; I < TableSize; ++I) {
>>>> +      ConstantInt *ResConst = dyn_cast<ConstantInt>(TableContents[I]);
>>> 
>>> ResConst feels a little backwards to me. Why not ConstRes, as in
>>> "constant result"?
>>> 
>>>> +      if (!ResConst) {
>>>> +        // E.g. if it is an undef.
>>> 
>>> I think undef is probably the only possible case here, right?
>>> 
>>> For undef, it shouldn't matter what result we return (someone please
>>> correct me if I'm wrong here), so it would be nice if we could ignore
>>> undef values when trying to find the linear relationship. If it makes
>>> things more complicated we could do it in a follow-up patch.
>>> 
>>>> +        LinearMappingPossible = false;
>>>> +        break;
>>>> +      }
>>>> +      APInt Val = ResConst->getValue();
>>>> +      if (I > 0) {
>>>> +        APInt Dist = Val - PrevVal;
>>>> +        if (I == 1) {
>>>> +          DistToPrev = Dist;
>>>> +        } else if (Dist != DistToPrev) {
>>>> +          LinearMappingPossible = false;
>>>> +          break;
>>>> +        }
>>>> +      }
>>>> +      PrevVal = Val;
>>>> +    }
>>>> +    if (LinearMappingPossible) {
>>>> +      ValueOffset = TableContents[0];
>>>> +      Multiplier = ConstantInt::get(ValueIntType, DistToPrev);
>>>> +      Kind = LinearMapKind;
>>>> +      ++NumLinearMaps;
>>>> +      return;
>>>> +    }
>>>> +  }
>>>> +
>>>>  // If the type is integer and the table fits in a register, build a bitmap.
>>>>  if (WouldFitInRegister(DL, TableSize, ValueType)) {
>>>>    IntegerType *IT = cast<IntegerType>(ValueType);
>>>> @@ -3802,6 +3848,14 @@
>>>>                                             "switch.gep");
>>>>      return Builder.CreateLoad(GEP, "switch.load");
>>>>    }
>>>> +    case LinearMapKind: {
>>>> +      // Derive the result value from the input value.
>>>> +      Value *Result = Builder.CreateIntCast(Index, Multiplier->getType(), false,
>>>> +                                            "switch.idx.cast");
>>>> +      Result = Builder.CreateMul(Result, Multiplier, "switch.idx.mult");
>>>> +      Result = Builder.CreateAdd(Result, ValueOffset, "switch.offset");
>>> 
>>> Should we handle the cases where the multiplier is 1 or offset is 0
>>> and generate simpler code for that? It seems like it would be easy to
>>> handle here, so we might as well do it and save some churn.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141114/fb07c6d4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-switch-linearmap2.patch
Type: application/octet-stream
Size: 7287 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141114/fb07c6d4/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141114/fb07c6d4/attachment-0001.html>