[llvm-commits] [Patch] Use bitmaps for switch lookup tables that fit in registers

Sean Silva silvas at purdue.edu
Thu Sep 20 10:01:32 PDT 2012


+  switch (Kind) {
+    case SingleValueKind:
+      return SingleValue;
+    case ArrayKind: {
+      Value *GEPIndices[] = { Builder.getInt32(0), Index };
+      Value *GEP = Builder.CreateInBoundsGEP(Array, GEPIndices,
+                                             "switch.gep");
+      return Builder.CreateLoad(GEP, "switch.load");
+    }
+  }
+  llvm_unreachable("Unknown lookup table kind!");

I may be wrong, but my impression is that for fully-determined
switches the unreachable typically gets put in a default case.

   // If all values in the table are equal, this is that value.
-  Constant *SameResult = Results.begin()->second;
+  SingleValue = Values.begin()->second;

I know this is unrelated to this patch, which is just a bit of
refactoring, but is it really necessary to single out the 1-case
switch like this? Will a 1-case lookup table not get optimized away
somewhere else? It just seems weird to single it out like this:
wouldn't it be equally reasonable to single out when there are 2 cases
(convert to `select`)? IIRC in the original thread for the inception
of this switch-to-lookup-table, someone said that maybe we need to
explicitly model this with a "switchselect", the same way that we have
"if" and "select". I think that should be investigated further, so
that these constructs get optimized well.


+      else if (IndexTy->getBitWidth() > MapTy->getBitWidth())
+        ShiftAmt = Builder.CreateTrunc(ShiftAmt, MapTy, "switch.trunc");

Please comment why it is safe to truncate the index here.

+      // Multiply the shift amount by the element width.
+      ShiftAmt = Builder.CreateMul(ShiftAmt,
+                      ConstantInt::get(MapTy, BitMapElementTy->getBitWidth()),
+                                   "switch.shiftamt");

If the bitmap type has "room to spare" for widening the element types,
it will probably be profitable to pad them to a power-of-2 size so
that this multiply will become a shift. Just something to think about
for a future patch. I doubt that it's a big deal, but maybe

+bool SwitchLookupTable::WouldFitInRegister(const TargetData *TD,
+                                           uint64_t TableSize,
+                                           const Type *ElementType) {
+  if (!TD)
+    return false;
+  const IntegerType *IT = dyn_cast<IntegerType>(ElementType);
+  if (!IT)
+    return false;
+  return TD->fitsInLegalInteger(TableSize * IT->getBitWidth());
+}

You have a FIXME about possibly narrowing the types. I think that that
FIXME should be here. Basically you can take the minimum of the bits
needed to represent each value. I'm pretty sure that in practically
all cases, that minimum is going to be smaller than the actual integer
type of the elements, possibly by a lot.

In fact, computing that minimum is sufficiently simple that I think it
could be included in this patch as well (it might require some code
reorganization though, but I think that computing this minimum is a
sufficiently important thing to do that you might as well reorganize
now).

--Sean Silva

On Thu, Sep 20, 2012 at 5:31 AM, Hans Wennborg <hans at chromium.org> wrote:
> Thanks for the comments! New version attached, split into two patches.
>
>  - Hans
>
> On Thu, Sep 20, 2012 at 2:08 AM, Sean Silva <silvas at purdue.edu> wrote:
>>> The patch also refactors the code to put the table-building
>>> functionality in a separate class. I'm happy to break that out into a
>>> separate patch if preferred.
>>
>> Please. It's extremely hard to pick out just the new bitmask-table
>> building code or just the refactoring to a class. It's OK to send two
>> patches in an email (just make sure they have numbered prefixes so
>> that it is unambiguous in what order to apply them).
>
> Done.
>
>> A couple comments:
>>
>> +            const SmallVector<std::pair<ConstantInt*, Constant*>, 4>& Results,
>>
>> With a name like "Results" this could easily be confused (read: it
>> confused me) with being an out parameter.
>
> RIght, let's call it Values instead.
>
>> +/// SwitchLookupTable - This class represents a lookup table that can
>> be used to
>> +/// replace a switch.
>> +class SwitchLookupTable {
>>
>> Please wrap this class in an anonymous namespace as per the coding
>> standards <http://llvm.org/docs/CodingStandards.html#anonymous-namespaces>.
>
> Done.
>
>> +    std::vector<Constant*> TableContents(TableSize);
>>
>> Why vector and not SmallVector?
>
> You're right, SmallVector will work fine here.
>
>> +    SwitchLookupTable Table;
>> +    Table.Init(Mod, TableSize, MinCaseVal, ResultLists[PHI],
>> +               DefaultResults[PHI], TD);
>> [...]
>> +    Value *Result = Table.BuildLookup(TableIndex, Builder);
>>
>> What's the point of having Init? It just seems like an error-prone
>> constructor that might not be called. Maybe better to factor it as one
>> class that "analyzes" the switch, and another that does the building;
>> that way, the type system enforces things that the current approach
>> leaves up to the programmer.
>
> Right, the Init function looks pretty unnecessary here. Removed.
>
>
>
>> On Wed, Sep 19, 2012 at 1:32 PM, Hans Wennborg <hans at chromium.org> wrote:
>>> Hi all,
>>>
>>> The attached batch updates the switch-to-lookup table transformation
>>> in SimplifyCFG to put the lookup tables in a bitmap when they would
>>> fit in a register on the target. This saves some space, and it also
>>> allows for building tables that would otherwise be too sparse.
>>>
>>> One interesting case that this hits is example 7 from
>>> http://blog.regehr.org/archives/320. We currently generate good code
>>> for this when lowering the switch to the selection DAG: we build a
>>> bitmask to decide whether to jump to one block or the other. My patch
>>> will result in the same bitmask, but it removes the need for the jump,
>>> as the return value can just be retrieved from the mask.
>>>
>>> This was part of my original lookup table patch [1], but it was
>>> suggested in review that we should do it separately.
>>>
>>> I tried this on a bootstrap build of Clang, and it reduced the binary
>>> size with 8328 bytes (0.03%).
>>>
>>> The patch also refactors the code to put the table-building
>>> functionality in a separate class. I'm happy to break that out into a
>>> separate patch if preferred.
>>>
>>> Please take a look.
>>>
>>> Thanks,
>>> Hans
>>>
>>>  [1]. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120903/149605.html



More information about the llvm-commits mailing list