[llvm-commits] [llvm] r163302 - in /llvm/trunk: lib/Transforms/Utils/SimplifyCFG.cpp test/Transforms/SimplifyCFG/switch_create.ll test/Transforms/SimplifyCFG/switch_to_lookup_table.ll

Thu Sep 13 02:22:54 PDT 2012

Hi Hans, some further thoughts.

>> +// FIXME: Maybe ConvertLoadToSwitch?
>> +bool CodeGenPrepare::LoadToSwitch(LoadInst *LI) {
>> +  GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(LI->getPointerOperand());
>> +  if (!GEP || !GEP->isInBounds() || GEP->getPointerAddressSpace())
>> +    return false;
>
> rather than trying to decode GEPs like this, there's another possible approach.
> First, determine what the load is from using, say, stripInBoundsOffsets.  As
> below you need to check that this is a constant global G with initial value
> something you can work with.  You should check hasDefinitiveInitializer and
> not hasInitializer by the way.
>
> Let A be the address being loaded from, i.e. LI->getOperand(0).
>
> Lets suppose the initial value C is an array [3 x i32], say { 1, 2, 3 }.
> Basically the idea is to output something like this:
>
> if (A == &C[0])
>    val = 1;
> else if (A == &C[1])
>    val = 2;
> else if (A == &C[2])
>    val = 3;
> else
>    val = undef
>
> I.e. the idea is that don't need to care how A is computed from C, all that
> matters is which element is being accessed.
>
> For the above code to be correct though there are some things to check:
>
> (0) the load should be loading an i32 (the element type of C).
>
> (1) the load should be 4 byte (i32) aligned, otherwise it might be loading
> part of one array element and part of the next.  The global G had better
> also be aligned at least this much otherwise the alignment of the load
> isn't telling you anything useful!

You can actually do the transform even if things aren't aligned.  Suppose you
only know that the addresses involved are 1 byte aligned.  Then you can do:
   if (A == G)
     val = constant you get when you load 4 bytes starting from byte 0;
   else if (A == G + 1)
     val = constant you get when you load 4 bytes starting from byte 1;
   else if ...
      ...
   else if (A == G + 8)
     val = constant you get when you the last 4 bytes;
   else
     val = undef;

If you know that everything is two byte aligned then you can skip every second
case.  If you know that they are four byte aligned, then you only need to keep
every 4'th case, and so on.

>
> At this point you know that the load must be just reading one of the elements of
> the array C, or it is outside of C and thus the value of the load is undefined.
>
> Then you can compute the addresses of the elements of C using a ConstantExpr
> GEP on the global G, and output a series of compares with the address A as
> above.
>
> The downside of this approach is that you can't output a switch because the
> addresses &C[0] etc aren't explicit numbers.

Actually, you can output a switch.  You can compute Idx = (A - G) / 4, the index
into the array, and then switch on Idx.  The division by 4 (size of i32) can be
a right shift because the addresses are aligned.  In the usual case in which A
= G + 4 * I (coming from a gep on G with index I) the DAG combiner should
simplify the computation of Idx and realize that it is the same as I.

In the more general case I mentioned above where things are not known to be
aligned as well as you would like then you can still use a switch (with more
cases).  You have to form Idx = (A - G) / known_alignment, and you get more
cases due to the fact that you are indexing into sub-parts of the array (those
separated by known_alignment bytes).

>
> The upside is that it should work in great generality.

And now in even more generality!

Ciao, Duncan.

>
> Ciao, Duncan.
>
>> +
>> +  if (GEP->getNumIndices() != 2)
>> +    return false;
>> +  Value *FirstIndex = GEP->idx_begin()[0];
>> +  ConstantInt *FirstIndexInt = dyn_cast<ConstantInt>(FirstIndex);
>> +  if (!FirstIndexInt || !FirstIndexInt->isZero())
>> +    return false;
>> +
>> +  Value *SecondIndex = GEP->idx_begin()[1];
>> +  assert(isa<IntegerType>(SecondIndex->getType()));
>> +  IntegerType *SecondIndexIntTy = cast<IntegerType>(SecondIndex->getType());
>> +
>> +  GlobalVariable *GV = dyn_cast<GlobalVariable>(GEP->getPointerOperand());
>> +  if (!GV || !GV->isConstant() || !GV->hasInitializer())
>> +    return false;
>> +
>> +  // FIXME: This could also be a ConstantArray.
>> +  ConstantDataArray *Arr = dyn_cast<ConstantDataArray>(GV->getInitializer());
>> +  if (!Arr)
>> +    return false;
>> +
>> +  assert(Arr->getElementType() == LI->getType());
>> +
>> +  BasicBlock *OriginalBB = LI->getParent();
>> +  BasicBlock *PostSwitchBB = OriginalBB->splitBasicBlock(LI);
>> +
>> +  IRBuilder<> Builder(PostSwitchBB->begin());
>> +  PHINode *PHI = Builder.CreatePHI(LI->getType(), Arr->getNumElements());
>> +
>> +  Builder.SetInsertPoint(OriginalBB->getTerminator());
>> +  SwitchInst *Switch = Builder.CreateSwitch(SecondIndex, PostSwitchBB,
>> +                                            Arr->getNumElements() - 1);
>> +  OriginalBB->getTerminator()->eraseFromParent();
>> +
>> +  // FIXME: We should be more clever in choosing the default case.
>> +  PHI->addIncoming(Arr->getAggregateElement(0U), OriginalBB);
>> +
>> +  for (unsigned I = 1; I < Arr->getNumElements(); ++I) {
>> +    BasicBlock *BB = BasicBlock::Create(PostSwitchBB->getContext(),
>> +                                        "lookup.bb",
>> +                                        PostSwitchBB->getParent(),
>> +                                        PostSwitchBB);
>> +    Switch->addCase(ConstantInt::get(SecondIndexIntTy, I), BB);
>> +    Builder.SetInsertPoint(BB);
>> +    Builder.CreateBr(PostSwitchBB);
>> +    PHI->addIncoming(Arr->getAggregateElement(I), BB);
>> +  }
>> +
>> +  LI->replaceAllUsesWith(PHI);
>> +  LI->eraseFromParent();
>> +
>> +  if (!GEP->hasNUsesOrMore(1))
>> +    GEP->eraseFromParent();
>> +
>> +  CurInstIterator = Switch;
>> +  return true;
>> +}
>
>