[llvm-commits] PR12385 PATCH: Missed constant propagation in loops

Stepan Dyatkovskiy STPWORLD at narod.ru
Thu May 3 00:21:26 PDT 2012


Hello Evan.
>
> What do you mean by this?
>
>>>>  By now whis optimization is integrated into SCCP. But may be it is
>>>>   better to make separated pass. What do you think?
>
> Do you mean sccp already does this?

Sorry. It was a copypaste typo. Now all works in instcombine. Forget about SCCP.

> Other questions:
> 1. What's the compile time impact?

Let me few hours. I'll present compile time impact and tests for you.

> 2. How frequently does this fire when you build and run the llvm test suite?
Hm.. Now I'm not sure what you mean. At least it allows to optimize cases like this:
int i;

void foo (void);
void bar (void)
{
  int j;
  i = 0;
  for (j = 0; j < 10000; j++)
    if (i)
      foo ();
}

>
> I'm somewhat skeptical that instcombine is the right home for this. However, I don't have suggestions as for where it belong.
Currently I found absolutely the same optimization in instcombine, but current version is restricted by a single BB. It doesn't combine instructions from different BBs.
>
> Dan, can you review this? Comments?
>
> Evan
>
> On Apr 28, 2012, at 12:38 PM, Stepan Dyatkovskiy <STPWORLD at narod.ru> wrote:
>
>>  ping.
>>
>>  -Stepan
>>
>>  26.04.2012, 11:37, "Stepan Dyatkovskiy" <stpworld at narod.ru>:
>>>  ping
>>>  Stepan Dyatkovskiy wrote:
>>>>   Hi all. I solved PR12385 issue: Missed constant propagation in loops.
>>>>   LLVM and clang can't optimize next case:
>>>>   int i;
>>>>   void bar();
>>>>   int main(){
>>>>   i = 0;
>>>>   int j;
>>>>   for(j = 0; j < 10000; ++j) {
>>>>   if (i)
>>>>   bar();
>>>>   i = 0;
>>>>   }
>>>>   return 0;
>>>>   }
>>>>
>>>>   Ideally it should be folded to "int main() { return 0; }"
>>>>
>>>>   I improved Instruction Combining pass, so now it is possible to combine
>>>>   store + load instructions from different basic blocks.
>>>>
>>>>   The concept is next.
>>>>   We need to scan BB graph. And if we found "X = load Addr" instruction
>>>>   predecessed with the same "store C, Addr" instructions in all its
>>>>   predecessors, we can replace the "load" with the stored value ("C").
>>>>
>>>>   [store C, Addr] [store C, Addr] [store C, Addr]
>>>>   \ | /
>>>>   ------- | -------
>>>>   \ | /
>>>>   [X = load Addr] ==> X := C
>>>>
>>>>   There are some restrictions:
>>>>   1. Store/Load address should be non-volatile and it should be constant.
>>>>   2. The parent of "load" may be predecessed with some BB's that are still
>>>>   was not scanned (loops and BB's that marked as non-executable). In that
>>>>   case we should mark this load's as waiting until all predecessors will
>>>>   resolved; and after all executable edges and requested loops will
>>>>   scanned we can determine possibility of replacement.
>>>>   3. If some predecessor chain doesn't contains any "store" or it contains
>>>>   "store" with the same address and different value, the "load" inst could
>>>>   NOT be converted to constant then:
>>>>
>>>>   [store A, Addr] [store C, Addr]
>>>>   \ /
>>>>   ------- -------
>>>>   \ /
>>>>   [X = load Addr] ==> could NOT be replaced with a constant.
>>>>
>>>>   Data structure:
>>>>   For tracking we will use TrackedStoreInsts map, that has format:
>>>>   map<BB, StoreInstsInfo>
>>>>   where
>>>>   StoreInstsInfo is struct {
>>>>   set<BB> PendingBB;
>>>>   map<Address, StoreInst> Stores;
>>>>   }
>>>>
>>>>   Algorithm:
>>>>
>>>>   Go over all executable edges, collect stores info and propogate it
>>>>   through the graph:
>>>>
>>>>   1. Go over executable edges
>>>>   and for each BB do the next:
>>>>   2. If BB contains predecessors that was not scanned before, we can't
>>>>   determine possibility of replacement "load" with "constant" in current
>>>>   BB, since we need to scan all predecessors before. So, add this
>>>>   predecessor to the TrackedStoreInsts[BB].PendingBB set.
>>>>   3. Scan the BB instructions.
>>>>   3.1. If we found "store" instruction and it uses constant non-volatile
>>>>   address, and it stores constant value, save it in
>>>>   TrackedStoreInsts[BB].Stores.
>>>>   3.2. If we found "X = load Addr".
>>>>   3.2.1. If TrackedStoreInsts[BB].Stores doesn't contains "store C, Addr"
>>>>   ignore it.
>>>>   3.2.2. If TrackedStoreInsts[BB].PendingBBs is not empty(), scan each
>>>>   PendingBB, and if it was already tracked - remove it
>>>>   from PendingBBs.
>>>>   3.2.3. If TrackedStoreInsts[BB].Stores contains "store Addr, C"
>>>>   3.2.3.1. If TrackedStoreInsts[BB].PendingBBs is empty, then replace
>>>>   "load" with "C".
>>>>   3.2.3.2. If rackedStoreInsts[BB].PendingBBs is not empty, mark load as
>>>>   still unresolved.
>>>>   4. If we reached TerminatorInst, propogate stores info for each its
>>>>   successor then:
>>>>   4.1. Add all TrackedStoreInsts[BB].PendingBBs to the
>>>>   TrackedStoreInsts[Successor].PendingBBs.
>>>>   4.2. Perform set intersection of TrackedStoreInsts[BB].Stores and
>>>>   TrackedStoreInsts[Successor].Stores and save it in the last one.
>>>>   5. After all executable edges was scanned, resolve all remained "load"s:
>>>>   get load's BB, and if we found corresponent "store" in
>>>>   TrackedStoreInsts[BB].Stores, then it is possible to replace load with
>>>>   constant, otherwise zap it.
>>>>
>>>>   P.S.: By now whis optimization is integrated into SCCP. But may be it is
>>>>   better to make separated pass. What do you think?
>>>>
>>>>   -Stepan
>>  _______________________________________________
>>  llvm-commits mailing list
>>  llvm-commits at cs.uiuc.edu
>>  http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits



More information about the llvm-commits mailing list