[PATCH] [SeparateConstOffsetFromGEP] Fix bugs and improve the current SeparateConstOffsetFromGEP

Mon Oct 27 00:06:43 PDT 2014

Hi Jingyue,
> First round. 
> 
> I'll fix the two correctness bugs separately. After that, please rebase your diff against the new version. 
Thanks very much for the code review and the commits for bugs.
> I am very curious what CFGSimplificationPass does to generate a = 1 + 2. Does a = 1 + 2 appear at the end of -O3? I don't think leaving that clear misoptimization at the end of -O3 is reasonable. 
{F224254}
I've attached a simple test case "simple.c" can reproduce it as following:
    clang -O3 -S -emit-llvm simple.c
    llc -march=aarch64 < simple.ll -print-after-all (or llc -march=nvptx < simple.ll -print-after-all)
For AArch64 backend We can find such IRs after CFGSimplification pass:
    %inc.1.1 = add nsw i32 2, 1
    ...
    %inc.2.1 = add nsw i32 %inc.1.1, 1
    ...
    %inc.1.2 = add nsw i32 %inc.1.1, 2
Similar IRs can be found in NVPTX backend even though it doesn't call CFGSimplification pass.

Besides that, the test case for OR also requires us to find constants in both operands. InstCombine will never optimize test case like "((a<<2) + 12) | 1".

>Why do you prefer "unsigned &ChainIndex" to "unsigned ChainIndex"? The latter seems to use more space because unsigned& takes the size of a pointer while unsigned only takes 32 bits. Also, the original logic seems clearer.
This is because I need to know the index of the next node. Previously, it find one constant throw a single path, so when it find one constant, the index is 0. But to find constants in both operands of a binary instruction, it needs to traverse throw a tree (As my example in the comments), which is a little different from previous single path. When we find a constant in an operand of a binary instruction, the current index may be not 0, and current single path is end. Then we can go back to the other operand and traverse another single path to find another constant. See the code
    if (BO->getOperand(1) == UserChain[ChainIndex - 1])
    if (ChainIndex != 0 && BO->getOperand(0) == UserChain[ChainIndex - 1])
Both operands are checked and visited. To make sure all nodes have been visited, an assertion for "ChainIndex == 0" is added.

A new patch has been added after modification according to the comments and rebase. Review please.

Thanks,
-Hao

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:129
@@ -130,2 +128,3 @@
   /// Looks for a constant offset without extracting it. The meaning of the
   /// arguments and the return value are the same as Extract.
+  static int64_t Find(Value *Idx, const DataLayout *DL, GetElementPtrInst *GEP,
----------------
jingyue wrote:
> Not the same as Extract any more. 
I just think we don't need to return an int64_t, which is only used to check whether the extraction is successful. As we always need to return a NewIdx, we can check whether the NewIdx is nullptr.

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:739
@@ -685,1 +738,3 @@
   GEP->setIsInBounds(false);
+  // No need to create another GEP as the accumulative byte offset is 0.
+  if (AccumulativeByteOffset == 0)
----------------
jingyue wrote:
> Not quite true. The sum of offsets being 0 doesn't mean every offset is zero. As long as there's a non-zero offset, it's worth creating a new GEP. 
Sorry I still not quite understand. This is after setting new operands stage, which means the old GEP has already been replaced by new indices. The following code is only to create the second GEP with one index whose value is 0. So I still think a GEP with index 0 is unnecessary.

http://reviews.llvm.org/D5863