[llvm-dev] [IndVarSimplify] Narrow IV's are not eliminated resulting in inefficient code
Oleg Ranevskyy via llvm-dev
llvm-dev at lists.llvm.org
Wed Apr 20 16:10:00 PDT 2016
Hi,
Would you be able to kindly check and assist with the IndVarSimplify / SCEV
problem I got in the latest LLVM, please?
Sometimes IndVarSimplify may not eliminate narrow IV's when there actually
exists such a possibility. This may affect other LLVM passes and result in
inefficient code. The reproducing test 'indvar_test.cpp' is attached.
The problem is with the second 'for' loop that accesses array elements with
different indexes on each iteration.
The latest LLVM fails to reuse array element values from previous
iterations and generates an unnecessary GEP. The generated IR is shown in
the attached file 'bad.ir'.
This happens because IndVarSimplify fails to eliminate '%idxprom7' and
'%idxprom10'.
The clang command line we use:
clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf
indvar_test.cpp -o bad.ir
I found that 'WidenIV::widenIVUse' (IndVarSimplify.cpp) may fail to widen
narrow IV uses.
When the function gets a NarrowUse such as '{(-2 +
%inc.lcssa),+,1}<nsw><%for.body3>', it first tries to get a wide recurrence
for it via the 'getWideRecurrence' call.
'getWideRecurrence' returns recurrence like this: '{(sext i32 (-2 +
%inc.lcssa) to i64),+,1}<nsw><%for.body3>', which is fine by itself.
Then a wide use operation is generated by 'cloneIVUser'. The generated wide
use is evaluated to '{(-2 + (sext i32 %inc.lcssa to
i64))<nsw>,+,1}<nsw><%for.body3>', which is different from
'getWideRecurrence' result (please note the position of -2). 'cloneIVUser'
sees the difference and returns nullptr.
I attached a test patch 'indvar.patch', which is not correct for all cases,
but it fixes the specific 'indvar_test.cpp' scenario to demonstrate the
efficient code that could have been generated (good.ir).
It transforms expressions like '(sext i32 (-2 + %inc.lcssa) to i64)' into
'-2 + (sext i32 %inc.lcssa to i64)' making expressions comparison succeed.
IV's are successfully eliminated, which can be seen in the '-mllvm -debug'
output.
The problem with the patch is that it uses wrong extend logic for the '-2'
operand. It must be sign or zero extended depending on the context.
Could you check and confirm the problem, and give any hints how this might
be fixed properly, please?
Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160421/c1d1ee9f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bad.ir
Type: application/octet-stream
Size: 2791 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160421/c1d1ee9f/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: good.ir
Type: application/octet-stream
Size: 1929 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160421/c1d1ee9f/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: indvar.patch
Type: application/octet-stream
Size: 989 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160421/c1d1ee9f/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: indvar_test.cpp
Type: text/x-c++src
Size: 229 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160421/c1d1ee9f/attachment.cpp>
More information about the llvm-dev
mailing list