[LLVMdev] IndVarSimplify too aggressive ?

Mon Mar 14 11:27:27 PDT 2011

Thanks Eli,

After digging thru mail archives & bugzilla, it seems fixing properly this issue would require a major change in the selectionDAG code --- to have it operate on a per function basis instead of per basic-block.

This however, does not seem to be the only issue. The following C code does not produce an efficicient assembly sequence either.

extern void f(unsigned long long v);

void test2()
{
  for (unsigned i=0; i<512; i++)
    f(i);
}

The resulting .ll out of clang looks reasonnable (with and without the patch), but the arm assembly output looks ugly, though marginally better with my patch : the induction variable should be counting up, and it could be zero extended before the call to f. This again points to Isel, but to a different area, as everything is taking place in the same BB.

Is this some known issue ? I could not find a bug report matching this.

--
Arnaud de Grandmaison

-----Original Message-----
From: Eli Friedman [mailto:eli.friedman at gmail.com]
Sent: Sunday, March 13, 2011 11:08 PM
To: Arnaud Allard de Grandmaison
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] IndVarSimplify too aggressive ?

On Sun, Mar 13, 2011 at 5:01 PM, Arnaud Allard de Grandmaison
<Arnaud.AllardDeGrandMaison at dibcom.com> wrote:
> Hi all,
>
> The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types.
>
> I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users.
>
> Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases.
>
> The patch could probably be made smarter : I am welcoming all suggestions.

It's worth pointing out that LoopStrengthReduce is doing essentially
the same transformation.  The only reason the generated code is
improved at all with your change is that ISel has a longstanding issue
where it can't conclude that the upper half of zext i32 %x to i64 is
zero if the zext is in a different block from the user of the zext.

-Eli
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2.s.wo_patch.arm
Type: application/octet-stream
Size: 447 bytes
Desc: test2.s.wo_patch.arm
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110314/3e93ccc4/attachment.obj>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test2.c
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110314/3e93ccc4/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2.ll.w_patch.arm
Type: application/octet-stream
Size: 744 bytes
Desc: test2.ll.w_patch.arm
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110314/3e93ccc4/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2.ll.wo_patch.arm
Type: application/octet-stream
Size: 676 bytes
Desc: test2.ll.wo_patch.arm
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110314/3e93ccc4/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2.s.w_patch.arm
Type: application/octet-stream
Size: 440 bytes
Desc: test2.s.w_patch.arm
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110314/3e93ccc4/attachment-0003.obj>