[llvm-commits] LLVM patch to optimize integer ABS idiom for ARM target

Ana Pazos apazos at codeaurora.org
Thu Sep 29 10:01:25 PDT 2011


Hi folks,

 

Find attached the updated patch that optimizes integer ABS idiom for the ARM
target.

 

Let me know what you think.

 

Thank you!

Ana.

 

From: Evan Cheng [mailto:evan.cheng at apple.com] 
Sent: Wednesday, September 28, 2011 5:20 PM
To: Ana Pazos
Cc: 'Eli Friedman'; rajav at codeaurora.org
Subject: Re: [llvm-commits] LLVM patch to optimize integer ABS idiom for ARM
target

 

Hi Ana,

 

The patch looks much cleaner. There are some minor stylistic issues though:

 

+     BuildMI(*SinkBB, SinkBB->begin(), dl,

+          TII->get(ARM::PHI), ABSDstReg)

+    .addReg(NewRsbDstReg).addMBB(RSBBB)

+    .addReg(NewMovDstReg).addMBB(BB);

 

Indentation.

 

+   if (DisableARMIntABS){

+      return NULL;

+   }

 

Please make sure there is a space before '{'. Also note in general LLVM code
doesn't use { } for single statement if - else clauses.

 

Please send the patch to llvm-commits for larger audience. Thanks.

 

Evan

 

On Sep 27, 2011, at 6:01 PM, Ana Pazos wrote:





Hello Evan,

 

I have updated the patch following your comments 1-4 and suggestions 1-2.

 

I have defined separate ARM ABS and t2ABS nodes and have moved the lowering
of these nodes (to movs + rsbmi machine instructions) to pre-RA scheduling
time.

The updated patch only modifies ARM target-dependent files.

 

The "bonuses" features imply changes to LLVM target-independent code which I
am not able to address right away.

 

I have also rerun the llvm/test and llvm/projects/test-suite tests and the
results are attached.

 

Please let me know what you think and if further changes are needed.

 

Note:

- The DAGCombiner lowers SELECT_CC as ASR/ADD/XOR instruction BEFORE each
target has a chance to choose its preferred idiom.

The code change in ARMISelDAGToDAG.cpp is to convert ASR/ADD/XOR
representing integer ABS into an ARM ABS/t2ABS node.

This was done to avoid changing LLVM target-independent code.

 

- ABS nodes take a source operand and a destination operand (V1 = ABS V0).

The comparison against 0 and the predicated computation of the absolute
value are implicit in the node.

That is why the ABS nodes are declared as defining CPSR.

 

Thanks a lot!

Ana.

 

From: Evan Cheng [mailto:evan.cheng at apple.com] 
Sent: Sunday, September 18, 2011 11:22 AM
To: Ana Pazos
Cc: rajav at codeaurora.org
Subject: Re: [llvm-commits] LLVM patch to optimize integer ABS idiom for ARM
target

 

 

On Sep 16, 2011, at 6:04 PM, Ana Pazos wrote:






Hi Evan,

 

Thanks for the great feedback.

 

I think I can immediately improve the patch by addressing  your points 1-4
and suggestions 1-2 and deliver these changes soon.

 

The "bonuses" suggestions imply changes to LLVM target-independent code
which I might not be able to address right away.

 

Would this plan be ok with you?

 

Yes. Thanks.






 

Please see my  comments below.

 

Talk to you again soon!

 

Thanks a lot,

Ana.

 

From: Evan Cheng [mailto:evan.cheng at apple.com] 
Sent: Friday, September 16, 2011 11:32 AM
To: Ana Pazos
Cc: Commit Messages and Patches for LLVM
Subject: Re: [llvm-commits] LLVM patch to optimize integer ABS idiom for ARM
target

 

Hi Ana,

 

Thanks for working on this. It does look like an important peephole
optimization. Unfortunately I think this patch has some problems.

 

1. The instruction selection pattern is looking for a very specific case.
Does it still work if the source is not a function argument? For example, if
it's the return value of a function call, or a result of a computation, I
don't think it will work with this patch. 

[apazos] Yes, it still works if the source is not a function argument. Maybe
the test case confused you.

The ASR/ADD/XOR pattern that implements integer ABS  is searched for in the
DAG and can happen in any place in the function code.

Once it is found, it is replaced with the pseudoABS node using as source
operand the ADD instruction first source operand (which is the same as SRA
instruction first source operand).

Later when pseudoABS node is lowered, it becomes movs/conditional rsbmi
using as source and destination register the pseudoABS instruction
destination register.

So there is nothing in the patch that restricts the operand to be "r0".

 

Ok. I'm just concerned about looking for specific patterns could be fragile
which is why I suggest adding a ABS isel opcode.

 






 

2. MOVrs and RSBccri are not needed. You can expand into MOVr with the
optional def set to CPSR and a RSBri with the predicate operand filled in.

[apazos] Probably my inexperience with *.td file definitions. I will find
out how to use the existing MOVr and RSBri instruction definitions.

 

Ok.






 

3. We want to avoid pseudo instructions that expands into multiple
instructions. As you have noticed, this messes up scheduling. Probably the
right solution is to expand the instructions at pre-RA scheduling time.

[apazos] I understand why you want to avoid pseudo instructions that expand
into multiple instructions. It was my concern too.

I have a version of the patch that implements the optimization at  pre-RA
scheduling time.

For that, I had to add a PHI node in the DAG (to be able to add the
conditional execution path that computes RSB).

The problem with this alternative solution is that LLVM decided to spill
both input operands to the PHI nodes, and the final code was very
inefficient.

I was not using the latest LLVM source code. I will try this solution again
with the LLVM svn tip. Will let you know if I see the problem again.

 

Ok.

 

 

4. It's a bad idea to have a single instruction that's used for both ARM and
Thumb2 mode.

[apazos] There is nothing special we need to do handle Thumb2 mode.

I built EEMBC benchmarks in Thumb2 mode and the ABS optimization worked just
fine.

Note that in Thumb2 mode conditional execution is implemented with IT
instructions.

There is no need to explicitly write IT instructions though.

The assembler/code emitter inserts it for you where necessary changing the
pattern movs rsbmi into movs itmi rsbmi.

So I am not sure we need t2ABSri type of node.

 

You still need pseudo nodes for ARM and Thumb2. We do not allow two modes to
share an opcode.






 

 

My suggestion:

1. ISel should match to an instruction (say ABS, t2ABSri) that's marked with
usesCustomInserter = 1.

2. Add support to lower ABS to MOVr and RSBri with optional def and
predicate operands filled in. Please do the same for the Thumb2 variant.

[apazos] I think I understood what you meant. Let me give it a try.

 

Bonuses:

1. Is this sufficient to generate the best code sequence? e.g.

        movs    r0, r0

        rsbmi   r0, r0, #0

 

This is the best if r0 is a function input. But how about?

        add       r0, r1

        movs    r0, r0

        rsbmi   r0, r0, #0

 

Shouldn't we copy propagate the movs?

        adds     r0, r1

        rsbmi   r0, r0, #0

 

One possibility is the scheduler custom expansion code look for the
instruction that defines the source and tack the optional def on that
instruction. Any other ideas?

 

2. How to make the instruction selection code match more cases? My
suggestion is to add a new target independent opcode ABS. For targets where
this node is legal, i.e. ARM, dag combine can form this instruction rather
than the sra + sub sequence. This way, you can write a simple pattern to
match the instruction instead of C++ selection code.

 

[apazos] So instead of ARM/Thumb-specific DAG nodes create a new target
independent ABS DAG node.

SELECT_CC DAG nodes that represent integer ABS patterns are transformed into
target independent ABS DAG nodes.

And then ARM/Thumb target lowers it as movs/rsbmi and the other targets
lower it as sra + add + xor sequence.

 

No, other targets will not generate ABS dag nodes since it will not be
legal.






The problem with this solution is that it will imply changes to
target-independent LLVM code.

In my current solution I strived to not change any target-independent LLVM
code.

 

I believe this is a less fragile solution.






 

[apazos] Let me give more thought to the 2 bonuses suggestions above.

 

[apazos] But immediately I can address the other points.

 

 

Ok. Thanks.

 

Evan






 

Evan

 

On Sep 15, 2011, at 5:32 PM, Ana Pazos wrote:







Hello,

 

I worked on an LLVM patch to optimize integer ABS idiom for the ARM target
and would like to submit it to your review.

 

I experimented with EEMBC benchmarks, in particular MPEG encoding, and noted
integer ABS computation happens frequently. Significant speed up was
achieved with the  optimized idiom for ARM (20% for MPEG encoding).

 

Patch details:

 

LLVM lowers SELECT_CC nodes that represent an integer ABS pattern into
ASR/ADD/XOR instructions.

 

It is possible to create an optimized machine idiom for integer ABS on ARM
formed by MOVs/RSBmi predicated instructions.

 

This patch modifies ARM-specific files to implement the above optimized
machine idiom for integer ABS.

 

Generation of the optimized integer ABS idiom is turned on by default. To
turn this feature off set -disable-arm-int-abs feature flag.

 

abspatch.diff

Changes to ARM-specific files to implement optimized integer ABS idiom.

 

abstestpatch.diff

ARM/iabs.ll and Thumb/iabs.ll tests check for the non-optimized integer ABS
idiom (ASR/ADD/XOR).

When applying abspatch.diff these tests fail. So patched the tests to set
-disable-arm-int-abs flag to prevent the compiler from generating optimized
integer ABS pattern and allow the non-optimized idiom to be checked.

 

iabsopt.ll

Similar to ARM/iabs.ll and Thumb/iabs.ll tests except that it checks for the
optimized integer ABS idiom and checks for all possible test conditions.

 

failures.txt

Failure report and explanation from running llvm/test and
projects/test-suite on ARM.

I noted failures running llvm/test (svn version 139318) and
projects/test-suite (svn revision 139319) on ARM. Are these failures
expected?

 

Thank you,

Ana.

 

<abspatch.diff><abstestpatch.diff><failures.txt><iabsopt.ll>________________
_______________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

 

 

<abspatch.diff><abstestpatch.diff><failures.txt><iabsopt.ll>

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110929/07aeb28f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: abspatch.diff
Type: application/octet-stream
Size: 8514 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110929/07aeb28f/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: abstestpatch.diff
Type: application/octet-stream
Size: 849 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110929/07aeb28f/attachment-0001.obj>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: failures.txt
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110929/07aeb28f/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: iabsopt.ll
Type: application/octet-stream
Size: 1573 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110929/07aeb28f/attachment-0002.obj>


More information about the llvm-commits mailing list