[llvm-commits] two-byte return optimization for AMD

Sun Dec 7 09:29:55 PST 2008

Hi NIck,

The AMD family 10h aka Barcelona optimization guide recommends using
"ret 0" for this purpose, which is a three-byte ret.  The guide for  
earlier AMD
processors does recommend "rep ret" though.  I don't know how much
difference this makes.

addAssemblyEmitter isn't an ideal place to add the pass because it's not
used on the JIT path.  I think it would be appropriate to add a new
TargetMachine hook here.

 > +    MachineInstr &MI = MBB.front();
 > +    if (MI.getOpcode() == X86::RET) {

It might be useful to skip past any labels or IMPLICIT_DEF at the  
beginning
of the block, since those don't produce any actual instructions.

 > +      BuildMI(MBB, MBB.begin(), TII->get(X86::REP_RET));
 > +      MI.eraseFromParent();

You can use MI.setDesc(TII->get(X86::REP_RET)) to change the
instruction in-place.

Dan

On Dec 7, 2008, at 12:19 AM, Nick Lewycky wrote:

> The AMD optimization manual suggests avoiding branches to 'ret'  
> instructions, preferring to emit 'rep; ret'. The attached patch  
> implements this.
>
> This doesn't apply very often across llvm-test. The largest is  
> kimwitu++ which produces 28 of these two-byte rets. The performance  
> impact seems to be less than noise, at least on my system.
>
> Please review!
>
> Nick
> <x86-twobyteret.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits