[llvm-commits] two-byte return optimization for AMD
Dan Gohman
gohman at apple.com
Sun Dec 7 09:29:55 PST 2008
Hi NIck,
The AMD family 10h aka Barcelona optimization guide recommends using
"ret 0" for this purpose, which is a three-byte ret. The guide for
earlier AMD
processors does recommend "rep ret" though. I don't know how much
difference this makes.
addAssemblyEmitter isn't an ideal place to add the pass because it's not
used on the JIT path. I think it would be appropriate to add a new
TargetMachine hook here.
> + MachineInstr &MI = MBB.front();
> + if (MI.getOpcode() == X86::RET) {
It might be useful to skip past any labels or IMPLICIT_DEF at the
beginning
of the block, since those don't produce any actual instructions.
> + BuildMI(MBB, MBB.begin(), TII->get(X86::REP_RET));
> + MI.eraseFromParent();
You can use MI.setDesc(TII->get(X86::REP_RET)) to change the
instruction in-place.
Dan
On Dec 7, 2008, at 12:19 AM, Nick Lewycky wrote:
> The AMD optimization manual suggests avoiding branches to 'ret'
> instructions, preferring to emit 'rep; ret'. The attached patch
> implements this.
>
> This doesn't apply very often across llvm-test. The largest is
> kimwitu++ which produces 28 of these two-byte rets. The performance
> impact seems to be less than noise, at least on my system.
>
> Please review!
>
> Nick
> <x86-twobyteret.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list