[PATCH] [AArch64] Lower sdiv x, pow2 using add + select + shift.

James Molloy james at jamesmolloy.co.uk
Sun Jul 13 11:58:56 PDT 2014


Hi Silviu,

Indeed - my testing showed that the branched version was no faster than the
csel version on an OoO core anyway, so I was not advocating the branched
solution.

James


On 10 July 2014 11:26, Silviu Baranga <silviu.baranga at gmail.com> wrote:

> James,
>
> It seems to me that the branch mispredict cost for the case where the
> values of X are random would outweigh the benefits of this transformation
> for your alternative code sequence, even on OoO cores.
>
> I don't think it would entirely ok to make that assumption here (X >= 0
> predictable).
>
> This point obviously doesn't matter for the csel solution.
>
> Thanks,
> Silviu
>
> http://reviews.llvm.org/D4438
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140713/de65db11/attachment.html>


More information about the llvm-commits mailing list