[llvm-commits] PATCH: Enable direct selection of bsf and bsr instructions for cttz and ctlz with zero-undef behavior

Chandler Carruth chandlerc at gmail.com
Tue Dec 20 03:25:32 PST 2011


On Mon, Dec 19, 2011 at 12:07 PM, Evan Cheng <evan.cheng at apple.com> wrote:

>
> On Dec 17, 2011, at 2:29 AM, Chandler Carruth wrote:
>
> On Thu, Dec 15, 2011 at 7:52 AM, Stephen Canon <scanon at apple.com> wrote:
>
>> Just for the record, this is in no way unique to AMD.  Agner Fog's tables
>> list BSF/BSR as 10 µops/16 cycles on Atom as well.  BSF is a hazard to be
>> avoided on an unknown x86 processor.
>>
>
> I really wasn't trying to draw generalizations. I've read the same tables.
> =/ I'm not sure what your concerned about here, this patch is orthogonal to
> any work on avoiding these instructions on architectures where they just
> decode to silly microcode.
>
> I'd still really appreciate some review on the actual patch. It's pretty
> simple.
>
>
> The patch looks fine to me.
>

Thanks!

Testing a boot-strap uncovered a pretty silly miscompile; I'd not correctly
modeled the essentially insane semantics of bsr. The fix was spotted by
Benjamin Kramer and he checked my (obvious) fix. I've committed the fixed
patch, with improved tests in r146974. I'll watch the bots just in case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111220/184bd2d6/attachment.html>


More information about the llvm-commits mailing list