[LLVMdev] X86TargetLowering::LowerToBT

Chris Sears chris.sears at gmail.com
Mon Jan 19 10:29:51 PST 2015


Looking at the Intel Optimization Reference Manual, page 14-14, for Atom

    BT m16, imm8, BT mem, imm8   latency 2,1 throughput 1
    BT m16, r16, BT mem, reg          latency 10, 9, throughput 8
    BT reg, imm8, BT reg, reg           latency 1, throughput 1

On C-26 they lower that throughput to 0.5 clock cycle for BT reg, imm8.

The posted functions were simplified for tracking down the code generation
problem. In general, the comparison between using BTQ reg,imm vs SHRQ/ANDQ
for bit testing is even worse because you have to MOVE the tested reg to a
temporary before the SHRQ/ANDQ. And all of these instructions require a REX
prefix (well, not the AND). The result is some code bloat (3 instructions
vs 1) and a little register pressure.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150119/ef0da9ac/attachment.html>


More information about the llvm-dev mailing list