[LLVMdev] X86TargetLowering::LowerToBT
Chris Sears
chris.sears at gmail.com
Mon Jan 19 10:29:51 PST 2015
Looking at the Intel Optimization Reference Manual, page 14-14, for Atom
BT m16, imm8, BT mem, imm8 latency 2,1 throughput 1
BT m16, r16, BT mem, reg latency 10, 9, throughput 8
BT reg, imm8, BT reg, reg latency 1, throughput 1
On C-26 they lower that throughput to 0.5 clock cycle for BT reg, imm8.
The posted functions were simplified for tracking down the code generation
problem. In general, the comparison between using BTQ reg,imm vs SHRQ/ANDQ
for bit testing is even worse because you have to MOVE the tested reg to a
temporary before the SHRQ/ANDQ. And all of these instructions require a REX
prefix (well, not the AND). The result is some code bloat (3 instructions
vs 1) and a little register pressure.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150119/ef0da9ac/attachment.html>
More information about the llvm-dev
mailing list