[LLVMdev] X86TarIgetLowering::LowerToBT

Kreitzer, David L david.l.kreitzer at intel.com
Fri Jan 23 14:06:30 PST 2015


When targeting modern processors, it will usually be best to use “BT reg, imm” for testing bits 32-63. The alternative implementations will all use multiple instructions, will encode larger, and will usually run slower.  For testing bits 0-31, “TEST reg, imm” is preferred unless you are looking to minimize code size at the expense of performance, in which case you would still want to use BT in the cases where it encodes smaller. As Fiona pointed out, there are some processors where TEST has slightly better low level performance properties than BT.


Regarding the partial EFLAGS write, modern OOO processors independently rename the carry flag, et al, so this is no longer a problem. I would have to check with the processor architects to figure out the exact processor generation where this problem was first fixed, but it was roughly a decade ago.  Steve’s Agner Fog quote, “BT, BTC, BTR, and BTS change the carry flag but leave the other flags unchanged. This causes a false dependence on the previous value of the flags and costs an extra μop. Use TEST, AND, OR and XOR instead of these instructions.”, was in reference to the Pentium 4.



FWIW, the Intel compiler itself doesn’t quite get all the “test bit” sequences right either. We will use a 64-bit TEST for testing bits 0-30, but we ought to be using a 32-bit test to avoid the REX byte. Similarly, for testing bit 31, we should use a 32-bit TEST rather than the “BT reg, 31” that we will currently generate. I intend to get those cases fixed.

Also, this thread hasn’t been focusing on the “BT reg, reg”, and “BT[CSR] reg, reg” instruction forms, but these are good instructions to use where possible. The only instructions in the BT family that you really want to avoid at all costs are the memory forms. You never want to generate those. The multi-instruction expansions will almost always be faster.

David Kreitzer
IA-32/Intel64 Code Generation
Intel Compilers

From: Smith, Kevin B
Sent: Friday, January 23, 2015 1:03 PM
To: Chris Sears; Stephen Canon
Cc: LLVM Developers Mailing List; Kreitzer, David L
Subject: RE: [LLVMdev] X86TarIgetLowering::LowerToBT

I’ll be happy to run it for you.  Do you want Intel64, x86 or both?  The Intel compiler doesn’t have a –Oz option.  It has –Os and –O[123].

Also, FWIW, one of the Intel compiler experts on BT will comment on this thread, and on our rules for BT usage later this afternoon.

Kevin B. Smith

From: llvmdev-bounces at cs.uiuc.edu<mailto:llvmdev-bounces at cs.uiuc.edu> [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chris Sears
Sent: Friday, January 23, 2015 9:37 AM
To: Stephen Canon
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] X86TargetLowering::LowerToBT

Constant mask case.

Sanjay, could you run this through the Intel compiler with the appropriate flags?
They have an -O2 but I couldn't find an equivalent -Oz.
For LLVM, it generates BTQ for testing bits 32-63.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150123/7b54d5d2/attachment.html>


More information about the llvm-dev mailing list