[PATCH] D37418: [X86] Use btc/btr/bts to implement xor/and/or that affects a single bit in the upper 32-bits of a 64-bit operation.

Philip Reames via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 4 13:46:33 PDT 2017


reames added a comment.

I ran into this case in a snippet of hot code just Friday, so I'm glad to see someone's working on it.  :)  In the case I had, the register allocator was spilling a register in a hot loop just to free up room for constant.  I hadn't had a chance to actually performance test the tweaked assembly, but I'd be very surprised if the btc variant wasn't faster in that case.

In https://reviews.llvm.org/D37418#859967, @spatel wrote:

> If the btc variant always reduces register pressure, that could be a good reason to choose it in isel?


Another option here would be to treat this as a register allocator rematerialization problem.  We have precedent for doing that for loads which can be folded into instructions, and the "fold constant into instruction" pattern we see here seems somewhat similar in spirit.

Out of the other options, I can't see a late pattern match to convert and/constant form to btc working out very well.  In particular, the spilling will have already been done and reversing that seems complicated.  Starting with a btc and expanding to the constant/and pattern might be a bit better, but then I worry about missed hoisting opportunities in loops with registers live over them which could have been spill/filled outside the loop.

On a vaguely related topic, has anyone looked at the possible benefit of producing single bit constants via an 8 bit move immediate and a shift immediate?  If I'm remembering correctly, it should encode smaller.


https://reviews.llvm.org/D37418





More information about the llvm-commits mailing list