[PATCH] D48606: [X86] Use bts/btr/btc for single bit set/clear/complement of a variable bit position

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 27 00:19:40 PDT 2018


craig.topper added a comment.

The 16-bit BTR fails to match because the 'and' got promoted to 32-bit and the rotate didn't. We need to fix the promotion of the rotate. I don't think we should try to pattern match the bit width mismatch. In reality, C type promotion rules make it likely the IR for an "unsigned short" case is already in i32 before we even get to the backend so its probably not a huge issue. So I don't think that should hold up this patch. I'm happy to add a FIXME and/or file a bug.



================
Comment at: test/CodeGen/X86/btc_bts_btr.ll:513-517
+; X64-NEXT:    movl $-2, %eax
+; X64-NEXT:    movl %esi, %ecx
+; X64-NEXT:    roll %cl, %eax
+; X64-NEXT:    andl (%rdi), %eax
+; X64-NEXT:    retq
----------------
lebedev.ri wrote:
> I wonder if something like
> ```
> mov (%rdi), %edi
> btr %esi, %edi
> ```
> would be better still than not folding at all?
Probably, but its not exactly easy to do. Tablegen generates the match order by ranking how many SDNodes are covered by the pattern. The regular memory pattern for and/or/xor covers more nodes so gets higher priority. To override the priority you have to add an AddedComplexity line to the pattern. But I worry that significantly bumping the priority of this pattern to override the load pattern may have other effects and require other priorities to be adjusted. I might me being overly cautious, but I'd like to keep the simple approach. gcc doesn't do this when there is a memory op either.


https://reviews.llvm.org/D48606





More information about the llvm-commits mailing list