[PATCH] D141653: [X86] Improve instruction ordering of constant `srl/shl` with `and` to get better and-masks

Fri Jan 13 18:23:49 PST 2023

pengfei added inline comments.

================
Comment at: llvm/test/CodeGen/X86/bitreverse.ll:538
-; X64-NEXT:    andb $8, %al
-; X64-NEXT:    leal (%rdi,%rdi), %ecx
-; X64-NEXT:    andb $4, %cl
----------------
IIRC, LEA is expensive, so this looks like a good deal. 

================
Comment at: llvm/test/CodeGen/X86/bmi-x86_64.ll:131-132
 }
+;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; BEXTR-SLOW: {{.*}}
----------------
Remove `BEXTR-SLOW` to eliminate the message.

================
Comment at: llvm/test/CodeGen/X86/btc_bts_btr.ll:988
 ; X86-NEXT:    movzbl {{[0-9]+}}(%esp), %ecx
+; X86-NEXT:    andb $7, %cl
 ; X86-NEXT:    shlb $2, %cl
----------------
One more `andb`

================
Comment at: llvm/test/CodeGen/X86/btc_bts_btr.ll:1011
 ; X86-NEXT:    movzbl {{[0-9]+}}(%esp), %ecx
+; X86-NEXT:    andb $7, %cl
 ; X86-NEXT:    shlb $2, %cl
----------------
ditto.

================
Comment at: llvm/test/CodeGen/X86/combine-bitreverse.ll:238
 ; X86-NEXT:    movl %ecx, %eax
-; X86-NEXT:    andl $5592405, %eax # imm = 0x555555
-; X86-NEXT:    shll $6, %ecx
-; X86-NEXT:    andl $-1431655808, %ecx # imm = 0xAAAAAA80
-; X86-NEXT:    shll $8, %eax
+; X86-NEXT:    shrl %eax
+; X86-NEXT:    andl $22369621, %eax # imm = 0x1555555
----------------
One more `shrl`?

================
Comment at: llvm/test/CodeGen/X86/combine-bitreverse.ll:289
+; X64-NEXT:    shll $4, %ecx
+; X64-NEXT:    shrl $4, %eax
+; X64-NEXT:    andl $135204623, %eax # imm = 0x80F0F0F
----------------
ditto.

================
Comment at: llvm/test/CodeGen/X86/combine-bitreverse.ll:327
 ; X86-NEXT:    andl $357913941, %ecx # imm = 0x15555555
-; X86-NEXT:    andl $-1431655766, %eax # imm = 0xAAAAAAAA
+; X86-NEXT:    shrl %eax
+; X86-NEXT:    andl $1431655765, %eax # imm = 0x55555555
----------------
ditto.

================
Comment at: llvm/test/CodeGen/X86/const-shift-of-constmasked.ll:656
+
+; Explicit `movzbl 5(%esp), %eax` for X86 because the exact value is
+; necessary to optimize out the `shr`.
----------------
You can add option `--no_x86_scrub_sp` when updating test.

================
Comment at: llvm/test/CodeGen/X86/const-shift-of-constmasked.ll:1195
+
+; Explicit `movzwl 6(%esp), %eax` for X86 because the exact value is
+; necessary to optimize out the `shr`.
----------------
ditto.

================
Comment at: llvm/test/CodeGen/X86/const-shift-with-and.ll:5

 define i64 @and_shr_from_mask_i64(i64 %x) nounwind {
 ; X86-LABEL: and_shr_from_mask_i64:
----------------
What's these tests used for? They are not changed in this patch?
Besides, do you still need new test case for this patch? Are these covered by the other changes already?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141653/new/

https://reviews.llvm.org/D141653