[PATCH] D140939: [X86] Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a single bit.

Thu Jan 5 22:12:10 PST 2023

goldstein.w.n added inline comments.

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:5653
       Info.ptrVal = I.getArgOperand(0);
       unsigned Size = I.getType()->getScalarSizeInBits();
       Info.memVT = EVT::getIntegerVT(I.getType()->getContext(), Size);
----------------
pengfei wrote:
> goldstein.w.n wrote:
> > pengfei wrote:
> > > Should the `Size` be same with operand 1 rather than result?
> > > Should the `Size` be same with operand 1 rather than result?
> > 
> > If I change it I run into assertion errors.
> > 
> > `I.getArgOperand(0)->getType()->getScalarSizeInBits()` the `Size` can be zero (bitwidth too small)
> > 
> > or `I.getArgOperand(1)->getType()->getScalarSizeInBits()` I run into:
> > ```
> > llc: /home/noah/programs/opensource/llvm-dev/src/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:11108: llvm::MemSDNode::MemSDNode(unsigned int, unsigned int, const llvm::DebugLoc&, llvm::SDVTList, llvm::EVT, llvm::MachineMemOperand*): Assertion `memvt.getStoreSize().getKnownMinSize() <= MMO->getSize() && "Size mismatch!"' failed.
> > ```
> > 
> > Maybe that means there is a bug in the patch elsewhere?
> > 
> > Running the test here are the sizes I see:
> > ```
> > Sizes: 8 / 0 / 16
> > Sizes: 8 / 0 / 16
> > Sizes: 8 / 0 / 16
> > Sizes: 8 / 0 / 16
> > Sizes: 16 / 0 / 8
> > Sizes: 16 / 0 / 8
> > Sizes: 16 / 0 / 8
> > Sizes: 16 / 0 / 8
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 8 / 0 / 32
> > Sizes: 32 / 0 / 8
> > Sizes: 32 / 0 / 8
> > Sizes: 32 / 0 / 8
> > Sizes: 32 / 0 / 8
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 8 / 0 / 64
> > Sizes: 64 / 0 / 8
> > Sizes: 64 / 0 / 8
> > Sizes: 64 / 0 / 8
> > ```
> I mean split `x86_atomic_bt*_rm` from it and then change it to `I.getArgOperand(1)->getType()->getScalarSizeInBits()`
> Other intrinsics are correct to use `I.getType()->getScalarSizeInBits()`.
> I mean split `x86_atomic_bt*_rm` from it and then change it to `I.getArgOperand(1)->getType()->getScalarSizeInBits()`
> Other intrinsics are correct to use `I.getType()->getScalarSizeInBits()`.

What does `"split `x86_atomic_bt*_rm` from it" mean?

================
Comment at: llvm/test/CodeGen/X86/atomic-rm-bit-test.ll:4622-4628
+; X86-NEXT:    jae .LBB78_1
+; X86-NEXT:  # %bb.2: # %if.then
+; X86-NEXT:    movl (%ecx,%eax,4), %eax
+; X86-NEXT:    retl
+; X86-NEXT:  .LBB78_1:
 ; X86-NEXT:    movl $123, %eax
 ; X86-NEXT:    retl
----------------
pengfei wrote:
> goldstein.w.n wrote:
> > pengfei wrote:
> > > The branch code doesn't look necessary. Can we necessary it?
> > > The branch code doesn't look necessary. Can we necessary it?
> > 
> > I think it is b.c we don't `cmovcc` loads.
> > 
> > For
> > ```
> > if.then:                                          ; preds = %entry
> >   %idxprom = zext i32 %c to i64
> >   %arrayidx = getelementptr inbounds i32, ptr %v, i64 %idxprom
> >   %1 = load i32, ptr %arrayidx, align 4
> >   br label %return
> > ```
> > 
> > And
> > ```
> > return:                                           ; preds = %entry, %if.then
> >   %retval.0 = phi i32 [ %1, %if.then ], [ 123, %entry ]
> >   ret i32 %retval.0
> > ```
> > 
> > a branch seems correct.
> Sorry for the wrong words. I mean can we eliminate the branch by modifying the IR code, e.g.,
> ```
> entry:
>   %shl = shl nuw i32 1, %c
>   %0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
>   %and = and i32 %0, %shl
>   %tobool.not = icmp eq i32 %and, 0
>   %ret = zext i1 %tobool.not to i32
>   ret i32 %ret
> ```
> 
> This will help to reduce the nosie in reviewing the code and pay more attention to the change we expected.
> Sorry for the wrong words. I mean can we eliminate the branch by modifying the IR code, e.g.,
> ```
> entry:
>   %shl = shl nuw i32 1, %c
>   %0 = atomicrmw or ptr %v, i32 %shl monotonic, align 4
>   %and = and i32 %0, %shl
>   %tobool.not = icmp eq i32 %and, 0
>   %ret = zext i1 %tobool.not to i32
>   ret i32 %ret
> ```
> 
> This will help to reduce the nosie in reviewing the code and pay more attention to the change we expected.

I see, so there are 6 difference test types:
`_br` -> branch on value
`_brz` -> branch on !value
`_brnz` -> branch on !!value
`_val` -> return value
`_valz` -> return !value
`_valnz` -> return !!value

Imo they are all worth testing. For example we have logic that searches the uses to see if its only for a truth value to see if we can optimize out the shift.
IIRC when writing the code, there where some edge cases where `br` behaved differently than `setc` so think its worth keeping.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140939/new/

https://reviews.llvm.org/D140939