[PATCH] D116270: [AMDGPU] Enable divergence-driven XNOR selection

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 10 01:13:40 PST 2022


foad added a comment.

> SITargetLowering::reassociateScalarOps exists to fix the instruction selection that is done in a wrong way.

No! It's not trying to fix anything, it's just trying to reassociate expressions to keep more of the intermediate results uniform, so we use fewer vgprs and fewer valu instructions. For example:

  // v1 = (v0 + s0) + s1
  v_add v1, v0, s0
  v_add v1, v1, s1
   -->
  // v1 = (s0 + s1) + v0 ; reassociated
  s_add s2, s0, s1
  v_add v1, s2, v0

This is exactly the same kind of thing you need to do to restore the missed optimization in xnor.ll:

  // v1 = ~(s0 ^ v0)
  v_xor v1, s0, v0
  v_not v1, v1
   -->
  // v1 = ~s0 ^ v0
  s_not s1, s0
  v_xor v1, s1, v0


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116270/new/

https://reviews.llvm.org/D116270



More information about the llvm-commits mailing list