[PATCH] D137705: [AMDGPU] Add DAG Combine for right-shift carry add to uaddo

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 16 01:00:57 PST 2022


Pierre-vh added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:3205
 
+  // fold (i64 (shr (add a, b), 32)) -> (uaddo a, b).overflow
+  //   iff a/b have >= 32 leading zeroes
----------------
arsenm wrote:
> arsenm wrote:
> > Should also move to generic code 
> This is missing the extends in the input and output 
What do you mean "generic"? Not checking the types and instead check that the shift amount is 1/2 of the type's size in bits?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:3221
+      bool CanCombine = true;
+      for (SDNode *User : LHS->uses()) {
+        if (User == N)
----------------
foad wrote:
> arsenm wrote:
> > Looking at uses is unusual and I'm not sure why you're doing it
> As mentioned below, the thinking is that this transform is not profitable unless every use either only wants the overflow bit, or only wants the low 32 bits of the 64 bit result. Otherwise you might as well keep the full 64 bit add.
Indeed as Jay said, it's because the transformation is only profitable when the users only care about the lower 32 bits and the carry bit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137705/new/

https://reviews.llvm.org/D137705



More information about the llvm-commits mailing list