[llvm] [AMDGPU] - Add constant folding for s_bitreplicate (PR #72366)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 15 01:58:07 PST 2023


================
@@ -2422,6 +2423,23 @@ static Constant *ConstantFoldScalarCall1(StringRef Name,
 
       return ConstantFP::get(Ty->getContext(), Val);
     }
+
+    case Intrinsic::amdgcn_s_bitreplicate: {
+      uint64_t Val = Op->getZExtValue();
+      uint64_t ReplicatedVal = 0;
+      uint64_t ReplicatedOnes = 0b11;
+      // Input operand is always b32
+      for (unsigned i = 0; i < 32; ++i, ReplicatedOnes <<= 2, Val >>= 1) {
+        uint64_t Bit = Val & 1;
+
+        if (!Bit)
+          continue;
+
+        ReplicatedVal |= ReplicatedOnes;
+      }
----------------
jayfoad wrote:

This is fun - there are lots of different ways to write it! For example:
```
  for (I = 0; I < 32; ++I)
    ReplicatedVal |= ((Val & (1 << I)) * 3) << I;
```
Or:
```
  Val = (Val & 0x000000000000FFFF) | (Val & 0x00000000FFFF0000) << 16;
  Val = (Val & 0x000000FF000000FF) | (Val & 0x0000FF000000FF00) << 8;
  Val = (Val & 0x000F000F000F000F) | (Val & 0x00F000F000F000F0) << 4;
  Val = (Val & 0x0303030303030303) | (Val & 0x0C0C0C0C0C0C0C0C) << 2;
  Val = (Val & 0x5555555555555555) | (Val & 0xAAAAAAAAAAAAAAAA) << 1;
  ReplicatedVal = Val | Val << 1; // or "Val * 3"
```
But I think it is your decision how you want to write it - I don't mind too much.

https://github.com/llvm/llvm-project/pull/72366


More information about the llvm-commits mailing list