[llvm] [AMDGPU] Add wave reduce intrinsics for float types - 2 (PR #168859)

Thu Nov 20 06:19:58 PST 2025

================
@@ -5480,6 +5480,15 @@ static uint32_t getIdentityValueFor32BitWaveReduction(unsigned Opc) {
     return std::numeric_limits<uint32_t>::min();
   case AMDGPU::S_MAX_I32:
     return std::numeric_limits<int32_t>::min();
+  case AMDGPU::V_ADD_F32_e64:   // -0.0
+  case AMDGPU::V_SUB_F32_e64: { // +0.0
+    union {
+      uint32_t IntPattern;
+      float FloatPattern;
+    };
+    FloatPattern = Opc == AMDGPU::V_ADD_F32_e64 ? -0.0f : +0.0f;
+    return IntPattern;
----------------
jmmartinez wrote:

Sorry, it's too close to undefined-behavior for me not to be scared of it.

>From https://en.cppreference.com/w/cpp/language/union.html: 
> It is undefined behavior to read from the member of the union that wasn't most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union.

Maybe use memcpy:

```cpp
  case AMDGPU::V_ADD_F32_e64:   // -0.0
  case AMDGPU::V_SUB_F32_e64: { // +0.0
    float AsFloat = AMDGPU::V_ADD_F32_e64 ? -0.0f : +0.0f;
    uint32_t AsInt;
    memcpy(&AsInt, &AsFloat, sizeof(AsInt));
    return AsInt;
```

---

BTW thanks for teaching me about anonymous unions. I wasn't familiar with that syntax !

https://github.com/llvm/llvm-project/pull/168859