[all-commits] [llvm/llvm-project] b2e37b: [AMDGPU] Fix fmed3 constant-fold sign-of-zero misc...

Wooseok Lee via All-commits all-commits at lists.llvm.org
Mon Jun 8 11:43:00 PDT 2026


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: b2e37b668bc4550bf8c72806402b116eb5d72e66
      https://github.com/llvm/llvm-project/commit/b2e37b668bc4550bf8c72806402b116eb5d72e66
  Author: Wooseok Lee <wolee at amd.com>
  Date:   2026-06-08 (Mon, 08 Jun 2026)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
    M llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll

  Log Message:
  -----------
  [AMDGPU] Fix fmed3 constant-fold sign-of-zero miscompile (#201896)

[AMDGPU] Fix fmed3 constant-fold sign-of-zero miscompile
    
fmed3AMDGCN identifies the maximum of three operands via
APFloat::compare,
then returns maxnum of the remaining two as the median. APFloat::compare
treats +0 and -0 as equal (cmpEqual), so for inputs like fmed3(-0, -0,
+0)
Max3=+0 incorrectly compares equal to Src0=-0, causing the wrong arm to
fire and returning +0 instead of the correct median -0.
    
Hardware v_med3_f32 sorts with -0 < +0 uniformly across all generations,
so fmed3(-0, -0, +0) must return -0.
    
Fix by replacing APFloat::compare equality checks with
APFloat::bitwiseIsEqual,
which distinguishes +0 from -0 by bit pattern. This is strictly correct:
the only case where compare returns cmpEqual but bitwiseIsEqual returns
false
is the +0/-0 pair, which is exactly the misidentification being fixed.
All
three arms of the helper are covered.
    
Affected inputs (all returning wrong +0 before the fix):
  fmed3(-0, -0, +0), fmed3(-0, +0, -0), fmed3(+0, -0, -0)
  fmed3(N, -0, +0), fmed3(-0, N, +0), fmed3(-0, +0, N)  where N < 0



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list