[all-commits] [llvm/llvm-project] b2e37b: [AMDGPU] Fix fmed3 constant-fold sign-of-zero misc...
Wooseok Lee via All-commits
all-commits at lists.llvm.org
Mon Jun 8 11:43:00 PDT 2026
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: b2e37b668bc4550bf8c72806402b116eb5d72e66
https://github.com/llvm/llvm-project/commit/b2e37b668bc4550bf8c72806402b116eb5d72e66
Author: Wooseok Lee <wolee at amd.com>
Date: 2026-06-08 (Mon, 08 Jun 2026)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
M llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll
Log Message:
-----------
[AMDGPU] Fix fmed3 constant-fold sign-of-zero miscompile (#201896)
[AMDGPU] Fix fmed3 constant-fold sign-of-zero miscompile
fmed3AMDGCN identifies the maximum of three operands via
APFloat::compare,
then returns maxnum of the remaining two as the median. APFloat::compare
treats +0 and -0 as equal (cmpEqual), so for inputs like fmed3(-0, -0,
+0)
Max3=+0 incorrectly compares equal to Src0=-0, causing the wrong arm to
fire and returning +0 instead of the correct median -0.
Hardware v_med3_f32 sorts with -0 < +0 uniformly across all generations,
so fmed3(-0, -0, +0) must return -0.
Fix by replacing APFloat::compare equality checks with
APFloat::bitwiseIsEqual,
which distinguishes +0 from -0 by bit pattern. This is strictly correct:
the only case where compare returns cmpEqual but bitwiseIsEqual returns
false
is the +0/-0 pair, which is exactly the misidentification being fixed.
All
three arms of the helper are covered.
Affected inputs (all returning wrong +0 before the fix):
fmed3(-0, -0, +0), fmed3(-0, +0, -0), fmed3(+0, -0, -0)
fmed3(N, -0, +0), fmed3(-0, N, +0), fmed3(-0, +0, N) where N < 0
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list