[PATCH] D48556: Fix asserts in AMDGCN fmed3 folding by handling more cases of NaN
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 25 12:51:19 PDT 2018
arsenm added inline comments.
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:3429-3432
+ // Checking for NaN before canonicalization provides better fidelity for
+ // mapping fclamp operations to fmed3. This will still not properly model
+ // NaN in fclamp if the intrinsic is canonicalized prior to the NaN being
+ // determined.
----------------
This comment is sort of confusing me. Is it saying there is still a problem?
Probably shouldn't mention fclamp, as no operation called fclamp is involved here.
================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:3447-3452
+ if (match(Src2, m_NaN()) || isa<UndefValue>(Src2)) {
+ CallInst *NewCall = Builder.CreateMaxNum(Src0, Src1);
+ NewCall->copyFastMathFlags(II);
+ NewCall->takeName(II);
+ return replaceInstUsesWith(*II, NewCall);
+ }
----------------
I don't really like repeating all of these so many times, but it's probably clearer than the alternatives. Should this used FastMathFlagGuard to avoid repeating the copyFastMathFlags at least?
================
Comment at: test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll:1337-1357
+; CHECK-LABEL: @fmed3_nan_0_1_f32(
+; CHECK: ret float 0.0
+define float @fmed3_nan_0_1_f32() {
+ %med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8001000000000, float 0.0, float 1.0)
+ ret float %med3
+}
+
----------------
Since the undef value check is repeated so many times, better add versions of these with undef instead of nan to make sure those work
Repository:
rL LLVM
https://reviews.llvm.org/D48556
More information about the llvm-commits
mailing list