[PATCH] D90901: [DAGCombiner] Don't fold ((fma (fneg X), Y, (fneg Z)) to fneg (fma X, Y, Z))
Qing Shan Zhang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 6 02:43:00 PST 2020
steven.zhang added a comment.
In D90901#2378338 <https://reviews.llvm.org/D90901#2378338>, @Jim wrote:
> In D90901#2378066 <https://reviews.llvm.org/D90901#2378066>, @craig.topper wrote:
>
>> No test changes?
>>
>> The transform would be legal with no signed zeros fast math flag right? I believe that's checked in getNegatedExpression. But not checked in the X86 specific override.
>>
>> I think the same issue may also exist in ARMInstrVFP.td. It matches both (fneg (fma x, y, z)) and (fma (fneg x), y, (fneg z)) to the same instruction.
>>
>> // Match @llvm.fma.* intrinsics
>> // (fneg (fma x, y, z)) -> (vfnma z, x, y)
>> def : Pat<(fneg (fma (f64 DPR:$Dn), (f64 DPR:$Dm), (f64 DPR:$Ddin))),
>> (VFNMAD DPR:$Ddin, DPR:$Dn, DPR:$Dm)>,
>> Requires<[HasVFP4,HasDPVFP]>;
>> def : Pat<(fneg (fma (f32 SPR:$Sn), (f32 SPR:$Sm), (f32 SPR:$Sdin))),
>> (VFNMAS SPR:$Sdin, SPR:$Sn, SPR:$Sm)>,
>> Requires<[HasVFP4]>;
>> def : Pat<(fneg (fma (f16 HPR:$Sn), (f16 HPR:$Sm), (f16 (f16 HPR:$Sdin)))),
>> (VFNMAH (f16 HPR:$Sdin), (f16 HPR:$Sn), (f16 HPR:$Sm))>,
>> Requires<[HasFullFP16]>;
>> // (fma (fneg x), y, (fneg z)) -> (vfnma z, x, y)
>> def : Pat<(f64 (fma (fneg DPR:$Dn), DPR:$Dm, (fneg DPR:$Ddin))),
>> (VFNMAD DPR:$Ddin, DPR:$Dn, DPR:$Dm)>,
>> Requires<[HasVFP4,HasDPVFP]>;
>> def : Pat<(f32 (fma (fneg SPR:$Sn), SPR:$Sm, (fneg SPR:$Sdin))),
>> (VFNMAS SPR:$Sdin, SPR:$Sn, SPR:$Sm)>,
>> Requires<[HasVFP4]>;
>> def : Pat<(f16 (fma (fneg (f16 HPR:$Sn)), (f16 HPR:$Sm), (fneg (f16 HPR:$Sdin)))),
>> (VFNMAH (f16 HPR:$Sdin), (f16 HPR:$Sn), (f16 HPR:$Sm))>,
>> Requires<[HasFullFP16]>;
>
> Yes, it is legal with no signed zeros fast math flag. I see that PowerPC deal with this case specifically.
>
> Should I add the condition with no signed zero to permit this transform instead of deleting it?
I think so. As Craig pointed out, the default implementation of getNegatedExpression will take care of the fast-math flags. You need check the nsz inside X86::getNegatedExpression() when perform some folding that might change the sign bit of zero.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D90901/new/
https://reviews.llvm.org/D90901
More information about the llvm-commits
mailing list