[PATCH] D90901: [DAGCombiner] Don't fold ((fma (fneg X), Y, (fneg Z)) to fneg (fma X, Y, Z))

Qing Shan Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 6 02:43:00 PST 2020


steven.zhang added a comment.

In D90901#2378338 <https://reviews.llvm.org/D90901#2378338>, @Jim wrote:

> In D90901#2378066 <https://reviews.llvm.org/D90901#2378066>, @craig.topper wrote:
>
>> No test changes?
>>
>> The transform would be legal with no signed zeros fast math flag right? I believe that's checked in getNegatedExpression. But not checked in the X86 specific override.
>>
>> I think the same issue may also exist in ARMInstrVFP.td. It matches both (fneg (fma x, y, z))   and (fma (fneg x), y, (fneg z))  to the same instruction.
>>
>>   // Match @llvm.fma.* intrinsics                                                                                                                                                                
>>   // (fneg (fma x, y, z)) -> (vfnma z, x, y)                                                                                                                                                     
>>   def : Pat<(fneg (fma (f64 DPR:$Dn), (f64 DPR:$Dm), (f64 DPR:$Ddin))),                                                                                                                          
>>             (VFNMAD DPR:$Ddin, DPR:$Dn, DPR:$Dm)>,                                                                                                                                               
>>         Requires<[HasVFP4,HasDPVFP]>;                                                                                                                                                            
>>   def : Pat<(fneg (fma (f32 SPR:$Sn), (f32 SPR:$Sm), (f32 SPR:$Sdin))),                                                                                                                          
>>             (VFNMAS SPR:$Sdin, SPR:$Sn, SPR:$Sm)>,                                                                                                                                               
>>         Requires<[HasVFP4]>;                                                                                                                                                                     
>>   def : Pat<(fneg (fma (f16 HPR:$Sn), (f16 HPR:$Sm), (f16 (f16 HPR:$Sdin)))),                                                                                                                    
>>             (VFNMAH (f16 HPR:$Sdin), (f16 HPR:$Sn), (f16 HPR:$Sm))>,                                                                                                                             
>>         Requires<[HasFullFP16]>;                                                                                                                                                                 
>>   // (fma (fneg x), y, (fneg z)) -> (vfnma z, x, y)                                                                                                                                              
>>   def : Pat<(f64 (fma (fneg DPR:$Dn), DPR:$Dm, (fneg DPR:$Ddin))),                                                                                                                               
>>             (VFNMAD DPR:$Ddin, DPR:$Dn, DPR:$Dm)>,                                                                                                                                               
>>         Requires<[HasVFP4,HasDPVFP]>;                                                                                                                                                            
>>   def : Pat<(f32 (fma (fneg SPR:$Sn), SPR:$Sm, (fneg SPR:$Sdin))),                                                                                                                               
>>             (VFNMAS SPR:$Sdin, SPR:$Sn, SPR:$Sm)>,                                                                                                                                               
>>         Requires<[HasVFP4]>;                                                                                                                                                                     
>>   def : Pat<(f16 (fma (fneg (f16 HPR:$Sn)), (f16 HPR:$Sm), (fneg (f16 HPR:$Sdin)))),                                                                                                             
>>             (VFNMAH (f16 HPR:$Sdin), (f16 HPR:$Sn), (f16 HPR:$Sm))>,                                                                                                                             
>>         Requires<[HasFullFP16]>; 
>
> Yes, it is legal with no signed zeros fast math flag. I see that PowerPC deal with this case specifically.
>
> Should I add the condition with no signed zero to permit this transform instead of deleting it?

I think so. As Craig pointed out, the default implementation of getNegatedExpression will take care of the fast-math flags. You need check the nsz inside X86::getNegatedExpression() when perform some folding that might change the sign bit of zero.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90901/new/

https://reviews.llvm.org/D90901



More information about the llvm-commits mailing list