[llvm] r312095 - [AMDGPU] Use v_max_f* for fcanonicalize

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 29 20:09:18 PDT 2017


> On Aug 29, 2017, at 20:03, Stanislav Mekhanoshin via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
> Author: rampitec
> Date: Tue Aug 29 20:03:38 2017
> New Revision: 312095
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=312095&view=rev
> Log:
> [AMDGPU] Use v_max_f* for fcanonicalize
> 
> If denorms are not flushed we can use max instead of multiplication
> by 1. For double that is simply faster, while for float and half
> it is shorter, because mul uses constant bus and VOP3.
> 
> Differential Revision: https://reviews.llvm.org/D36856
> 
> Modified:
>    llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td
>    llvm/trunk/lib/Target/AMDGPU/SIInstructions.td
>    llvm/trunk/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
>    llvm/trunk/test/CodeGen/AMDGPU/fcanonicalize.f16.ll
>    llvm/trunk/test/CodeGen/AMDGPU/fcanonicalize.ll
> 
> Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td?rev=312095&r1=312094&r2=312095&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td (original)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td Tue Aug 29 20:03:38 2017
> @@ -42,9 +42,12 @@ class AMDGPUShaderInst <dag outs, dag in
>   field bits<32> Inst = 0xffffffff;
> }
> 
> -def FP16Denormals : Predicate<"Subtarget.hasFP16Denormals()">;
> -def FP32Denormals : Predicate<"Subtarget.hasFP32Denormals()">;
> -def FP64Denormals : Predicate<"Subtarget.hasFP64Denormals()">;
> +def FP16Denormals : Predicate<"Subtarget->hasFP16Denormals()">;
> +def FP32Denormals : Predicate<"Subtarget->hasFP32Denormals()">;
> +def FP64Denormals : Predicate<"Subtarget->hasFP64Denormals()">;
> +def NoFP16Denormals : Predicate<"!Subtarget->hasFP16Denormals()">;
> +def NoFP32Denormals : Predicate<"!Subtarget->hasFP32Denormals()">;
> +def NoFP64Denormals : Predicate<"!Subtarget->hasFP64Denormals()">;
> def UnsafeFPMath : Predicate<"TM.Options.UnsafeFPMath">;
> 
> def InstFlag : OperandWithDefaultOps <i32, (ops (i32 0))>;
> 
> Modified: llvm/trunk/lib/Target/AMDGPU/SIInstructions.td
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIInstructions.td?rev=312095&r1=312094&r2=312095&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/SIInstructions.td (original)
> +++ llvm/trunk/lib/Target/AMDGPU/SIInstructions.td Tue Aug 29 20:03:38 2017
> @@ -1278,20 +1278,47 @@ defm : BFMPatterns <i32, S_BFM_B32, S_MO
> // FIXME: defm : BFMPatterns <i64, S_BFM_B64, S_MOV_B64>;
> defm : BFEPattern <V_BFE_U32, V_BFE_I32, S_MOV_B32>;
> 
> +let Predicates = [NoFP16Denormals] in {
> def : Pat<
>   (fcanonicalize (f16 (VOP3Mods f16:$src, i32:$src_mods))),
>   (V_MUL_F16_e64 0, (i32 CONST.FP16_ONE), $src_mods, $src, 0, 0)
>> ;
> +}
> 
> +let Predicates = [FP16Denormals] in {
> +def : Pat<
> +  (fcanonicalize (f16 (VOP3Mods f16:$src, i32:$src_mods))),
> +  (V_MAX_F16_e64 $src_mods, $src, $src_mods, $src, 0, 0)
> +>;
> +}
> +
> +let Predicates = [NoFP32Denormals] in {
> def : Pat<
>   (fcanonicalize (f32 (VOP3Mods f32:$src, i32:$src_mods))),
>   (V_MUL_F32_e64 0, (i32 CONST.FP32_ONE), $src_mods, $src, 0, 0)
>> ;
> +}
> +
> +let Predicates = [FP32Denormals] in {
> +def : Pat<
> +  (fcanonicalize (f32 (VOP3Mods f32:$src, i32:$src_mods))),
> +  (V_MAX_F32_e64 $src_mods, $src, $src_mods, $src, 0, 0)
> +>;
> +}
> 
> +let Predicates = [NoFP64Denormals] in {
> def : Pat<
>   (fcanonicalize (f64 (VOP3Mods f64:$src, i32:$src_mods))),
>   (V_MUL_F64 0, CONST.FP64_ONE, $src_mods, $src, 0, 0)
>> ;
> +}
> +
> +let Predicates = [FP64Denormals] in {
> +def : Pat<
> +  (fcanonicalize (f64 (VOP3Mods f64:$src, i32:$src_mods))),
> +  (V_MAX_F64 $src_mods, $src, $src_mods, $src, 0, 0)
> +>;
> +}
> 
> def : Pat<
>   (fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),


I just noticed you missed the packed case

-Matt


More information about the llvm-commits mailing list