[PATCHES] R600/SI: New V_FRACT fix, intrinsic for S_FLBIT_I32, and more
Marek Olšák
maraeo at gmail.com
Tue Mar 10 09:37:00 PDT 2015
On Tue, Mar 10, 2015 at 3:50 PM, Tom Stellard <tom at stellard.net> wrote:
> On Thu, Mar 05, 2015 at 10:33:15PM +0100, Marek Olšák wrote:
>> From e9f7ebe3fa7751e40b7d7cf4fadc17c7c8ef3a4a Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= <marek.olsak at amd.com>
>> Date: Sun, 1 Mar 2015 23:07:48 +0100
>> Subject: [PATCH 2/3] R600/SI: Expand fract to floor, then only select V_FRACT
>> on CI
>>
>> V_FRACT is buggy on SI.
>>
>> R600-specific code is left intact.
>>
>> v2: drop the multiclass, use complex VOP3 patterns
>> ---
>> lib/Target/R600/AMDGPUISelLowering.cpp | 3 ---
>> lib/Target/R600/R600ISelLowering.cpp | 4 +++
>> lib/Target/R600/SIISelLowering.cpp | 6 +++++
>> lib/Target/R600/SIInstructions.td | 22 +++++++++++++++++
>> test/CodeGen/R600/llvm.AMDGPU.fract.ll | 45 +++++++++++++++++++++++++++++++---
>> 5 files changed, 73 insertions(+), 7 deletions(-)
>>
>> diff --git a/lib/Target/R600/AMDGPUISelLowering.cpp b/lib/Target/R600/AMDGPUISelLowering.cpp
>> index 4707279..62a33fa 100644
>> --- a/lib/Target/R600/AMDGPUISelLowering.cpp
>> +++ b/lib/Target/R600/AMDGPUISelLowering.cpp
>> @@ -885,9 +885,6 @@ SDValue AMDGPUTargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
>> return LowerIntrinsicIABS(Op, DAG);
>> case AMDGPUIntrinsic::AMDGPU_lrp:
>> return LowerIntrinsicLRP(Op, DAG);
>> - case AMDGPUIntrinsic::AMDGPU_fract:
>> - case AMDGPUIntrinsic::AMDIL_fraction: // Legacy name.
>> - return DAG.getNode(AMDGPUISD::FRACT, DL, VT, Op.getOperand(1));
>>
>> case AMDGPUIntrinsic::AMDGPU_clamp:
>> case AMDGPUIntrinsic::AMDIL_clamp: // Legacy name.
>> diff --git a/lib/Target/R600/R600ISelLowering.cpp b/lib/Target/R600/R600ISelLowering.cpp
>> index c738611..cf0a60f 100644
>> --- a/lib/Target/R600/R600ISelLowering.cpp
>> +++ b/lib/Target/R600/R600ISelLowering.cpp
>> @@ -837,6 +837,10 @@ SDValue R600TargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const
>> case Intrinsic::AMDGPU_rsq:
>> // XXX - I'm assuming SI's RSQ_LEGACY matches R600's behavior.
>> return DAG.getNode(AMDGPUISD::RSQ_LEGACY, DL, VT, Op.getOperand(1));
>> +
>> + case AMDGPUIntrinsic::AMDGPU_fract:
>> + case AMDGPUIntrinsic::AMDIL_fraction: // Legacy name.
>> + return DAG.getNode(AMDGPUISD::FRACT, DL, VT, Op.getOperand(1));
>> }
>> // break out of case ISD::INTRINSIC_WO_CHAIN in switch(Op.getOpcode())
>> break;
>> diff --git a/lib/Target/R600/SIISelLowering.cpp b/lib/Target/R600/SIISelLowering.cpp
>> index 7d794b8..5c9a9f9 100644
>> --- a/lib/Target/R600/SIISelLowering.cpp
>> +++ b/lib/Target/R600/SIISelLowering.cpp
>> @@ -932,6 +932,12 @@ SDValue SITargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
>> Op.getOperand(1),
>> Op.getOperand(2),
>> Op.getOperand(3));
>> +
>> + case AMDGPUIntrinsic::AMDGPU_fract:
>> + case AMDGPUIntrinsic::AMDIL_fraction: // Legacy name.
>> + return DAG.getNode(ISD::FSUB, DL, VT, Op.getOperand(1),
>> + DAG.getNode(ISD::FFLOOR, DL, VT, Op.getOperand(1)));
>> +
>> default:
>> return AMDGPUTargetLowering::LowerOperation(Op, DAG);
>> }
>> diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td
>> index ab1f08f..6b9230a 100644
>> --- a/lib/Target/R600/SIInstructions.td
>> +++ b/lib/Target/R600/SIInstructions.td
>> @@ -3288,6 +3288,28 @@ def : Pat <
>> (V_CNDMASK_B32_e64 $src0, $src1, $src2)
>> >;
>>
>> +//===----------------------------------------------------------------------===//
>> +// Fract Patterns
>> +//===----------------------------------------------------------------------===//
>> +
>> +let Predicates = [isCI] in {
>> +
>> +// Convert (x - floor(x)) to fract(x)
>> +def : Pat <
>> + (f32 (fsub (f32 (VOP3Mods f32:$x, i32:$mods)),
>> + (f32 (ffloor (f32 (VOP3Mods f32:$x, i32:$mods)))))),
>> + (V_FRACT_F32_e64 $mods, $x, DSTCLAMP.NONE, DSTOMOD.NONE)
>> +>;
>> +
>> +// Convert (x + (-floor(x))) to fract(x)
>> +def : Pat <
>> + (f64 (fadd (f64 (VOP3Mods f64:$x, i32:$mods)),
>> + (f64 (fneg (f64 (ffloor (f64 (VOP3Mods f64:$x, i32:$mods)))))))),
>> + (V_FRACT_F64_e64 $mods, $x, DSTCLAMP.NONE, DSTOMOD.NONE)
>> +>;
>> +
>> +} // End Predicates = [isCI]
>> +
>
> We are there different patterns for f32 and f64? Also, can we match
> this pattern on VI too?
isCI includes all CI and later chips.
fsub is expanded to fneg+fadd for f64, which is why we need different patterns.
Marek
More information about the llvm-commits
mailing list