[PATCH] D35218: [AMDGPU] fcanonicalize elimination optimization
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 10 12:21:22 PDT 2017
rampitec added inline comments.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4638
+
+ case ISD::FCANONICALIZE:
+ return true;
----------------
arsenm wrote:
> fcanonicalize should have the idempotent optimization applied so there shouldn't be a need to handle it
It is not:
```
v_mul_f32_e32 v2, 1.0, v2
v_mul_f32_e32 v2, 1.0, v2
```
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4662
+ case ISD::FCOS:
+ case ISD::FSINCOS:
+ return Op.getValueType().getScalarType() != MVT::f16;
----------------
arsenm wrote:
> We don't handle this
We do not handle this now, but may handle later. The idea is still the same as with FSIN and FCOS.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4665-4668
+ // In pre-GFX9 targets V_MIN_F32 and others do not flush denorms.
+ // For such targets need to check their input recursively.
+ case ISD::FMINNUM:
+ case ISD::FMAXNUM:
----------------
arsenm wrote:
> The output is never flushed anywhere?
GFX9 flushes.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4669-4670
+ case ISD::FMAXNUM:
+ case ISD::FMINNAN:
+ case ISD::FMAXNAN:
+ if (ST->getGeneration() >= SISubtarget::GFX9)
----------------
arsenm wrote:
> We don't handle these
This again only now. I had a use of it in HSAIL and may actually port it.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4671-4672
+ case ISD::FMAXNAN:
+ if (ST->getGeneration() >= SISubtarget::GFX9)
+ return true;
+
----------------
arsenm wrote:
> I don't think this is true, but should have a named check in the subtarget
I would rather think about denorm support flag in TD for every single instruction wrt subtarget. Why add just a single one?
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4678-4680
+ case ISD::ConstantFP: {
+ auto F = cast<ConstantFPSDNode>(Op)->getValueAPF();
+ return !F.isDenormal() && !(F.isNaN() && F.isSignaling());
----------------
arsenm wrote:
> We constant fold this already so this shouldn't be necessary
This is necessary. We may constant fold it to remove fcanonicalize on a constant, but consider:
canonicalize(fmax(x, 0.0f));
Without above code it would not be handled because we have to look though arguments of min/max/fneg/fabs. I have a test for it actually.
In particular it is needed to fold an extremely common case like max(max, ...).
https://reviews.llvm.org/D35218
More information about the llvm-commits
mailing list