[PATCH] D43079: [TTI CostModel] change default cost of FP ops to 1 (PR36280)

Tue Feb 20 17:01:08 PST 2018

I have not thought this through fully yet, but couldn't we use the scheduling model to get the number of units available for certain instructions for backends using the machine scheduler? And determine the throughput based on that? I will think about this a bit more when I am back.

________________________________________
From: Adam Nemet via Phabricator <reviews at reviews.llvm.org>
Sent: Wednesday, February 21, 2018 12:34:37 AM
To: spatel at rotateright.com; hfinkel at anl.gov; a.bataev at hotmail.com; efriedma at codeaurora.org; Florian Hahn; llvm-dev at redking.me.uk; craig.topper at gmail.com
Cc: anemet at apple.com; Evgeny Astigeevich; aemerson at apple.com; mcrosier at codeaurora.org; Javed Absar; Kristof Beyls; llvm-commits at lists.llvm.org; t.p.northover at gmail.com; junbuml at codeaurora.org
Subject: [PATCH] D43079: [TTI CostModel] change default cost of FP ops to 1 (PR36280)

anemet added a comment.

In https://reviews.llvm.org/D43079#1013321, @spatel wrote:

> In https://reviews.llvm.org/D43079#1013269, @eastig wrote:
>
> > Hi Sanjay,
> >
> > The patch caused regressions in the LLVM benchmarks and in Spec2k/Spec2k6 benchmarks on AArch64 Cortex-A53:
> >
> > SingleSource/Benchmarks/Misc/matmul_f64_4x4:  49%
> >  MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt: 5.32%
> >  CFP2000/188.ammp/188.ammp: 3.58%
> >  CFP2000/177.mesa/177.mesa: 2.48%
> >  CFP2006/444.namd/444.namd: 2.49%
> >
> > The regression of SingleSource/Benchmarks/Misc/matmul_f64_4x4 can also be seen on the public bot: http://lnt.llvm.org/db_default/v4/nts/90636
> >  It is 128.85%.
> >
> > The main difference in generated code is FMUL(FP, scalar) instead of FMUL(SIMD, scalar):
> >
> >   fmul d20, d16, d2
> >
> >
> > instead of
> >
> >   fmul v17.2d, v1.2d, v5.2d
> >
> >
> > This also caused code size increase: 6.04% in SingleSource/Benchmarks/Misc/matmul_f64_4x4
> >
> > I am working on a reproducer.
>
>
> Thanks. We knew this change was likely to cause perf regressions based on some of the x86 diffs, so having those reductions will help tune the models in general and specifically for AArch64.
>
> Ie, we should be able to solve the AArch64 problems with AArch64-specific cost model changes rather than reverting this. For example as @fhahn mentioned, we might want to make the int-to-FP ratio 3:2 for some cores. Another possibility is overriding the fmul/fsub/fadd AArch64 costs to be more realistic (as we also probably have to do for x86).

Please revert until these things get worked out so that we can properly track performance.   We are seeing many regressions including 17% on 444.namd and 12% on 482.sphinx3 in SPECfp 2006.

Repository:
  rL LLVM

https://reviews.llvm.org/D43079