[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!

Thu Apr 6 12:06:29 PDT 2017

On Thu, Apr 6, 2017 at 6:53 AM, Kristof Beyls via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> I've been digging a little bit deeper into the biggest performance
> regressions I've observed.
>
> What I've observed so far is:
> * A lot of the biggest regressions are caused by unnecessarily moving
> floating point values through general purpose registers. I've raised
> http://bugs.llvm.org/show_bug.cgi?id=32550 for this. I think this one
> definitely needs fixing before enabling GlobalISel by default at -O0.
> * FastISel seems to transform division-by-constant-power-of-2 into right
> shift (see
> https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/SelectionDAG/FastISel.cpp#L456-L468).
> GlobalISel does not. It seems to me that at -O0 there may be reasons not
> perform this transformation, but maybe there is a good reason why FastISel
> does this?

So, FastISel on AArch64 isn't really an "O0" selector:  it has a lot
of smarts and peepholes, because some JIT users had it as the main
optimizing selector for a while.

In that sense, it's a pretty aggressive target that IMO we don't have to match.

> * FastISel doesn't seem to handle functions with switch statements, so it
> falls back to DAGISel. DAGISel produces code that's a lot better than
> GlobalISel for switch statement at -O0. I'm not sure if we need to do
> something here before enabling GlobalISel by default. I'm thinking we may
> need to add a smarter way to lower switch statements rather than just a
> cascaded sequence of conditional branches.

D31080 seems promising, I've been wanting to take a look, hoping we
can use that to emit an optimized lowering.  I'm not sure we want that
at O0 though (even if only for FastISel+DAGISel parity).

> I'll try to add the above content to the document Diana created at
> https://goo.gl/IS2Bdw too.

Thanks for the investigation!  These are also some of the biggest
problems I've seen (in particular the FP regbanks).

I'll make sure I find the time to file bugs for all the other issues
I'm aware of.  (sorry I haven't done that earlier!)

-Ahmed

> Thanks,
>
> Kristof
>
>
>
> On 3 Apr 2017, at 17:10, Kristof Beyls <Kristof.Beyls at arm.com> wrote:
>
> I've kicked off a run to compare "-O0 -g" versus "-O0 -g -mllvm -global-isel
> -mllvm -global-isel-abort=2".
> I've selected the test-suite (albeit a version which is a couple of months
> old now) and a few short-running proprietary benchmarks to get data back
> quickly for an initial feel of where things are.
> This was running on Cortex-A57 AArch64 Linux.
>
> I saw one assertion failure in GlobalISel, see
> http://bugs.llvm.org/show_bug.cgi?id=32471. This is in a program compiled at
> -O2 (my out-dated test-suite still overrides -O0 and instead uses -O for
> that program). The root cause of the failure seems to be due to LowLevelType
> not supporting vectors of pointers. I think this demonstrates that for
> correctness, we should be trying to test more than -O0, or even more than
> just LLVM-IR produced by clang, as other front-ends could run into this even
> at -O0.
>
> Due to this assertion failure and the infrastructure I used, the numbers
> below do not include test-suite/MultiSource/Benchmarks results.
>
> On the non-correctness aspects, LNT tells me that:
> - The programs that report execution time, on geomean are about 17% slower.
> - The programs that report scores, on geomean are about 21% slower.
> - Code size is up on geomean about 11%.
> I'm afraid I don't have compile time numbers, nor any feel for debug info
> quality.
>
> I'll need quite a bit more time to dig into the details to come up with
> something actionable, although the fact that LowLevelType doesn't support
> vectors of pointers is already actionable.
> Nevertheless, I thought to share what I see as is, to see if others see
> similar results so far.
>
> I thought Diana was going to look into fallback rate on the test-suite on
> AArch64 linux?
>
> Thanks,
>
> Kristof
>
> On 30 Mar 2017, at 10:54, Renato Golin <renato.golin at linaro.org> wrote:
>
> On 30 March 2017 at 00:27, Quentin Colombet <qcolombet at apple.com> wrote:
>
> On iOS we are at 100% pass rate in 00 g for the LLVM test suite, standard
> benchmarks and unit tests. In about 5% of all functions GlobalIsel falls
> back to SDIsel.
> (Kristof Beyls would have the linux numbers.)
> The self host compiler correctly builds and runs the LLVM test suite in O0.
>
>
> Having done no tests at all on my side, I think we need to have
> similar numbers on Linux to be able to flip across the board.
>
> I don't want to flip it only for Darwin and not Linux, as that will
> fragment the effort too much.
>
> I'll check with Diana and Kristof to know what's the best way forward,
> but it should be reasonably quick.
>
> cheers,
> --renato
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>