[llvm-dev] Ryzen (znver1) scheduler and instruction selection

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Tue Mar 14 07:53:19 PDT 2017


On 03/14/2017 09:43 AM, Denis Steckelmacher via llvm-dev wrote:
> Hello,
>
> I have just bought an AMD Ryzen 7 1700 CPU that I use to run scientific computing programs (built on Numpy). Getting lower performance than expected, I started to profile my entire system using "perf top" and to fix any library that does not properly recognize my CPU. For instance, OpenBLAS was using non-vectorized fall-back paths everywhere, dividing performance by 3 on my benchmark.
>
> Having closely watched LLVM for a couple of years, I have seen that Zen support has been recently been added. I have looked at the relevant commits (and the files as they now are in SVN), and I see that znver1 still uses the BtVer2 scheduler model. In my experiments, using the Haswell scheduler for znver1 leads to marginal gains, but I still wanted to develop a complete Zen scheduler model. However, we currently do not have enough information from AMD and even reading the GCC patches for Zen did not allow me to produce a valid scheduler. Basically, my scheduler leads to performance consistently 5-10% below the Haswell scheduler (on C-Ray multithreaded and pgbench v9.4.3). I'm still quite impressed at how important a scheduler can be.
>
> Does someone know if someone else has already worked on a Zen scheduler? If not, I'll continue my work and I will keep you informed.
>
> Another small issue that I have found, and that may or may not be important, is how X86 instructions are selected. In lib/Target/X86/
> X86TargetTransformInfo.cpp, the cost of plenty of instructions is given in tables. Different tables allow to have different costs depending on the processor. However, a processor is mapped to a technology (Zen supports AVX2), then a technology is mapped to costs (AVX2 to costs optimized for Intel Haswell). My Ryzen CPU therefore gets Haswell costs. I have no idea of whether there is a significant difference in costs between CPU implementations, but this architecture may prevent LLVM from getting the most out of non-Intel CPUs. Has anyone looked into this?
>
>
> I want to stress the fact that this email is more a list of questions than a complain. I am well aware that most developers are probably using an Intel-based machine, which introduces a natural bias towards Intel as it is the platform on which tests and benchmarks are run. I would like to start a discussion on how to make LLVM, and compilers in general, more architecture-independent with regards to optimization.

In general, LLVM is very good at being architecture independent. 
However, we have not had a sustained effort to support AMD CPUs in our 
X86 backend. I think it would be great if this received more attention. 
There are lots of different ways you (or anyone else) can contribute in 
this area. Patches are great. If you can narrow down problems to 
specific benchmarks, loops, etc. then filing bug reports is great too. 
Having a bunch of open performance-related bug reports is a good 
potential motivator as well.

  -Hal

>
> Best regards,
> Denis Steckelmacher
>
> (I am a PhD student and have no connection with AMD; I bought my Ryzen CPU with my own personal money)
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-dev mailing list