[cfe-dev] Techniques for runtime codepath selections utilizing new ISAs

Janus Lynggaard Thorborg via cfe-dev cfe-dev at lists.llvm.org
Sat Sep 10 08:55:21 PDT 2016


I've programmed a codebase utilizing a templated simd math library allowing
to write all algorithms using generic scalar/vector types. Entrypoints then
switch on available instructions sets like AVX, SSE etc. at runtime and
invokes the most optimal supported codepath, so one single binary supports
every platform from old Pentiums to the newest cpus, while the critical
code is running optimally.

Of course, the whole codebase had to be compiled with no cpu/arch
extensions to ensure the compiler doesn't insert unsupported instructinos
in non-controlled code paths.

On every compiler I've used, all is well and fine. My xcode/clang compiler
was pretty dated though (the last xcode 5 version). Unfortunately, users
compiling and using my open source projects are reporting that nothing
compiles anymore.

At some point, this functionality was broken as clang now emits errors like

error: always_inline function '_mm256_and_ps' requires target feature
'avx', but would be inlined into function 'vand' that is compiled without
support for 'avx'
                        inline v8sf vand(v8sf a, v8sf b) { return
_mm256_and_ps(a, b); }

Suffice to say, this completely broke all of my projects. However, I know
this is a caveat relying on non-standardized behaviour - and I'm pretty
sure there are reasons for avoiding mixing of code with different targets.
Still, seeing as this technique is immensively beneficial for performance
and legacy support, I'm still extremely interesting in making it work

As far as I know, the only current way to make this work, is to compile
every source file with different compiler flags. This is, however, very
tedious, inflexible and requires manual maintenance on every source file
every time something changes as well as completely separating all code
requiring vector operations and normal code. Additionally, you know have to
specialize all code manually for each ISA instead of relying on
template-code generation, creating massive code duplication and maintenance.

The other alternative is to compile separate binaries for each supported


So I guess what I'm really interested in hearing is, whether there are
other options for achieving this, or if there are any plans to revert
and/or reimplement the old code generation? Additionally, it would be
helpful if anyone can shed some light upon why the code generation was
changed in this way.

Also: If this is an inappropriate forum please let me know and/or direct to
some other place

Regards, Janus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160910/fe1737b2/attachment.html>

More information about the cfe-dev mailing list