[cfe-dev] Techniques for runtime codepath selections utilizing new ISAs

Sat Sep 10 09:33:32 PDT 2016

I’m not 100% sure, but I think that the CLang support for ‘__attribute__((target(...)))’ might help.

See http://clang.llvm.org/docs/AttributeReference.html#target-gnu-target

And https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

The GCC equivalent does allow you to have multiple target specific solutions in the same source.  However, I don’t see a mention of ‘target_clones’ for X86 in CLang which is possibly closer to what you need.

            MartinO

From: cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] On Behalf Of Janus Lynggaard Thorborg via cfe-dev
Sent: 10 September 2016 16:55
To: cfe-dev at lists.llvm.org
Subject: [cfe-dev] Techniques for runtime codepath selections utilizing new ISAs

Hello,

I've programmed a codebase utilizing a templated simd math library allowing to write all algorithms using generic scalar/vector types. Entrypoints then switch on available instructions sets like AVX, SSE etc. at runtime and invokes the most optimal supported codepath, so one single binary supports every platform from old Pentiums to the newest cpus, while the critical code is running optimally.

Of course, the whole codebase had to be compiled with no cpu/arch extensions to ensure the compiler doesn't insert unsupported instructinos in non-controlled code paths.

On every compiler I've used, all is well and fine. My xcode/clang compiler was pretty dated though (the last xcode 5 version). Unfortunately, users compiling and using my open source projects are reporting that nothing compiles anymore.

At some point, this functionality was broken as clang now emits errors like this:

error: always_inline function '_mm256_and_ps' requires target feature 'avx', but would be inlined into function 'vand' that is compiled without support for 'avx'

                        inline v8sf vand(v8sf a, v8sf b) { return _mm256_and_ps(a, b); }

Suffice to say, this completely broke all of my projects. However, I know this is a caveat relying on non-standardized behaviour - and I'm pretty sure there are reasons for avoiding mixing of code with different targets. Still, seeing as this technique is immensively beneficial for performance and legacy support, I'm still extremely interesting in making it work (somehow). 

As far as I know, the only current way to make this work, is to compile every source file with different compiler flags. This is, however, very tedious, inflexible and requires manual maintenance on every source file every time something changes as well as completely separating all code requiring vector operations and normal code. Additionally, you know have to specialize all code manually for each ISA instead of relying on template-code generation, creating massive code duplication and maintenance.

The other alternative is to compile separate binaries for each supported platform...

---

So I guess what I'm really interested in hearing is, whether there are other options for achieving this, or if there are any plans to revert and/or reimplement the old code generation? Additionally, it would be helpful if anyone can shed some light upon why the code generation was changed in this way.

Also: If this is an inappropriate forum please let me know and/or direct to some other place

Regards, Janus

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160910/f8dd32c5/attachment.html>