<div dir="ltr">Hello,<div><br></div><div>I've programmed a codebase utilizing a templated simd math library allowing to write all algorithms using generic scalar/vector types. Entrypoints then switch on available instructions sets like AVX, SSE etc. at runtime and invokes the most optimal supported codepath, so one single binary supports every platform from old Pentiums to the newest cpus, while the critical code is running optimally.</div><div><br></div><div>Of course, the whole codebase had to be compiled with no cpu/arch extensions to ensure the compiler doesn't insert unsupported instructinos in non-controlled code paths.</div><div><br></div><div>On every compiler I've used, all is well and fine. My xcode/clang compiler was pretty dated though (the last xcode 5 version). Unfortunately, users compiling and using my open source projects are reporting that nothing compiles anymore.</div><div><br></div><div>At some point, this functionality was broken as clang now emits errors like this:</div><div><br></div><div><div style="margin:0px 0px 0px 12px;font-size:11px;font-family:menlo">error: always_inline function '_mm256_and_ps' requires target feature 'avx', but would be inlined into function 'vand' that is compiled without support for 'avx'</div><div style="margin:0px 0px 0px 12px;font-size:11px;font-family:menlo">                        inline v8sf vand(v8sf a, v8sf b) { return _mm256_and_ps(a, b); }</div></div><div><br></div><div>Suffice to say, this completely broke all of my projects. However, I know this is a caveat relying on non-standardized behaviour - and I'm pretty sure there are reasons for avoiding mixing of code with different targets. Still, seeing as this technique is immensively beneficial for performance and legacy support, I'm still extremely interesting in making it work (somehow). </div><div><br></div><div>As far as I know, the only current way to make this work, is to compile every source file with different compiler flags. This is, however, very tedious, inflexible and requires manual maintenance on every source file every time something changes as well as completely separating all code requiring vector operations and normal code. Additionally, you know have to specialize all code manually for each ISA instead of relying on template-code generation, creating massive code duplication and maintenance.<br></div><div><br></div><div>The other alternative is to compile separate binaries for each supported platform...</div><div><br></div><div>---</div><div><br></div><div>So I guess what I'm really interested in hearing is, whether there are other options for achieving this, or if there are any plans to revert and/or reimplement the old code generation? Additionally, it would be helpful if anyone can shed some light upon why the code generation was changed in this way.</div><div><br></div><div>Also: If this is an inappropriate forum please let me know and/or direct to some other place</div><div><br></div><div>Regards, Janus</div><div><br></div><div><br></div></div>