[cfe-dev] clang-cl's <intrin.h>, _tzcnt_u32, and compatibility with MSVC's <intrin.h>
Demikhovsky, Elena via cfe-dev
cfe-dev at lists.llvm.org
Thu Sep 29 12:19:42 PDT 2016
Forwarding your mail to Elad who is working now on intrinsics. Please take into account a possible delay in Elad’s response, we are in local holidays this week.
From: Reid Kleckner [mailto:rnk at google.com]
Sent: Thursday, September 29, 2016 22:02
To: Hans Wennborg <hans at chromium.org>; Blank, Guy <guy.blank at intel.com>; Aboud, Amjad <amjad.aboud at intel.com>; Demikhovsky, Elena <elena.demikhovsky at intel.com>; Craig Topper <craig.topper at gmail.com>; Badouh, Asaf <asaf.badouh at intel.com>; Zuckerman, Michael <michael.zuckerman at intel.com>; Breger, Igor <igor.breger at intel.com>
Cc: Nathan Froyd <nfroyd at mozilla.com>; Nico Weber <thakis at chromium.org>; cfe-dev <cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] clang-cl's <intrin.h>, _tzcnt_u32, and compatibility with MSVC's <intrin.h>
We've been here before:
Nico's change to avoid including bmiintrin.h if _MSC_VER is set to reduce compile time brought the problem back.
Basically, Intel's immintrin.h interface is too big. It's windows.h all over again. It significantly slows down builds of simple projects that include basic STL headers like <string>. Nico was able to speed up the *overall* build time of Chromium by at least 10% (ask him for specifics) by adding all these '#if !defined(_MSC_VER)' checks to immintrin.h. However, for compatibility, I think we may need intrin.h to include most of that stuff by default.
I added a bunch of Intel folks to basically ask the question: Can we please go back to the old days of mmintrin.h, xmmintrin.h, then emmintrin.h?
If not, we will have to consider adding an evil configuration macro, like WIN32_LEAN_AND_MEAN for windows.h, that opts out of all this extra immintrin.h functionality to speed up compilation.
On Thu, Sep 29, 2016 at 11:04 AM, Hans Wennborg <hans at chromium.org<mailto:hans at chromium.org>> wrote:
On Thu, Sep 29, 2016 at 10:50 AM, Nathan Froyd via cfe-dev
<cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
> [While I filed this as https://llvm.org/bugs/show_bug.cgi?id=30506,
> Paul Robinson suggested in the bug that it might be worth bringing to
> cfe-dev for wider discussion.]
> Firefox's copy of FFmpeg includes <intrin.h> on MSVC:
> and MSVC's <intrin.h> (after including several other headers) declares
> _tzcnt_u32. clang-cl declares _tzcnt_u32 in <bmiintrin.h>, but
> <bmiintrin.h> is only included from <intrin.h> if an appropriate
> target CPU is detected:
> even though _tzcnt_u32 (or rather, its underlying implementation
> function, __tzcnt_u32) is explicitly declared to be available
> The net result is that Firefox's copy of FFmpeg doesn't compile with
> clang-cl because of this issue. Upstream has the same code, so I
> assume the same is true there:
> AFAICT, this behavior was changed by only including <bmiitrin.h> if
> __BMI__ is defined in:
> with the desirable goal of reducing compile time. But this change
> also broke things like _tzcnt_u32 being available.
> What is the right thing to do here? Should <bmiintrin.h> be
> unconditionally included, or should something else be done?
I think these intrinsics (it's really just tzcnt though?) that are
useful also for non-BMI targets should probably be moved out of the
+thakis and rnk if they have any thoughts on this.
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev