[cfe-dev] clang-cl's <intrin.h>, _tzcnt_u32, and compatibility with MSVC's <intrin.h>

Nathan Froyd via cfe-dev cfe-dev at lists.llvm.org
Fri Oct 28 15:46:07 PDT 2016


I've been following the review below and its successor, D25767, but
I'm unable to make the proposed solution (or its equivalent) work.  If
I specify:

-Xclang -fimplicit-module-maps -Xclang -fmodule-map-file=[builtin .modulemap]

I still get an error about an unresolved __tzcnt_u32 symbol.  If I
remove the "-Xclang -fimplicit-module-maps" from the above, I get the
same error.  If I try instead:

-Xclang -fmodules -Xclang -fmodule-map-file=[builtin .modulemap]

I get peculiar errors about x86intrin.h and the _Builtin_intrinsics
module not being available and implicit use of module files being
disabled.  (Same goes for _Builtin_stddef_max_align_t and
__stddef_max_align_t.h.)  Adding "-Xclang -fimplicit-module-maps"
gives me the same error.

I'm not sure what's going wrong, or if there's been some sort of
mistake in generating the module map.  Help?

Thanks,
-Nathan

On Thu, Oct 6, 2016 at 2:45 PM, Cohen, Elad2 <elad2.cohen at intel.com> wrote:
> I uploaded to review the patch that adds the option for enabling the modules
> feature exclusively for the intrinsic header files:
> https://reviews.llvm.org/D25337
> You can try it out by adding the –fexclusive-builtin-modules command line
> option. (all the different x86 *intrin.h files should be included by default
> since the #if guards already contain a __has_feature(modules) condition).
>
>
>
> Thanks, Elad
>
>
>
> From: Cohen, Elad2
> Sent: Friday, September 30, 2016 16:54
> To: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Reid Kleckner
> <rnk at google.com>; Hans Wennborg <hans at chromium.org>; Blank, Guy
> <guy.blank at intel.com>; Aboud, Amjad <amjad.aboud at intel.com>; Craig Topper
> <craig.topper at gmail.com>; Badouh, Asaf <asaf.badouh at intel.com>; Zuckerman,
> Michael <michael.zuckerman at intel.com>; Breger, Igor <igor.breger at intel.com>
> Cc: Nathan Froyd <nfroyd at mozilla.com>; Nico Weber <thakis at chromium.org>;
> cfe-dev <cfe-dev at lists.llvm.org>; Ouriel, Boaz <boaz.ouriel at intel.com>;
> Rackover, Zvi <zvi.rackover at intel.com>
>
>
> Subject: RE: [cfe-dev] clang-cl's <intrin.h>, _tzcnt_u32, and compatibility
> with MSVC's <intrin.h>
>
>
>
> Hi All,
>
>
>
> As mentioned in past discussions, the introduction of AVX512 intrinsics had
> a severe impact on the compile time.
>
> The extreme case happens on windows where the mere inclusion of system
> headers like <string> causes a hit in the compile time of applications that
> don’t even make use of any intrinsics.
>
> http://lists.llvm.org/pipermail/cfe-dev/2016-May/048837.html
> http://lists.llvm.org/pipermail/cfe-dev/2016-June/049460.html
>
> During these discussions Chandler Carruth indicated that usage of clang
> modules is probably the best alternative to mitigate this problem:
>
> http://lists.llvm.org/pipermail/cfe-dev/2016-June/049513.html
>
>
>
> Based on this feedback, Intel has started working on this proposal. Initial
> Internal measurements indicated that this direction is promising.
>
> For example, the compile time of a single translation unit that includes all
> of the intel intrinsic header files would reduce by 90% starting from the
> 2nd compilation (by using cached pre-compiled modules). In general, we saw
> that compilation of projects that have more than one translation unit that
> includes x86intrin.h/immintrin.h will get big compile time reductions
> starting from the 1st compilation.
>
>
>
> The biggest advantage of this approach is that it scales and would help also
> applications that do rely on intrinsics and hence will not be able to avoid
> the compile time overhead.
>
>
>
> I (Intel) would like to add a Clang option that enables the modules feature
> exclusively for the Intel intrinsics header files, and set this option to be
> turned on by default. This allows clang to cache a pre-compiled (actually
> pre-parsed) module of all the headers included by x86intrin.h/immintrin.h,
> and any following compilation of #include<x86intrin.h>/<immintrin.h>
> directives will skip the parsing phase and save precious compile time. Since
> ‘clang modules’ and the regular ‘pre-processor include mechanism’ are not
> 100% compatible, I started by cleaning some small Intel intrinsics module
> bugs exposed by this work:
>
> https://reviews.llvm.org/D23871, https://reviews.llvm.org/D24825,
> https://reviews.llvm.org/D24752 (This one is still in review).
>
> I plan to upload the actual patch that adds the option I mentioned above
> somewhere next week hopefully. Once this patch is uploaded, you could
> experiment with it and see if it actually mitigates the problem on your end
> and provide us with valuable feedback on this direction.
>
>
>
> Thanks,
>
> Elad
>
>
>
>
>
> From: Demikhovsky, Elena
> Sent: Thursday, September 29, 2016 22:20
> To: Reid Kleckner <rnk at google.com>; Hans Wennborg <hans at chromium.org>;
> Blank, Guy <guy.blank at intel.com>; Aboud, Amjad <amjad.aboud at intel.com>;
> Craig Topper <craig.topper at gmail.com>; Badouh, Asaf <asaf.badouh at intel.com>;
> Zuckerman, Michael <michael.zuckerman at intel.com>; Breger, Igor
> <igor.breger at intel.com>
> Cc: Nathan Froyd <nfroyd at mozilla.com>; Nico Weber <thakis at chromium.org>;
> cfe-dev <cfe-dev at lists.llvm.org>; Ouriel, Boaz <boaz.ouriel at intel.com>;
> Cohen, Elad2 <elad2.cohen at intel.com>
> Subject: RE: [cfe-dev] clang-cl's <intrin.h>, _tzcnt_u32, and compatibility
> with MSVC's <intrin.h>
>
>
>
> Forwarding your mail to Elad who is working now on intrinsics. Please take
> into account a possible delay in Elad’s response, we are in local holidays
> this week.
>
>
>
> -           Elena
>
>
>
> From: Reid Kleckner [mailto:rnk at google.com]
> Sent: Thursday, September 29, 2016 22:02
> To: Hans Wennborg <hans at chromium.org>; Blank, Guy <guy.blank at intel.com>;
> Aboud, Amjad <amjad.aboud at intel.com>; Demikhovsky, Elena
> <elena.demikhovsky at intel.com>; Craig Topper <craig.topper at gmail.com>;
> Badouh, Asaf <asaf.badouh at intel.com>; Zuckerman, Michael
> <michael.zuckerman at intel.com>; Breger, Igor <igor.breger at intel.com>
> Cc: Nathan Froyd <nfroyd at mozilla.com>; Nico Weber <thakis at chromium.org>;
> cfe-dev <cfe-dev at lists.llvm.org>
> Subject: Re: [cfe-dev] clang-cl's <intrin.h>, _tzcnt_u32, and compatibility
> with MSVC's <intrin.h>
>
>
>
> We've been here before:
>
> https://llvm.org/bugs/show_bug.cgi?id=25544
>
> http://llvm.org/viewvc/llvm-project?view=revision&revision=253358
>
> https://bugs.chromium.org/p/chromium/issues/detail?id=556735
>
>
>
> Nico's change to avoid including bmiintrin.h if _MSC_VER is set to reduce
> compile time brought the problem back.
>
>
>
> Basically, Intel's immintrin.h interface is too big. It's windows.h all over
> again. It significantly slows down builds of simple projects that include
> basic STL headers like <string>. Nico was able to speed up the *overall*
> build time of Chromium by at least 10% (ask him for specifics) by adding all
> these '#if !defined(_MSC_VER)' checks to immintrin.h. However, for
> compatibility, I think we may need intrin.h to include most of that stuff by
> default.
>
>
>
> I added a bunch of Intel folks to basically ask the question: Can we please
> go back to the old days of mmintrin.h, xmmintrin.h, then emmintrin.h?
>
>
>
> If not, we will have to consider adding an evil configuration macro, like
> WIN32_LEAN_AND_MEAN for windows.h, that opts out of all this extra
> immintrin.h functionality to speed up compilation.
>
>
>
> On Thu, Sep 29, 2016 at 11:04 AM, Hans Wennborg <hans at chromium.org> wrote:
>
> On Thu, Sep 29, 2016 at 10:50 AM, Nathan Froyd via cfe-dev
> <cfe-dev at lists.llvm.org> wrote:
>> [While I filed this as https://llvm.org/bugs/show_bug.cgi?id=30506,
>> Paul Robinson suggested in the bug that it might be worth bringing to
>> cfe-dev for wider discussion.]
>>
>> Firefox's copy of FFmpeg includes <intrin.h> on MSVC:
>>
>>
>> https://dxr.mozilla.org/mozilla-central/source/media/ffvpx/libavutil/x86/intmath.h#26
>>
>> and MSVC's <intrin.h> (after including several other headers) declares
>> _tzcnt_u32.  clang-cl declares _tzcnt_u32 in <bmiintrin.h>, but
>> <bmiintrin.h> is only included from <intrin.h> if an appropriate
>> target CPU is detected:
>>
>>
>> https://github.com/llvm-mirror/clang/blob/master/lib/Headers/immintrin.h#L82
>>
>> even though _tzcnt_u32 (or rather, its underlying implementation
>> function, __tzcnt_u32) is explicitly declared to be available
>> everywhere:
>>
>>
>> https://github.com/llvm-mirror/clang/blob/master/lib/Headers/bmiintrin.h#L284
>>
>> The net result is that Firefox's copy of FFmpeg doesn't compile with
>> clang-cl because of this issue.  Upstream has the same code, so I
>> assume the same is true there:
>>
>> https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/x86/intmath.h#L26
>>
>> AFAICT, this behavior was changed by only including <bmiitrin.h> if
>> __BMI__ is defined in:
>>
>> http://reviews.llvm.org/D20291
>>
>> with the desirable goal of reducing compile time.  But this change
>> also broke things like _tzcnt_u32 being available.
>>
>> What is the right thing to do here?  Should <bmiintrin.h> be
>> unconditionally included, or should something else be done?
>
> I think these intrinsics (it's really just tzcnt though?) that are
> useful also for non-BMI targets should probably be moved out of the
> BMI-specific  header.
>
> +thakis and rnk if they have any thoughts on this.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.



More information about the cfe-dev mailing list