[cfe-commits] [PATCH] Fixing include order for Windows

Aaron Ballman aaron at aaronballman.com
Sun Feb 19 18:48:03 PST 2012


On Sun, Feb 19, 2012 at 7:41 PM, Chandler Carruth <chandlerc at google.com> wrote:
> On Sun, Feb 19, 2012 at 5:35 PM, Aaron Ballman <aaron at aaronballman.com>
> wrote:
>>
>> On Sun, Feb 19, 2012 at 5:18 PM, Chandler Carruth <chandlerc at google.com>
>> wrote:
>> > On Sun, Feb 19, 2012 at 3:09 PM, Aaron Ballman <aaron at aaronballman.com>
>> > wrote:
>> >>
>> >> On Sun, Feb 19, 2012 at 4:45 PM, Eli Friedman <eli.friedman at gmail.com>
>> >> wrote:
>> >> > On Sun, Feb 19, 2012 at 10:28 AM, Aaron Ballman
>> >> > <aaron at aaronballman.com>
>> >> > wrote:
>> >> >> On Windows, the default search paths for includes would put the
>> >> >> CMake-found headers (<build dir>\lib\clang\3.1\include) before the
>> >> >> OS
>> >> >> headers.  For the most part, this is fine.  However, there are some
>> >> >> header files that share names with the system headers, which are
>> >> >> incorrect.  For instance, immintrin.h is completely different from
>> >> >> the
>> >> >> SDK include.  This causes miscompilations in surprising places
>> >> >> because
>> >> >> a system header will attempt to include one of these duplicates, and
>> >> >> find the clang header instead of the other system header.
>> >> >>
>> >> >> A simple test case of: #include <map> when building from MSVC
>> >> >> demonstrates the issue.
>> >> >>
>> >> >> A user can work around this behavior by specifying -nobuiltininc on
>> >> >> the command line, but that seems rather obtuse for such normal use
>> >> >> cases.
>> >> >>
>> >> >> I've fixed this by changing the order that the headers are included
>> >> >> from the toolchain.  The SDK headers get added *before* the clang
>> >> >> headers, and this resolves the issue.
>> >> >>
>> >> >> This patch does cause one regression test to fail
>> >> >> (Modules\compiler_builtins.m).  This is because the platform SDK
>> >> >> intrinsics are differently structured from the ones that come with
>> >> >> clang.  The OS APIs expect the SDK intrinsics, but our tests seem to
>> >> >> require our own intrinsics.  I'm open to suggestions.
>> >> >
>> >> > If on Windows, we specifically need the OS version of immintrin.h, we
>> >> > should modify our version to forward to it; completely ignoring the
>> >> > builtin headers is a bad idea (for example,
>> >> > http://llvm.org/bugs/show_bug.cgi?id=11918 is a miscompile caused by
>> >> > pulling in the wrong version of stdarg.h).
>> >>
>> >> The patch doesn't completely ignore the builtin headers though -- it
>> >> simply prefers the OS headers over the builtin headers.
>> >
>> >
>> > That doesn't work in the general case though -- look at all the places
>> > in
>> > our builtin headers where we specifically forward *to* the system
>> > headers.
>> > If the builtin doesn't come first, lots of things go wrong here.
>> >
>> > We should make our builtins always come first, and always be correct
>> > when
>> > they come first.
>> >
>> >> That being
>> >> said, I am not against forwarding our version.  What would be a good
>> >> approach to it?
>> >
>> >
>> > Look at the other builtin headers that use '__has_include_next' and
>> > 'include_next' to fold together a Clang builtin header and a system
>> > builtin
>> > header.
>> >
>> > I suspect you will have to essentially tack on the system 'immintrin.h'
>> > header to the end of the builtin, as we do expect #include <immintrin.h>
>> > to
>> > pull in the Clang builtin bits... Essentially, this doesn't seem like it
>> > should be either-or, it should be both.
>>
>> So how should I handle the duplicate declarations?  For instance:
>>
>> // mmintrin.h from MSVC
>> typedef union __declspec(intrin_type) _CRT_ALIGN(8) __m64
>> {
>>    unsigned __int64    m64_u64;
>>    float               m64_f32[2];
>>    __int8              m64_i8[8];
>>    __int16             m64_i16[4];
>>    __int32             m64_i32[2];
>>    __int64             m64_i64;
>>    unsigned __int8     m64_u8[8];
>>    unsigned __int16    m64_u16[4];
>>    unsigned __int32    m64_u32[2];
>> } __m64;
>>
>> void  _m_empty(void);
>> #define _mm_empty         _m_empty
>>
>> // mmintrin.h from clang
>> typedef long long __m64 __attribute__((__vector_size__(8)));
>>
>> static __inline__ void __attribute__((__always_inline__, __nodebug__))
>> _mm_empty(void)
>> {
>>    __builtin_ia32_emms();
>> }
>> #define _m_empty _mm_empty
>>
>> It seems to me that trying to keep *both* headers is going to be
>> rather challenging given that both make the same typedefs in
>> incompatible ways, the declarations/macros are opposite one another,
>> etc.
>
>
> Hold on -- the MS one is trying to provide the same intel intrinsics? Now I
> think that the original search order and header file is correct -- we don't
> want the MS version of this header as it is likely to use MS extensions
> Clang doesn't support (or for which Clang has better models). Maybe what we
> really need is to understand what the clang immintrin.h *lacks* in order for
> MS code to use it?

Yes, MS has the same intrinsics, but with differently-structured
header files.  The MS immintrin.h has all of the AVX intrinsics and
includes only wmmintrin.h, which includes nmmintrin.h for SSE4.2, etc
The clang immintrin.h checks various defines (__AVX__, __AES__, etc)
before including all of the files.

One thing which does get us further in parsing is to force define the
includes from clang's intrinsics:

#define __AVX__
#define __AES__
#define __SSE3__
#define __SSSE3__
#define __SSE4_1__
#define __SSE4_2__

There are still compile errors, however, because of the way
Microsoft's intrin.h header file works.  It has some macro hacks to
basically erase some declarations.  For instance:

#define __MACHINE(X)          X;
#define __MACHINEX86X_IA64    __MACHINE
...
#define __MACHINEZ(X)         /* NOTHING */
...
#if !(defined(_M_IX86) || defined(_M_IA64))
#undef __MACHINEX86X_IA64
#define __MACHINEX86X_IA64    __MACHINEZ
#endif
...
__MACHINEX86X_IA64(__m64 _m_pshufw(__m64,int))

// Microsoft headers
extern __m64 _m_pshufw(__m64, int);
#define _mm_shuffle_pi16  _m_pshufw

// Clang headers
#define _mm_shuffle_pi16(a, n) __extension__ ({ \
  __m64 __a = (a); \
  (__m64)__builtin_ia32_pshufw((__v4hi)__a, (n)); })
#define _m_pshufw _mm_shuffle_pi16

This causes compile errors with clang like this:

error: expected unqualified-id
__MACHINEX86X_IA64(__m64 _m_pshufw(__m64,int))

Because this expands out into:

__m64 __extension__ ({ __m64 __a = (__m64);
(__m64)__builtin_ia32_pshufw((__v4hi)__a, (int)); });

Beyond that, I'm a bit more worried about the drastic differences
between the type declarations.  I know there is code out there which
makes use of the union for __m64, for instance.  So that code will
break because the clang intrinsics do not use it (they use long long
instead).

But perhaps that can be worked around by conditionally compiling in
the declarations used by the Microsoft headers?

~Aaron




More information about the cfe-commits mailing list