<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 15 April 2015 at 11:18, Sean Silva <span dir="ltr"><<a href="mailto:chisophugis@gmail.com" target="_blank">chisophugis@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Wed, Apr 15, 2015 at 6:46 PM, Nick Lewycky <span dir="ltr"><<a href="mailto:nlewycky@google.com" target="_blank">nlewycky@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span>On 15 April 2015 at 10:27, Sean Silva <span dir="ltr"><<a href="mailto:chisophugis@gmail.com" target="_blank">chisophugis@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">In the post-commit of r231792, I suggested the idea of using __attribute__((enable_if(...))) for avoiding the mess of the macros in the builtin headers. AFAIK, the macros are currently used to make sure that the "immediates" are constant expressions, piggybacking on the constant expression check in the __builtin_* call.<div><br></div><div>I've attached a file with a proof-of-concept for using __attribute__((enable_if(...))) for this purpose. I originally though using __builtin_constant_p in the enable_if, but that turns out to not be necessary (see the docs: <a href="http://clang.llvm.org/docs/AttributeReference.html#enable-if" target="_blank">http://clang.llvm.org/docs/AttributeReference.html#enable-if</a> ; the enable_if condition fails for non-constant expressions anyway). The core is:</div><div><br></div><div><div>// Current builtin headers:</div><div>//</div><div>//#define _mm256_insertf128_si256(V1, V2, M) __extension__ ({ \</div><div>// (__m256i)__builtin_shufflevector( \</div><div>// (__v4di)(V1), \</div><div>// (__v4di)_mm256_castsi128_si256((__m128i)(V2)), \</div><div>// (((M) & 1) ? 0 : 4), \</div><div>// (((M) & 1) ? 1 : 5), \</div><div>// (((M) & 1) ? 4 : 2), \</div><div>// (((M) & 1) ? 5 : 3) );})</div><div><br></div><div>// A bit cleaner.</div><div>static __inline __attribute__((__always_inline__, __nodebug__))</div><div>__m256i _mm256_insertf128_si256(__m256i __a, __m128i __b, int __imm8)</div><div>__attribute__((enable_if(__imm8, "'__imm8' must be a constant")))</div><div>{</div><div> if (__imm8 & 1)</div><div> return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 0, 1, 4, 5);</div><div> else</div><div> return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 4, 5, 2, 3);</div><div>}</div></div></div></blockquote><div><br></div></span><div>I think:</div><div><br></div><div><span>static __inline __attribute__((__always_inline__, __nodebug__))<br>__m256i _mm256_insertf128_si256(__m256i __a, __m128i __b, int __imm8)<br></span>__attribute__((enable_if(__imm8 & 1 == 1, "'__imm8' must be a constant")))<span><br>{<br> return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 0, 1, 4, 5);<br>}</span></div><div><br></div><div><span>static __inline __attribute__((__always_inline__, __nodebug__))<br>__m256i _mm256_insertf128_si256(__m256i __a, __m128i __b, int __imm8)<br></span>__attribute__((enable_if(__imm8 & 1 == 0, "'__imm8' must be a constant")))<span><br>{<br> return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 4, 5, 2, 3);<br>}<br></span></div></div></div></div></blockquote><div><br></div></div></div><div>Unfortunately this approach doesn't scale to some of the more nontrivial ones like _mm256_alignr_epi8 in <a href="http://reviews.llvm.org/D8301" target="_blank">http://reviews.llvm.org/D8301</a> .</div></div></div></div></blockquote><div><br></div><div>Fair enough. I would still try to hoist out a simple if-statement though, even if we can't do it for all of the builtins.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div>Nick, are you okay using enable_if for this? It's sort of a hack but if we are going to be carrying this attribute around forever (has it reached that level of compatibility guarantee yet?), we might as well use it to solve this problem for us.</div></div></blockquote><div><br></div></span><div>Attribute enable_if is here to stay, though it may change meaning in corner cases (notably multiple enable_if's on a single function). Using it in the compiler's own header files is fine. Does anyone ever try to take the address of _mm256_insertf128_si256 (I assume not if it's a macro)?</div></div></div></div>
</blockquote></span></div><br></div><div class="gmail_extra">Yeah, I think taking the address of these is a "don't do that" sort of thing. "static inline always_inline"</div></div></blockquote><div><br></div><div>You can still take the address of a static inline always_inline function (at least with GCC 4.7), but I guess we're okay not allowing that.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">What changes do you foresee in the case of multiple enable_if's? The current behavior (in my empirical testing; haven't used the source) seems to be a good fit for this use case.</div></div></blockquote><div><br></div><div>I don't know what we'd change it to, but we currently have a problem where you end up with a combinatorial explosion in number of overloads necessary to represent the check you want. I don't recall the situation off the top of my head.</div><div><br></div></div></div></div>