[cfe-commits] r149086 - in /cfe/trunk: include/clang/Basic/TargetInfo.h lib/Basic/Targets.cpp lib/CodeGen/CGBuiltin.cpp test/CodeGen/builtin-count-zeros.c

Sat Jan 28 10:36:03 PST 2012

On Jan 26, 2012, at 7:16 PM, Chris Lattner wrote:

> 
> On Jan 26, 2012, at 4:47 PM, Chandler Carruth wrote:
> 
>> BTW, I agree about grandfathering in existing ARM code that relies on this, but still potentially exposing safe and sane entry points for future code. Maybe the Intel intrinsics provide this interface? I don't recall.
>> 
>> On Thu, Jan 26, 2012 at 3:02 PM, Bob Wilson <bob.wilson at apple.com> wrote:
>> How is it a divergence?  GCC says that the values are undefined.  This does not conflict with that.  It makes no difference on ARM codegen so the only real effect is to disable optimizations that try to take advantage of that undef.
>> 
>> It is a divergence in that Clang is providing stricter guarantees than GCC. When folks complain "Hey, I developed my code w/ Clang, and everything worked great, but when my customer built it with GCC it miscompiled and exploded. WTF?" I want to have documentation of the Clang language extension that their code depends on.
> 
> FWIW, I agree with Bob.  Call this "implementation defined" behavior.

I don't look at this as "clang providing stricter guarantees".  The built-ins are documented to be undefined for zero, and I don't want to change that.  If you want to write portable code, you need to avoid that undefined behavior.  That should be the message to clang users.

Instead, this change is just making clang be less aggressive about exploiting that undefined behavior on certain targets.  It's an internal implementation detail of the compiler.  There are lots of ways we can take advantage of various kinds of undefs, some we use and some we don't.  We don't want to get into documenting the different cases where we decide to optimize based on undefs.

> 
> As a random request, Bob, can you also do this for PowerPC?

Done in svn r149181

>  I note that LLVM's own MathExtras.h header has:
> 
> inline unsigned CountLeadingZeros_32(uint32_t Value) {
>   unsigned Count; 
>   // PowerPC is defined for __builtin_clz(0)
> #if !defined(__ppc__) && !defined(__ppc64__)
>   if (!Value) return 32;
> #endif
>   Count = __builtin_clz(Value);
>   return Count;
> }
> 
> How about that :)

Eww.

> 
>> As an unrelated (and orthogonal to your change) note, it would be good to make sure that LLVM folds code like:
>> 
>> int f(int x) { return x ? __builtin_clz(x) : sizeof(int)*8; }
>> 
>> on ARM to not have the dead condition. I tried to do this for a few cases, but I'm worried I might not have gotten it for all of them... Dunno when I'll get back to further hacking on this style of optimization.
> 
> Yes, we "should" fold that on any target for IR that is using "defined at zero" semantics.  In fact, we should probably fold that (when using "undefined at zero" semantics) to set the "defined at zero" bit and have the code generator expand out the right code.

I just tried that example, and it's not getting folded now.  I'll try fix it but I keep getting distracted, so in case I forget, I added a note in the ARM README file.  svn r149182
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120128/2e5d1628/attachment.html>