[cfe-dev] RFC: do not optimize on basis of __attribute__((nonnull)) in glibc headers

Hal Finkel via cfe-dev cfe-dev at lists.llvm.org
Tue Jan 3 15:10:17 PST 2017


On 01/03/2017 04:06 PM, Richard Smith via cfe-dev wrote:
> Via https://reviews.llvm.org/D27855, LLVM is likely to gain the 
> ability to delete null checks in callers based on 
> __attribute__((nonnull)) on the callee. This interacts badly with 
> glibc's choice to mark the pointer parameters of memcpy, memmove, etc. 
> as __attribute__((nonnull)) -- it is relatively common for programs to 
> pass a null pointer and a zero size to these functions, and in 
> practice libc implementations accept such usage. Indeed, LLVM's 
> lowering of @llvm.memcpy intrinsics relies on these calls working.
>
> Deleting a null pointer check on p after a memcpy(p, q, 0) call seems 
> extremely user-hostile, and very unlikely to result in a valuable 
> improvement to the program, so I propose that we stop lowering 
> __attribute__((nonnull)) on these builtin library functions to the 
> llvm nonnull attribute.
>
> (Chandler is working on a paper for the C++ committee proposing to 
> give these functions defined behavior when given a null pointer and a 
> zero size, but optimizing on the basis of these particular nonnull 
> attributes seems like a bad idea regardless of the C or C++ 
> committees' decisions.)
>
> Thoughts?

On the one hand, GCC has already started optimizing based on the nonnull 
attribute on memcpy, and so that ship has sailed. We already need to fix 
all of our code to work in the face of those optimizations, regardless 
of what Clang/LLVM does, because the vast majority of our code is 
expected to work correctly when compiled with GCC. Even if we fix this 
in C/C++ at the standards level now, it seems likely to be too late. 
Moreover, a conforming implementation of the underlying library function 
could still do problematic things.

On the other hand, I don't feel like we should be unnecessarily hostile 
to users. This particular "feature" of how memcpy is (not) specified is 
not well known and gives fairly-obvious-looking code undefined behavior. 
No pointer casting or other likely-to-cause-problems constructs 
required. I suspect that it is useful to preserve the fact that 
memcpy(p, q, n) implies that p and q are dereferenceable[_or_null](n), 
although we can certainly do that in the optimizer regardless of what 
attributes are applied if we must, and strip the attribute in places 
that tend to be problematic if desired. We should have a flag that 
disables this stripping if we do it by default.

We could also have a mode that inserts a null check, or a zero-size 
check, and branch around all calls to memcpy and friends. This has the 
advantage of protecting against odd, but conforming, library 
implementations that assume the pointers are not null (not that I know 
of any such implementation).

In short, I don't have a strong feeling on this either way. I'm fine 
with stripping the attributes. I expect that ubsan will (continue to) 
complain about the situation regardless.

Thanks again,
Hal

>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170103/28b24f17/attachment.html>


More information about the cfe-dev mailing list