[clang] 4ede887 - PR45402: Make the restrictions on constant evaluation of memcmp and

Richard Smith via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 3 18:26:54 PDT 2020


Author: Richard Smith
Date: 2020-04-03T18:26:14-07:00
New Revision: 4ede8879924c08ae5b495d3f421c167d822a60be

URL: https://github.com/llvm/llvm-project/commit/4ede8879924c08ae5b495d3f421c167d822a60be
DIFF: https://github.com/llvm/llvm-project/commit/4ede8879924c08ae5b495d3f421c167d822a60be.diff

LOG: PR45402: Make the restrictions on constant evaluation of memcmp and
memchr consistent and comprehensible, and document them.

We previously allowed evaluation of memcmp on arrays of integers of any
size, so long as the call evaluated to 0, and allowed evaluation of
memchr on any array of integral type of size 1 (including enums). The
purpose of constant-evaluating these builtins is only to support
constexpr std::char_traits, so we now consistently allow them on arrays
of (possibly signed or unsigned) char only.

Added: 
    

Modified: 
    clang/docs/LanguageExtensions.rst
    clang/include/clang/Basic/DiagnosticASTKinds.td
    clang/lib/AST/ExprConstant.cpp
    clang/test/SemaCXX/constexpr-string.cpp

Removed: 
    


################################################################################
diff  --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index 558ce7dee653..6dcfd1a49f06 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -2333,10 +2333,11 @@ String builtins
 ---------------
 
 Clang provides constant expression evaluation support for builtins forms of
-the following functions from the C standard library ``<string.h>`` header:
+the following functions from the C standard library headers
+``<string.h>`` and ``<wchar.h>``:
 
 * ``memchr``
-* ``memcmp``
+* ``memcmp`` (and its deprecated BSD / POSIX alias ``bcmp``)
 * ``strchr``
 * ``strcmp``
 * ``strlen``
@@ -2366,7 +2367,11 @@ In addition to the above, one further builtin is provided:
 constant expressions in C++11 onwards (where a cast from ``void*`` to ``char*``
 is disallowed in general).
 
-Support for constant expression evaluation for the above builtins be detected
+Constant evaluation support for the ``__builtin_mem*`` functions is provided
+only for arrays of ``char``, ``signed char``, or ``unsigned char``, despite
+these functions accepting an argument of type ``const void*``.
+
+Support for constant expression evaluation for the above builtins can be detected
 with ``__has_feature(cxx_constexpr_string_builtins)``.
 
 Memory builtins
@@ -2386,6 +2391,25 @@ more information.
 
 Note that the `size` argument must be a compile time constant.
 
+Clang provides constant expression evaluation support for builtin forms of the
+following functions from the C standard library headers
+``<string.h>`` and ``<wchar.h>``:
+
+* ``memcpy``
+* ``memmove``
+* ``wmemcpy``
+* ``wmemmove``
+
+In each case, the builtin form has the name of the C library function prefixed
+by ``__builtin_``.
+
+Constant evaluation support is only provided when the source and destination
+are pointers to arrays with the same trivially copyable element type, and the
+given size is an exact multiple of the element size that is no greater than
+the number of elements accessible through the source and destination operands.
+
+Constant evaluation support is not yet provided for ``__builtin_memcpy_inline``.
+
 Atomic Min/Max builtins with memory ordering
 --------------------------------------------
 

diff  --git a/clang/include/clang/Basic/DiagnosticASTKinds.td b/clang/include/clang/Basic/DiagnosticASTKinds.td
index 544573edffdf..a1415f9ec0e1 100644
--- a/clang/include/clang/Basic/DiagnosticASTKinds.td
+++ b/clang/include/clang/Basic/DiagnosticASTKinds.td
@@ -244,6 +244,12 @@ def note_constexpr_unsupported_unsized_array : Note<
 def note_constexpr_unsized_array_indexed : Note<
   "indexing of array without known bound is not allowed "
   "in a constant expression">;
+def note_constexpr_memcmp_unsupported : Note<
+  "constant evaluation of %0 between arrays of types %1 and %2 "
+  "is not supported; only arrays of narrow character types can be compared">;
+def note_constexpr_memchr_unsupported : Note<
+  "constant evaluation of %0 on array of type %1 "
+  "is not supported; only arrays of narrow character types can be searched">;
 def note_constexpr_memcpy_null : Note<
   "%select{source|destination}2 of "
   "'%select{%select{memcpy|wmemcpy}1|%select{memmove|wmemmove}1}0' "

diff  --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index c6e1cc7b67df..a83b2e24e17f 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -8469,8 +8469,12 @@ bool PointerExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
     }
     // Give up on byte-oriented matching against multibyte elements.
     // FIXME: We can compare the bytes in the correct order.
-    if (IsRawByte && Info.Ctx.getTypeSizeInChars(CharTy) != CharUnits::One())
+    if (IsRawByte && !CharTy->isCharType()) {
+      Info.FFDiag(E, diag::note_constexpr_memchr_unsupported)
+          << (std::string("'") + Info.Ctx.BuiltinInfo.getName(BuiltinOp) + "'")
+          << CharTy;
       return false;
+    }
     // Figure out what value we're actually looking for (after converting to
     // the corresponding unsigned type if necessary).
     uint64_t DesiredVal;
@@ -8586,6 +8590,7 @@ bool PointerExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
     QualType T = Dest.Designator.getType(Info.Ctx);
     QualType SrcT = Src.Designator.getType(Info.Ctx);
     if (!Info.Ctx.hasSameUnqualifiedType(T, SrcT)) {
+      // FIXME: Consider using our bit_cast implementation to support this.
       Info.FFDiag(E, diag::note_constexpr_memcpy_type_pun) << Move << SrcT << T;
       return false;
     }
@@ -11149,6 +11154,16 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
                 CharTy1, E->getArg(0)->getType()->getPointeeType()) &&
             Info.Ctx.hasSameUnqualifiedType(CharTy1, CharTy2)));
 
+    // For memcmp, allow comparing any arrays of '[[un]signed] char',
+    // but no other types.
+    if (IsRawByte && !(CharTy1->isCharType() && CharTy2->isCharType())) {
+      // FIXME: Consider using our bit_cast implementation to support this.
+      Info.FFDiag(E, diag::note_constexpr_memcmp_unsupported)
+          << (std::string("'") + Info.Ctx.BuiltinInfo.getName(BuiltinOp) + "'")
+          << CharTy1 << CharTy2;
+      return false;
+    }
+
     const auto &ReadCurElems = [&](APValue &Char1, APValue &Char2) {
       return handleLValueToRValueConversion(Info, E, CharTy1, String1, Char1) &&
              handleLValueToRValueConversion(Info, E, CharTy2, String2, Char2) &&
@@ -11159,57 +11174,6 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
              HandleLValueArrayAdjustment(Info, E, String2, CharTy2, 1);
     };
 
-    if (IsRawByte) {
-      uint64_t BytesRemaining = MaxLength;
-      // Pointers to const void may point to objects of incomplete type.
-      if (CharTy1->isIncompleteType()) {
-        Info.FFDiag(E, diag::note_constexpr_ltor_incomplete_type) << CharTy1;
-        return false;
-      }
-      if (CharTy2->isIncompleteType()) {
-        Info.FFDiag(E, diag::note_constexpr_ltor_incomplete_type) << CharTy2;
-        return false;
-      }
-      uint64_t CharTy1Width{Info.Ctx.getTypeSize(CharTy1)};
-      CharUnits CharTy1Size = Info.Ctx.toCharUnitsFromBits(CharTy1Width);
-      // Give up on comparing between elements with disparate widths.
-      if (CharTy1Size != Info.Ctx.getTypeSizeInChars(CharTy2))
-        return false;
-      uint64_t BytesPerElement = CharTy1Size.getQuantity();
-      assert(BytesRemaining && "BytesRemaining should not be zero: the "
-                               "following loop considers at least one element");
-      while (true) {
-        APValue Char1, Char2;
-        if (!ReadCurElems(Char1, Char2))
-          return false;
-        // We have compatible in-memory widths, but a possible type and
-        // (for `bool`) internal representation mismatch.
-        // Assuming two's complement representation, including 0 for `false` and
-        // 1 for `true`, we can check an appropriate number of elements for
-        // equality even if they are not byte-sized.
-        APSInt Char1InMem = Char1.getInt().extOrTrunc(CharTy1Width);
-        APSInt Char2InMem = Char2.getInt().extOrTrunc(CharTy1Width);
-        if (Char1InMem.ne(Char2InMem)) {
-          // If the elements are byte-sized, then we can produce a three-way
-          // comparison result in a straightforward manner.
-          if (BytesPerElement == 1u) {
-            // memcmp always compares unsigned chars.
-            return Success(Char1InMem.ult(Char2InMem) ? -1 : 1, E);
-          }
-          // The result is byte-order sensitive, and we have multibyte elements.
-          // FIXME: We can compare the remaining bytes in the correct order.
-          return false;
-        }
-        if (!AdvanceElems())
-          return false;
-        if (BytesRemaining <= BytesPerElement)
-          break;
-        BytesRemaining -= BytesPerElement;
-      }
-      // Enough elements are equal to account for the memcmp limit.
-      return Success(0, E);
-    }
-
     bool StopAtNull =
         (BuiltinOp != Builtin::BImemcmp && BuiltinOp != Builtin::BIbcmp &&
          BuiltinOp != Builtin::BIwmemcmp &&
@@ -11227,7 +11191,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
       APValue Char1, Char2;
       if (!ReadCurElems(Char1, Char2))
         return false;
-      if (Char1.getInt() != Char2.getInt()) {
+      if (Char1.getInt().ne(Char2.getInt())) {
         if (IsWide) // wmemcmp compares with wchar_t signedness.
           return Success(Char1.getInt() < Char2.getInt() ? -1 : 1, E);
         // memcmp always compares unsigned chars.

diff  --git a/clang/test/SemaCXX/constexpr-string.cpp b/clang/test/SemaCXX/constexpr-string.cpp
index f540be8f8e5b..79ac3bf2cc4d 100644
--- a/clang/test/SemaCXX/constexpr-string.cpp
+++ b/clang/test/SemaCXX/constexpr-string.cpp
@@ -47,7 +47,7 @@ extern "C" {
   extern wchar_t *wmemmove(wchar_t *d, const wchar_t *s, size_t n);
 }
 
-# 45 "SemaCXX/constexpr-string.cpp" 2
+# 51 "SemaCXX/constexpr-string.cpp" 2
 namespace Strlen {
   constexpr int n = __builtin_strlen("hello"); // ok
   static_assert(n == 5);
@@ -121,13 +121,13 @@ namespace StrcmpEtc {
   extern struct Incomplete incomplete;
   static_assert(__builtin_memcmp(&incomplete, "", 0u) == 0);
   static_assert(__builtin_memcmp("", &incomplete, 0u) == 0);
-  static_assert(__builtin_memcmp(&incomplete, "", 1u) == 42); // expected-error {{not an integral constant}} expected-note {{read of incomplete type 'struct Incomplete'}}
-  static_assert(__builtin_memcmp("", &incomplete, 1u) == 42); // expected-error {{not an integral constant}} expected-note {{read of incomplete type 'struct Incomplete'}}
+  static_assert(__builtin_memcmp(&incomplete, "", 1u) == 42); // expected-error {{not an integral constant}} expected-note {{not supported}}
+  static_assert(__builtin_memcmp("", &incomplete, 1u) == 42); // expected-error {{not an integral constant}} expected-note {{not supported}}
 
   static_assert(__builtin_bcmp(&incomplete, "", 0u) == 0);
   static_assert(__builtin_bcmp("", &incomplete, 0u) == 0);
-  static_assert(__builtin_bcmp(&incomplete, "", 1u) == 42); // expected-error {{not an integral constant}} expected-note {{read of incomplete type 'struct Incomplete'}}
-  static_assert(__builtin_bcmp("", &incomplete, 1u) == 42); // expected-error {{not an integral constant}} expected-note {{read of incomplete type 'struct Incomplete'}}
+  static_assert(__builtin_bcmp(&incomplete, "", 1u) == 42); // expected-error {{not an integral constant}} expected-note {{not supported}}
+  static_assert(__builtin_bcmp("", &incomplete, 1u) == 42); // expected-error {{not an integral constant}} expected-note {{not supported}}
 
   constexpr unsigned char ku00fe00[] = {0x00, 0xfe, 0x00};
   constexpr unsigned char ku00feff[] = {0x00, 0xfe, 0xff};
@@ -155,11 +155,11 @@ namespace StrcmpEtc {
 
   struct Bool3Tuple { bool bb[3]; };
   constexpr Bool3Tuple kb000100 = {{false, true, false}};
-  static_assert(sizeof(bool) != 1u || __builtin_memcmp(ks00fe00, kb000100.bb, 1) == 0);
-  static_assert(sizeof(bool) != 1u || __builtin_memcmp(ks00fe00, kb000100.bb, 2) == 1);
+  static_assert(sizeof(bool) != 1u || __builtin_memcmp(ks00fe00, kb000100.bb, 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(sizeof(bool) != 1u || __builtin_memcmp(ks00fe00, kb000100.bb, 2) == 1); // expected-error {{constant}} expected-note {{not supported}}
 
-  static_assert(sizeof(bool) != 1u || __builtin_bcmp(ks00fe00, kb000100.bb, 1) == 0);
-  static_assert(sizeof(bool) != 1u || __builtin_bcmp(ks00fe00, kb000100.bb, 2) != 0);
+  static_assert(sizeof(bool) != 1u || __builtin_bcmp(ks00fe00, kb000100.bb, 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(sizeof(bool) != 1u || __builtin_bcmp(ks00fe00, kb000100.bb, 2) != 0); // expected-error {{constant}} expected-note {{not supported}}
 
   constexpr long ksl[] = {0, -1};
   constexpr unsigned int kui[] = {0, 0u - 1};
@@ -173,25 +173,25 @@ namespace StrcmpEtc {
       return nullptr;
     }
   }
-  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), sizeof(long) - 1) == 0);
-  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), sizeof(long) + 0) == 0);
-  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), sizeof(long) + 1) == 0);
-  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), 2*sizeof(long) - 1) == 0);
-  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), 2*sizeof(long) + 0) == 0);
-  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), 2*sizeof(long) + 1) == 42); // expected-error {{not an integral constant}} expected-note {{dereferenced one-past-the-end}}
-  static_assert(__builtin_memcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) - 1) == 0);
-  static_assert(__builtin_memcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) + 0) == 0);
-  static_assert(__builtin_memcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) + 1) == 42); // expected-error {{not an integral constant}} expected-note {{dereferenced one-past-the-end}}
-
-  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), sizeof(long) - 1) == 0);
-  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), sizeof(long) + 0) == 0);
-  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), sizeof(long) + 1) == 0);
-  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), 2*sizeof(long) - 1) == 0);
-  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), 2*sizeof(long) + 0) == 0);
-  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), 2*sizeof(long) + 1) == 42); // expected-error {{not an integral constant}} expected-note {{dereferenced one-past-the-end}}
-  static_assert(__builtin_bcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) - 1) == 0);
-  static_assert(__builtin_bcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) + 0) == 0);
-  static_assert(__builtin_bcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) + 1) == 42); // expected-error {{not an integral constant}} expected-note {{dereferenced one-past-the-end}}
+  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), sizeof(long) - 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), sizeof(long) + 0) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), sizeof(long) + 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), 2*sizeof(long) - 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), 2*sizeof(long) + 0) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_memcmp(ksl, kuSizeofLong(), 2*sizeof(long) + 1) == 42); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_memcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) - 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_memcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) + 0) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_memcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) + 1) == 42); // expected-error {{constant}} expected-note {{not supported}}
+
+  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), sizeof(long) - 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), sizeof(long) + 0) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), sizeof(long) + 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), 2*sizeof(long) - 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), 2*sizeof(long) + 0) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_bcmp(ksl, kuSizeofLong(), 2*sizeof(long) + 1) == 42); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_bcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) - 1) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_bcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) + 0) == 0); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(__builtin_bcmp(ksl + 1, kuSizeofLong() + 1, sizeof(long) + 1) == 42); // expected-error {{constant}} expected-note {{not supported}}
 
   constexpr int a = strcmp("hello", "world"); // expected-error {{constant expression}} expected-note {{non-constexpr function 'strcmp' cannot be used in a constant expression}}
   constexpr int b = strncmp("hello", "world", 3); // expected-error {{constant expression}} expected-note {{non-constexpr function 'strncmp' cannot be used in a constant expression}}
@@ -385,14 +385,14 @@ namespace StrchrEtc {
   enum class E : unsigned char {};
   struct EPair { E e, f; };
   constexpr EPair ee{E{240}};
-  static_assert(__builtin_memchr(&ee.e, 240, 1) == &ee.e);
+  static_assert(__builtin_memchr(&ee.e, 240, 1) == &ee.e); // expected-error {{constant}} expected-note {{not supported}}
 
   constexpr bool kBool[] = {false, true, false};
   constexpr const bool *const kBoolPastTheEndPtr = kBool + 3;
-  static_assert(sizeof(bool) != 1u || __builtin_memchr(kBoolPastTheEndPtr - 3, 1, 99) == kBool + 1);
-  static_assert(sizeof(bool) != 1u || __builtin_memchr(kBool + 1, 0, 99) == kBoolPastTheEndPtr - 1);
-  static_assert(sizeof(bool) != 1u || __builtin_memchr(kBoolPastTheEndPtr - 3, -1, 3) == nullptr);
-  static_assert(sizeof(bool) != 1u || __builtin_memchr(kBoolPastTheEndPtr, 0, 1) == nullptr); // expected-error {{not an integral constant}} expected-note {{dereferenced one-past-the-end}}
+  static_assert(sizeof(bool) != 1u || __builtin_memchr(kBoolPastTheEndPtr - 3, 1, 99) == kBool + 1); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(sizeof(bool) != 1u || __builtin_memchr(kBool + 1, 0, 99) == kBoolPastTheEndPtr - 1); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(sizeof(bool) != 1u || __builtin_memchr(kBoolPastTheEndPtr - 3, -1, 3) == nullptr); // expected-error {{constant}} expected-note {{not supported}}
+  static_assert(sizeof(bool) != 1u || __builtin_memchr(kBoolPastTheEndPtr, 0, 1) == nullptr); // expected-error {{constant}} expected-note {{not supported}}
 
   static_assert(__builtin_char_memchr(kStr, 'a', 0) == nullptr);
   static_assert(__builtin_char_memchr(kStr, 'a', 1) == kStr);


        


More information about the cfe-commits mailing list