[llvm] r200187 - StringRef: Extend constexpr capabilities and introduce ConstStringRef

Alp Toker alp at nuanti.com
Tue Jan 28 11:56:11 PST 2014


On 27/01/2014 17:40, David Blaikie wrote:
>
>
>
> On Sun, Jan 26, 2014 at 9:46 PM, Alp Toker <alp at nuanti.com 
> <mailto:alp at nuanti.com>> wrote:
>
>     They say you should practice what you preach, and this commit had
>     some changes that'll affect all users of StringRef so I've rolled
>     it back in r200187 until we get some discussion going.
>
>     The need derives from dozens of cases in clang where user input
>     was being passed in as a format string to the diagnostic system.
>     With all those fixed following ongoing work the last few months we
>     now want a stricter check for compile-time fixed strings and this
>     general solution has come together.
>
>     I'll kick off with a self-review, posted inline below.. :-)
>
>
>
>     On 27/01/2014 04:07, Alp Toker wrote:
>
>         Author: alp
>         Date: Sun Jan 26 22:07:17 2014
>         New Revision: 200187
>
>         URL: http://llvm.org/viewvc/llvm-project?rev=200187&view=rev
>         Log:
>         StringRef: Extend constexpr capabilities and introduce
>         ConstStringRef
>
>         (1) Add llvm_expect(), an asserting macro that can be
>         evaluated as a constexpr
>              expression as well as a runtime assert or compiler hint
>         in release builds. This
>              technique can be used to construct functions that are
>         both unevaluated and
>              compiled depending on usage.
>
>         (2) Update StringRef using llvm_expect() to preserve runtime
>         assertions while
>              extending the same checks to static asserts in C++11
>         builds that support the
>              feature.
>
>         (3) Introduce ConstStringRef, a strong subclass of StringRef
>         that references
>              compile-time constant strings. It's convertible to, but
>         not from, ordinary
>              StringRef and thus can be used to add compile-time safety
>         to various interfaces
>              in LLVM and clang that only accept fixed inputs such as
>         diagnostic format
>              strings that tend to get misused.
>
>         Modified:
>              llvm/trunk/include/llvm/ADT/StringRef.h
>              llvm/trunk/include/llvm/Support/ErrorHandling.h
>              llvm/trunk/lib/Support/ErrorHandling.cpp
>              llvm/trunk/unittests/ADT/StringRefTest.cpp
>
>         Modified: llvm/trunk/include/llvm/ADT/StringRef.h
>         URL:
>         http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/StringRef.h?rev=200187&r1=200186&r2=200187&view=diff
>         ==============================================================================
>         --- llvm/trunk/include/llvm/ADT/StringRef.h (original)
>         +++ llvm/trunk/include/llvm/ADT/StringRef.h Sun Jan 26
>         22:07:17 2014
>         @@ -10,6 +10,8 @@
>           #ifndef LLVM_ADT_STRINGREF_H
>           #define LLVM_ADT_STRINGREF_H
>           +#include "llvm/Support/Compiler.h"
>         +#include "llvm/Support/ErrorHandling.h"
>           #include "llvm/Support/type_traits.h"
>           #include <algorithm>
>           #include <cassert>
>         @@ -70,7 +72,7 @@ namespace llvm {
>               /// @{
>                 /// Construct an empty string ref.
>         -    /*implicit*/ StringRef() : Data(0), Length(0) {}
>         +    /*implicit*/ LLVM_CONSTEXPR StringRef() : Data(0),
>         Length(0) {}
>                 /// Construct a string ref from a cstring.
>               /*implicit*/ StringRef(const char *Str)
>         @@ -80,11 +82,8 @@ namespace llvm {
>                 }
>                 /// Construct a string ref from a pointer and length.
>         -    /*implicit*/ StringRef(const char *data, size_t length)
>         -      : Data(data), Length(length) {
>         -        assert((data || length == 0) &&
>         -        "StringRef cannot be built from a NULL argument with
>         non-null length");
>         -      }
>         +      /*implicit*/ LLVM_CONSTEXPR StringRef(const char *data,
>         size_t length)
>         +          : Data(data), Length((llvm_expect(data || length ==
>         0), length)) {}
>
>
>     We lose friendly messages here, but we get back subjectively equal
>     benefit by using the built-in LLVM fatal error handler instead of
>     the one in the system.
>
>
>                 /// Construct a string ref from an std::string.
>               /*implicit*/ StringRef(const std::string &Str)
>         @@ -104,24 +103,20 @@ namespace llvm {
>                 /// data - Get a pointer to the start of the string
>         (which may not be null
>               /// terminated).
>         -    const char *data() const { return Data; }
>         +    LLVM_CONSTEXPR const char *data() const { return Data; }
>                 /// empty - Check if the string is empty.
>         -    bool empty() const { return Length == 0; }
>         +    LLVM_CONSTEXPR bool empty() const { return Length == 0; }
>                 /// size - Get the string size.
>         -    size_t size() const { return Length; }
>         +    LLVM_CONSTEXPR size_t size() const { return Length; }
>                 /// front - Get the first character in the string.
>         -    char front() const {
>         -      assert(!empty());
>         -      return Data[0];
>         -    }
>         +    LLVM_CONSTEXPR char front() const { return
>         llvm_expect(!empty()), Data[0]; }
>                 /// back - Get the last character in the string.
>         -    char back() const {
>         -      assert(!empty());
>         -      return Data[Length-1];
>         +    LLVM_CONSTEXPR char back() const {
>         +      return llvm_expect(!empty()), Data[Length - 1];
>               }
>                 /// equals - Check for string equality, this is more
>         efficient than
>         @@ -187,9 +182,8 @@ namespace llvm {
>               /// @name Operator Overloads
>               /// @{
>           -    char operator[](size_t Index) const {
>         -      assert(Index < Length && "Invalid index!");
>         -      return Data[Index];
>         +    LLVM_CONSTEXPR char operator[](size_t Index) const {
>         +      return llvm_expect(Index < Length), Data[Index];
>               }
>
>
>     Ditto.
>
>
>                 /// @}
>         @@ -547,6 +541,20 @@ namespace llvm {
>               /// @}
>           +  /// ConstStringRef - A \c StringRef carrying the
>         additional stipulation that
>         +  /// the referenced string is a compile-time constant.
>         +  ///
>         +  /// Use this to specify function parameters that require
>         fixed inputs such
>         +  /// as debug and diagnostic messages or format strings.
>         +  class ConstStringRef : public StringRef {
>         +  public:
>         +    /*implicit*/ LLVM_CONSTEXPR ConstStringRef() : StringRef() {}
>         +
>         +    template <size_t N>
>         +    /*implicit*/ LLVM_CONSTEXPR ConstStringRef(const char
>         (&data)[N])
>         +        : StringRef(data, (llvm_expect(N > 0 && data[N - 1]
>         == '\0'), N - 1)) {}
>
>
>     This constructor doesn't technically require a constant string
>     literal and you can trick it into accepting a fixed-length char
>     array, but in all uses it detected errors that have been fixed. 
>
>
> Could you point to revisions that demonstrate the fixes that have been 
> prompted by this?

Hi David,

I'm including a few of them below. These include fixes for around 200 
diagnostics that went through 30 incorrect entry points that have been 
fixed from December. Not including fixes by others, or that were caught 
during pre-commit review.

Each of these would have previously leaked, been potentially 
misformatted or insecure except for two cases which actually tried to 
escaping the user input before still incorrectly using it as a format 
string.

There's a tendency for developers to land features when they appear to 
work without paying attention to conditions where they can fail, much 
like printf() but perhaps more serious because we don't do any 
validation at all for format strings. Safe facilities like 
ConstStringRef provide the kind of compile-time safety we need.

(For background, I'm holding back on the patches because Chandler 
spotted a possible missed optimization in llvm_expect. We need to make 
sure the test condition is never evaluated in release builds so 
StringRef won't regress.)



|commit d6de61539c778f1d8449208c381234d13f7643cb|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Sun Jan 26 05:08:49 2014 +0000|

|
||    Identify two more unsafe uses of getCustomDiagID()|

||

|    git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@200126 
91177308-0d34-0410-b5e6-96231b3b80d8|

|
||commit 0e20dca20ed1f55af630b4f117ff9add50b1cf7b|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Sun Jan 26 05:08:07 2014 +0000|

|
||    Remove buggy example code from the documentation|

||

|    Instead point readers to the latest, correct example code in SVN 
until we find|

|    a way to automatically include example sources into the 
documentation (or until|

|    someone steps up to maintain these actively).|

||

|    This ensures that the examples are up-to-date, buildable, and most 
of all that|

|    readers don't pick up incorrect usage.|

||

|    git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@200125 
91177308-0d34-0410-b5e6-96231b3b80d8|

|
||commit 35fe9a0c1e52c387dfadabafa4b2d26761595786|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Sun Jan 26 05:07:32 2014 +0000|

|
||    ARCMigrate: Introduce proper diagnostics for TransformActions|

||

|    This starts to switch ARCMT to use proper diagnostic messages. The 
old use was|

|    based on incorrect example code from the documentation.|

||

|    The logic of the previous report() functions has been retained to 
support any|

|    external consumers that might be intercepting diagnostic messages 
through the|

|    old interface.|

||

|    Note that the change in test/Misc/warning-flags.c isn't a new 
warning without a|

|    flag, rather one that was previously invisible to the test. Adding 
a flag might|

|    be a good idea though.|

||

|    git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@200124 
91177308-0d34-0410-b5e6-96231b3b80d8|

|
||commit 2a030a3aec48ed18846518aab60e5e5e72f381cc|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Thu Jan 2 15:08:04 2014 +0000|

|
||    Verify that clang TargetInfo descriptions match DataLayout strings 
from LLVM|

||

|    The backend string is only verified when available as it's possible 
to run|

|    clang IRGen for targets that haven't been built or don't exist in 
LLVM.|

||

|    git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@198309 
91177308-0d34-0410-b5e6-96231b3b80d8|

|
||commit d5e04a5d9e356fbf3634ab113885039da57eaadc|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Mon Dec 23 17:59:59 2013 +0000|

|
||    Fix another misuse of getCustomDiagID()|

||

|    There's no need to escape strings and generate new DiagIDs for each 
message.|

||

|    git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@197915 
91177308-0d34-0410-b5e6-96231b3b80d8|

|
||commit a2cd7434fdd6e55d780b290aaef4365fd011252c|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Sat Dec 21 05:20:03 2013 +0000|

|
||    Fix getCustomDiagID() usage in CodeGen and TextDiagnosticBuffer|

||

|    DiagIDs are a cached resource generally only constructed from 
compile-time|

|    constant or stable format strings.|

||

|    Escaping arbitrary messages and constructing DiagIDs from them 
didn't make|

|    sense.|

||

|    git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@197856 
91177308-0d34-0410-b5e6-96231b3b80d8|

|
||commit f83451512ecb54410e1ca80a1b3a762051bcc2d1|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Sat Dec 21 05:19:58 2013 +0000|

|
||    Fix getCustomDiagID() usage in example code|

||

|    This was setting a bad example. DiagIDs are a limited resource and 
the message|

|    argument is evaluated as a format string.|

||

|    git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@197855 
91177308-0d34-0410-b5e6-96231b3b80d8|

|
||commit a2474459beca778d352a16e6463cf4dd9e29dc06|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Sun Jan 26 06:17:37 2014 +0000|

|    Enforce safe usage of DiagnosticsEngine::getCustomDiagID()|

||

|    Replace the last incorrect uses and templatize the function to 
require a|

|    compile-time constant string preventing further misuse.|

||

|    The diagnostic formatter expects well-formed input and has 
undefined behaviour|

|    with arbitrary input or crafted user strings in source files. 
Accepting user|

|    input would also have caused unbounded generation of new diagnostic 
IDs which|

|    can be problematic in long-running sessions or language bindings.|

||

|    This completes the work to fix several incorrect callers that 
passed user|

|    input or raw messages to the diagnostics engine where a constant 
format string|

|    was expected.|

||

|    git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@200132 
91177308-0d34-0410-b5e6-96231b3b80d8|

|
||commit 33fc63be61210df4b2a89386b95fde1c26c4fe99|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Sun Jan 26 08:36:03 2014 +0000|

|
||    Fix another invalid getCustomDiagID() use to unbreak the build|

||

|    It was calling the utility wrapper that now requires a constant string|

|    following clang r200132. The StringRef version on DiagnosticIDs 
appears to have|

|    been what was intended so change to that.|

||

|    git-svn-id: 
https://llvm.org/svn/llvm-project/clang-tools-extra/trunk@200142 
91177308-0d34-0410-b5e6-96231b3b80d8|

|
||commit a81090fcf5ed403a580e74cfeb704eade3c2c41d|

|Author: Alp Toker <alp at nuanti.com>|

|Date:   Sun Jan 26 06:58:01 2014 +0000|

|
||    Prospective build fix for unsafe usage of getCustomDiagID()|

||

|    This now requires a compile-time constant string so let's build proper|

|    diagnostic IDs and pass through the inputs as arguments.|

||

|    Tracks clang changes in r200132.|

||

|    git-svn-id: 
https://llvm.org/svn/llvm-project/clang-tools-extra/trunk@200139 
91177308-0d34-0410-b5e6-96231b3b80d8|


Alp.


>     There's some interest in introducing a clang attribute to check
>     for literals and that would get added here as the feature evolves.
>
>
>         +  };
>         +
>             /// \brief Compute a hash_code for a StringRef.
>             hash_code hash_value(StringRef S);
>
>         Modified: llvm/trunk/include/llvm/Support/ErrorHandling.h
>         URL:
>         http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/ErrorHandling.h?rev=200187&r1=200186&r2=200187&view=diff
>         ==============================================================================
>         --- llvm/trunk/include/llvm/Support/ErrorHandling.h (original)
>         +++ llvm/trunk/include/llvm/Support/ErrorHandling.h Sun Jan 26
>         22:07:17 2014
>         @@ -15,11 +15,11 @@
>           #ifndef LLVM_SUPPORT_ERRORHANDLING_H
>           #define LLVM_SUPPORT_ERRORHANDLING_H
>           -#include "llvm/ADT/StringRef.h"
>           #include "llvm/Support/Compiler.h"
>           #include <string>
>             namespace llvm {
>         +  class StringRef;
>             class Twine;
>               /// An error handler callback.
>         @@ -78,7 +78,7 @@ namespace llvm {
>         bool gen_crash_diag = true);
>             LLVM_ATTRIBUTE_NORETURN void report_fatal_error(const
>         std::string &reason,
>         bool gen_crash_diag = true);
>         -  LLVM_ATTRIBUTE_NORETURN void report_fatal_error(StringRef
>         reason,
>         +  LLVM_ATTRIBUTE_NORETURN void report_fatal_error(const
>         StringRef &reason,
>         bool gen_crash_diag = true);
>
>
>     This tweak drops the early dependency on StringRef.h so we can use
>     ErrorHandling in StringRef itself.
>
>
>
>             LLVM_ATTRIBUTE_NORETURN void report_fatal_error(const
>         Twine &reason,
>         bool gen_crash_diag = true);
>         @@ -108,4 +108,14 @@ namespace llvm {
>           #define llvm_unreachable(msg)
>         ::llvm::llvm_unreachable_internal()
>           #endif
>           +/// An assert macro that's usable in constexprs and that
>         becomes an optimizer
>         +/// hint in NDEBUG builds.
>         +///
>         +/// Unlike \c assert() the \param test expression may be
>         evaluated in optimized
>         +/// builds and so should be simple, accurate and never have
>         side effects.
>         +#define llvm_expect(test) (void)(!!(test) ? 0 :
>         (llvm_unreachable(#test), 0))
>
>
>     This macro expansion could do with checking over. If it works out
>     llvm_expect() will have better characteristics in debug builds.
>
>     The optimizer hint approach is something to consider. We have to
>     evaluate 'test' to support constexpr evaluation but we don't
>     strictly need to use it as a hint if llvm_unreachable() feels too
>     strong here.
>
>     Alp.
>
>
>
>         +
>         +// TODO: Update other headers to explicitly include
>         StringRef.h and drop this.
>         +#include "llvm/ADT/StringRef.h"
>         +
>           #endif
>
>         Modified: llvm/trunk/lib/Support/ErrorHandling.cpp
>         URL:
>         http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/ErrorHandling.cpp?rev=200187&r1=200186&r2=200187&view=diff
>         ==============================================================================
>         --- llvm/trunk/lib/Support/ErrorHandling.cpp (original)
>         +++ llvm/trunk/lib/Support/ErrorHandling.cpp Sun Jan 26
>         22:07:17 2014
>         @@ -58,7 +58,7 @@ void llvm::report_fatal_error(const std:
>             report_fatal_error(Twine(Reason), GenCrashDiag);
>           }
>           -void llvm::report_fatal_error(StringRef Reason, bool
>         GenCrashDiag) {
>         +void llvm::report_fatal_error(const StringRef &Reason, bool
>         GenCrashDiag) {
>             report_fatal_error(Twine(Reason), GenCrashDiag);
>           }
>
>         Modified: llvm/trunk/unittests/ADT/StringRefTest.cpp
>         URL:
>         http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ADT/StringRefTest.cpp?rev=200187&r1=200186&r2=200187&view=diff
>         ==============================================================================
>         --- llvm/trunk/unittests/ADT/StringRefTest.cpp (original)
>         +++ llvm/trunk/unittests/ADT/StringRefTest.cpp Sun Jan 26
>         22:07:17 2014
>         @@ -531,4 +531,22 @@ TEST(StringRefTest, joinStrings) {
>             EXPECT_TRUE(v2_join3);
>           }
>           +static void fn_stringref(StringRef str) {
>         +  EXPECT_TRUE(str == "hello");
>         +}
>         +static void fn_conststringref(ConstStringRef str) {
>         +  fn_stringref(str);
>         +}
>         +
>         +TEST(StringRefTest, constStringRef) {
>         +  LLVM_CONSTEXPR ConstStringRef csr("hello");
>         +#if __has_feature(cxx_constexpr) ||
>         defined(__GXX_EXPERIMENTAL_CXX0X__)
>         +  LLVM_STATIC_ASSERT(csr[0] != csr[1], "");
>         +  LLVM_STATIC_ASSERT(csr[2] == csr[3], "");
>         +  LLVM_STATIC_ASSERT(csr.size() == 5, "");
>         +#endif
>         +  llvm_expect(csr[2] == csr[3]);
>         +  fn_conststringref(csr);
>         +}
>         +
>           } // end anonymous namespace
>
>
>         _______________________________________________
>         llvm-commits mailing list
>         llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>         http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>     -- 
>     http://www.nuanti.com
>     the browser experts
>
>
>     _______________________________________________
>     llvm-commits mailing list
>     llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>     http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>

-- 
http://www.nuanti.com
the browser experts

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140128/8282120b/attachment.html>


More information about the llvm-commits mailing list