[Lldb-commits] [PATCH] D73860: [lldb/StringPrinter] Avoid reading garbage in uninitialized strings
Raphael Isemann via Phabricator via lldb-commits
lldb-commits at lists.llvm.org
Tue Feb 4 01:35:30 PST 2020
teemperor requested changes to this revision.
teemperor added inline comments.
================
Comment at: lldb/packages/Python/lldbsuite/test/functionalities/data-formatter/data-formatter-stl/libcxx/string/main.cpp:8
+// A corrupt libcxx string which points to garbage and has a crazy length.
+static unsigned char garbage_string_long[] = {185, 52, 168, 29, 1, 0, 0, 0, 168, 61, 175, 29, 1, 0, 0, 0, 104, 222, 174, 29, 1, 0, 0, 0};
+
----------------
I think those byte arrays need a quick comment about which elements mean what (or how they trigger the respective code paths). Just pointing out which bytes are supposed to overwrite which `std::string` members is good enough. Something like a macro maybe? `#define STD_STRING_BYTES(cap, size, length) {cap, size, length}`
================
Comment at: lldb/packages/Python/lldbsuite/test/functionalities/data-formatter/data-formatter-stl/libcxx/string/main.cpp:29
+ if (sizeof(std::string) == sizeof(garbage_string_sso))
+ memcpy((void *)&garbage1, &garbage_string_sso, sizeof(std::string));
+ if (sizeof(std::string) == sizeof(garbage_string_long))
----------------
shafik wrote:
> vsk wrote:
> > shafik wrote:
> > > While I get what you are doing here, we know he structure of libc++ SSO implementation and we are manually building a corrupt one, this is fragile to changes in the implementation.
> > >
> > > I don't have an immediate suggestion for an alternative approach but if we stick with this we should stick a big comment explaining this, perhaps laying out the assumptions of the internal layout we are assuming and maybe some sanity checks maybe using `offsetof` to verify fields exist and are where we expect them to be.
> > I don't see how this is fragile. The structure of libc++'s SSO implementation is ABI, and is unlikely to change (esp. not in a way that turns either one of the garbage strings into a valid string). I've left comments explaining what's wrong with both of the garbage strings, but can leave a pointer to https://joellaity.com/2020/01/31/string.html for more info?
> Sure, that note would be fine.
Can you instead do a `#if _LIBCPP_ABI_VERSION == 1` and have the #else as an #error that this test needs updating. We don't support any other libc++ ABI beside 1 in LLDB but if we ever do then this should not silently pass.
================
Comment at: lldb/source/DataFormatters/StringPrinter.cpp:73
uint8_t *&next) {
+ assert(isInHalfOpenRange(buffer, buffer, buffer_end) &&
+ "Cannot read the first byte of ASCII string buffer");
----------------
Isn't this just `assert(buffer<buffer_end)`? That's less confusing IMHO (and I think in general this check can be in `GetPrintable` as this should always be true for all `GetPrintableImpl`).
================
Comment at: lldb/source/DataFormatters/StringPrinter.cpp:140
uint8_t *&next) {
+ assert(isInHalfOpenRange(buffer, buffer, buffer_end) &&
+ "Cannot read the first byte of UTF8 string buffer");
----------------
Same as above.
================
Comment at: lldb/source/DataFormatters/StringPrinter.cpp:149
+ if ((utf8_encoded_len == 0 || utf8_encoded_len > 4) ||
+ !isInHalfOpenRange(buffer + (utf8_encoded_len - 1), buffer, buffer_end))
return retval;
----------------
Isnt' `!isInHalfOpenRange(buffer + (utf8_encoded_len - 1), buffer, buffer_end))` just `buffer + (utf8_encoded_len - 1U) < buffer_end`? `utf8_encoded_len` is always positive so the check if it adding it to `buffer` makes it smaller than `buffer` can only happen with an integer overflow IIUC (which we probably should check against more explicitly then).
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73860/new/
https://reviews.llvm.org/D73860
More information about the lldb-commits
mailing list