[Lldb-commits] [PATCH] D73860: [lldb/StringPrinter] Avoid reading garbage in uninitialized strings

Shafik Yaghmour via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Mon Feb 3 19:09:13 PST 2020


shafik added inline comments.


================
Comment at: lldb/packages/Python/lldbsuite/test/functionalities/data-formatter/data-formatter-stl/libcxx/string/main.cpp:29
+    if (sizeof(std::string) == sizeof(garbage_string_sso))
+      memcpy((void *)&garbage1, &garbage_string_sso, sizeof(std::string));
+    if (sizeof(std::string) == sizeof(garbage_string_long))
----------------
vsk wrote:
> shafik wrote:
> > While I get what you are doing here, we know he structure of libc++ SSO implementation and we are manually building a corrupt one, this is fragile to changes in the implementation. 
> > 
> > I don't have an immediate suggestion for an alternative approach but if we stick with this we should stick a big comment explaining this, perhaps laying out the assumptions of the internal layout we are assuming and maybe some sanity checks maybe using `offsetof` to verify fields exist and are where we expect them to be.
> I don't see how this is fragile. The structure of libc++'s SSO implementation is ABI, and is unlikely to change (esp. not in a way that turns either one of the garbage strings into a valid string). I've left comments explaining what's wrong with both of the garbage strings, but can leave a pointer to https://joellaity.com/2020/01/31/string.html for more info?
Sure, that note would be fine.


================
Comment at: lldb/source/DataFormatters/StringPrinter.cpp:64
+static bool isInHalfOpenRange(uint8_t *Needle, uint8_t *Start, uint8_t *End) {
+  return uintptr_t(Needle) >= uintptr_t(Start) &&
+         uintptr_t(Needle) < uintptr_t(End);
----------------
vsk wrote:
> shafik wrote:
> > can we use `reinterpret_cast` as opposed to what is basically a C-style cast. This also has the advantage of pointing out potentially dangerous code for future persons refactoring this code.
> As written, I think the casts to `uintptr_t` convey the assumptions being made here.
Perhaps but I can not easily search for C-style casts but I can easily audit for `reinterpret_cast` and `reinterpret_cast` sometimes has stronger semantics than C-style casts e.g. [this example](https://twitter.com/shafikyaghmour/status/1059874016175972352) 

So while this does not apply to this case it is best to get used to using `reinterpret_cast` instead saving some character by using a C-style cast.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73860/new/

https://reviews.llvm.org/D73860





More information about the lldb-commits mailing list