<div dir="ltr">Also, maybe that code could just return a StringRef. It's like I mentioned a few days ago (don't remember if it was this thread or another), but when you've got StringRefs all the way down, this problem pretty much disappears.</div><br><div class="gmail_quote"><div dir="ltr">On Tue, Sep 20, 2016 at 2:36 PM Zachary Turner <<a href="mailto:zturner@google.com">zturner@google.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="gmail_msg">StringRef has `withNullAsEmpty` which I added a few days ago. It will return an empty StringRef. seems to me that should solve most of those kinds of problems.</div><br class="gmail_msg"><div class="gmail_quote gmail_msg"><div dir="ltr" class="gmail_msg">On Tue, Sep 20, 2016 at 2:31 PM Greg Clayton <<a href="mailto:gclayton@apple.com" class="gmail_msg" target="_blank">gclayton@apple.com</a>> wrote:<br class="gmail_msg"></div><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">We should avoid crashing if there is a reasonable work around when the input is bad. StringRef with NULL is easy, just put NULL and zero length and don't crash. Just because it is documented, doesn't mean it needs to stay that way, but I am not going to fight that battle.<br class="gmail_msg">
<br class="gmail_msg">
We should make every effort to not crash if we can. If it is way too difficult, document the issue and make sure clients know that this is the way things are. StringRef does this and we accept it. Doesn't mean it won't crash us. I just hate seeing the crash logs where we have:<br class="gmail_msg">
<br class="gmail_msg">
StringRef s(die.GetName());<br class="gmail_msg">
<br class="gmail_msg">
It shouldn't crash IMHO, but we know it does and we now code around it. Yes, we put in code like:<br class="gmail_msg">
<br class="gmail_msg">
StringRef s;<br class="gmail_msg">
const char *cstr = die.GetName();<br class="gmail_msg">
if (cstr)<br class="gmail_msg">
s = cstr;<br class="gmail_msg">
<br class="gmail_msg">
Is this nice code? I am glad it makes it simple for the LLVM side, but I would rather write:<br class="gmail_msg">
<br class="gmail_msg">
StringRef s(die.GetName());<br class="gmail_msg">
<br class="gmail_msg">
Maybe I will subclass llvm::StringRef as lldb::StringRef and override the constructor.<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
> On Sep 20, 2016, at 2:24 PM, Zachary Turner <<a href="mailto:zturner@google.com" class="gmail_msg" target="_blank">zturner@google.com</a>> wrote:<br class="gmail_msg">
><br class="gmail_msg">
> Well, but StringRef for example is well documented. So it seems to me like there's an example of a perfectly used assert. It's documented that you can't use null, and if you do it asserts. Just like strlen.<br class="gmail_msg">
><br class="gmail_msg">
> The issue I have with "you can't ever assert" is that it brings it into an absolute when it really shouldn't be. We already agreed (I think) that certain things that are well documented can assert. But when we talk in absolutes, it tends to sway people that they should always do that thing, even when it's not the most appropriate solution. And I think some of that shows in the LLDB codebase where you've got hugely complicated logic that is very hard to follow, reason about, or test, because no assumptions are ever made about any of the inputs. Even when they are internal inputs that are entirely controlled by us.<br class="gmail_msg">
><br class="gmail_msg">
> On Tue, Sep 20, 2016 at 2:19 PM Greg Clayton <<a href="mailto:gclayton@apple.com" class="gmail_msg" target="_blank">gclayton@apple.com</a>> wrote:<br class="gmail_msg">
> Again, strlen is a stupid example as it is well documented. All of llvm and clang are not.<br class="gmail_msg">
> > On Sep 20, 2016, at 1:59 PM, Zachary Turner <<a href="mailto:zturner@google.com" class="gmail_msg" target="_blank">zturner@google.com</a>> wrote:<br class="gmail_msg">
> ><br class="gmail_msg">
> ><br class="gmail_msg">
> ><br class="gmail_msg">
> > On Tue, Sep 20, 2016 at 1:55 PM Greg Clayton <<a href="mailto:gclayton@apple.com" class="gmail_msg" target="_blank">gclayton@apple.com</a>> wrote:<br class="gmail_msg">
> ><br class="gmail_msg">
> > > On Sep 20, 2016, at 1:45 PM, Zachary Turner <<a href="mailto:zturner@google.com" class="gmail_msg" target="_blank">zturner@google.com</a>> wrote:<br class="gmail_msg">
> > ><br class="gmail_msg">
> > > I do agree that asserts are sometimes used improperly. But who's to say that the bug was the assert, and not the surrounding code? For example, consider this code:<br class="gmail_msg">
> > ><br class="gmail_msg">
> > > assert(p);<br class="gmail_msg">
> > > int x = *p;<br class="gmail_msg">
> ><br class="gmail_msg">
> > Should be written as:<br class="gmail_msg">
> ><br class="gmail_msg">
> > assert(p);<br class="gmail_msg">
> > if (!p)<br class="gmail_msg">
> > do_something_correct();<br class="gmail_msg">
> > else<br class="gmail_msg">
> > int x = *p;<br class="gmail_msg">
> ><br class="gmail_msg">
> > ><br class="gmail_msg">
> > > Should this assert also not be here in library code? I mean it's obvious that the program is about to crash if p is invalid. Asserts should mean "you're about to invoke undefined behavior", and a crash is *better* than undefined behavior. It surfaces the problem so that you can't let it slip under the radar, and it also alerts you to the point that the UB is invoked, rather than later.<br class="gmail_msg">
> > ><br class="gmail_msg">
> > > What about this assert?<br class="gmail_msg">
> > ><br class="gmail_msg">
> > > assert(ptr);<br class="gmail_msg">
> > > int x = strlen(ptr);<br class="gmail_msg">
> > ><br class="gmail_msg">
> > > Surely that assert is ok right? Do we need to check whether ptr is valid EVERY SINGLE TIME we invoke strlen, or any other function for that matter? The code would be a disastrous mess.<br class="gmail_msg">
> ><br class="gmail_msg">
> > Again, check before you call if this is in a shared library! What is so hard about that? It is called software that doesn't crash.<br class="gmail_msg">
> ><br class="gmail_msg">
> > assert(ptr)<br class="gmail_msg">
> > int x = ptr ? strlen(ptr) : 0;<br class="gmail_msg">
> ><br class="gmail_msg">
> > I find it hard to believe that you are arguing that you cannot EVER know ANYTHING about the state of your program. :-/<br class="gmail_msg">
> ><br class="gmail_msg">
> > This is like arguing that you should run a full heap integrity check every time you perform a memory write, just to be sure you aren't about to crash.<br class="gmail_msg">
> ><br class="gmail_msg">
> > If you make a std::vector<>, do we need to verify that its internal pointer is not null before we write to it? Probably not, right? Why not? Because it has a specification of how it works, and it is documented that you can construct one, you can use it.<br class="gmail_msg">
> ><br class="gmail_msg">
> > It's ok to document how functions work, and it is ok to assume that functions work the way they claim to work.<br class="gmail_msg">
><br class="gmail_msg">
<br class="gmail_msg">
</blockquote></div></blockquote></div>