[Lldb-commits] [PATCH] D62503: Add ReadCStringFromMemory for faster string reads

Thu Jun 13 12:01:55 PDT 2019

labath added inline comments.

================
Comment at: lldb/unittests/Host/NativeProcessProtocolTest.cpp:128-133
+  EXPECT_THAT_ERROR(
+      Process.ReadCStringFromMemory(0x0, &string[0], sizeof(string), bytes_read)
+          .ToError(),
+      llvm::Succeeded());
+  EXPECT_STREQ(string, "hel");
+  EXPECT_EQ(bytes_read, 3UL);
----------------
aadsm wrote:
> labath wrote:
> > aadsm wrote:
> > > labath wrote:
> > > > I'm wondering how will the caller know that the read has been truncated. Given that ReadMemory already returns an error in case of partial reads, maybe we could do the same here ? (return an error, but still set bytes_read and the string as they are now). What do you think ?
> > > I'm a bit hesitant on this one, because it's different semantics.
> > > ReadMemory returns an error when it was not able to read everything that it was told to.
> > > 
> > > In this test case we were able to read everything but didn't find a null terminator. I decided to set a null terminator in this case and say that we read 1 less byte in order to 1) actually have a C string (I don't want people to strlen it and fail) and 2) allow people to read in chunks if they want (however, I don't know how useful this will be if at all) or allow them to partially read a string or even to realize that the string is larger than originally expected.
> > > 
> > > There is a way to know if it didn't read a full string with `strlen(string) == bytes_read` but that'd be O(n). You can also do `bytes_read == sizeof-1 && string[size-2] != '\0'` but it kind of sucks.
> > > I have no good solution here :D unless we want to return an enum with the 3 different options.
> > > In this test case we were able to read everything but ...
> > Is that really true? We were told to read a c string. We did not find a c string, so instead we chose to mangle the data that we have read into a c string. That seems a bit error-ish.
> > 
> > I also find it confusing that in this (truncated) case the returned value corresponds with the length of the string, while in the "normal" case in includes the nul terminator. I think it would be more natural to do it the other way around, as that would be similar to what snprintf does.
> > 
> > How about we avoid this issue entirely, and have the function return `Expected<StringRef>` (the StringRef being backed by the buffer you pass in as the argument) instead? This way you don't have to do any null termination, as the StringRef carries the length implicitly. And one can tell that the read may be incomplete by comparing the  stringref size with the input buffer size.
> > Is that really true? We were told to read a c string. We did not find a c string, so instead we chose to mangle the data that we have read into a c string. That seems a bit error-ish.
> 
> That is fair.
> 
> > How about we avoid this issue entirely, and have the function return Expected<StringRef> (the StringRef being backed by the buffer you pass in as the argument) instead? This way you don't have to do any null termination, as the StringRef carries the length implicitly. And one can tell that the read may be incomplete by comparing the stringref size with the input buffer size.
> 
> I think my only concern with this is that people might think the passed in buffer would be null terminated after the call (this is true for snprintf). I'm not sure how to set the expectation that it wouldn't...
> 
That's true, but on the other hand, snprintf does not use StringRefs, only c strings. If the users only work with the returned StringRef (like most of our code should), then they shouldn't care about whether it's nul terminated or not.

I'd also be fine with nul-terminating the buffer, but still returning a StringRef to the part which does not include the nul terminator. That would make this function similar to how llvm::Twine::toNullTerminatedStringRef works (though the fact that they explicitly put "NullTerminated" in the name shows that one normally does not expect StringRefs to be null terminated).

This way, we'd lose the ability to tell whether the read was truncated or not, but that's probably not that important. We'd still have the ability to work with the string we've read in an llvm-like manner and avoid any additional length computations..

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62503/new/

https://reviews.llvm.org/D62503