[PATCH] D42043: c-index: CXString: fix MSAN read-past-end bug

Steve O'Brien via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jan 18 11:23:12 PST 2018


elsteveogrande added a comment.

Thanks very much for looking at this @vsk !  I actually found an ASAN bug in my new code, mixing and matching `malloc/free` and `operator`s `new/delete`.



================
Comment at: tools/c-index-test/c-index-test.c:3268
 
-  filename = clang_getFileName(file);
-  index_data->main_filename = clang_getCString(filename);
-  clang_disposeString(filename);
+  index_data->main_filename = clang_getFileName(file);
 
----------------
vsk wrote:
> This looks like a separate bug fix. Is it possible to separate the main_filename changes from this patch?
Will do!


================
Comment at: tools/libclang/CXString.cpp:59
 CXString createEmpty() {
   CXString Str;
+  Str.Contents = (const void *) "";
----------------
vsk wrote:
> Why shouldn't this be defined as createRef("")?
Hmm, good question.  `createRef` below actually calls back to `createEmpty` and `createNull` in the nonnull-but-empty and null cases.

I think I'll do this the other way around, and let `createRef` have the responsibility of dealing with these fields and nulls and whatnot.


================
Comment at: tools/libclang/CXString.cpp:213
+  if (string.IsNullTerminated) {
+    CString = (const char *) string.Contents;
+  } else {
----------------
vsk wrote:
> Basic question: If a non-owning CXString is null-terminated, what provides the guarantee that the string is in fact valid when getCString() is called? Is the user of the C API responsible for ensuring the lifetime of the string is valid?
I believe the API itself is the one building `CXString` instances, and the user of the C API doesn't really create them, only use them.  So the API has to ensure the string stays "good" while there may be references to it.

(Which feels a little fragile.  But I think that's the tradeoff being made.  You'll get either "fast" strings, or data guaranteed to be sane.  I'd opt for safer data but I don't know who's using this C API and am afraid to introduce a serious perf regression.  So it'll stay this way and I'll try my best to solve *-SAN issues with these constraints :) )


Repository:
  rC Clang

https://reviews.llvm.org/D42043





More information about the cfe-commits mailing list