[PATCH] D102669: [analyzer][ctu] Fix wrong 'multiple definitions' errors caused by space characters in lookup names when parsing the ctu index file

Ella Ma via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Dec 9 00:51:45 PST 2021


OikawaKirie added inline comments.


================
Comment at: clang/lib/CrossTU/CrossTranslationUnit.cpp:154-172
+  // Find the length delimiter.
+  const size_t LengthDelimiter = LineRef.find(':');
+  if (StringRef::npos == LengthDelimiter)
+    return false;
+
+  // Parse the length of LookupName as USRLength.
+  size_t USRLength = 0;
----------------
steakhal wrote:
> OikawaKirie wrote:
> > steakhal wrote:
> > > OikawaKirie wrote:
> > > > steakhal wrote:
> > > > > OikawaKirie wrote:
> > > > > > steakhal wrote:
> > > > > > > 
> > > > > > The source of lookup name of the function being imported is function `CrossTranslationUnitContext::getLookupName`. Keeping the length in the mapping can avoid parsing the lookup name during importing.
> > > > > Okay; you can copy the original StringRef to have that. But by consuming it on one path makes the code much more readable.
> > > > > 
> > > > The `getAsInterger` call can also check whether the content before the first colon is an integer. Therefore, a sub-string operation is required here.
> > > I don't doubt that your proposed way of doing this works and is efficient.
> > > What I say is that I think there is room for improvement in the non-functional aspects, in the readability. However, it's not really a blocking issue, more of a personal taste.
> > I know what you are considering, it is clearer and more readable by consuming the length, then the USR. However, to correctly separate the USR and file path, the length of `USR-Length` is also required, which makes it impossible to just *consume* the length at the beginning.
> > 
> > Another way of solving this problem is to re-create the string with the USR-Length and the USR after parsing, but I think it is not a good solution.
> > 
> > BTW, is it necessary to assert the `USR-Length` to be greater than zero? I think it is more reasonable to report *invalid format* rather than assert the value, as it can be provided by the user.
> I think what causes the misunderstanding is the definition of //consume// in the context of `StringRef`.
> ```lang=C++
> const StringRef Copy = Line;
> Line.consumeInteger(...); // Line advances forward by the number of characters that were parsed as an integral value.
> // Copy still refers to the unmodified, original characters.
> // I can use it however I want.
> 
> // `Line` is a suffix of `Copy`, and the `.end()` should be the same, only `.begin()` should differ.
> ```
> 
> I hope that caused the miscommunication.
> 
> ---
> > BTW, is it necessary to assert the USR-Length to be greater than zero? I think it is more reasonable to report *invalid format* rather than assert the value, as it can be provided by the user.
> Yeah, sure!
I think I have figured out what have been misunderstood.

In the current patch, I just modify function `CrossTranslationUnitContext::getLookupName` by adding a length at the beginning. Therefore, the lookup name for the CTU query will have the length part. And for the sake of simplicity and efficiency, the length together with the USR is stored in the mapping as the key.

To correctly parse the `<USR-Length>:<USR>` part, I cannot just consume the `<USR-Length>` at the beginning. Otherwise, I cannot know the length of `<USR-Length>:` part, which makes it impossible to parse the entire `<USR-Length>:<USR>` part, even though the original `Line` is copied.

I will update the approach of adding the length part. Since the length is only used during parsing the CTU index, I will modify function `createCrossTUIndexString` to add the length and revert the changes to function `CrossTranslationUnitContext::getLookupName` to keep the lookup name for CTU query unchanged.


================
Comment at: clang/test/Analysis/Inputs/ctu-lookup-name-with-space.cpp:12-14
+int importee(int X) {
+  return 1 / X;
+}
----------------
steakhal wrote:
> Also fixup the return type in the declaration within the main TU. Also add the `// expected-no-diagnostics` comment to the primary TU.
Yes, you are right.
I was misled by myself.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102669/new/

https://reviews.llvm.org/D102669



More information about the cfe-commits mailing list