[PATCH] D82481: [XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s)

Hubert Tong via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 26 19:45:04 PDT 2020


hubert.reinterpretcast added inline comments.


================
Comment at: llvm/lib/MC/MCContext.cpp:313
+  SmallString<128> InvalidName(OriginalName);
+  // Replace the invalid character with '_'.
+  for (size_t I = 0; I < InvalidName.size(); ++I) {
----------------
jasonliu wrote:
> jasonliu wrote:
> > hubert.reinterpretcast wrote:
> > > Although this encoding generates runs of underscores, it does make it less intrusive in terms of being able to use `c++filt` on the result. Note that collisions increase with non-Latin scripts (which could be reduced if we append the hex values of all the underscores, including the original ones, to the end of the string).
> > > ```
> > > namespace ImplVerTags {
> > > 
> > > typedef struct
> > > 一世
> > > {} A;
> > > typedef struct
> > > 二世
> > > {} B;
> > > 
> > > }
> > > 
> > > template <typename> void *implStaticDataUnsafe = nullptr;
> > > 
> > > template void *implStaticDataUnsafe<ImplVerTags::A>;
> > > template void *implStaticDataUnsafe<ImplVerTags::B>;
> > > ```
> > Just want to make sure I understand your suggestion:
> > so if I have a "f`!o" as symbol name, I would get back `_Renamed..6021f__o back`?
> > Do we want to consider just adding those hex values up? Like getting back `_Renamed..81f__o` for that?
> If we choose to get the results like `_Renamed..6021f__o`, in what way I could trigger a collision? Do we need to have a collision handling mechanism then?
Hmm, I suggested appending but adding to the front does seem to make it easier to eyeball the split (in addition to making the more relevant part show up first for C identifiers in non-Latin scripts). I like it. Also, yes, I meant that f_$ would become (with the prefix) `_Renamed..5f24f__`.

We shouldn't end up with collisions in that case. The transformation is completely reversible by identifying the number of underscores to determine the prefix length and then replacing each underscore with the encoded value.


================
Comment at: llvm/lib/MC/MCContext.cpp:346
+  assert((NameEntry.second || !NameEntry.first->second) &&
+         "This name is used somewhere else.");
+  // Mark the name as used for a non-section symbol.
----------------
Moot point if we're going with the encoding change, but the collision handling mechanism needs to check that the output name is not also going to collide:
```
void f_$o() { }
void f$_o() { }
void f_$o1() { }
```



================
Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-symbol-rename.ll:2
+;; This file tests how llc handles symbol contains invalid character on XCOFF
+;; platform.
+;; Since symbol name resolution is the same between 32 bit and 64 bit,
----------------
Can't help but notice that none of the renamed entities are referenced via the TOC.

For example:
```
int f$o();
int (*bar())() { return f$o; }
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82481/new/

https://reviews.llvm.org/D82481





More information about the llvm-commits mailing list