[PATCH] D102707: Fix non-global-value-max-name-size not considered by LLParser

Dimitry Andric via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 27 12:23:20 PST 2022


dim added a comment.

In D102707#3269962 <https://reviews.llvm.org/D102707#3269962>, @mehdi_amini wrote:

> Your profile shows a great amount of maps lookup though. It can be that the map size exploded for some reason or that we query it much more?

Yes, it appears that removing the capping in `Value::setNameImpl()` is the culprit. Adding debug output shows that this function regularly gets names hundreds of megabytes long! Maybe that is an issue in itself, but it wasn't a problem in the past, because the names were usually capped at 1024 bytes. Later, such similar-looking strings would get a name conflict, but then another unique name would be generated.



================
Comment at: llvm/lib/IR/Value.cpp:326
 
-  // Cap the size of non-GlobalValue names.
-  if (NameRef.size() > NonGlobalValueMaxNameSize && !isa<GlobalValue>(this))
----------------
If I put back *this* particular part, the huge memory usage goes away!

It looks like `setNameImpl()` is called for each and every loop rotation iteration, leading to identifiers like (here they're cutoff at 1024 chars):

```
for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i_crit_edge.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i_crit_edge.i_crit_edge.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i_crit_edge.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_edge.i_crit_edge.i_crit_edge.i_crit_edge.i_crit_edge.i.for.body.i.i.i.i.i.for.body.i.i.i.i.i_crit_e
```

However if the capping is not done at this point, the full identifier is sent through the rest of the function!

When I print `NameRef().size()` to `dbgs())` I get sizes of 1 through 260,046,833 (!), so these strings become a huge memory hog. They might get capped later on, but for some reason they still get inserted into ValueMaps, it seems.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102707/new/

https://reviews.llvm.org/D102707



More information about the llvm-commits mailing list