[PATCH] D41296: Limit size of non-GlobalValue name

Tue Jul 30 17:39:33 PDT 2019

aaronpuchert added a comment.
Herald added a project: LLVM.

In D41296#962215 <https://reviews.llvm.org/D41296#962215>, @hfinkel wrote:

> Why don't we just cap all names at some point (and then just start adding numbers as we generally do to break degeneracies). It seems like, otherwise, we'll end up with these kinds of fixes in many places. Fixing this in one common place seems better.

This change has the unfortunate side effect of creating unexpected error messages in the LLParser that have been observed in software using LLVM <https://github.com/numba/numba/issues/3669>. Reading the discussion here does not indicate that this effect was intended.

It can even lead to cases where we can't read our own output. To see how this happens, let's simulate a much smaller limit of 4. The normal output of the test case would have been (I've stripped the metadata here):

  $ opt -S -O2 -o - value-with-long-name.ll
  define i32 @f(i32 %a, i32 %b) local_unnamed_addr #0 {
    %reass.add = add i32 %b, %a
    %reass.mul = shl i32 %reass.add, 1
    ret i32 %reass.mul
  }

Now we abbreviate both value names to `reas`, which then needs to be disambiguated:

  $ opt -S -O2 -o trunc.ll -non-global-value-max-name-size=4 value-with-long-name.ll
  define i32 @f(i32 %a, i32 %b) local_unnamed_addr #0 {
    %reas = add i32 %b, %a
    %reas2 = shl i32 %reas, 1
    ret i32 %reas2
  }

Now let's run that through `opt` again with the same limit:

  $ opt -S -O2 -o - -non-global-value-max-name-size=4 trunc.ll
  opt: trunc.ll:7:3: error: multiple definition of local value named 'reas2'
    %reas2 = shl i32 %reas, 1
    ^

This error comes from a check at the end of `LLParser::PerFunctionState::SetInstName`. The parser calls `Value::setName`, which truncates "reas2" to "reas" and then disambiguates it as "reas1". Then it produces the error because `Value::getName()` = "reas1" isn't the original name "reas2".

Despite such long names having limited practical use, I think we should generally be able to parse our own output. One possibility would be to look for a disambiguation scheme that doesn't make the names longer. But I think that's pretty hard, and it would still produce strange errors when users actually produce long names on their own. The other possibility is that we skip the truncation when called from the LLParser. Off the top of my head, I see three possibilities:

1. Add a boolean parameter to `setName()`. Not ideal.
2. Create another function like `setName()` (factoring out common parts) that doesn't truncate, and while we are at it, also doesn't discard local value names. That would make the check for `Context.shouldDiscardValueNames()` in `LLParser::Run()` obsolete.
3. Add another field to `LLVMContext`. We already have the DiscardValueNames property, so why not add a TruncateValueNames? But then we would need to make sure it's set correctly in `LLParser::Run()` just like the other one.

I slightly prefer variant 2, although 3 is probably more in line with the existing code. Any ideas or opinions about that?

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D41296/new/

https://reviews.llvm.org/D41296