[PATCH] D60616: Make parseBitcodeFile use a named StructType, if it exists and matches.

Fri Aug 2 05:55:00 PDT 2019

t.p.northover added inline comments.

================
Comment at: llvm/lib/Bitcode/Reader/BitcodeReader.cpp:1824-1826
+        // 3. This kind of conflict should not happen.
+        return error(
+            "named struct types match by name and differ by structure");
----------------
arnt wrote:
> t.p.northover wrote:
> > Really? I thought loading multiple bitcode files into the same LLVMContext was fine and conflicting types were automatically renamed.
> > 
> > Am I misreading or will this turn that situation into an error?
> The test for "conflicting type" was not terribly sensible. It assumed that if all modules contained opaque references to the same opaque type, then the the same name meant the same type. However, if one module had a defined type and the others had opaque types, then it was a naming conflict, which was silently resolved by renaming.
> 
> The case that blew up for me involved loading the superclasses for a class into the same context. Suppose that C inherits B, which inherits A, and each has its own type and module. A's module contains one or more A-related struct types, B's and C's module contain opaque references. B's module contains, etc. Loading B and C into the same context would break, because C's references to B would be renamed, while those to A would be preserved. I struggled to find a rationale for this.
> 
> I chose to make it an error because there are cases where renaming isn't safe, and I can't see a way to detect it. Suppose modules D and E both contain defined struct types T, and module F contains an opaque T. What is intended? The existing code would behave differently depending on load order. If you load only D and E, renaming is safe. But when loading the second of D/E, the reader doesn't know whether F will appear.
> Loading B and C into the same context would break, because C's references to B would be renamed, while those to A would be preserved. I struggled to find a rationale for this.

It sounds like it'd result in weird names and inconsistent usage of each type, but not fundamentally change the semantics of the IR. I believe this is how LTO actually works.

> Suppose modules D and E both contain defined struct types T, and module F contains an opaque T. What is intended?

D and E would continue to use a type structurally equivalent to their version of T, but one of them would have to give up the name (and get something like %T.1).

I don't believe there are any constraints on what happens to F: it could use yet another (still) opaque %T.2 or either of the versions from D/E. Because the type was opaque it can't be doing any operations that actually care.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60616/new/

https://reviews.llvm.org/D60616