[llvm] [ThinLTO] Don't mark calloc function dead (PR #72673)

Fangrui Song via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 22 14:14:25 PST 2023


MaskRay wrote:

> > Updated test cases
> > > > The problem arises only when two (or more) libcalls are folded into one (or more) other libcalls.
> > > 
> > > 
> > > The problem here is all allocation functions are not in the RuntimeLibcalls list. All RuntimeLibcalls functions are preserved by LTO process (in general, unless for specific platform they are not available) because they can be produced by backend even if the bitcode object file don't reference them. There is no difference from this case, which `calloc` doesn't get referenced before optimization.
> > 
> > 
> > Well, I've done a brief investigation on this and looks like RuntimeLibcalls handling boils up to this snippet of code in lld
> > ```
> >   if (!ctx.bitcodeFiles.empty())
> >     for (auto *s : lto::LTO::getRuntimeLibcallSymbols())
> >       handleLibcall(s);
> > ```
> > 
> > 
> >     
> >       
> >     
> > 
> >       
> >     
> > 
> >     
> >   
> > The `handleLibcall` function check if IR symbol representing runtime library function is lazy, and adds its bitcode file to LTO input set if so. This doesn't really help much, because `calloc` (event if added to to RuntimeLibcalls set) is not a lazy symbol in my case. The problem is not that we don't add it but that we explicitly remove it in DCE. I'm starting to beleive that calloc is the unique case (I haven't found anything else of the sort).
> 
> It isn't lld that is doing the handling that preserves these symbols, it is LTO, specifically llvm/lib/Object/IRSymtab.cpp. See the use of RuntimeLibcalls.def here and where it is used to preserve symbols later in the file:

For a complete fix, we should also all `calloc` to `LTO.cpp:static const char *libcallRoutineNames[]` to affect the following code

```
> >   if (!ctx.bitcodeFiles.empty())
> >     for (auto *s : lto::LTO::getRuntimeLibcallSymbols())
> >       handleLibcall(s);
```

Middle-end library function optimizations may reference a runtime library function that is not in the referencer's symbol table.
If the definition is provided by a lazy bitcode file, we will find that we need to extract the lazy bitcode file after LTO compilation.
However, the bitcode file did not participate the LTO compilation and our model does not allow repeated LTO compilation.
As a result, the runtime library function will be either undefined or defined as a symbol without an associated section.

To address this issue, we make two changes:

* Change the linker to extract all runtime library functions defined in lazy bitcode files. We cannot fortell what runtime library functions will be referenced, so we conservatively retain all (<https://reviews.llvm.org/D50017>).
* Set the `VisibileToRegularObj` bit for all runtime library functions in an IR symbol table. This prevents the symbol from being internalized or discarded.

The current patch handles the second point, but not the first. If we place `malloc` and `calloc` in different lazy bitcode files, we will find that `calloc` may not be correctly extracted => incorrect definition.

https://github.com/llvm/llvm-project/pull/72673


More information about the llvm-commits mailing list