[PATCH] D128863: Add switch to use "source_file_name" instead of Module ID for globally promoted local

Mon Jul 18 06:52:02 PDT 2022

tejohnson added a comment.

In D128863#3623605 <https://reviews.llvm.org/D128863#3623605>, @void wrote:

> In D128863#3620768 <https://reviews.llvm.org/D128863#3620768>, @tejohnson wrote:
>
>>> During LTO a local promoted to a global gets a unique suffix based partially on the Module's ID.
>>
>> The hash is actually a hash of the full module IR, computed when we write out the bitcode before the ThinLTO link, saved in the bitcode file, then read into a table indexed by module Id.
>>
>>> This has a problem, because the Module ID is not the same as the original Module ID.
>>
>> Can you clarify what the different module IDs you are comparing here?
>
> I may have a misunderstanding of the module ID. I thought it was based on the name of the Module and not a hash of the full IR. However, the result is the same for the tool we're using. Here're the full details:
>
> - We build the Linux kernel with LTO and PGO enabled.
> - Later on, we need to generate a "live-patch" for the kernel (one that'll be applied to the kernel without rebooting it).
> - The mechanism we use to generate the live-patch does a large number of checks to determine what changed between the pre and post patched object files.
> - Changes in certain sections aren't valid.
>   - So if a patched local function "foo" is promoted to a global "foo.llvm.1234567890abcdef", and that function is called from one of the sections where changes are invalid, changing its promoted name (to say "foo.llvm.fedcba0987654321") because of a different Module ID hash causes validation to fail.
>
> So I wanted to avoid this validation muck by not relying upon the Module ID hash, which can apparently change. :-)

Sorry for the slow reply, I've been out all of July due to vacation followed by illness.

Ok, I understand the issue. The hash changes due to code changes in a module A that is allowed to change, but this has the side effect of superficially changing the callsite in another module B that isn't allowed to change. Couple concerns below.

Fundamentally, LTO is doing whole program and cross-module optimization. How do you ensure that a source change to module A doesn't actually affect the optimization in module B? E.g. the reason you are getting a call to a local foo from module A in module B in the first place is that module B decided to import the caller of foo. If foo or its caller changed in module A the importing decisions into module B could actually change its code generation. We could decide not to import foo's caller or foo itself might get imported. There are other whole program optimizations that could also affect the code generation in B depending on changes in other modules. How will you handle or prevent these cases?

Using the source path this way will fix this particular instance, however, you have to be careful and ensure that all source files are fully and uniquely qualified by their path when they are compiled. The kernel build may do this already, but it is somewhat dangerous to assume this in general so we'd have to document this clearly.

Change also needs a test. But I'm concerned overall as described earlier that this isn't a full fix for potential issues due to LTO changes.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128863/new/

https://reviews.llvm.org/D128863