[PATCH] D81269: Fix null pointer dereference in `ProfileSummaryInfo::getPSI()`

Pietro Fezzardi via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 5 16:48:39 PDT 2020


fez added a comment.

In D81269#2077071 <https://reviews.llvm.org/D81269#2077071>, @vsk wrote:

> When is a null PSI accessed? It's not clear to me how this happens (based on `ProfileSummaryInfoWrapperPass::doInitialization`).
>
> Could you share a reduced regression test that reproduces the crash without this patch applied?


At the moment, I can reliably reproduce it by compiling llvm itself with clang-9 -O2. In `lib/Transforms/InstCombine/InstructionCombining.cpp` the nullptr check on `PSI` gets removed. I checked in the compiled object file `InstructionCombining.cpp.o` and the nullptr check at line 3785 `(PSI && PSI->hasProfileSummary())` is gone.

However, I cannot reproduce the crash directly inside llvm's codebase. I tracked down the offending code to a library we're using in our codebase. It was working until upgrading llvm. I have the source of the library and I have minimized the offending code to the following:

  legacy::FunctionPassManager PM(MyModule);
  PM.add(createInstructionCombiningPass());
  PM.run(MyFunction);

In this case `doInitialization` is not called, so `PSI` is null.

On one hand, I agree that without calling `PM.doInitialization()` the API of the `legacy::FunctionPassManager` is being misused.
The contract of the API is violated, because without calling `doInitialization` one could end up using some uninitialized state.

On the other hand, the crash I got with the snipped above was surprising. I was debugging my code to figure out the problem, and I saw that `getPSI` was called in `InstructionCombining.cpp`, returning `nullptr`. After the call to `getPSI` I saw a null pointer check, so I assumed that the presence of `PSI` is optional, as the rest of the code seemed to suggest. So, I expected the `nullptr` check to fail, but inspecting stepping through it I jumped directly to the next instruction, and checking the underlying assembly I realized that the compiler had optimized away the check. Everything was even more confused by the fact that compiling llvm (not the snippet) with `-O0` the bug went away, because the optimizer just left the nullptr check on `PSI` in its right place.
After all, if the new API after the cosmetic change returns a reference, why hasn't the call sites be updated to have references instead of pointers? And why are the returned values from `getPSI` even checked for null if they are actually references? You see what I mean?

So, it seems to me that there is some kind of ambiguity here. In llvm's codebase `doInitialization` is always properly called. Hence, whenever `getPSI` gets called, `PSI` is never null. This is good. And this is also the reason why I cannot reproduce the bug directly in llvm's codebase.

However nobody theoretically stops a user to do something similar to the small snippet above. And in that case what happens is that, before the cosmetic patch I referenced, the code would work just fine, simply skipping over the initialization of `BFI` (which is guarded by the nullptr check on PSI). Instead, after the cosmetic change I referenced, the code crashes. In principle, there's nothing wrong with the crash per se, because it happens when someone violates the contract (always call `doInitialization` before `run`). But the interaction of the cosmetic change with undefined behavior makes very hard for someone debugging the code to find out what the real root cause, because the optimizer gets in the way. It's not an API that is hard to misuse. And for a user may not be straightforward to figure out that a crash in `runOnFunction` depends on the fact that `doInitialization` was not called AND the compiler decided to optimize away the nullptr check.

I realize that strictly speaking this may not be a bug, because it gets triggered only when breaking the contract, but the fact that it depends on optimization levels makes it kinda surprising. So, I don't know where do you want to go from here. Do you think that this is worth discussing or I should just drop it?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81269/new/

https://reviews.llvm.org/D81269





More information about the llvm-commits mailing list