[lld] Fix lld crash wrt generated thunks growing away from the patched code (PR #170495)

Peter Smith via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 8 07:02:31 PST 2025


================
@@ -597,7 +595,7 @@ AArch64Err843419Patcher::patchInputSectionDescription(
         uint64_t startAddr = isec->getVA(off);
         if (uint64_t patcheeOffset =
                 scanCortexA53Errata843419(isec, off, limit))
-          implementPatch(ctx, startAddr, patcheeOffset, isec, patches);
+          implementPatch(ctx, startAddr, patcheeOffset, isec, patches, dyn_cast<Symbol>(*codeSym));
----------------
smithp35 wrote:

> Thank you for the careful review!
> 
> Wouldn't it be the same to keep using the `sectionMap` but check if the symbol at hand is a mapping symbol? - I did implement your suggestion and the one I'm proposing, and it looks like both work for the current test cases, but would avoid adding another map, while still avoiding using mapping symbols, I believe:
> 
> ```c++
> auto sectionSymbol = *codeSym;
> 
> if (isAArch64MappingSymbol(*sectionSymbol)) {
>   // It is an error for a relocation to reference a mapping symbol, so create one instead.
>   sectionSymbol = addSyntheticLocal(ctx, "", STT_SECTION, 0, 0, *isec);
> }
> ```
> 
> wyt?

I agree they are logically equivalent.

The reason I suggested that they be in a separate map is that the existing code assumes that the vector of symbols returned from init() is in alternating order of $x and $d [1], some extra care will be needed to skip over any instances of a section symbol. I guess this could be mitigated by making sure that only one Section Symbol is added and it is sorted in init() so it is guaranteed to appear first and can be skipped over.
```
    auto codeSym = mapSyms.begin();
    if (codeSym->isSection())
      codeSym = std::next(codeSym());
    while (codeSym != mapSyms.end()) {
```
Will need some comments updating too.

Thinking in a bit more, perhaps a better solution is to make the map from InputSection* to a pair of Section Symbol and Mapping Symbol list.
```
  // A cache of the SectionSymbol (if it exists) and the mapping symbols 
  // defined by the InputSection sorted in order of ascending value with
  // redundant symbols removed. These describe
  // the ranges of code and data in an executable InputSection.
  llvm::DenseMap<InputSection *, std::pair<Defined*, std::vector<Defined *>>> sectionMap;
```
Then we could easily extract the Section Symbol without having the rest of the code have to skip over it.

[1] Looks like my earlier comment about `isDataMapSymbol` being unused was incorrect. I had read it as only adding the mapping symbol for "$x" but it is just only adding mapping symbols for executable sections, so it will add "$d" as well. 

> 
> > While I don't know of any code-generator that doesn't add a Section Symbol for each allocatable section, it is not required by the ELF specification. If sectionSymbolMap[isec] returns nullptr we can call addSyntheticLocal() with a symbol type of STT_SECTION, offset of 0, size 0, and the empty string "" as a name to generate one.
> 
> Regardless of the strategy, that case does happen in my new test, where there doesn't seem to find a symbol specifically for `Arch64AbsLongThunk` (This is not the same as the `_ret` one previously discussed). So the symbols currently named `__AArch64AbsLongThunk_$x` in my new test case will get "renamed" to just `__AArch64AbsLongThunk_` (but won't be a mapping symbol anymore).

OK, I guess LLD is acting as a code-generator that doesn't add a section symbol. I think this is a bonus as we'll have tested the code-path.

https://github.com/llvm/llvm-project/pull/170495


More information about the llvm-commits mailing list