[clang] [C++20][Modules] Fix crash when function and lambda inside loaded from different modules (PR #109167)

Thu Sep 19 06:40:36 PDT 2024

dmpolukhin wrote:

> Would you like to explain more why this fail previously in more detail?

Original code in `ASTReader::finishPendingActions` looked like this:
```
    for (auto ID : PendingLambdas)
      GetDecl(ID);
    PendingLambdas.clear();
```
The issue here is that the code uses implicit iterators for `PendingLambdas`, but `GetDecl` may insert more elements into `PendingLambdas`, which can invalidate the iterators. In a good case, when there is no vector relocation, the new elements will be skipped. However, in the reproducer, the insertion caused the vector to relocate, resulting in a crash due to reading invalid values from deallocated memory. To address this issue, I have enclosed more than 5 lambdas to cause relocation in the new test cases.

In the new code, before reading the lambdas, I copy the existing lambdas to a new array. Alternatively, I could use an integer index for iteration and read the size of the vector on each iteration. Both approaches work fine, but I decided that running other pending actions might be better before starting to deserialize new lambdas. I can change it if you think iteration with an integer index is better.

> Also I am thinking if we can make the process more efficiently: (1) Can we avoid the visitor in the writing process? (2) Can we delay the loading of lambdas to the load of definitions of the functions?
> 
> I immediately expectation is that (1) is possible but (2) may not be not safe.
> 
> For (1), my idea is to record/register the information during the writing of LambdaExpr in ASTWriterStmt and then we can write this record after we write all the decls and types. Then we need to read this record eagerly. Maybe we can optimize it further by adding a slot in FunctionDecl to record the offset of such informations. So we can avoid the eager linear reading process.

Yeah, it will complicate things a lot, we visit statements only after serializing FunctionDecl. I run some profiling and it seems that `collectLambdas` inclusively takes only about 0.25% cycles own time is almost 0. I'll try to reduce it even further. I think we can scopes instead of statements - it should be more efficient and we do it for enumerating anonymous definitions and it seems to be fast enough.

> For (2), whether or not it is safe is, can we we load the lambdas without loading the definition of the functions? If not, we can add a code when getting the definition of the functions like:
> 
> ```
> if (lambdas not loaded)
>      getCanonicalDecl()->loadLambdas();
> ```
> 
> Then we can delay the loading of the lambdas to make it more efficient.

It will be too late in my example. Without my change lambdas get loaded during function body deserialization but it was too late because loading template specialization caused choosing canonical record for the lambda from another modules. Function itself is selected from the last loaded modules by lookup (i.e. it happens in reverse order of module loading) but template specialisations get loaded in direct order. I tried to change order of specilization loading to opposite but in real word it not always works because some modules can be loaded for example in case of `-fmodule-file=<module-name>=<path/to/BMI>`.

https://github.com/llvm/llvm-project/pull/109167