[PATCH] D120781: [IRLinker] materialize Functions before moving any

Fri Mar 25 15:35:13 PDT 2022

dexonsmith added a comment.

In D120781#3404235 <https://reviews.llvm.org/D120781#3404235>, @nickdesaulniers wrote:

> So I think when BitcodeWriter is serializing a function, it can check whether each of the function's basicblocks has its address taken.  If so, it can build the set of GlobalValues that are Users of that BlockAddress.  I'm curious if this should be part of the record or an Abbreviation? (Not sure I fully understand what an abbreviation is, despite just reading https://llvm.org/docs/BitCodeFormat.html#abbreviations). It seems like adding a record is unconditional though? So not sure about making it optional/only emitted if necessary?

You'd either want to add a new record or add something to an existing record.

- An abbreviation stores "meta-information", describing the encoding of any record that refers to it.
- A record stores "content". It can optionally refer to any already-emitted abbreviation, customizing the record's encoding to match the abbreviation. This doesn't affect the record's content when parsed; it just changes the binary representation.
- If you change the number/order/content of fields in a record, it's possible the abbreviation it was using will no longer apply. You might need to update it, or use a different one.

The other concept is a block, which is a scope into which you can put abbreviations, records, and other blocks. Blocks are nested, and are guaranteed to be 4B-aligned (records are not even 1B-aligned IIRC; they can start on any bit). You can remember where a block is and go back to read it later.

> Then, when BitcodeReader is parsing function records (`parseFunctionRecord`), it can read back these sub-records (or abbreviations). When parsing a function body (lazily), it can check these sub-records and call `DeferredFunctionInfo.find`, `findFunctionInStream`, `Stream.JumpToBit`, `parseFunctionBody` (basically the steps of `BitcodeReader::materialize`).

Instead of adding a new field to a function record, I'd suggest adding a new record to `FUNCTION_BLOCK`, which contains the instructions and basic blocks. Then you're tying together:

- Loading basic blocks of a function
- Pulling up dependents that might refer directly to the function's basic blocks

I'd suggest creating a new record kind for this.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120781/new/

https://reviews.llvm.org/D120781