[llvm-dev] Question about METADATA_BLOCKs in bitcode

Tue Oct 20 16:51:23 PDT 2015

> On 2015-Oct-20, at 16:02, Teresa Johnson <tejohnson at google.com> wrote:
> 
> I have a question about module-level METADATA_BLOCKs in the bitcode.
> There are currently two blocks with the METADATA_BLOCK id at module
> scope. The first has the module-level metadata values (consisting of
> some combination of METADATA_* record codes except for METADATA_KIND).
> The second consists only of METADATA_KIND records. The latter is used
> only in the METADATA_ATTACHMENT block within function blocks (for
> metadata attached to instructions).
> 
> For ThinLTO we want to delay the parsing of module level metadata
> until all functions have been imported from that module (there is some
> bookkeeping used to suture it up when we read it during a post-pass).
> However, I do need the METADATA_KIND records since that is used when
> parsing the function body being imported. In my ThinLTO prototype I
> was simply skipping the first METADATA_BLOCK and always parsing the
> second, since I knew from inspecting the BitcodeWriter that the second
> was the one with the METADATA_KIND records.
> 
> When working on the implementation of this handling for the upstream
> patches, I am reassessing how to do this perhaps more robustly. A few
> options:
> 
> 1) Should the two module-level blocks use different block ids since
> they hold different types of records? I.e. a METADATA_BLOCK_ID
> (holding the actual metadata) and a new METADATA_KIND_BLOCK_ID (for
> the metadata kind records). That way it is more obvious what the two
> separate blocks are for and it is also much more trivial to identify
> and skip just the former.

This SGTM.  The other options sound fragile (not generally correct).

Note that the reader will still have to understand the old schema.

> 2) Should I simply continue to assume that we will have two
> METADATA_BLOCKs at module level and that the kinds will always be in
> the latter?
> 
> 3) Another option is to peek inside the METADATA_BLOCK to check the
> first record's type, and if it isn't METADATA_KIND, skip the rest of
> the block. Looks like EnterSubBlock already optionally returns the
> size of the block in words, so I could use that to compute the bitcode
> offset to jump to before reading the first record, in case it is the
> wrong METADATA_BLOCK and I want to skip it.
> 
> Thoughts?
> 
> Thanks,
> Teresa
> 
> -- 
> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413