[PATCH] D105251: [IR] Allow multiple instruction metadata attachments with same type

Fri Jul 2 11:25:03 PDT 2021

dexonsmith added a comment.

In D105251#2854679 <https://reviews.llvm.org/D105251#2854679>, @tejohnson wrote:

> In D105251#2854351 <https://reviews.llvm.org/D105251#2854351>, @dexonsmith wrote:
>
>> - We probably need an API for "setting" the `!"branch_weights"` entry (since blindly calling `addMetadata()` would make the above verifier fail)
>
> Not sure I follow - right now there shouldn't be multiple branch weights prof metadata and I don't see a need for that going forward (the heap profiler data would use a different/new tag for its !prof metadata).

Agreed.

Were you intending to substitute all existing calls to `setMetadata()` with calls to `addMetadata()`? I wrote the above bullet after noticing that doing so would be incorrect when updating metadata (e.g., fixing up branch weights after a transformation). Instead of replacing the tag you'll end up with two of them (rendering the IR invalid). My suggestion is to add a helper for the logic needed to update metadata safely.

Walking through my logic in more detail:

This patch (the one we're looking at) will give `Instruction` the capability of having multiple attachments of the same kind. It'll have a verifier check that disallows it in practice (since it'll initially be invalid IR for all metadata kinds), but then specific metadata kinds can opt-in as needed (by opting out of the verifier check, and adding new more specific checks as is useful).

A future patch will leverage this feature for `!prof` attachments. The suggested new rules for the `!prof` attachment treat it as a set, where nodes are uniqued by their first operand. But it's not a multiset; only one node is allowed for each identity tag. (Alternatively, you could see this as a map, where the data is the node and the key is its first operand.)

In some cases, code will have a priori knowledge that an instruction does not yet have a given tag attached. In those cases, it'll be safe to call `addMetadata()` since it won't invalidate the set.

In other cases, code will need logic something like this for the replace operation:

  auto replaceProfAttachment = [](const Instruction &I, MDNode &NewN) {
    SmallVector<MDNode *> MDs;
    I.getMetadata(MD_prof, MDs);
    for (int N = 0, NE = MDs.size(); N != NE; ++N) {
      if (MDs[N]->getOperand(0) != NewN.getOperand(0))
        continue;

      // Found a node with the same tag to replace.
      MDs[N] = &NewN;
      I.setMetadata(MD_prof, MDs);
      return;
    }
    // Tag not found. Safe to append.
    I.addMetadata(MD_prof, &NewN);
  };

Maybe an API like this could be generalized?

  I.addOrReplaceMetadataWithTaggedOperand(MD_prof, NewN, /*TaggedOperandNo=*/0);

Scanning through the code, many existing calls to `setMetadata()` with `MD_prof` are on new instructions (the former case where addMetadata is safe). But there are also a few that look like updates to me (the latter case where it isn't).

I also noticed calls to `setMetadata(MD_prof, nullptr)` -- not sure if you'd want those updated to delete just the branch weights (which would need a set-erase operation), or if it always would make sense to clear all `!prof` attachments together.

> Do you mean add something new like !heapprof? I thought about that, but it seemed cleaner to put all profile data under !prof. But it would solve part of the issue (needing multiple !prof due to other types of profile metadata on the inst), and avoid the need for this change if we also use a level of indirection when there is more than one heap profile per allocation.

Yes, exactly.

If you end up going that way (and abandoning this patch), probably one of us should update LangRef to better document the existing behaviour (maybe the commit message can reference this review for a few design points to consider if someone else needs the feature).

Also in that case, I'm curious if you think it'd make sense to split out `!valueprof` at some point to make it orthogonal from branch weights? (Or maybe it doesn't matter, since they'll never/rarely be on the same instructions?)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105251/new/

https://reviews.llvm.org/D105251