[PATCH] D148216: Add support for annotations in UpdateTestChecks (NFC)

Thu May 25 08:32:28 PDT 2023

hnrklssn added inline comments.

================
Comment at: clang/test/utils/update_cc_test_checks/Inputs/annotations.c.expected:12
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[X]], align 4
+// CHECK-NEXT:    ret i32 [[TMP1]]
+//
----------------
nikic wrote:
> hnrklssn wrote:
> > nikic wrote:
> > > nikic wrote:
> > > > hnrklssn wrote:
> > > > > hnrklssn wrote:
> > > > > > nikic wrote:
> > > > > > > hnrklssn wrote:
> > > > > > > > delcypher wrote:
> > > > > > > > > @hnrklssn I just noticed we don't have a `CHECK` for what `META2` actually refers to. Should we?
> > > > > > > > > 
> > > > > > > > > Not something that has to be fixed in this patch, more just an observation.
> > > > > > > > Indeed this is true for metadata in general, presumably because the RHS often refer to things like other metadata identifiers. In the case of annotations they seem to always refer to simple strings however, so it would be feasible to do a straight match without having to do recursive matching or complex regexes to determine which part of the metadata to match against.
> > > > > > > > 
> > > > > > > > In many cases with metadata attached to IR nodes, multiple nodes refer to the same metadata node, so at least you verify that they still are consistent. But I agree that verifying the content would be a great future addition.
> > > > > > > You need to pass `--check-globals` to check the actual metadata.
> > > > > > When I add that to this test case it adds
> > > > > > 
> > > > > > ```
> > > > > > //.
> > > > > > // CHECK: attributes #0 = { noinline nounwind optnone "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
> > > > > > //.
> > > > > > // CHECK: !0 = !{i32 1, !"wchar_size", i32 4}
> > > > > > // CHECK: !1 = !{!"clang version 17.0.0 (git at github.com:llvm/llvm-project.git 684914f47cf59e9ab6d8b0f73c58ca6272ea28d4)"}
> > > > > > // CHECK: !2 = !{!"auto-init"}
> > > > > > //.
> > > > > > ```
> > > > > > 
> > > > > > So it seems to just be doing a simple literal matching on all metadata, regardless of whether we captured that metadata in any filecheck variable. And it still doesn't use the META2 variable to match the definition. Am I missing something? If we use the literal metadata names instead of variable matching for the definitions, there isn't much point in doing variable matching for the metadata uses either, since the test still very much relies on the metadata numbering being stable.
> > > > > @nikic Do you have more information to add about how metadata definition matchers can be generated without hardcoding everything (which is kind of the opposite of what this patch is trying to do), or in general if you're happy with the state of the PR?
> > > > This works fine with update_test_checks, so it must be some bug in update_cc_test_checks in particular. From a quick look, I suspect it's because https://github.com/llvm/llvm-project/blob/3d05ab6d3e24e76ff53b8d7d623c436b4be5b809/llvm/utils/update_cc_test_checks.py#L447 hardcodes a True value, while update_test_checks makes this dependent on `--preserve-names`, which is disabled by default there.
> > > Or maybe just test this with update_test_checks? This change is valuable for that script as well, and it doesn't have this extra issue.
> > I couldn't find any uses of 
> > `--preserve-names` in the entire repo (not even in tests for update_test_checks), so I went ahead and changed update_cc_test_checks to pass True instead.
> > 
> > While at it I added 2 new options to match some, but not necessarily all, globals. I set the least verbose one (other than "none") as the default, emitting checks for the definitions of all globals that we emit matches for in the IR. Do you agree that this is reasonable?
> That sounds like a nice approach. My main question would be whether `--check-globals seen` is the right default: My gut feeling here is that this will introduce additional checks for attributes/metadata in tests that don't actually care about it, but I'm not particularly sure about that.
> 
> You might want to do some mass test regeneration and take a look over the diff to see how useful the changes would be.
> 
> If we do change the default, the default change likely need to be part of `--version 3` to avoid substantial test churn.
I've regenerated checks before and after my patch and compared them. Some findings:
 - there's a surprising number of custom metadata types that didn't get variable checkers but instead checked hardcoded metadata IDs (!sloc, !pcsections, !alias.scope, !noalias, !nonnull, !noundef, !dereferenceable, !nontemporal, !fpmath etc.). The tests often have checkers, but were removed when regenerating them before my patch, so something seems to have broken.
 - I'm split on whether global variables should match the exact name or get a matcher. Right now the patch handles them like any other global, but I'm assuming that the name actually carries significance for linking?
 - matching attributes seems useful in some cases, not matter too much in others, but often there aren't that many attributes so it does not pollute much. Tests where attribute matchers pollute should probably use --scrub-attributes in update_check_tests (does not exist in update_cc_check_tests) anyways
 - tests that include debug info generally do so deliberately, so matching against that seems reasonable
 - TBAA metadata doesn't give too much insight unless transitive/all. That said not all TBAA tests check metadata at all today (at least not llvm/test/Transforms/ArgumentPromotion/reserve-tbaa.ll)
 - Similar goes for !llvm.loop, the useful info is pointed to indirectly
 - Profile info doesn't seem to be included very often. It's probably reasonable to match against it when it's there. Found one test that was specifically created to check profile info metadata but lost the check during conversion to update_check_tests (llvm/test/Transforms/ArgumentPromotion/profile.ll)

Since the `transitive` setting provides substantially more insight than the `seen` setting for loop and tbaa, and only really explodes for debug info (which is rarely included, and deliberately so when done) I'm open to making it the default instead. For `opt` based tests I kind of think that `all` should be the default since any metadata in the output was either created by the tested passes, or explicitly included in the input.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148216/new/

https://reviews.llvm.org/D148216