[llvm] [clang] [Clang][IR] add TBAA metadata on pointer, union and array types. (PR #75177)

Wed Dec 20 14:16:00 PST 2023

================
@@ -4080,7 +4080,11 @@ LValue CodeGenFunction::EmitArraySubscriptExpr(const ArraySubscriptExpr *E,
         E->getType(), !getLangOpts().isSignedOverflowDefined(), SignedIndices,
         E->getExprLoc(), &arrayType, E->getBase());
     EltBaseInfo = ArrayLV.getBaseInfo();
-    EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType());
+    // If array is member of some aggregate, keep struct path TBAA information
+    // about it.
+    EltTBAAInfo = isa<MemberExpr>(Array) && CGM.getCodeGenOpts().ArrayTBAA
+                      ? ArrayLV.getTBAAInfo()
+                      : CGM.getTBAAInfoForSubobject(ArrayLV, E->getType());
----------------
rjmccall wrote:

Hmm.  Okay, so if I understand correctly, the basic idea here is that TBAA for array types is just TBAA for the underlying element types, so if we have TBAA for an array l-value, whether it's struct-path TBAA or not, subscripting into the array can just preserve that TBAA onto the element.  And then that gets complicated by the fact that we apparently actually use *char* as the TBAA for array types unless we're doing struct-path TBAA, so it's quite important that we actually override that or else we basically lose TBAA completely for these subscripts.

At the very least, this needs to be reflected in the comment; the overall situation is very non-obvious locally.  But I would actually prefer that we just unconditionally change the TBAA we use for array types, because it seems unjustifiable.  And as far as I can see, that should be an NFC refactor because it's not actually possible to do accesses directly of array type: arrays just decay into pointers in every context that would otherwise cause an access, and that decay ends up changing the TBAA we'd use anyway.

That is, I think you should consider just doing the array part of this patch unconditionally.  The union and pointer changes are real increases in precision / risk, though, and should continue to be guarded with flags.

https://github.com/llvm/llvm-project/pull/75177