[PATCH] D102552: [GlobalOpt] recompute alignments for loads and stores of updated globals

Sat May 15 06:29:47 PDT 2021

spatel added a comment.

In D102552#2761376 <https://reviews.llvm.org/D102552#2761376>, @lebedev.ri wrote:

> I think the bug is the other way around. For each scalar we are going to split off of aggregate,
> we should first determine the alignment for scalar by determining what largest legal alignment
> could it had as part of the aggregate (based on the alignment of the outer type, and offset).
> Then, we shouldn't //need// to update uses, because by then we didn't change the alignment,
> and if any use overestimated it, then it is, and was, UB.

I think we are already correctly finding the alignment of the updated global itself (see inline code comment). But I'm not sure how that can propagate to potentially stale alignment specifiers on the users other than what I proposed here.

The problem is that if we have an alignment that's known better than the minimum (and that alignment is correct, not UB, for the original code), it may not hold for the new inner type (see inline test comment).

Let me know if I'm not seeing it correctly (not too familiar with this!).

================
Comment at: llvm/lib/Transforms/IPO/GlobalOpt.cpp:555-557
       // Calculate the known alignment of the field.  If the original aggregate
       // had 256 byte alignment for example, something might depend on that:
       // propagate info to each field.
----------------
Alignment of the new global is updated here.

================
Comment at: llvm/test/Transforms/GlobalOpt/globalsra-align.ll:33
   %x = load i32*, i32** getelementptr inbounds ([2 x [7 x i32*]], [2 x [7 x i32*]]* @a, i64 0, i64 0, i64 0), align 1
   store i32* null, i32** getelementptr inbounds ([2 x [7 x i32*]], [2 x [7 x i32*]]* @a, i64 0, i64 1, i64 1), align 16
   ret i32* %x
----------------
This is accessing element 8 (7 + 1) of the 16-byte aligned global with 4-byte elements, so "align 16" is correct and the best that it can be for the original code, but that's wrong after we strip away the outer array and only have [7 x i32*]. This is the test that most closely models what is happening in the bug reports.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102552/new/

https://reviews.llvm.org/D102552