[PATCH] D156658: [clang][dataflow] When checking `ExprToLoc` convergence, only consider children of block terminator.

Tue Aug 22 04:46:23 PDT 2023

mboehme added a comment.

Sorry for the late response -- I was on vacation.

In D156658#4554347 <https://reviews.llvm.org/D156658#4554347>, @xazax.hun wrote:

> In D156658#4552965 <https://reviews.llvm.org/D156658#4552965>, @mboehme wrote:
>
>> I've investigated this in more detail. Unfortunately, it turns out that it's not quite as simple as just implementing widening on `ExprToLoc`.
>>
>> One of the reasons for this is that we only apply widening at loop heads, but the expressions that are "blocking" convergence may be contained in a block that is not a loop head.
>
> I am probably missing something, but I why does it matter where are we doing the widening?

We only do widening at loop heads, and this means that widening only affects locations and values that flow into the loop from the outside or from a previous loop iteration.

But convergence can also be blocked by locations and values that are only used within the loop body. If these change from loop iteration to loop iteration and we don't perform widening on them, we will conclude that the state of the loop body never converges.

>> Essentially, what we want is a "top" `PointerValue` that does not have an associated `StorageLocation`. However, we don't want to eliminate the `PointerValue` entirely; we still want to be able to attach properties to it, so that, for example, an analysis can record that the `PointerValue` is non-null, even though we don't know what its exact location is.
>
> Another way to interpret "top": it points to a "summary" `StorageLocation` that can be any other `StorageLocation`, we just do not know which one. This interpretation/formulation has some advantages:
>
> - We have a `StorageLocation` to use when we dereference these top pointers.
> - It is compatible with the alias sets representation.
> - It is compatible with some other representations where we have other "summary" locations, like "UnkownStackLocation" or "UnkownHeapLocation".
>
> These summary memory locations are sort of the union of all the potential memory locations they could represent. I think in general it might be useful to embrace this idea, e.g., when we model arrays, we can have a single element region representing all the knowledge we know to be true for all elements of the array.

I like this!

I'll have to do some thinking about how we want to represent these unknown / "top" storage locations exactly. Is there going to be a singleton "top" storage location? Are we going to allow associating the "top" storage location with a value (probably not...)? And so on. But this seems like a good direction to investigate.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156658/new/

https://reviews.llvm.org/D156658