[PATCH] D88477: [analyzer] Overwrite cast type in getBinding only if that was null originally

Balázs Benics via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 30 07:11:30 PDT 2020


steakhal added a comment.

In D88477#2301641 <https://reviews.llvm.org/D88477#2301641>, @NoQ wrote:

> A load from a region of type `T` should always yield a value of type `T` because that's what a load is.
>
> I'd rather have it as an assertion: if the region is typed, the type shouldn't be specified. If such assertion is hard to satisfy, we could allow the same canonical type to be specified. But having a conflict between two sources of type information and resolving it by arbitrarily choosing one of them sounds scary and unreliable. This patch doesn't eliminate the source of the conflict, it simply changes the way we flip the coin to resolve it.

Oh, now I see. Thank you.

> Normally then loading through a casted pointer the conflict is resolved by representing the casted pointer differently, so that the pointer appeared to be a `TypedValueRegion` of the correct type. Namely, it is achieved by wrapping it into an `ElementRegion` with index 0 and the correct type.

Yes, I have seen that many times, now it's clear why :)

> Why didn't the same occur in this case? Do I understand correctly that `L` is just `&b`, not `&ElementRegion{b, 0 S32b, unsigned char **}`?

In the Summary's example, at location `#2` the value of `**p` is `&Element{SymRegion{reg_$0<int * a>},0 S64b,unsigned char}` - which is exactly what we stored at line `#1`.

  void test(int *a, char ***b, float *c) {
    *(unsigned char **)b = (unsigned char *)a; // #1
    if (**b == 0) // no-crash, #2
      ;
    // ...
  }

Dereferencing this location `**b` would result in a typed region of type `unsigned char`.
I suspect that the analyzer thinks that following a type-punned pointer will point to an object of the static type - which seems reasonable to me.

However, breaks the //invariant// that you said. Since we have mismatching types, namely `unsigned char` and `char*` - the type of the `TypedValueRegion` and the explicitly specified `LoadTy` respectively.

> [...] if the region is typed, the type shouldn't be specified.

IMO, we should relax this invariant in the following way:
If the region is typed and the type also explicitly specified, we will perform a cast to the explicitly stated type.
In other words, overwrite the `LoadTy` only if that was not given.

---

Another possibility could be to change the type of the region at the //bind//. By doing that, the Region's type would match the static type so any load afterward would use the correct //static// type as it should and we could assert your //relaxed invariant// requiring matching types when both are specified.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88477/new/

https://reviews.llvm.org/D88477



More information about the cfe-commits mailing list