[PATCH] D81678: Introduce partialinit attribute at call sites for stricter poison analysis

Juneyoung Lee via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 11 19:48:47 PDT 2020


aqjune added a comment.

Hi, this is really interesting. I was interested in statically analyzing whether a value is undef/poison, so I also thought about the existence of this attribute, but I never imagined that MSan would benefit from this attribute as well.

> The partialinit attribute is, in some sense, backwards: the definition is essentially "an argument *not* marked partialinit must not contain any poison values". We usually try to avoid negative reasoning like this; I'm afraid it'll make transforms harder to reason about.



> An alternative is to invert the meaning of the attribute and put it on all arguments that must be not poison. Those are a lot more common though.

I agree with these two opinions. IIUC, in LLVM IR, attaching attribute to an argument imposes more restriction to the value.
Changing the semantics of IR to raise UB when poison or undef is passed to a function call will affect soundness of optimizations as well: for example, function outlining or dead arg elimination can introduce such function calls.
If the change is big, maybe we can split the patch to (1) implement the attribute (2) enable adding the attribute, so (2) contains all the big & mechanical diffs.

> @efriedma 
>  The way that call argument coercion works is unsound in the presence of poison. An integer can't be partially poisoned: it's either poison, or not poison. We probably need to come up with some safer way to pass structs/unions.

This is true, clang frontend may lower an argument with aggregate type into one with large int type (such as i64).
However, can poison value be safely generated in C? Paddings or union with different size may contain undef bits, but not poison. Signed overflow is UB.
Undef value can exist bitwisely, so I think this is an orthogonal issue.

> @rsmith 
> It's not entirely clear to me what partialinit means. Does it mean that some bytes of the parameter might be undef / poison? Or does it mean that some bytes of the parameter that (for a struct parameter or array thereof) correspond to a struct member might be undef / poison? (Eg, if I pass a { i8, i32 } to a function, do we need to explicitly say that the three padding bytes are not initialized?)

For poison, I believe it is now used as a valuewise concept in LLVM, so if the argument is a non-aggregate type such as i32/float/i8* it should be just binary (whether it is poison or not).

If it is agreed that we should have inverted version of `partialinit`, I'd like to suggest calling it `frozen` instead... :)
We have the notion of frozen value in LangRef already: it is used as e.g. `the branch condition should be frozen, otherwise it is undefined behavior`.
If an argument is frozen, it does not have any undef bits or poison value.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81678/new/

https://reviews.llvm.org/D81678





More information about the llvm-commits mailing list