[llvm-dev] Writing static analyzers to detect non-deterministic behavior?

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 10 11:20:36 PDT 2018


On Fri, Aug 10, 2018 at 11:17 AM George Karpenkov <ekarpenkov at apple.com>
wrote:

> Iterating over unordered_map and unordered_set (or user-defined?
> llvm::DenseSet/DenseMap?) would seem like usual suspects for such checks.
>
> > I don’t know of any nice way to track whether an ordered walk of an
> unordered container leaks out into the final output of the program
>
> Precisely, so if it’s not doable for a checker, it’s probably not reliably
> doable for an engineer as well.
>

I wouldn't think that was the case - you might iterate one hashed container
and build another deep in other function calls - I've certainly debugged
through many layers of that sort of thing when poking around in LLVM.


> In my experience, most iterations over unordered containers tend to affect
> the final output,
>

I'd be surprised if that was the case - though a bit hard to measure. I
would guess LLVM has maybe O(hundreds) of iterations over hashed containers
that don't effect the final output - but that's just a guess.


> and if project owners really care about avoiding non-determinism it would
> IMO make sense to ban those.
>

Seems a bit severe to me. But I could be wrong.

- Dave


>
> On Aug 9, 2018, at 11:52 AM, Grang, Mandeep Singh via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Thanks for your response David.
>
> 1) I'm not sure it's do-able. I don't know of any nice way to track
> whether an ordered walk of an unordered container leaks out into the final
> output of the program. Only iterating over an unordered container is
> probably not a sufficient hint (it'd have a high false positive rate to
> warn on every instance of that) - and I don't have any great ideas about
> whether that false positive rate could be sufficiently reduced with maybe
> some heuristics about connecting one iteration to another container
> population, and to the final program output
>
> The idea I had in mind was to check the commutativity of each operation
> inside the iteration of the unordered container. I realize we may not be
> able to do this for every operation and even if we could that would still
> be a heuristic and prone to false positives. Maybe we can start with a
> simpler problem to tackle instead of the iteration order. That's why in the
> patch I pushed I wrote a very simple checker for std::sort with raw
> pointers.
>
> > 2) Maybe - but I would think that'd still end up using heuristics to
> guess at whether one iteration order impacts the order of another
> container, etc.
>
> Yes, maybe iteration order is more difficult to problem but we should be
> able to detect non-deterministic sorting order of keys with same values.
> The idea I had in mind is similar to what we do in llvm::sort (which is
> random shuffle of the container). We instrument the user code to sort a
> container twice - first is the user's normal std::sort, the second time we
> random shuffle the container and then sort again, and then check if there
> is a difference in the two outputs. The run time cost would be the cost of
> the shuffle + the extra sort + the cost of checking the two containers [O(n
> lg n)  + 2 O(n)].
>
> --Mandeep
>
> On 8/9/2018 11:13 AM, David Blaikie wrote:
>
> 1) I'm not sure it's do-able. I don't know of any nice way to track
> whether an ordered walk of an unordered container leaks out into the final
> output of the program. Only iterating over an unordered container is
> probably not a sufficient hint (it'd have a high false positive rate to
> warn on every instance of that) - and I don't have any great ideas about
> whether that false positive rate could be sufficiently reduced with maybe
> some heuristics about connecting one iteration to another container
> population, and to the final program output...
>
> 2) Maybe - but I would think that'd still end up using heuristics to guess
> at whether one iteration order impacts the order of another container, etc.
>
> On Wed, Aug 8, 2018 at 7:43 PM Grang, Mandeep Singh via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> In the past, I had added the ability in LLVM to uncover 2 types of
>> non-deterministic behaviors: iteration of unordered containers with
>> pointer-like keys and sorting of elements with the same keys.
>>
>> Now, I wanted to add checkers for these (and other types of
>> non-deterministic behaviors) so that they could be applied more widely.
>> I also realize that not all of these may be doable at compile-time.
>>
>> With that in mind, I have pushed a patch which I adds a new category of
>> checks for non-determinism: https://reviews.llvm.org/D50488.
>>
>> The first checker which I have pushed here will simply check if raw
>> pointers are being sorted. In subsequent patches, I will fine tune this
>> by handling more complex cases of sorting of pointer-like keys.
>>
>> I have another patch which checks for iteration of unordered containers
>> but that may be very noisy in its current form.
>>
>> I would like comments/suggestions from the community on:
>>
>> 1. Are static analysis checks to detect non-determinism something worth
>> doing/doable?
>>
>> 2. How about writing sanitizers to detect non-deterministic behavior?
>> Would that be too costly in terms of run time and instrumentation?
>>
>> --Mandeep
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180810/efb33ae4/attachment.html>


More information about the llvm-dev mailing list