[llvm-dev] Writing static analyzers to detect non-deterministic behavior?

George Karpenkov via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 10 11:17:22 PDT 2018


Iterating over unordered_map and unordered_set (or user-defined? llvm::DenseSet/DenseMap?) would seem like usual suspects for such checks.

> I don’t know of any nice way to track whether an ordered walk of an unordered container leaks out into the final output of the program

Precisely, so if it’s not doable for a checker, it’s probably not reliably doable for an engineer as well.

In my experience, most iterations over unordered containers tend to affect the final output,
and if project owners really care about avoiding non-determinism it would IMO make sense to ban those.

> On Aug 9, 2018, at 11:52 AM, Grang, Mandeep Singh via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Thanks for your response David.
> 
> 1) I'm not sure it's do-able. I don't know of any nice way to track whether an ordered walk of an unordered container leaks out into the final output of the program. Only iterating over an unordered container is probably not a sufficient hint (it'd have a high false positive rate to warn on every instance of that) - and I don't have any great ideas about whether that false positive rate could be sufficiently reduced with maybe some heuristics about connecting one iteration to another container population, and to the final program output
> 
> The idea I had in mind was to check the commutativity of each operation inside the iteration of the unordered container. I realize we may not be able to do this for every operation and even if we could that would still be a heuristic and prone to false positives. Maybe we can start with a simpler problem to tackle instead of the iteration order. That's why in the patch I pushed I wrote a very simple checker for std::sort with raw pointers.
> 
> > 2) Maybe - but I would think that'd still end up using heuristics to guess at whether one iteration order impacts the order of another container, etc.
> 
> Yes, maybe iteration order is more difficult to problem but we should be able to detect non-deterministic sorting order of keys with same values. The idea I had in mind is similar to what we do in llvm::sort (which is random shuffle of the container). We instrument the user code to sort a container twice - first is the user's normal std::sort, the second time we random shuffle the container and then sort again, and then check if there is a difference in the two outputs. The run time cost would be the cost of the shuffle + the extra sort + the cost of checking the two containers [O(n lg n)  + 2 O(n)].
> 
> --Mandeep
> 
> 
> On 8/9/2018 11:13 AM, David Blaikie wrote:
>> 1) I'm not sure it's do-able. I don't know of any nice way to track whether an ordered walk of an unordered container leaks out into the final output of the program. Only iterating over an unordered container is probably not a sufficient hint (it'd have a high false positive rate to warn on every instance of that) - and I don't have any great ideas about whether that false positive rate could be sufficiently reduced with maybe some heuristics about connecting one iteration to another container population, and to the final program output... 
>> 
>> 2) Maybe - but I would think that'd still end up using heuristics to guess at whether one iteration order impacts the order of another container, etc.
>> 
>> On Wed, Aug 8, 2018 at 7:43 PM Grang, Mandeep Singh via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> In the past, I had added the ability in LLVM to uncover 2 types of 
>> non-deterministic behaviors: iteration of unordered containers with 
>> pointer-like keys and sorting of elements with the same keys.
>> 
>> Now, I wanted to add checkers for these (and other types of 
>> non-deterministic behaviors) so that they could be applied more widely. 
>> I also realize that not all of these may be doable at compile-time.
>> 
>> With that in mind, I have pushed a patch which I adds a new category of 
>> checks for non-determinism: https://reviews.llvm.org/D50488 <https://reviews.llvm.org/D50488>.
>> 
>> The first checker which I have pushed here will simply check if raw 
>> pointers are being sorted. In subsequent patches, I will fine tune this 
>> by handling more complex cases of sorting of pointer-like keys.
>> 
>> I have another patch which checks for iteration of unordered containers 
>> but that may be very noisy in its current form.
>> 
>> I would like comments/suggestions from the community on:
>> 
>> 1. Are static analysis checks to detect non-determinism something worth 
>> doing/doable?
>> 
>> 2. How about writing sanitizers to detect non-deterministic behavior? 
>> Would that be too costly in terms of run time and instrumentation?
>> 
>> --Mandeep
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180810/965cbaa0/attachment.html>


More information about the llvm-dev mailing list