[llvm-dev] Uncovering non-determinism in LLVM - The Next Steps

Thu Jul 6 17:14:47 PDT 2017

On Thu, Jul 6, 2017 at 1:08 PM, Daniel Berlin <dberlin at dberlin.org> wrote:

>
>
> On Thu, Jul 6, 2017 at 12:34 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>>
>>
>> On Thu, Jul 6, 2017 at 10:20 AM, Daniel Berlin via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>>
>>> On Thu, Jul 6, 2017 at 8:02 AM, Robinson, Paul via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>>
>>>>
>>>> > -----Original Message-----
>>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
>>>> > Grang, Mandeep Singh via llvm-dev
>>>> > Sent: Thursday, July 06, 2017 2:56 AM
>>>> > To: llvm-dev at lists.llvm.org
>>>> > Subject: [llvm-dev] Uncovering non-determinism in LLVM - The Next
>>>> Steps
>>>> >
>>>> > Hi all,
>>>> >
>>>> > Last year I had shared with the community my findings about instances
>>>> of
>>>> > non-determinism in llvm codegen. The major source of which was the
>>>> > iteration of unordered containers resulting in non-deterministic
>>>> > iteration order. In order to uncover such instances we had introduced
>>>> > "reverse iteration" of unordered containers (currently only enabled
>>>> for
>>>> > SmallPtrSet).
>>>> > I would now like to take this effort forward and propose to do the
>>>> > following:
>>>> >
>>>> > 1. We are in the process of setting up an internal nightly buildbot
>>>> > which would build llvm with the cmake flag -
>>>> > DLLVM_REVERSE_ITERATION:BOOL=ON.
>>>> > This will make all supported containers iterate in reverse order by
>>>>
>>>> I hope you mean all supported *unordered* containers here. :-)
>>>>
>>>> > default. We would then run "ninja check-all". Any failing unit test
>>>> is a
>>>> > sign of a potential non-determinism.
>>>>
>>>> When you did this with SmallPtrSet, were there tests that failed but
>>>> did not actually indicate non-determinism?
>>>>
>>>
>>> An example of this is the order of predecessors in the IR in phi nodes.
>>> There are passes that will create them in different orders depending on
>>> smallptrset iteration.
>>> This is "non-deterministic" in the sense that the textual form is
>>> different, but has the same semantic meaning either way.
>>> (Let's put aside the fact that allowing them  to have a different order
>>> than the actual block predecessors is a pointless waste of time :P)
>>>
>>> Whether you consider this non-deterministic depends on your goal.
>>>
>>> I would argue that any pass that behaves differently given
>>> phi [[1, block 1], [2, block 2]]
>>> and
>>> phi [[2, block 2], [1, block 1]]
>>>
>>> is just flat out broken (and we have some that break due to poor design,
>>> etc)
>>>
>>> So i wouldn't consider the above to be non-deterministic in any
>>> meaningful sense, despite it outputting different textual form.
>>>
>>
>> One of our definitions of non-determinism is simply "output from command
>> line tools should always be bit identical given identical inputs", which is
>> suitable for content-based caching build systems like Bazel.
>>
> Just to point out: These systems already often have to ignore whitespace
> differences, etc.
>
> I'd also argue any of these content-based caching build systems is going
> to be rarely used if they require every single tool that uses them to
> produce bit identical output (IE boiling the ocean), as opposed to letting
> the tools define what identical means or something.
>
>
>> I don't know how our bitcode encoding compares to the textual IR in the
>> case of your phi example, but assuming that that difference makes it into
>> the bitcode too, it would cause e.g. ThinLTO bitcode artifacts to violate
>> the content-based caching assumptions, even if semantically to the compiler
>> the difference is immaterial.
>>
>
> You are certainly welcome to try to make all these things that are
> semantically identical try to be completely syntactically identical as
> well, but i'm pretty uninterested in slowing down or banning passes from
> doing certain things to do that.
>
> IE if you want this to happen, IMHO, this should be happening in output
> writing, not somewhere else.
>
> To give another example: Currently, IIRC, the order of basic blocks the
> function iterator goes through is "as they appear in input".
> There is no real defined or required ordering (IE it's not, for example,
> sorted in a preorder walk from the entry block or something) for
> correctness.
>
> I could make a pass that randomizes the list which for (auto &BB : F)
> walks, and nothing else in the compiler should change :)
>
> I don't think you can say such a pass is broken.
>
> Hence I think if you want such constraints, they need to be placed on
> output orderings, not on what passes do.
>

Yeah, I agree it is mostly (if not entirely) an output canonicalization
issue in these cases.

-- Sean Silva

>
> I know that Richard has mentioned in the past at least for Clang the
>> intention is bit-identical output for bit-identical input.
>>
>> -- Sean Silva
>>
>>
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170706/4f5e99f5/attachment.html>