[llvm-dev] Is clang+llvm deterministisc?

章明 via llvm-dev llvm-dev at lists.llvm.org
Thu Jul 20 01:17:25 PDT 2017


Thank you for clarifying the status of the RNG feature!

The possible non-determinism in code generation of the latest release of LLVM/Clang is what I worry about.
It seems that I'll have to rely on native assembly output of LLVM to provide a consistent view of the control flow graph.
Also, I may have to dump dominator trees and loop information produced by LLVM so that they can be used by my instrumentation process.


> -----Original Messages-----
> From: "Stephen Crane" <sjc at immunant.com>
> Sent Time: 2017-07-20 04:19:27 (Thursday)
> To: "章明" <editing at zju.edu.cn>
> Cc: "alexandre isoard" <alexandre.isoard at gmail.com>, llvm-dev <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] Is clang+llvm deterministisc?
> 
> That RNG is currently not used. There are some old stalled patches
> that use it, but they haven't been committed. These patches
> specifically use that RNG for intentionally randomizing compiler
> output.
> 
> I don't know of other major problems for reproducible control flow,
> but I'm not an expert. I guess there could always be weird edge cases
> like unstable iteration of hash tables of pointers?
> 
> - stephen
> 
> On Mon, Jul 17, 2017 at 12:36 AM, 章明 via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > I searched source code of LLVM/Clang 4.0.0 for 'random_seed' with grep. It
> > seems the -frandom-seed option is not supported.
> >
> >
> > The -rng-seed option appears to be defined in
> > ./lib/Support/RandomNumberGenerator.cpp, which is source code for class
> > RandomNumberGenerator. The constructor of class RandomNumberGenerator is
> > private and is only called by Module::createRNG (defined in
> > lib/IR/Module.cpp). But Module::createRNG does not seem to be called
> > anywhere, except by a unit test.
> >
> >
> > I also tried adding a line to print a message in Module::createRNG. The
> > modified code compiles without any error. However, when I run clang and llc
> > to compile a simple C program, the message is not printed out. This confirms
> > that Module::createRNG is not called by clang or llc.
> >
> >
> > -----Original Messages-----
> > From:"Alexandre Isoard" <alexandre.isoard at gmail.com>
> > Sent Time:2017-07-17 03:49:48 (Monday)
> > To: "章明" <editing at zju.edu.cn>
> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
> > Subject: Re: [llvm-dev] Is clang+llvm deterministisc?
> >
> > Hi Ming Zhang,
> >
> > If you don't want to rely on Clang reproducibility, you could save the IR
> > into a .bc file. Clang can directly take a .bc file as input.
> >
> > You then:
> > - instrument a copy of that .bc file and run your counting
> > - add control flow checking on an other copy of the original .bc file and
> > you have your final binary
> >
> > For the reproducibility, I think we try to preserve that, but sometime we
> > lose it, you may have to specify -frandom-seed.
> >
> > On Sun, Jul 16, 2017 at 4:22 AM, 章明 via llvm-dev <llvm-dev at lists.llvm.org>
> > wrote:
> >>
> >> Hi, there,
> >>
> >>
> >> I am working on a project on software control flow checking, which
> >> instruments a program to check if the control flow at runtime matches the
> >> control flow graph computed at compile-time.
> >>
> >>
> >> My instrumentation process has to make use of control flow information,
> >> including as control flow graph and dominator/post-dominator trees, so it is
> >> better part of the compiler. On the other hand, I don't want any
> >> transformation pass to mess up the additional instrumentation code, so my
> >> instrumentation process has to be run after other transformation passes are
> >> complete. Therefore, I'd like to implement my instrumentation process as the
> >> last pass before the machine intermediate representation (MIR) is translated
> >> to native assembly code.
> >>
> >>
> >> My instrumentation process also needs to take basic block execution
> >> frequencies into consideration. So I have to compile the same program twice.
> >> First, the program is compiled, adding code to collect execution
> >> frequencies. Then, when the execution frequencies have been collected, the
> >> same program is compiled again to add control flow checking instructions,
> >> which takes execution frequencies into consideration. Obviously, the program
> >> profiled to collect execution frequencies and the program instrumented with
> >> control flow checking instructions have to be consistent. At least, they
> >> have to have the same basic blocks and identical control flow graphs. So my
> >> question is this: If I compile the same program twice using Clang, with the
> >> same command line, is it guaranteed that, at the point right before the MIRs
> >> are converted to native assembly code, the MIRs are identical?
> >>
> >>
> >> Thank you!
> >>
> >>
> >> Ming Zhang
> >>
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> >
> >
> >
> > --
> > Alexandre Isoard
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >


More information about the llvm-dev mailing list